Научная статья на тему 'SYSTEMATIC ANALYSIS OF GENETIC VARIATIONS IN FGFR2 AND THE ASSOCIATION WITH HUMAN DISEASE'

SYSTEMATIC ANALYSIS OF GENETIC VARIATIONS IN FGFR2 AND THE ASSOCIATION WITH HUMAN DISEASE Текст научной статьи по специальности «Биологические науки»

CC BY
820
15
i Надоели баннеры? Вы всегда можете отключить рекламу.

Аннотация научной статьи по биологическим наукам, автор научной работы — Yuan Bi

Crouzon Syndrome (CS) and Jackson-Weiss Syndrome (JWS) are two types of craniosynostosis syndromes which are resulted from the premature fusion of fibrous tissue. The Phenotypes of CS and JWS are restricted growth of skull, brain and central nervous system. Genetic variants in the fibroblast growth factor receptor 2 (FGFR2) gene were reported to be related to CS and JWS. To our knowledge, a systematic analysis of all genetic variations in FGFR2 has not been previously reported. We performed a genetic analysis on a total of 947 exonic variants in FGFR2, and concluded the variation pattern of FGFR2 gene including variation distribution across different exons, multiple types of variants, and frequency of variations across human population groups. We then examined the relationship between FGFR2 genetic variations and CS as well as JWS, and suggested gene therapy to treat the diseases. We analyzed mutations in FGFR2 that cause different cancers and expression of FGFR2 across different cancers. We also performed pathogenicity predictions on frequent FGFR2 somatic mutations.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «SYSTEMATIC ANALYSIS OF GENETIC VARIATIONS IN FGFR2 AND THE ASSOCIATION WITH HUMAN DISEASE»

https://doi.org/10.29013/ELBLS-21-4-50-68

Yuan Bi,

The Webb School of California High School Student E-mail: andy13842823539@outlook.com

SYSTEMATIC ANALYSIS OF GENETIC VARIATIONS IN FGFR2 AND THE ASSOCIATION WITH HUMAN DISEASE

Abstract. Crouzon Syndrome (CS) and Jackson-Weiss Syndrome (JWS) are two types of cranio-synostosis syndromes which are resulted from the premature fusion of fibrous tissue. The Phenotypes of CS and JWS are restricted growth of skull, brain and central nervous system. Genetic variants in the fibroblast growth factor receptor 2 (FGFR2) gene were reported to be related to CS and JWS. To our knowledge, a systematic analysis of all genetic variations in FGFR2 has not been previously reported. We performed a genetic analysis on a total of947 exonic variants in FGFR2, and concluded the variation pattern of FGFR2 gene including variation distribution across different exons, multiple types of variants, and frequency of variations across human population groups. We then examined the relationship between FGFR2 genetic variations and CS as well as JWS, and suggested gene therapy to treat the diseases. We analyzed mutations in FGFR2 that cause different cancers and expression of FGFR2 across different cancers. We also performed pathogenicity predictions on frequent FGFR2 somatic mutations.

Keywords: Fibroblast Growth Factor Receptor 2, Crouzon Syndrome, Jackson-Weiss Syndrome, Mutations, Cancers, Pathogenicity Prediction.

1. Introduction forehead and protrusion of eyeballs are associated

Crouzon Syndrome (CS) and Jackson-Weiss with malformation of skulls [3]. CS's severity and pa-

Syndrome (JWS) are two types of craniosynostosis tients' symptoms vary within different families [4].

syndrome, which has characteristics of premature fu- JWS is another disease that is characterized by

sion of fibrous joints and restricted growth of skull, abnormal formations of facial structures and skulls.

brain and central nervous system. CS is the most JWS is a rare disease, and the exact prevalence fre-

common type of craniosynostosis syndrome [1]. CS quency ofJWS is not reported yet. JWS is a disease

has an approximate prevalence of 1 in 25,000 births of craniosynostosis and there are many clinical simi-

in the world. 67% of the cases are familiar and 33- larities between JWS and CS [5]. JWS presents limb

56% of the cases derive from spontaneous genetic abnormalities as people with JWS have toes with me-

variations [2]. dial deviation, and its severity can vary [5, 6].

CS has characteristics of abnormal growth of Studies have reported that both CS and JWS are

facial bones because of premature fusion of the fi- caused by mutations in human fibroblast growth

brous tissue. The abnormal skull shapes formed factor receptor 2 (FGFR2). FGFR2 spanning about

include scaphocephaly (fusion of sagittal suture), 119,744 bp (base pairs) of genomic sequence and

oxycephaly (fusion of the lambdoid and coronal residing on chromosome 10, encodes a receptor

sutures), brachycephaly (fusion of coronal suture). tyrosine kinase [7]. The full length of messenger

Some craniofacial abnormalities such as prominent RNA (mRNA) of FGFR2 is 4,624 bp, which contains

a coding sequence of 2,466 bp and is further translated into FGFR2 protein with 821 amino acid residues [8,9]. FGFR2 protein belongs to the fibroblast growth factor receptor (FGFR) family, and members of the FGFR family are different from each other in tissue distribution and ligand affinities.

FGFR2 is composed of three regions: (i) an extracellular region; (ii) a transmembrane region; (iii) intracellular tyrosine kinase domains (TK1 and TK2) [10]. Igll, Iglll and link between these Ig-like domains interacts with the FGF ligands, which creates a cascade of downstream signals. Studies have shown that FGFR2 protein is essential in controlling and regulating cell proliferation, differentiation, migration, and apoptosis [11].

Genetic mutations of FGFR2 increase the binding affinity while binding to fibroblast growth factor (FGF) ligands, disrupting cell differentiation and causing developmental defects [12,13]. Most of the sequence changes that cause craniosynostosis syndromes including CS and JWS are encoded in the extracellular ligand-binding portion, specifically between the Igll and the Iglll [14].

This FGFR2 genetic research analyzed all the variants of the gene, and to our knowledge, a comprehensive analysis of genetic variations in FGFR2 were not previously reported yet. The aim of this genetic research is to conduct a systematic analysis of FGFR2 gene genetic variations using key information extracted from a human genome variation database. Examples of the key information extracted are types of variations, nucleotides change, exon sequencing, and amino acid changes. The analysis focuses on relationships between FGFR2 variations and CS as well as JWS. In addition, we found that the genetic mutations of the FGFR2 gene was associated with multiple types of cancer. Therefore, gene mutation and subsequent functional effects of FGFR2 in human cancers were also investigated in the study.

2. Materials and Methods

This study uses a series of online tools and web sources to assist our bioinformatics analyses.

2.1 Single nucleotide polymorphism database

The Single Nucleotide Polymorphism Database (dbSNP) was established by the National Center for Biotechnology Information (NCBI) aiming to address large sampling designs [15]. After its establishment, dbSNP has served as a central repository of genetic variants for the public use purpose. dbSNP documents the following nucleotide sequence variations (i) single nucleotide substitutions; (ii) small insertion/deletion polymorphisms; (iii) sequence invariant sequence; (iv) microsatellite repeats; (v) named variants; (vi) uncharacterized heterozygous assays. Besides, dbSNP records information of population frequency, neutral polymorphisms, and disease-causing mutations [16]. All current known genetic variants in human genome are recorded in the database and can be downloaded freely. Therefore, human genetic variations in FGFR2 gene were extracted from the online compressed file (ftp://ftp. ncbi.nlm.nih.gov/snp/organisms/human_9606/ VCF/00-All.vcf.gz) of the Human Variation Sets File in VCF (Variant Call Format), and used for next steps of analysis.

2.2 Genetic variant annotation

In order to obtain the annotated Ensemble gene results, a VCF format file containing all SNP sites of FGFR2 (FGFR2.vcf) was uploaded to wANNOVAR (http://wannovar.wglab.org/) [17]. wANNOVAR is the web server version of ANNOVAR, which researchers used to perform genetic variants functional annotations from the sequence data. Both "exome summary results" containing all variants in the exome and "genome summary results" including variants in the whole gene are outputted.

2.3 The Cancer Genome Atlas datasets

The Cancer Genome Atlas Project (TCGA), established by National Cancer Institute, includes data of at least 10,000 cases of more than 30 different tumor types. The processed data in TCGA contains genome sequence, exome sequence, RNA expression data, and clinical datasets [18]. Data of cancer studies in this FGFR2 genetic research comes from

TCGA. We analyzed mutations in FGFR2 that cause cancers and expression of FGFR2 across different

cancers.

2.4 Somatic gene mutation analysis

We used the online tool cBioPortal (https:// www.cbioportal.org/) to perform FGFR2 gene mutations analysis. cBioPortal for Cancer Genomics was designed for users to have easy access to complicated gene datasets and facilitate the transition of raw genomic data into direct biological information. It provides an extensive set of tools for exploration and visualization of cancer genomics data involving large numbers. The data sets in cBioPortal contain studies from TCGA [19, 20]. We also obtained the visualizations of different cancers' alteration frequency in FGFR2 gene using cBioPortal.

2.5 Differential expression analysis in cancers

In order to analyze genetic expression of FGFR2

across different cancers, we further used UALCAN (http://ualcan.path.uab.edu/). UALCAN is an online tool that used data from TCGA to give comparison of gene expression across tumor samples and

normal samples. UALCAN is also used by researchers to identify over-expressed or under-expressed genes in different types of cancer [21]. This FGFR2 genetic research uses UALCAN to obtain visualization of comparison between gene expression of normal samples and tumor samples.

3. Results

3.1 Research workflow

This study integrated a series of online tools and web sources to perform a systematic analysis of genetic variations in human FGFR2 gene (Figure 1). Briefly, data information of the genetic variations was firstly extracted from the dbSNP database in variant call format (VCF). The VCF was then uploaded to wANNOVAR in order to obtain functional annotations result file genetic variants. The result file contains multiple columns representing different annotation tasks. We selected key columns that are significant to this research and performed analysis on these data. The cancer studies in this research came from TCGA and online tools including cBioPortal and UALCAN (Figure 1).

Figure 1. Research workflow

3.2 Brief information about FGFR2 is ENSG00000066468, which has multiple tranHuman FGFR2 gene resides on chromosome 10. scripts from post-transcriptional processing mecha-Ensembl provides comparative analysis at genomic nisms. The longest transcript has a transcript ID of and genic levels, and different transcripts can be ENST00000358487 and this transcript is 4,624 bp compared [22]. The Ensemble gene ID of FGFR2 in length [23] (Figure 2).

Figure 2. Gene structure of human FGFR2. We used arrows to indicate structures of exons and introns in all transcripts.

The longest transcript is ENST00000358487, which has the greatest number of exons out of all transcripts. In this research, we set exons locations on ENST00000358487 as reference standard to exons in other transcripts. However, there are variations on exons on other transcripts that do not accord with exons locations on transcript ENST00000358487. We mark these exons using arrows FGFR2 is a tyrosine-protein kinase that binds to ligands and activates signaling pathways [24]. The FGF family includes at least 22 known FGF ligands that bind to the extracellular ligand-binding region of the FGFR2 protein. FGFR2 interacts with the FGF1 ligand, while the spliced isomers of FGFR2 can interact with other members of the FGF family such as FGF1, FGF2, FGF3, and FGF4 [25, 26, 27].

FGFR2 activates several signaling pathways through binding to the FGF ligands (Figure 3). This pathway includes: (i) phosphoinositide 3-kinase (PI3k)/ protein kinase B (Akt); (ii) phospholipase C gamma 1 (PLCG1) and protein kinase C (PKC); (iii) mitogen-activated protein kinase (MAPK) and extracellular signal-related kinases 1 and 2 (ERIK 1/2) [25, 26, 27]. The FGF/FGFR signaling is essential in biological process including bone formation and homeostasis. Mutations in FGFR2 can cause unregulated FGF signaling and premature suture closure, as dysregulated downstream signaling including enhancement of pathways PLCG1, phosphoinositide 3-kinase (PI3K)/Akt, and RAS/RAF/ MAPK [28].

Figure 3. FGFR2's main signaling pathway. FGFR2 signaling pathways contain 3 main pathways.

Phosphorylation ofPLCGl leads to the production of cellular signaling molecules diacylglycerol (DAG) and inositol 1,4,5-trisphosphate (IP3). PKC is then activated by DAG, and calcium is released from IP3, which regulated cells migration and morphogenesis. Phosphorylation of FRS2 can recruit growth factor receptor-bound protein 2 (GRB2), GRB2-associated-binding protein (GABl) and son of sevenless homolog 1 (SOSl), which in turn mediates the RAS/RAF/MAPK signaling pathway and PI3K/Akt pathway. The pathway can then activate ERK1/2, controlling cell proliferation

and differentiation. The PI3K/Akt pathway regulates cells survival.

3.3 Genetic variations distribution

wANNOVAR outputted a total of 26,169 variants, and the variants occur in different regions. (Figure 4) The "genome summary results" contains all the variants, and "exome summary results" only contain variants occurring in exons, which are approximately 3.6% (947/26,169) of all the genetic variations. Intronic variants accounts for approximately 91% (23,824/26,169) of all genetic variations in FGFR2 gene (Figure 4) This FGFR2 genetic

research mainly focused on the "exome summary re- nucleotide variation (SNV), nonsynonymous SNV, suits" because genetic alterations occurring in exons frameshift insertion, frameshift deletion, frameshift

can affect the proteins produced from gene.

substitution, nonframeshift insertion, nonframeshift

wANNOVAR classifies all of the gene variants deletion, nonframeshift substitution, startloss, start-into following several categories: synonymous single gain, stoploss, stopgain [30].

Figure 4. Distribution of genetic variants across different regions in FGFR2.

3'-untranslated regions (UTR3) and 5'-untrans-lated regions (UTR5) respectively refer to untranslated region found on 3 prime side or 5 prime side of the coding sequence Both upstream and downstream refer to relative positions of genetic code in DNA. Upstream is towards the 5-prime end of the coding sequence, while downstream is toward the 3-prime end. Splicing refers to the boundary of exons and introns. Intergenic region refers to the stretch of sequences located between successive genes. Different regions of FGFR2 gene are marked with numbers Table 1. - Counts and percentage of

of variants found within the regions and percentage to the total number of variants [29].

Among all 947 exonic variations in FGFR2 gene, nonsynonymous SNV is the most frequent type of variation as it accounts for approximately 60% of all exonic genetic variations. The least frequent type of variant is nonframeshift insertion as it only accounts for 1% of all exonic variations in FGFR2. Frameshift substitution and startgain are not found in all of the exonic genetic variations in FGFR2 (Table 1). exonic variants types in FGFR2 gene

Variation Types Definition Counts Percentage

1 2 3 4

Synonymous SNV Single Nucleotide Variation that causes no amino acid changes. 324 34%

1 2 3 4

Nonsynonymous SNV Single Nucleotide Variation that causes amino acid changes. 574 60%

Frameshift Insertion Insertion of nucleotides whose numbers are not divisible by three, disrupting the reading frames. 4 0.5%

Frameshift Deletion Deletion of nucleotides whose numbers are not divisible by three, disrupting the reading frames 10 1%

Frameshift Substitution Substitution of nucleotides that causes disruptions in reading frames. 0 0%

Nonframeshift Insertion Insertion of nucleotides whose numbers are divisible by three, which does not disrupt the reading frame. 1 0.2%

Nonframeshift Deletion Deletion of nucleotides whose numbers are divisible by three, which does not disrupt the reading frames. 7 0.9%

Nonframeshift Substitution Substitution of nucleotides that disrupt no reading frames. 5 0.6%

Startloss Variant that results in elimination of start co-don. 2 0.3%

Startgain Variant that results in creation of start codon. 16 2%

Stoploss Variant that results in elimination of stop codon. 4 0.5%

Stopgain Variant that results in creation of stop codon. 0 0%

Total 947

Genetic variations were distributed in different exons. Ensemble provides the length of exons in FGFR2 gene in different transcripts. There are a total of 26 exons having genetic variations in the result file, and among all 26 exons, exon 18 in transcript ENST00000358487 is the longest with 1690 bp in length. Exon 5 in transcript ENST00000478859 is the shortest with 40 bp in length. We observed the numbers of genetic variants across different exons in the result file. Exon 3 in transcript ENST00000358487 has highest numbers of variations as there are 99 genetic variations occurring in this exon. Exon 7 of transcript ENST00000604236 and exon 7 of transcript ENST00000359354 have

least genetic variations and each of them has 3 genetic variations. (Figure 5).

We then calculated the variation density of different exons by dividing variation numbers in the exon over exon length. Among the 26 exons, exon 7 of transcript ENST00000358487 has highest variation density of 0.43, followed by exon 8 of transcript ENST00000358487 with variation density of 0.41. Exon 1 of transcript ENST00000358487 has lowest variation density of 0, as there are no variants reported on this exon. Exon 7 of transcript ENST00000359354 has second lowest variation density of0.006. (Figure 5).

Figure 5. Exons' length and variations number across different exons. We multiplied each exon's variation density by 1000 in order to clearly visualize the comparisons

We set exons sequence on transcript ENST00000358487 as standard references to exons in other transcripts. For exons that are not on transcript ENST00000358487, we marked the transcripts they appear on and exon numbers according to different transcripts.

3.4 Exome annotation result

The file "exome summary results" contains 140 columns and 948 rows. Except for the first row containing column titles, each row represents a genetic variant in FGFR2 exome. Each column describes an annotation task wANNOVAR performs on the FGFR2 variants, and we extracted information from some columns (Table 2) in this FGFR2 genetic research.

Table 2. - Selected columns from exome annotation file

Column Names Column Descriptions

Chr Chromosome that the gene locates on

Start Start nucleotide number of the SNP

End End nucleotide number of the SNP

Ref Original nucleotides before the SNP

Alt New nucleotides after the SNP

Func.ensGene Regions where SNP occurs

Gene.ensGene Ensemble ID of the gene

ExonicFunc.ensGene Types of SNP

AAChange.ensGene Change of amino acids because of gene variations

1000G (ALL, AFR, AMR, EAS, EUR, SAS) Allele frequency in 1000 Genome Project. The population groups include the following categories: ALL, African, American, East Asian, European, South Asian.

ClinVar_SIG Interpretation provided by ClinVar of relationships between genetic variations and diseases [31].

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

ClinVar_DIS Diseases associated with the variations [30].

COSMIC_DIS Effects of SNPs across human cancers

COSMIC_ID ID identified by Catalogue of Somatic Mutations in Cancer (COSMIC) [32]

SIFT score Predicts the effects of amino acid substitutions on proteins. Ranges from 0.0 (deleterious) to 1.0 (tolerated)

SIFT_pred Predicts the SNP to be D (deleterious) or T (tolerated)

The "exome summary results" contains various columns performing different annotation tasks on genetic variations, and users can select key columns to perform genetic alterations analysis.

Based on the annotation, 37 genetic variations in 25 genome positions are identified to be associated

with CS and JWS (Table 3). In addition, there are 26 variations which are predicted to be pathogenic in term of the Five-Tier Terminology System [33] (Table 4).

Table 3. - This table includes all of the FGFR2 genetic variants that are identified to be associated with CS or JWS. For exons that are not marked with transcripts, they are on transcript ENST00000358487

rsID Exon located Amino acid change Diseases related Relationship with diseases

rs3750819 Exon2 R6H CS JWS benign|benign

rs3750819 Exon2 R6P CS JWS benign benign

rs56226109 Exon3 S57L CS JWS likely benign likely benign

rs755793 Exon5 M186T CS benign

rs121918505 Exon7 S267P CS pathogenic

Exon7 F276V CS pathogenic

Exon7 Y281C CS likely pathogenic

rs121918497 Exon7 Q289P CS JWS pathogenic pathogenic

rs121918501 Exon7 W290R CS pathogneic

rs121918501 Exon7 W290G CS pathogenic

Exon7 W290S CS pathogenic

rs121918500 Exon7 K292E CS pathogenic

Exon7 Y308C CS pathogenic

rs121918493 Exon8 Y328C CS pathogenic

Exon8 D336G CS pathogenic

rs387906676 Exon8 A337T CS pathogenic

rs387906676 Exon8 A337P CS pathogenic

Exon8 G338R CS pathogenic

Exon8 G338E CS pathogenic

rs121918489 Exon8 Y340H CS pathogenic

rs121918488 Exon8 C342S JWS pathogenic

rs121918488 Exon8 C342R JWS pathogenic

Exon8 C342G JWS pathogenic

rs121918487 Exon8 C342Y JWS pathogenic

Exon8 C342S JWS pathogenic

Exon8 C342F JWS pathogenic

rs121918496 Exon8 C342W CS pathogenic

rs121918492 Exon8 A344G CS JWS pathogenic pathogenic

rs121918494 Exon8 S347C CS pathogenic

rs121918490 Exon8 S354C CS pathogenic

rs121918507 Exon12 K526E CS pathogenic

Exon12 N549H CS likely pathogenic

rs141929882 Exon13 R592C CSJWS likely pathogenic|likely pathogenic

rs558460047 ENST00000429361: Exon9 T362M CS|JWS likely benign|likely benign

rs748777325 ENST00000357555: Exon17 K682Rfs*38 CSJWS likely benign|likely benign

rs764959117 Exon18 E806K CSJWS Uncertain Significance| Uncertain Significance

rs764959117 Exon18 E806Q CSJWS Uncertain Significance| Uncertain Significance

Table 4. - Definition of terms used in the five-tier terminology system. The American College of Medical Genetics and Genomics established the five-tier terminology system to indicate the relationships between genetic variations and diseases.

Terms Definition

Pathogenic The genetic variant is disease-causing

Likely Pathogenic The certainty that the genetic variant being diseasing-causing is greater than 90% [33]

Uncertain Significance Unknown whether genetic variation is disease-causing

Benign The genetic variation is not disease-causing

Likely Benign The certainty that the genetic variant not being disease-causing is greater than 90% [33]

We examined the data to determine relationship between genetic variations and CS disease or relationship between genetic variations and JWS disease. We analyzed genetic variations (we did not include synonymous SNV because it does not change amino acids) across different exons and their associations with the diseases. For JWS, variants in exon 8 of transcript ENST00000358487 accounts

for 87.5% (7/8) of pathogenic variations, and variants in exon 7 of transcript ENST00000358487 accounts for the rest 12.5% (1/8). For CS, exon 8 of transcript ENST00000358487 has 55% of all pathogenic variants, exon 7 of ENST00000358487 has 40%, and exon 12 of ENST00000358487 has 5% (Figure 6).

Figure 6. Distributions of FGFR2 genetic variants that have relationships with CS and JWS across different exons. For exons that are not marked with transcript ID, they belong to transcript ENST00000358487

We also analyzed different types of SNPs (including synonymous SNV) and their associations with CS as well as JWS. Most (95%) of the CS pathogenic variants are classified into nonsynony-mous SNV. Interestingly, there is one synonymous SNV being pathogenic to CS even though synonymous SNV does not generally cause amino acid changes. All of the variants that are pathogenic to JWS belong to nonsynonymous SNV. There is one frameshift deletion variation analyzed to be likely benign to both CS and JWS. All other types ofvari-ations were not reported to have relationships with CS and JWS.

3.5 Genetic variations across human population groups

One of the annotation tasks performed by wAN-NOVAR is giving gene's Human 1000 Genome Project data. The aim of the 1000 Genome Project is to present a comprehensive description of genetic variations of human through sequencing individuals' genomes [34]. The 1000 Genome Project classifies populations into the following 5 major categories: African, American, European, East Asian, and South Asian [35]. In FGFR2 gene, we analyzed 5 variants to have distinctively different variation frequency in different human population groups.

Table 5. - Cancer studies and

Among the five genetic variations, four are classified into synonymous SNV, and one of the variants is nonsynonymous SNV. We focused on the analysis of the nonsynonymous SNV variant because synonymous SNV does not have functional consequence to the protein encoded. This nonsynonymous SNV is ENSG00000066468: ENST00000358487: exon5: c.T557C: p.M186T and the rsID is rs755793. This variant has a frequency of 0.36 in African population, 0.11 in American population, 0.065 in Eastern Asian population, 0.007 in European population, and 0.002 in Southern Asian population.

3.6 FGFR2 exonic alterations and associations with tumors

FGFR2 genetic variants are related to different types of cancer. Deregulation of FGFR2 protein caused by FGFR2 alterations has been reported to contribute to tumor progression, and alterations in FGFR2 are found in several types of cancer [36].

cBioPortal provides mutations analysis of different types of tumors using studies from TCGA. In this FGFR2 genetic research, we analyzed data from 30 published cancer studies from TCGA and 4 extra published cancer studies in the cBioPortal database. A total of 11305 samples from 11255 patients were included in the cancer analysis (Table 5) [37, 38].

number of samples in each study

Cancer Studies Number of Samples

1 2

Acute Myeloid Leukemia (TCGA, PanCancer Atlas) 200

Adrenocortical Carcinoma (TCGA, PanCancer Atlas) 92

Bladder Urothelial Carcinoma (TCGA, PanCancer Atlas) 411

Brain Lower Grade Glioma (TCGA, PanCancer Atlas) 514

Breast Invasive Carcinoma (TCGA, PanCancer Atlas) 1084

Cervical Squamous Cell Carcinoma (TCGA, PanCancer Atlas) 297

Cholangiocarcinoma (TCGA, PanCancer Atlas) 36

Colorectal Adenocarcinoma (TCGA, PanCancer Atlas) 594

Diffuse Large B-Cell Lymphoma (TCGA, PanCancer Atlas) 48

Esophageal Adenocarcinoma (TCGA, PanCancer Atlas) 182

Glioblastoma Multiforme (TCGA, PanCancer Atlas) 592

Head and Neck Squamous Cell Carcinoma (TCGA, PanCancer Atlas) 523

Kidney Renal Clear Cell Carcinoma (TCGA, PanCancer Atlas) 512

Liver Hepatocellular Carcinoma (TCGA, PanCancer Atlas) 372

1 2

Lung Adenocarcinoma (TCGA, PanCancer Atlas) 566

Lung Squamous Cell Carcinoma (TCGA, PanCancer Atlas) 487

Mesothelioma (TCGA, PanCancer Atlas) 87

Ovarian Serous Cystadenocarcinoma (TCGA, PanCancer Atlas) 585

Pancreatic Adenocarcinoma (TCGA, PanCancer Atlas) 184

Pheochromocytoma and Paraganglioma (TCGA, PanCancer Atlas) 178

Prostate Adenocarcinoma (TCGA, PanCancer Atlas) 494

Sarcoma (TCGA, PanCancer Atlas) 255

Skin Cutaneous Melanoma (TCGA, PanCancer Atlas) 448

Stomach Adenocarcinoma (TCGA, PanCancer Atlas) 440

Testicular Germ Cell Tumors (TCGA, PanCancer Atlas) 149

Thymoma (TCGA, PanCancer Atlas) 123

Thyroid Carcinoma (TCGA, PanCancer Atlas) 500

Uterine Carcinosarcoma (TCGA, PanCancer Atlas) 57

Uterine Corpus Endometrial Carcinoma (TCGA, PanCancer Atlas) 529

Uveal Melanoma (TCGA, PanCancer Atlas) 80

Ampullary Carcinoma (Baylor College of Medicine, Cell Reports 2016) 160

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Metastatic Melanoma (DFCI, Science 2015) 110

Non-Small Cell Lung Cancer (MSK, Cancer Cell 2018) 75

Metastatic Esophagogastric Cancer (MSKCC, Cancer Discovery 2017) 341

Figure 7. Alteration frequency in different types of cancer caused by FGFR2 variants

cBioPortal presents the alteration frequencies of cancers. Among all types of cancers, Cholangio-carcinoma has the highest alteration frequency of 19.44%. In Cholangiocarcinoma's 19.44% alteration frequency, Fusion has a frequency of 13.89% and mutation's frequency is 5.56%. Uterine Corpus Endometrial Carcinoma has second highest alteration frequency of 16.64%, and Skin Cutaneous Melanoma has the third highest alteration frequency of 10.59%. In Uterine Corpus Endometrial Carcinoma's 16.64% alteration frequency, mutation accounts for 14.93%, amplification accounts for 1.13%, deep deletion accounts for 0.19%, and multiple alteration accounts for 0.38%. For Skin Cutaneous Melanoma's 10.59% frequency, mutation

has 9.91%, deep deletion has 0.45%, and multiple alterations has 0.23% (Figure 7).

3.7 FGFR2 gene expression across different cancers

UALCAN provides analysis of gene expression across different cancers using cancer studies from TCGA. The comparisons between normal FGFR2 varies across all the cancers. Out of 24 types of cancers, 19 types have FGFR2 gene expression lower than normal, indicating for loss of function mutation in FGFR2. Interestingly, Cholangiocarcinoma, the type of cancer with highest genetic alteration frequency, has FGFR2 gene expression higher than normal. This implies that gain of function mutations causes Cholangiocarcinoma (Figure 8) [39, 40].

Figure 8. FGFR2 gene expression across normal samples and tumor samples

3.8 FGFR2 mutations pathogenicity prediction

In order to predict frequent FGFR2 somatic mutations' pathogenicity, we analyzed mutation data from the National Cancer Institute Genomic Data Commons (GDC). GDC stores genomic and clinical data information from patients with cancer, and the aim of this system is to allow public access to genomic data [41].

The Sorting Intolerant from Tolerant (SIFT) is an algorithm that is able to predict the effects

of genetic variations on proteins. It can help users to identify whether genetic variants are disease-causing or tolerated. The underlying theory behind the SIFT algorithm is that evolutionary conserved regions are less tolerant to genetic variants. SIFT examines the composition of amino acids and calculates the SIFT score, which indicates the normalized probability of observing amino acid changes at the position. SIFT scores range from 0 to 1. Variants with score less than 0.05 are deleterious (disease-

causing), while variants with score greater or equal to 0.05 are tolerated (benign) [42].

We extracted FGFR2 exonic mutation data and utilized SIFT to predict the pathogenicity of common mutations (Figure 9).

Figure 9. Prediction of effects of FGFR2 exonic mutations. Among a total of 282 most frequent exonic mutations, 50% (142/282) are deleterious, 19% (52/282) are tolerated

4. Discussion

The FGF/FGFR2 signaling pathway is essential to bone development and homeostasis, and mutations in FGFR2 would lead to unregulated FGF signaling and premature suture closure. Besides from CS and JWS, other types of Craniosynostosis syndromes, such as Apert Syndrome, Pfeiffer Syndrome, and Seathre-Chotzen-like Syndrome, are all caused by FGFR2 genetic mutations [43]. Currently, the only treatment for Craniosynostosis Syndrome is surgery in the first year of life, which expands the in-tercranial volume to prevent the buildup of pressure on the brain, and restores the cosmetic appearance [44].

Mutation in FGFR2 gene increases the binding affinity with FGF ligands and enhances the downstream signaling, leading to premature suture fusion. Therefore, it would be promising to to inhibit the downstream signaling pathway RA.S/RA.F/MAPIK to prevent cells from proliferate prematurely and minimize the phenotypes of CS and JWS. However, these signaling pathways are significant to human growth in other aspects including mitosis and

meiosis [45]. Inhibiting the downstream signaling pathway might bring various potential risks, so it is important to consider other treatment methods for the diseases. For Crouzon Syndrome and Jackson-Weiss Syndrome, gene therapy can be a potential treatment as normal versions of FGFR2 gene can be introduced into cells containing faulty FGFR2 gene.

This FGFR2 gene research provides an analysis of all genetic variations in human FGFR2 exome. To our knowledge, research regarding all the genetic variations in human FGFR2 has not been yet reported. Among all 947 exonic variations, nonsyn-onymous SNV and Synonymous SNV accounts for respectively 60% and 34% of all types of variations. Considering numbers of variations reported and length of different exons, exon 7 of transcript ENST00000358487 has the highest variation density of 0.43, followed by exon 8 in the same transcript with variation frequency of 0.41. We found that a specific nonsynonymous SNV, rs755793, has its variation frequency distinctively different human populations. Its variation frequency in African population is 0.36, while in Southern Asian population,

the frequency is 0.002. We examined most frequent somatic mutations in FGFR2 exome and predicted whether the variants are deleterious. According to the prediction, 50% of the frequent exonic mutations in FGFR2 are deleterious, which increases individual's susceptibility to diseases. However, 31% of the total number of FGFR2 mutations cannot be calculated to obtain SIFT scores because of missing information, so the SIFT prediction does not give a comprehensive analysis of all FGFR2 mutations. Still, the SIFT prediction can show that at least half of the mutations in FGFR2 are disease-causing and harmful to human bodies.

The current database contains genetic variations information from individuals that are not merely limited to CS or JWS patients, so the total data is not a perfect representation of patients with either of these two types of diseases. According to our current data, the only pathogenic genetic variations for CS are only found on exon 7, exon 8 and exon 12 of transcript ENST00000358487, and JWS pathogenic genetic variations are only reported on exon7 and exon 8 of the same transcript. Most of the variants causing CS and JWS overlap in exon 8, encoding for extracellular IgIII domain. In fact, a lot of phenotypes of these two diseases overlap as they are both types of Craniosynostosis syndromes. Interestingly, we found one synonymous SNV with the rsID of rs121918491 to be pathogenic to CS. One possible explanation of this is that although the variants do not cause amino acid changes, the substrate specificity of tRNA to the altered codons can affect the timing of translation, which then affect the folding of protein. As a result, phenotypes of proteins encoded might be influenced by synonymous SNV.

Gene therapy should target exon 8 of transcript ENST00000358487 to deal with CS and JWS, and CRISPR/Cas 9 can be a potential technique to perform gene editing for FGFR2. CRISPR/Cas 9 technique can be used to replace the faulty gene segments with healthy copies of the FGFR2 gene. RNA

editing is another promising treatment method for CS and JWS.

This FGFR2 genetic variations research also contains analysis of different types of cancers caused by the variations. Mutations in FGFR2 gene have often been related to multiple types of tumor progressions. According to visualizations and our analysis using cBioPortal, the most common types of cancers caused by FGFR2 alterations is Cholangiocar-cinoma. The visualization derived from analysis of 11305 patients and 11255 samples in a total of 34 studies shows that Cholangiocarcinoma has an alteration frequency of 19.44%. Most of the tumors can be caused by loss of function mutations, but the one with highest alteration frequency, Cholangiocarcinoma, is caused by gain of function mutations. Overexpression, which is initially caused by genetic translocations, ofFGFR2 fusion proteins leads to increased sensitivity to FGFR inhibitors [46]. Hromas and his colleagues did research on preventing the chromosomal translocations, and from the research they came up with the conclusion that poly-adenos-ine diphosphate ribose polymerase 1 (PARP1) inhibitors can prevent the formation of chromosomal translocations [47]. It would be promising to run experiment method of using PARP1 inhibitors to treat Cholangiocarcinoma caused by FGFR2 mutations.

5. Conclusion

In conclusion, to our knowledge, this is the first systematic analysis of all exonic variants in FGFR2, so the study provides important information for people to understand this gene. This study shows that nonsynonymous SNV is the most frequent variant as it accounts for 60% of all 947 exonic variants. Exon 7 has the highest variation density of 0.43. Most of the variants that are pathogenic to CS and JWS are in exon 7 and exon 8. Exon 7 and exon 8 encode for IgIII and the space between Igll and IgIII, and these regions directly bind with the FGF ligands. The study also suggests that FGFR2 variants are associated with certain types of cancers, and possible treatment methods were discussed.

Reference:

1. Houssaint E., Blanquet P.R., Champion-Arnaud P., Gesnel M.C., Torriglia A., Courtois Y. & Breathnach R. Related fibroblast growth factor receptor genes exist in the human genome. Proceedings of the National Academy of Sciences of the United States ofAmerica, - 87(20). 1990. - P. 8180-8184.

2. Samatha Y., Vardhan T.H., Kiran A.R., Sankar A.J. & Ramakrishna B. Familial Crouzon syndrome. Contemporary clinical dentistry, - 1(4). 2010. - P. 277-280.

3. Gupta S., Prasad A., Sinha U., Singh R. & Gupta G. Crouzon Syndrome in a Ten-week-old Infant: A Case Report. Saudi journal of medicine & medical sciences, - 8(2). 2020. - P. 146-150.

4. Graul-Neumann L.M., Klopocki E., Adolphs N., Mensah M.A. & Kress W. Mutation c.943G>T (p. Ala-315Ser) in FGFR2 Causing a Mild Phenotype of Crouzon Craniofacial Dysostosis in a Three-Generation Family. Molecular syndromology, - 8(2). 2017. - P. 93-97.

5. Van Herwerden L., Rose C.S., Reardon W., Brueton L.A., Weissenbach J., Malcolm S. & Winter R.M. Evidence for locus heterogeneity in acrocephalosyndactyly: a refined localization for the Saethre-Chotzen syndrome locus on distal chromosome 7 p- and exclusion of Jackson-Weiss syndrome from craniosyn-ostosis loci on 7 p and 5q. American journal of human genetics, - 54(4). 1994. - P. 669-674.

6. Wilkie A.O. Craniosynostosis: Genes and mechanisms. Hum Mol Genet. - 6. 1997. - P. 1647-1656. Doi: 10.1093/hmg/6.10.1647.

7. USCS Genome Browser, Homo sapiens fibroblast growth factor receptor 2 (FGFR2), transcript variant 1, mRNA; July 2020. Available from: URL: https://genome.ucsc.edu/cgi-bin/hgGateway

8. The UniProt Consortium, UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47: D506-515. FGFR2 Human, 2019. Available from: URL: https://www.uniprot.org/.

9. Nucleotide. Bethesda: National Library of Medicine, National Center for Biotechnology Information; Accession No. NM_000141.5. Homo sapiens fibroblast growth factor receptor 2 (FGFR2), transcript variant 1, mRNA; July, 2020. Available from: URL: https://www.ncbi.nlm.nih.gov/nuccore/NM_000141.

10. Gene. Bethesda: National Library of Medicine, National Center for Biotechnology Information; July, 2020. Available from: URL: https://www.ncbi.nlm.nih.gov/gene/2263.

11. Al-Namnam N.M., Hariri F., Thong M.K. & Rahman Z.A. Crouzon syndrome: Genetic and intervention review. Journal of oral biology and craniofacial research, - 9(1). 2019. - P. 37-39.

12. Lin Y., Ai S., Chen C., Liu X., Luo L., Ye S., Liang X., Zhu Y., Yang H. & Liu Y. Ala344Pro mutation in the FGFR2 gene and related clinical findings in one Chinese family with Crouzon syndrome. Molecular vision, - 18. 2012. - P. 1278-1282.

13. Lin Y., Liang X., Ai S., Chen C., Liu X., Luo L., Ye S., Li B., Liu Y. & Yang H. FGFR2 molecular analysis and related clinical findings in one Chinese family with Crouzon syndrome. Molecular vision, - 18. 2012. - P. 449-454.

14. Oldridge M., Lunt P.W., Zackai E.H., McDonald-McGinn D.M., Muenke M., Moloney D.M., Twigg S.R., Heath J.K., Howard T.D., Hoganson G. Genotype-phenotype correlation for nucleotide substitutions in the IgII-IgIII linker of FGFR2. Hum Mol Genet. - 6. 1997. - P. 137-143.

15. dbSNP. 2020. Available from: URL:https://www.ncbi.nlm.nih.gov/snp/.

16. Sherry S.T., Ward M.H., Kholodov M., Baker J., Phan L., Smigielski E.M. & Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic acids research, - 29(1). 2001. - P. 308-311.

17. Lab W.G. wANNOVAR. 2021. Available from: URL:http://wannovar.wglab.org/.

18. Chandran U.R., Medvedeva O.P., Barmada M.M., Blood P.D., Chakka A., Luthra S., Ferreira A., Wong K.F., Lee A.V., Zhang Z., Budden R., Scott J.R., Berndt A., Berg J.M. & Jacobson R.S. TCGA Expedition: A Data Acquisition and Management System for TCGA Data. PloS one, - 11(10). 2016. - e0165395.

19. Buechner P., Hinderer M., Unberath P., Metzger P., Boeker M., Acker T., Haller F., Mack E., Nowak D., Paret C., Schanze D., von Bubnoff N., Wagner S., Busch H., Boerries M. & Christoph J. Requirements Analysis and Specification for a Molecular Tumor Board Platform Based on cBioPortal. Diagnostics (Basel, Switzerland), - 10(2). 2020. - 93 p.

20. Gao J., Aksoy B.A., Dogrusoz U., Dresdner G., Gross B., Sumer S.O., Sun Y., Jacobsen A., Sinha R., Lars-son E., Cerami E., Sander C. & Schultz N. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Science signaling, - 6(269). 2013. - l1 p.

21. Chandrashekar D.S., Bashel B., Balasubramanya S., Creighton C.J., Ponce-Rodriguez I., Chakravarthi B. & Varambally S. UALCAN: A Portal for Facilitating Tumor Subgroup Gene Expression and Survival Analyses. Neoplasia (New York, N.Y.), - 19(8). 2017. - P. 649-658.

22. Herrero J., Muffato M., Beal K., Fitzgerald S., Gordon L., Pignatelli M., Vilella A.J., Searle S.M., Amode R., Brent S., Spooner W., Kulesha E., Yates A. & Flicek P. Ensembl comparative genomics resources. Database: the journal of biological databases and curation, 2016. bav096.

23. Institute, E.M.B.L.s.E.B. FGFR2 ENSR00000358487 Transcripts. July 2020. Available from: URL: https:// asia.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000066468;r=10:121479072-121598458;t=ENST00000358487.

24. Ornitz D.M. & Marie P.J. Fibroblast growth factor signaling in skeletal development and disease. Genes & development, - 29(14). 2015. - P. 1463-1486.

25. Seto M.L., Hing A.V., Chang J., Hu M., Kapp-Simon K.A., Patel P.IK et al. Isolated sagittal and coronal craniosynostosis associated with TWIST box mutations. Am J Med Genet Part A. - 143(7). 2007. - P. 678-86.

26. Lei H., Deng C-X. Fibroblast growth factor receptor 2 signaling in breast cancer. Int J Biol Sci. In press.

27. Robin N.H. FGFR-Related Craniosynostosis Syndromes. Gene Reviews. 2007. - P. 1-30.

28. Azoury S.C., Reddy S., Shukla V. & Deng C.X. Fibroblast Growth Factor Receptor 2 (FGFR2) Mutation Related Syndromic Craniosynostosis. International journal of biological sciences, - 13(12). 2017. - P. 1479-1488.

29. Notari D.L., Molin A., Davanzo V., Picolotto D., Ribeiro H.G. & Silva S. IntergenicDB: a database for intergenic sequences. Bioinformation, - 10(6). 2014. - P. 381-383.

30. Li Q. & Wang K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. American journal of human genetics, - 100(2). 2017. - P. 267-280.

31. Landrum M.J., Chitipiralla S., Brown G.R., Chen C., Gu B., Hart J., Hoffman D., Jang W., Kaur K., Liu C., Lyoshin V., Maddipatla Z., Maiti R., Mitchell J., O'Leary N., Riley G. R., Shi W., Zhou G., Schneider V., Maglott D., ... Kattman B.L. ClinVar: improvements to accessing data. Nucleic acids research, - 48(D1). 2020. - P. D835-D844.

32. Forbes S.A., Bhamra G., Bamford S., Dawson E., Kok C., Clements J., Menzies A., Teague J.W., Futreal P.A. & Stratton M.R. The Catalogue of Somatic Mutations in Cancer (COSMIC). Current protocols in human genetics, Chapter - 10. Unit-10.11. 2008.

33. van Rooij J., Arp P., Broer L., Verlouw J., van Rooij F., Kraaij R., Uitterlinden, A. & Verkerk A. Reduced penetrance of pathogenic ACMG variants in a deeply phenotyped cohort study and evaluation of Clin-

Var classification over time. Genetics in medicine: official journal of the American College of Medical Genetics, - 22(11). 2020. - P. 1812-1820.

34. Altshuler D.L., et al. A map of human genome variation from population-scale sequencing. Nature.

35. Belsare S., Levy-Sakin M., Mostovoy Y. et al. Evaluating the quality of the 1000 genomes project data. BMC Genomics - 20. 2019. - P. 620.

36. Szybowska P., Kostas M., Wesche J., Wiedlocha A. & Haugsten E.M. Cancer Mutations in FGFR2 Prevent a Negative Feedback Loop Mediated by the ERK1/2 Pathway. Cells, - 8(6). 2019. - 518 p.

37. Cerami et al. The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data. Cancer Discovery. May, - 2. 2012. - 401 p.

38. cBioPortal for Cancer Genomics. 2021. Available from: URL: https://www.cbioportal.org.

39. Chandrashekar D.S., Bashel B., Balasubramanya S., Creighton C.J., Ponce-Rodriguez I., Chakravarthi B. & Varambally S. UALCAN: A Portal for Facilitating Tumor Subgroup Gene Expression and Survival Analyses. Neoplasia (New York, N.Y.), - 19(8). 2017. - P. 649-658.

40. UALCAN. 2021. Available from: URL:http://ualcan.path.uab.edu/.

41. Jensen M.A., Ferretti V., Grossman R.L. & Staudt L.M. (). The NCI Genomic Data Commons as an engine for precision medicine. Blood, - 130(4). 2017. - P. 453-459.

42. Sim N.L., Kumar P., Hu J., Henikoff S., Schneider G. & Ng P.C. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic acids research, 40(Web Server issue), 2012. - W452-W457.

43. Agochukwu N.B., Solomon B.D. & Muenke M. Impact of genetics on the diagnosis and clinical management of syndromic craniosynostoses. Child's nervous system: ChNS: official journal of the International Society for Pediatric Neurosurgery, - 28(9). 2012. - P. 1447-1463.

44. Lattanzi W., Barba M., Di Pietro L. & Boyadjiev S.A. Genetic advances in craniosynostosis. American journal of medical genetics. Part A, - 173(5). 2017. - P. 1406-1429.

45. Kalous J., Tetkova A., Kubelka M. & Susor A. Importance of ERK1/2 in Regulation of Protein Translation during Oocyte Meiosis. International journal of molecular sciences, - 19(3). 2018. - 698 p.

46. Wang J., Xing X., Li Q Zhang G., Wang T., Pan H. & Li D. Targeting the FGFR signaling pathway in cholangiocarcinoma: promise or delusion? Therapeutic advances in medical oncology, - 12. 2020. 1758835920940948.

47. Hromas R., Williamson E., Lee S. H. & NickoloffJ. Preventing The Chromosomal Translocations That Cause Cancer. Transactions of the American Clinical and Climatological Association, - 127. 2016. - P. 176-195.

i Надоели баннеры? Вы всегда можете отключить рекламу.