User:Arahmatullah/sandbox

c7orf26 (Chromosome 7, Open Reading Frame 26) is a gene in humans that encodes a protein known as c7orf26 (uncharacterized protein c7orf26). Based on properties of c7orf26 and its conservation over a long period of time, it's suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

Gene

edit

Background

edit

Chromosome 7 is one of the 23 pairs of chromosomes in the human body, and spans about 159 million base pairs and represents about 5-5.5% of the total DNA in cells.[1] Changes to the structure of chromosome 7 can result in a number of genetic abnormalities, including Williams Syndrome which causes structural and cosmetic changes to the human body, ultimately resulting in a shorter lifespan.[2] There are hundreds of known open reading frames (ORF) along the domain of chromosome 7, however there is not much known about the 26th reading frame, which is of considerable interest.

Currently, two isoforms are known in Homo Sapiens and are referred to as isoforms 1 and 2, respectively.[3]

Location

edit
 
Location of c7orf26 on chromosome 7

c7orf26 (accession: NM_024067 / NP_076972; alias: MGC-2178) is located on the long arm of chromosome 7 (7p22.1) , starting at 6590021 and ending at 6608726. The c7orf26 gene spans 2178 base pairs and is orientated on the + strand. The coding region is made up of a protein sequence measuring 449 amino acids long. It is divided into 6 transcripts containing a total of 24 exons on the forward strand and has 5952 unique Single Nucleotide Polymorphisms (SNPs).[4]

Gene Neighborhood

edit

Genes ZDHHC4, ZNF853 and ZNF316 neighbor c7orf26 on chromosome 7.[5] Gene ZDHHC4 is a zinc-finger protein involved with cytochrome-c oxidase activity and protein-cysteine S-palmitoyltransferase activity and has overlapping regions with c7orf26.[6] Gene GRID2IP lies upstream by >2000 bp of c7orf26, and is heavily involved with in synaptogenesis and synaptic plasticity.[7]

Expression

edit
 
Diagram depicting the expression of c7orf26 in tissues throughout the body.

c7orf26 is highly expressed in lymphatic, reproductive, and nervous tissue. These include the brain (frontal and occipital cortex), thymus glands, salivary glands, endometrium, cervix, and prostate. It is intermediately expressed in the lungs.[8]

Homology

edit

Paralogs

edit

No paralogs of c7orf26 have been found in the human genome, however, six unique isoforms have been identified. They are c7orf26 isoform (X1, X2, X3, X4) and isoform 2 (two sub-isoforms identified).[9]

Orthologs

edit

Below is a table of a variety of orthologs of the human c7orf26. The table include closely, intermediately and distantly related orthologs.[10] Orthologs of the human protein c7orf26 are listed above in descending order of the date of divergence. c7orf26 is highly conserved throughout all orthologs, this is demonstrated with a 65% identity in the least similar ortholog. c7orf26 has evolved slowly and evenly over time.

Genus Species Common Name Taxonomy Date of Divergency (MYA) Sequence Length (# amino acids) Sequence Identity (%) Sequence Similarity
Lingula anatina Lingulata Invertebrata 916 406 72% 66%
Cryptotermes secundus West Indian Drywood Termite Invertebrata 797 403 77% 70%
Saccoglossus kowalevskii Acorn Worm Invertebrata 794 407 86% 81%
Python bivittatus Burmese Python Reptilia 286 329 72% 66%
Colius striatus Speckled Mousebird Aves 273 309 87% 83%
Callorhinchus milii Australian Ghostshark Chondrichthyes 177 441 70% 57%
Cynoglossus semilaevis Tonguefish Osteichthyes 128 307 67% 52%
Amphiprion ocellaris Clownfish Osteichthyes 117 481 65% 50%
Elephantulus edwardii Cape Elephant Shrew Mammalia 105 445 91% 88%
Piliocolobus tephrosceles Ugandan Red Colobus Mammalia 102 449 88% 84%
Theropithecus gelada Bleeding Heart Monkey Primate 43.6 617 91% 88%
Saimiri boliviensis boliviensis Black Capped Squirrel Monkey Primate 43.2 585 92% 90%
Homo Sapiens Human Primate 0 449 100% 100%

Protein

edit

General Properties

edit

The molecular weight of c7orf26 is 50 kiloDaltons. The isoelectric point is 7.61. The protein sequence is uniquely rich for leucine at 15.8% of its composition, this may indicate a leucine-zipper. Further analysis from PSORT indicates that a leucine-zipper region is found at amino acid 318 and lasts until position 340 (22 amino acids long). There are no extremes with regards to acidity and alkalinity. c7orf26 has a positive charge cluster from amino acid 245 – 275 and does not have any negative, or mixed charge clusters.[11]

Composition

edit

There is an even distribution of amino acids comprising c7orf26. The percent composition of each amino acid is fairly consistent throughout the orthologs of the protein. The most distant ortholog displays the most variance in amino acid composition. There is a higher percent composition of tyrosine, histidine and leucine and a lower composition of valine and alanine.

c7orf26 is highly phosphorylated post modification. There are 66 predicted phosphorylated sites according to the NetPhos predictor of phosphorylation sites.[12] There are 4 unique sumoylation sites according to SUMOplot/SUMOsp programs.[13] Sumoylation sites are involved in a number of cellular processes, including nuclear-cytosolic transport, transcriptional regulation and protein stability.

According DAS-TMFilter Server[14], c7orf26 has zero predicted transmembrane sites or transmembrane protein coding regions, therefore, it can be inferred with certainty that c7orf26 is not a transmembrane protein.

Using the GOR (Garnier-Osguthorpe-Robson)[15] method, it can be inferred that c7orf26 has unique secondary structure comprised of alpha helices, random coil regions and extended strands. Random coil regions are most found in c7orf26, as they comprise 53.23% of the protein, while alpha helices comprise 34.30% and extended strands at 12.47%.

Subcellular localization

edit

According to PSORT, c7orf26 is predicted to be localized in the cytoplasm with 70.6% confidence.[16]

Interacting Proteins

edit

c7orf26 interacts uniquely with 11 different proteins, according to the Mentha interactome browser.[17] In particular, c7orf26 interacts with the entire family of 'INTS' (Integrator Complex Subunit 1-7). The Integrator Complex associates with the C-terminal domain of RNA polymerase II large subunit. It is involved in the transcription and processing of their transcripts. INTS mediates recruitment of cytoplasmic dynein to the nuclear envelope.

Outside of the INTS gene family, c7orf26 interacts with AK5[18], HDGF, and ASUN[19].

Clinical Significance

edit

According to Guirato et. al (2018), there may be some evidence that regions on chromosome 7 may be directly linked to a nuclear estrogen receptor (ESR2) that modulates cancer cell proliferation and tumor growth[20]. In another journal article by Fu et. al (2014), there is further indication that regions along chromosome 7, located between open reading frames 20-30, directly correlate to cellular functions of a hepatoma-derived growth factor (HDGF), another way of expressing normal function in tumorigenesis[21].

References

edit
  1. ^ Reference, Genetics Home. "Chromosome 7". Genetics Home Reference. Retrieved 2019-04-29.
  2. ^ Reference, Genetics Home. "Williams syndrome". Genetics Home Reference. Retrieved 2019-05-02.
  3. ^ "uncharacterized protein C7orf26 isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  4. ^ "C7orf26 chromosome 7 open reading frame 26 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  5. ^ "C7orf26 chromosome 7 open reading frame 26 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  6. ^ www.genecards.org https://www.genecards.org/cgi-bin/carddisp.pl?gene=ZDHHC4. Retrieved 2019-05-02. {{cite web}}: Missing or empty |title= (help)
  7. ^ "GRID2IP - Google Search". www.google.com. Retrieved 2019-05-02.
  8. ^ "GDS1085 / 4306". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  9. ^ "C7orf26 - Uncharacterized protein C7orf26 - Homo sapiens (Human) - C7orf26 gene & protein". www.uniprot.org. Retrieved 2019-05-02.
  10. ^ "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  11. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-04.
  12. ^ "NetPhos 3.1 Server - prediction results". www.cbs.dtu.dk. Retrieved 2019-05-04.
  13. ^ "SUMOplot™ Analysis Program | Abgent". www.abgent.com. Retrieved 2019-05-04.
  14. ^ "DAS-TMfilter server". www.enzim.hu. Retrieved 2019-05-05.
  15. ^ "NPS@ : GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2019-05-05.
  16. ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 2019-05-05.
  17. ^ "mentha: the interactome browser". mentha.uniroma2.it. Retrieved 2019-05-05.
  18. ^ www.genecards.org https://www.genecards.org/cgi-bin/carddisp.pl?gene=AK5. Retrieved 2019-05-05. {{cite web}}: Missing or empty |title= (help)
  19. ^ www.genecards.org https://www.genecards.org/cgi-bin/carddisp.pl?gene=INTS13. Retrieved 2019-05-05. {{cite web}}: Missing or empty |title= (help)
  20. ^ Giurato, Giorgio; Nassa, Giovanni; Salvati, Annamaria; Alexandrova, Elena; Rizzo, Francesca; Nyman, Tuula A.; Weisz, Alessandro; Tarallo, Roberta (2018-03-06). "Quantitative mapping of RNA-mediated nuclear estrogen receptor β interactome in human breast cancer cells". Scientific Data. 5: 180031. doi:10.1038/sdata.2018.31. ISSN 2052-4463. PMC 5839158. PMID 29509190.{{cite journal}}: CS1 maint: PMC format (link)
  21. ^ Fu, Wenxian; Farache, Julia; Clardy, Susan M; Hattori, Kimie; Mander, Palwinder; Lee, Kevin; Rioja, Inmaculada; Weissleder, Ralph; Prinjha, Rab K (2014-11-19). "Epigenetic modulation of type-1 diabetes via a dual effect on pancreatic macrophages and β cells". eLife. 3. doi:10.7554/eLife.04631. ISSN 2050-084X. PMC 4270084. PMID 25407682.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)

Suggested Reading

edit
  1. Boeing, Stefan; Williamson, Laura; Encheva, Vesela; Gori, Ilaria; Saunders, Rebecca E.; Instrell, Rachael; Aygün, Ozan; Rodriguez-Martinez, Marta; Weems, Juston C. (05 17, 2016). "Multiomic Analysis of the UV-Induced DNA Damage Response". Cell Reports. 15 (7): 1597–1610. doi:10.1016/j.celrep.2016.04.047. ISSN 2211-1247. PMC PMCPMC4893159. PMID 27184836. {{cite journal}}: Check |pmc= value (help); Check date values in: |date= (help)
  2. Goto, Yusuke; Kojima, Satoko; Kurozumi, Akira; Kato, Mayuko; Okato, Atsushi; Matsushita, Ryosuke; Ichikawa, Tomohiko; Seki, Naohiko (05 10, 2016). "Regulation of E3 ubiquitin ligase-1 (WWP1) by microRNA-452 inhibits cancer cell migration and invasion in prostate cancer". British Journal of Cancer. 114 (10): 1135–1144. doi:10.1038/bjc.2016.95. ISSN 1532-1827. PMC PMCPMC4865980. PMID 27070713. {{cite journal}}: Check |pmc= value (help); Check date values in: |date= (help)
  3. Subaran, Ryan L.; Odgerel, Zagaa; Swaminathan, Rajeswari; Glatt, Charles E.; Weissman, Myrna M. (2016-4). "Novel variants in ZNF34 and other brain-expressed transcription factors are shared among early-onset MDD relatives". American Journal of Medical Genetics. Part B, Neuropsychiatric Genetics: The Official Publication of the International Society of Psychiatric Genetics. 171B (3): 333–341. doi:10.1002/ajmg.b.32408. ISSN 1552-485X. PMC PMCPMC5832964.
  4. Stelzl, Ulrich; Worm, Uwe; Lalowski, Maciej; Haenig, Christian; Brembeck, Felix H.; Goehler, Heike; Stroedicke, Martin; Zenkner, Martina; Schoenherr, Anke (2005-09-23). "A human protein-protein interaction network: a resource for annotating the proteome". Cell. 122 (6): 957–968. doi:10.1016/j.cell.2005.08.029. ISSN 0092-8674. PMID 16169070.