Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene.[5] The protein is predicted to be localized in the nucleus.

C8orf58
Identifiers
AliasesC8orf58, chromosome 8 open reading frame 58
External IDsMGI: 2145726 HomoloGene: 19540 GeneCards: C8orf58
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_173686
NM_001013842
NM_001198827

NM_001004155
NM_001112735

RefSeq (protein)

NP_001013864
NP_001185756
NP_775957

NP_001004155
NP_001106206

Location (UCSC)Chr 8: 22.6 – 22.6 MbChr 14: 70.39 – 70.4 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene edit

The C8orf58 gene is located on chromosome 8 at position 8p21.3. It spans a total of 4,550 base pairs and has seven exons. C8orf58 is flanked by the genes PDLIM2 and CCAR2.[6] There are no aliases. It is defined as a protein coding gene.[7]

mRNA edit

C8orf58 produces three transcript splice variants. The transcript of variant 1 represents the longest transcript and encodes the largest protein. It is 2,062 base pairs and contains seven exons. There are two other splice variants, produced by alternative splice sites.[8]

Isoform Exons Length (base pairs) Features
Transcript Variant 1 1, 2, 3, 4, 5, 6, 7 2062 One upstream in-frame stop codon.
Transcript Variant 2 1, 2, 3, 4, 5, 6, 7 2038 Alternate in-frame splice site in the 3' coding region.
Transcript Variant 3 1, 2, 3, 4, 5, 6 1955 Lacks an alternate exon, results in a frameshift in the 3' coding region.

C8orf58 has a relatively short 5’ region and a moderate 3’ region. Both the 5’ and 3’ regions contain stem loops.[9] There is one predicted miRNA binding site that found in the 3’UTR of C8orf58.[10]

Protein edit

C8orf58 protein Isoform 1 is 365 amino acids long. Isoform 2 and Isoform 3 are 357 and 300 amino acids respectively. There is a kozak consensus sequence present, which confirms it is a protein coding sequence.[11]

C8orf58 Isoform 1 has a molecular weight of 39.7 kDa and an isoelectric point of 8.29. It is proline and arginine rich and isoleucine, asparagine, phenylalanine, and tyrosine poor.[12]

The predicted secondary structure of the C8orf58 protein include multiple alpha helices and one beta strands.[12][13]

Isoform From mRNA Variant Length (amino acids) Molecular Weight (kDa) Isoelectric Point
1 1 365 39.7 8.30
2 2 357 38.6 8.30
3 3 300 32.0 5.82

Evolutionary history edit

It is part of the DUF4657 family, a family of proteins found in eukaryotes. Proteins in this family are typically between 305 and 370 amino acids in length.[14] The Domain of Unknown Function (DUF) of C8orf58 is located between amino acids 73 to 364.

Expression edit

According to the NCBI GEO profiles, C8orf58 is a narrowly expressed protein found in spleen, lung, thymus, prostate, and spinal cord tissue. It is constitutively expressed in these tissues.[15]

Post-translational modification edit

The bioinformatic tools on Expasy were used to determine potential post translational modification sites for the C8orf58 protein. There are two predicted phosphorylation sites and one predicted sumoylation site.[16]

Subcellular localization edit

According to PSORT II, C8orf58 is located in the nucleus. This is supported by the presence of a sumoylation site, which is involved in nucleic cytoplasmic transport.

Interacting proteins edit

Two proteins have been found to interact with protein C8orf58, CENPH and metG1, which were found using two hybrid assay and the two hybrid pooling approach respectively.[17] CENPH (Centromere Protein H) plays a critical role in centromere structure, kinetochore formation, and sister chromatid separation.[18] MetG1 (Methionine—tRNA ligase) is required for elongation of protein synthesis and the initiation of all mRNA translation through initiator tRNA(fMet) aminoacylation.[19]

Homology edit

An important paralog of this gene is ENSG00000248235.[20] Orthologs of the human gene C8orf58 are limited to vertebrates of the animal kingdom.

Scientific Name Common Name NCBI Accession Number Length (Amino Acids) Date of Divergence (MYA) Identity (%) Similarity (%)
Homo sapiens Human NP_001013864.1 365 - - -
Gorilla gorilla Gorilla XP_004046807.1 439 9.06 96 79.50
Marmota marmota Alpine Marmot XP_015354979.1 369 90 68 75.7
Oryctolagus cuniculus European Rabbit XP_008248092.1 371 90 66 72
Nannospalax galili Spalax XP_008848689.1 362 90 65 74.7
Ceratotherium simum simum White Rhinoceros XP_014652157.1 381 96 66 72.7
Odobenus rosmarus divergens Pacific walrus XP_012418498.1 388 96 65 74.7
Sus scrofa Wild Boar XP_005670472.1 382 96 65 73.3
Hipposideros armiger Great Roundleaf Bat XP_019487131.1 387 96 62 71
Eptesicus fuscus Big Brown Bat XP_008149784.1 377 96 62 70.1
Loxodonta africana African Bush Elephant XP_003412428.1 372 105 71 77.2
Orycteropus afer afer Aardvark XP_007949039.1 370 105 65 71.7
Parus major Great Tit XP_015504136.1 320 312 32 35.6
Anolis carolinensis Carolina Anole XP_008118367.1 453 312 28 38.9

References edit

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000241852Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000044551Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "Entrez Gene: Chromosome 8 open reading frame 58". Retrieved 2017-11-22.
  6. ^ NCBI Nucleotide. Homo sapiens chromosome 8 open reading frame 58 (C8orf58), transcript variant 1, mRNA. [1]
  7. ^ GeneCard. C8orf58 Gene(Protein Coding) Chromosome 8 Open Reading Frame 58. [2]
  8. ^ NCBI Gene. C8orf58 chromosome 8 open reading frame 58 [Homo sapiens (human)]. [3]
  9. ^ RNA Folding Form
  10. ^ TargetScan Human
  11. ^ NCBI Protein. Uncharacterized protein C8orf58 isoform 1 [Homo sapiens].[4]
  12. ^ a b SDSC Biology Workbench
  13. ^ Chou-Fasman Secondary Structure Prediction Server
  14. ^ UniProtKB - Q8NAV2 (CH058_HUMAN). UniProt
  15. ^ NCBI GEO Profiles
  16. ^ Expasy Bioinformatics Resource Portal
  17. ^ IntAct Molecular Interaction Database
  18. ^ Centromere protein H
  19. ^ Methionine--tRNA ligase
  20. ^ GeneCard. 8orf58 Gene(Protein Coding) Chromosome 8 Open Reading Frame 58. [5].