Chromosome 19 open reading frame 22 (c19orf22) is a protein which in humans is encoded by the c19orf22 gene.[5] The primary alias of the gene is R3H domain containing 4 (R3HDM4), but it is commonly referred to as c19orf22.

R3HDM4
Identifiers
AliasesR3HDM4, C19orf22, R3H domain containing 4
External IDsMGI: 1924814; HomoloGene: 16343; GeneCards: R3HDM4; OMA:R3HDM4 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_138774

NM_177994
NM_001378997

RefSeq (protein)

NP_620129

NP_818775
NP_001365926

Location (UCSC)Chr 19: 0.9 – 0.91 MbChr 10: 79.75 – 79.76 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene

edit

In the human genome, c19orf22 is located on the minus strand of chromosome 19, at 19p13.3.[6] There are six exons in the sequence.

 
Location of c19orf22 on the chromosome.

Expression

edit

The gene has the highest expression in bone marrow, followed by other tissues such as those found in the appendix and spleen.[7] Similar results were found when cross checked across strict orthologs, including mouse and rat. Expression is ubiquitous and high across many tissues.

mRNA

edit

The mRNA has 1803 base pairs.[8] There are two known isoforms of c19orf22:

  • R3H domain-containing protein 4 isoform X1
  • R3H domain-containing protein 4 isoform X2

Conceptual translation

edit

The depicted conceptual translation contains the 5'UTR region, protein sequence, and the end of the 3'UTR region.

 
Conceptual Translation Part 1 - 5'UTR region and protein sequence.
 
Conceptual Translation Part 2 - protein sequence.
 
Conceptual Translation Part 3 - protein sequence.
 
Conceptual Translation Part 4 - end of the 3'UTR region.

Homology/evolutionary history

edit

There are many orthologs of c19orf22, both strict and distant, but there are no paralogs. C19orf22 is estimated to have first appeared in fish more than 550 million years ago.[9] The gene is found in jawed and jawless fish. It is found in vertebrates, but it is not found in invertebrates. C19orf22 is evolving moderately slowly, as it is evolving more slowly than fibrinogen alpha and cytochrome C, which are indicators used to gauge evolution.

Strict and Distant Orthologs of c19orf22
Genus and Species Common Name Taxonomic Group Median Date of Divergence (MYA) Accession Number Sequence Length (aa) Sequence Identity to Human (%) Sequence Similarity to Human (%)
Mammals Homo sapiens Human Primates 0 NP_620129.2 268 100 100
Mus musculus Mouse Rodentia 87 NP_001365926.1 268 83.2 88.6
Felis catus Domestic Cat Felis 94 XP_023098420.1 270 91.5 94.1
Myotis daubentonii Daubenton's Bat Chiroptera 94 XP_059553395.1 272 88.2 91.2
Trichechus manatus latirostris Florida Manatee Sirenia 99 XP_004378580.1 272 89.7 93.8
Reptiles Crocodylus porosus Saltwater Crocodile Crocodilia 319 XP_019393792.1 261 67.7 78.1
Python bivittatus Burmese Python Squamata 319 XP_007434704.1 274 60.4 72.9
Pogona vitticeps Bearded Dragon Iguania 319 XP_020636752.1 286 57.6 67.8
Chelonia mydas Green Sea Turtle Testudines 319 XP_037741422.1 281 57.2 68
Birds Dromaius novaehollandiae Emu Casuariiformes 319 XP_025964072.1 273 64.5 76.4
Scopus umbretta Hamerkop Peleconiformes 319 NXX59745.1 243 63.1 75.4
Hirundo rustica Barn Swallow Passeriformes 319 NXW75483.1 243 63.1 74.6
Meleagris gallopavo Turkey Galliformes 319 XP_010723277.1 282 60.1 71.5
Amphibians Geotrypetes seraphini West African Caecilian Apoda 352 XP_033811581.1 258 61.9 77.6
Hyla sarda Sardinian Tree Frog Anura 352 XP_056373927.1 259 53 70.9
Fish Salmo salar Atlantic Salmon Salmoniformes 429 ACI68345.1 253 46.6 60.1
Solea solea Flatfish Pleuronectiformes 429 XP_058495099.1 256 50.7 65.4
Scyliorhinus canicula Small Spotted Cat Shark Carcharhiniformes 462 XP_038632293.1 261 54.4 70.1
Amblyraja radiata Thorny Skate Rajiformes 462 XP_032902988.1 260 52.4 68
Petromyzon marinus Sea Lamprey Petromyzontiformes 563 XP_032809394.1 577 20.6 27.7

Protein

edit

The protein contains 268 amino acids. C19orf22 has a molecular weight of 30.3 kDa.[10] This is slightly below the average molecular weight of human proteins – ranging from 38kDa-46 kDa.

Protein interactions

edit

Many proteins have been found to interact with c19orf22 using methods such as co-expression, experiments, databases, text mining, and protein neighborhood analysis. Descriptions of the most important ones are depicted in the table below.[11]

Interacting Protein Network
Protein Name Full Name Description  
MT-CO1 Mitochondrially Encoded Cytochrome C Oxidase 1 Component of the cytochrome c oxidase, the last enzyme in the mitochondrial electron transport chain which drives oxidative phosphorylation.
MT-CO2 Mitochondrially Encoded Cytochrome C Oxidase 2 Component of the cytochrome c oxidase, the last enzyme in the mitochondrial electron transport chain which drives oxidative phosphorylation.
MT-CO3 Mitochondrially Encoded Cytochrome C Oxidase 3 Component of the cytochrome c oxidase, the last enzyme in the mitochondrial electron transport chain which drives oxidative phosphorylation.
CYC1 Cytochrome C1 Heme protein, mitochondrial, component of the ubiquinol-cytochrome c oxidoreductase, a multiunit transmembrane complex that is part of the mitochondrial electron transport chain which drives oxidative phosphorylation.

Location and function

edit

C19orf22 is consistently found in the nucleus and cytoplasm across orthologs.[12] It is likely involved in enabling nucleic acid binding activity.

Post translational modifications

edit

C19orf22 has multiple significant domains and regions throughout the protein sequence, including: a disordered region, a mixed charge region, a MVP (aka vault) region, and a R3H domain.[13] Additionally, there are many phosphorylation sites throughout the sequence. While many are not included in the figure below, the two sites that are most significant are indicated by purple circles. The green region represents the vault region, and the yellow region represents the R3H domain. Disordered and mixed charged regions are also shown.

 
Significant regions found on c19orf22 protein sequence.

Protein structure

edit
 
Tertiary structure of c19orf22 protein.

Secondary Structure

edit

Alpha helices and beta sheets are evenly distributed throughout the protein sequence.[14]

Tertiary structure

edit

The tertiary structure of the c19orf22 protein is depicted.[15] As per the key, spherical appearance indicates the most significant phosphorylation sites. Ball and stick appearance indicates the conserved arginine (R) rich regions in the sequence. Other domains and regions are labelled.

The tertiary structure contains positive, negative, neutral, and mixed charge regions.

 
Charge distribution of c19orf22 protein.

Clinical significance

edit

Text based information

edit

A connection has been found between issues with expression of c19orf22 and medical conditions such as arthritis and cancer.[16][17] C19orf22 is identified as a gene that is a part of the erythropoietic signature. Genes in this signature are differentially expressed in sJIA and CAPS, and contain fold changes.  C19orf22 is included in a list of genes that are depleted in patients with congenital heart defect. It may also be correlated with high myopia, learning difficulties, and dysmorphic figures that are symptoms of Peutz-Jeghers syndrome.

Common SNPs

edit

There are multiple missense, 3'UTR, and intron variants in c19orf22.[18] There is the lack of variations in the 5’UTR region due to its short length. Some variants are depicted in the table below.

Short Genetic Variations of c19orf22
SNP Position Transcript Change Protein Change Mutation Type   Variant Allele Clinical Significance
rs1242243978   897,544 c.713G>A Cys238Tyr Missense variant T none
rs1248285503 897,551 c.706C>G Pro236Ala Missense variant T none
rs897751342   896,574 N/A N/A 3’UTR variant A and T none
rs771454485 897,579 - 897,583 N/A N/A Intron variant G and T none
rs1408762003 894,877 N/A N/A 2KB upstream variant C none
rs762093889 899,456 c.687G>A Met229Ile Missense variant T none
rs751997535 899,629

c.253G>A

Ala207Thr

Missense variant T none

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000198858Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000035781Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "R3H domain-containing protein 4 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2023-10-15.
  6. ^ "Human BLAT Search". genome.ucsc.edu. Retrieved 2023-10-15.
  7. ^ "R3HDM4 R3H domain containing 4 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2023-10-15.
  8. ^ "NCBI Nucleotide". 25 December 2022.
  9. ^ "BLAST".
  10. ^ "Statistical Analysis of Proteins".
  11. ^ "Stringdb".
  12. ^ "Human Protein Atlas".
  13. ^ "ScanProsite".
  14. ^ "Chou and Fasman Secondary Structure Prediction Server".
  15. ^ "iCN3D".
  16. ^ Nirmala N, Grom A, Gram H (September 2014). "Biomarkers in systemic juvenile idiopathic arthritis: a comparison with biomarkers in cryopyrin-associated periodic syndromes". Current Opinion in Rheumatology. 26 (5): 543–552. doi:10.1097/BOR.0000000000000098. PMC 4487522. PMID 25050926.
  17. ^ Qureshi MA, Khan S, Tauheed MS, Syed SA, Ujjan ID, Lail A, Sharafat S (November 2020). "Pan-Cancer Multiomics Analysis of TC2N Gene Suggests its Important Role(s) in Tumourigenesis of Many Cancers". Asian Pacific Journal of Cancer Prevention. 21 (11): 3199–3209. doi:10.31557/APJCP.2020.21.11.3199. PMC 8033114. PMID 33247676.
  18. ^ "NCBI dbSNP".

Further reading

edit

Journal articles that can provide more insight: