User:Carly Rasmussen/sandbox/KIAA1841

An Error has occurred retrieving Wikidata item for infobox

KIAA1841 is a gene in humans that encodes a protein known as KIAA1841 (uncharacterized protein KIAA1841).

Gene

edit

Location

edit
 
Location of KIAA1841 on chromosome 2

KIAA1841 is located on the long arm of chromosome 2 (2q14), starting at 61297486 and ending at 61349294. The KIAA1841 gene spans 52809 base pairs and is orientated on the ++ strand(4). The coding region is made up of 4292 base pairs and the protein sequence of 718 amino acids(1).

Gene Neighborhood

edit

Genes PEX13 and C2orf74 neighbor KIAA1841 on chromosome 2. [1]

Expression

edit
 
Diagram depicting the expression of KIAA1841 in tissues throughout the body.

KIAA1841 is highly expressed in reproductive structures and nervous tissue. These include the brain, prostate, cervix, ear and nervous tissue. It is intermediately expressed in the lungs and spinal cord.[2] [3] KIAA1841 is expressed at low levels in a wide range of tissues through out the human body.

Promoter

edit

According to Genomatix’s ElDorado program the promoter region of KIAA1841 is predicted to be [add length] base pairs in length. The promoter region starts [add #] base pairs upstream of the 5’ UTR of KIAA1841 mRNA transcript and contains part of this 5’ UTR.

Transcript Variants

edit

In humans, the KIAA1841 gene produces 18 alternatively spliced transcript variants as well as 3 unspliced. From the 18 spliced variants 4 form a protein product. The main transcript in humans is trascript ID ENST00000402291, or OTTHUMT00000325477. [4] [5] [6]

Homology

edit

Paralogs

edit

There are no paralogs of KIAA1841[7]

Orthologs

edit
Species Common Name NCBI Accession # Sequence Length Protein Identity mRNA Identity
Homo Sapiens Human NP_001123465.1 718 100% 100%
Odobenus Rosmarus Divergens Walrus XP_004397774.1 718 94% 97%
Canis Lupus Familiaris Grey Wolf XP_538505 718 92% 96%
Equus Caballus Horse XP_001495879.1 765 93% 96%
Mus Musculus Mouse NP_082136.2 718 89% 94%
Echinops Telfairi Hedgehog XP_004710320.1 718 86% 92%
Pelodiscus Sinensis Soft-Shelled Turtle XP_006122225.1 718 78% 88%
Anas Platyrhynchos Mallard Duck XP_005016968.1 719 78% 87%
Gallus Gallus Red Junglefowl NP_001186348.1 718 76% 87%
Xenopus (Silurana) Tropicalis Western Clawed Frog XP_004914757.1 715 71% 83%
Danio Rerio Zebrafish XP_001333668.2 735 60% 73%
Drosophila Melanogaster Fruit Fly NP_648346.1 889 38% 62%
Apis Mellifera Western Honey Bee XP_006559923.1 849 40% 62%
Anopheles Gambiae Mosquito XP_558222.3 806 40% 61%

Orthologs of the human protein KIAA1841 are listed above in descending order or date of divergence and then ascending order of percent identity. KIAA1841 is highly conserved throughout all orthologs, this is demonstrated with a 40% identity in the least similar ortholog. KAA1841 has evolved slowly and evenly over time.[8] [9]

Homologous Domains

edit

The domain of unknown function 3342 is conserved in all orthologs. It is the highest conserved region of the protein. Conservation of this domain was traced all the way back to a fungus called Betrachochytrium dendrobatidis, which diverged 1216 millions of years ago from humans.[10]

Protein

edit

General Properties

edit

The nucleus is the target of the protein. There is a Primary sequence & variants/isoforms

KIAA1841 DUF3342 N Terminus C Terminus
Isoelectric Point 6.5 6.74 5.5 8.2
Positive Charge (%) 13.8 13.2 11.7 15.9+
Negative Charge (%) 14.4 13.5 14.2 14.5
Net Charge -0.6 -0.3 -2.5 1.4
Major Hydrophobics (%) 24.8 28.7 27.2 22.3

Composition

edit

There is an even distribution of amino acids comprising KIAA1841. The percent composition of each amino acid is fairly consistent throughout the orthologs of the protein. The most distant ortholog Leishmania infantum displays the most variance in amino acid composition. There is a higher percent composition of alanine, histidine and leucine and a lower composition of lysine.

The protein sequence of KIAA1841 is not rich or low in any amino acids. The same is true for Mus musculus, Danio rerio, Drosophila melanogaster but not true for the most distantly related.1 Leishmania infantum is rich is histidine. Humans and closely related orthologs are composed of 2.2% to 3.8% histidine compared to 5% in Leishmania infantum.

Domains

edit

The DUF3342 domain stretches from 147-449 on KIAA1841 and has a molecular weight of 35.7 kdal. The DUF domain is low in G (2%) and rich is C (6.3%). Both of the non-polar stretches in the protein are located within the DUF domian. One at the beginning and one at the end. The domain is from 147-449 of KIAA1841.

pfam11822 is a region on KIAA1841, which has a domain DUF3342 of unknown function(5). This family of proteins has yet to be functionally characterized and it is found in bacteria. This domain is usually between 170 amino acids and 303 amino acids in length. The N terminal half of this protein family is a BTB-like domain. BTB domains multifunctional protein-protein interaction motif that is involved in a number of different cellular functions, including roles in regulating transcription, cytoskeleton dynamics, gating and assembly of ion channels and is involed with ubiquitination of proteins. BTB domain stuctures are highly conversed and are found on proteins that only have one or two other types of domains(6).

Post-Translational Modifications

edit

There are three regions of low complexity near the beginning of the protein. KIAA1841 is predicted to be highly phosphorylated post modification. There are 50 predicted phosphorylated sites. There is one leucine-rich nuclear export signal toward the end of the protein. The protein is targeted for the nucleus. There is one sulfated tyrosine, which stregthens protein protein interactions. Two motifs with high probability of post translational modification sumoylation sites were found. Sumoylation sites are involved in a number of cellular processes, including nuclear-sytosolic transport, transcriptional regulation and protein stability.

Secondary Structure

edit

KIAA1841 is primarily composed of alpha helices and beta sheets. Alpha helices comprise the majority of the protein, this is true for the DUF domain and both terminuses. The DUF domain has slightly less beta sheets compared to the protein as a whole and the C terminus has an even smaller amount of beta sheets comprising it’s secondary structure.

KIAA1841 DUF3342 N Terminus C Terminus
2° Structure: α Helice (%) 68 68 63.9 70.1
2° Structure: β Sheet (%) 61 49.2 60.8 35.8
2° Structure: β Turn (%) 14.1 9.9 12.8 15.4

3° structure

edit

Paragraph about the tertiary structure

===Subcellular Localization Paragraph about subcellular localization

Interacting Proteins

edit

TFs that might bind to regulatory sequence Proteins found in Y2H screens (developmental; functional) Selective (as in makes some sense) partners

Clinical Signifcance

edit

Disease Association

edit

Diseases associated with this gene are crohn’s disease, celiac disease and inflammatory bowel disease.[11] [12]

Mutations

edit

Paragraph about mutations


References

edit
  1. ^ "NCBI gene database". NCBI.
  2. ^ "GEO profiles". NCBI geo profiles.
  3. ^ "EST profiles". NCBI EST profiles.
  4. ^ "Emsembl". Vega.
  5. ^ "Genecards". The Gene Human Database.
  6. ^ "Aceview". NCBI.
  7. ^ "Genecards". The Gene Human Database.
  8. ^ "BLAST". NCBI.
  9. ^ Hedges, SB. "TimeTree". Bioinformatics.
  10. ^ "Biology Workbench". San Diego Supercomputer Center.
  11. ^ "NCBI gene database". NCBI.
  12. ^ "Genecards". The Gene Human Database.