This section is to be added to the multiple sequence alignment article.

Phylogeny-aware methods edit

 
Non-homologous exon alignment by an iterative method (a), and by a phylogeny-aware method (b)

Most multiple sequence alignment methods try to minimize the number of insertions/deletions (gaps) and, as a consequence, produce compact alignments. This causes several problems if sequences to align contain non-homologous regions, if gaps are informative in a phylogeny analysis. These problems are common in newly produced sequences that are poorly annotated and may contain frame-shifts, wrong domains or non-homologous spliced exons.

First of such methods started to be developed in 2005 by Löytynoja and Goldman.[1] The same authors released a software called PRANK in 2008.[2] PRANK improves alignments when insertions are present. Nevertheless, it runs slowly compared to progressive and/or iterative methods developed for several years.

In 2012, two new phylogeny-aware tools appeared. One is PAGAN developed by the same team than PRANK.[3] The other is ProGraphMSA developed by Szalkowski.[4] Both were developed independently but share common features, notably the use of graph algorithms to improve the recognition of non-homologous regions, and an improvement in code making these software faster than PRANK.



References edit

  1. ^ Loytynoja, A. (2005). "An algorithm for progressive multiple alignment of sequences with insertions". Proceedings of the National Academy of Sciences. 102 (30): 10557–10562. doi:10.1073/pnas.0409137102.
  2. ^ Loytynoja, A.; Goldman, N. (2008). "Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis". Science. 320 (5883): 1632–1635. doi:10.1126/science.1158395. PMID 18566285.
  3. ^ Loytynoja, A.; Vilella, A. J.; Goldman, N. (2012). "Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm". Bioinformatics. 28 (13): 1684–1691. doi:10.1093/bioinformatics/bts198. PMC 3381962. PMID 22531217.
  4. ^ Szalkowski, A. M. (2012). "Fast and robust multiple sequence alignment with phylogenyaware gap placement". BMC Bioinformatics. 13: 129–1180. doi:10.1186/1471-2105-13-129. PMC 3495709. PMID 22694311.{{cite journal}}: CS1 maint: unflagged free DOI (link)