List of gene prediction software

In today's world, List of gene prediction software is a topic of great relevance and interest to a wide spectrum of society. Whether it's a current topic, a prominent figure, a historical event, or any other area of ​​importance, List of gene prediction software has captured the attention of people of all ages and backgrounds. This attention is due, in part, to the relevance that List of gene prediction software has in people's daily lives, as well as its impact in different areas, such as politics, culture, technology or the economy. This article seeks to further explore the meaning and importance of List of gene prediction software, as well as provide a detailed analysis of its impact on today's society.

This is a list of software tools and web portals used for gene prediction.

Name Description Species References
FINDER Automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences Eukaryotes
FragGeneScan Predicting genes in complete genomes and sequencing Reads Prokaryotes, Metagenomes
ATGpr Identifies translational initiation sites in cDNA sequences Human
Prodigal Its name stands for Prokaryotic Dynamic Programming Genefinding Algorithm. It is based on log-likelihood functions and does not use Hidden or Interpolated Markov Models. Prokaryotes, Metagenomes (metaProdigal)
AUGUSTUS Eukaryote gene predictor Eukaryotes
BGF Hidden Markov model (HMM) and dynamic programming based ab initio gene prediction program
DIOGENES Fast detection of coding regions in short genome sequences
Dragon Promoter Finder Program to recognize vertebrate RNA polymerase II promoters Vertebrates
EasyGene The gene finder is based on a hidden Markov model (HMM) that is automatically estimated for a new genome. Prokaryotes
EuGene Integrative gene finding Prokaryotes, Eukaryotes
FGENESH HMM-based gene structure prediction: multiple genes, both chains Eukaryotes
FrameD Find genes and frameshift in G+C rich prokaryote sequences Prokaryotes, Eukaryotes
GeMoMa Homology-based gene prediction based on amino acid and intron position conservation as well as RNA-Seq data
GENIUS II Links ORFs in complete genomes to protein 3D structures Prokaryotes, Eukaryotes
geneid Program to predict genes, exons, splice sites, and other signals along DNA sequences Eukaryotes
GeneParser Parse DNA sequences into introns and exons Eukaryotes
GeneMark Family of self-training gene prediction programs Prokaryotes, Eukaryotes,

Metagenomes

GeneTack Predicts genes with frameshifts in prokaryote genomes Prokaryotes
GenomeScan Predicts the locations and exon-intron structures of genes in genome sequences from a variety of organisms, GENSCAN server is the GenomeScan's predecessor Vertebrate, Arabidopsis, Maize
GENSCAN Predicts the locations and exon-intron structures of genes in genome sequences from a variety of organisms Vertebrate, Arabidopsis, Maize
GLIMMER Finds genes in microbial DNA Prokaryotes
GLIMMERHMM Eukaryotic gene-finding system Eukaryotes
GrailEXP Predicts exons, genes, promoters, polyas, CpG islands, EST similarities, and repeat elements in DNA sequence Human, Mus musculus, Arabidopsis thaliana, Drosophila melanogaster
mGene Support-vector machine (SVM) based system to find genes Eukaryotes
mGene.ngs SVM based system to find genes using heterogeneous information: RNA-seq, tiling arrays Eukaryotes
MORGAN Decision tree system to find genes in vertebrate DNA Eukaryotes
BioNIX Web tool to combine results from different programs: GRAIL, FEX, HEXON, MZEF, GENEMARK, GENEFINDER, FGENE, BLAST, POLYAH, REPEATMASKER, TRNASCAN Prokaryotes, Eukaryotes
NNPP Neural network promoter prediction Prokaryotes, Eukaryotes
NNSPLICE Neural network splice site prediction Drosophila, Human
ORFfinder Graphical analysis tool to find all open reading frames Prokaryotes, Eukaryotes
Regulatory Sequence Analysis Tools Series of modular computer programs to detect regulatory signals in non-coding sequences Fungi, Prokaryotes, Metazoa, Protist, Plants
PHANOTATE A tool to annotate phage genomes. Phages
SplicePredictor Method to identify potential splice sites in (plant) pre-mRNA by sequence inspection using Bayesian statistical models Eukaryotes
VEIL Hidden Markov model to find genes in vertebrate DNA Server Eukaryotes

See also

References

  1. ^ Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM (Apr 2021). "FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences". BMC Bioinformatics. 44 (9): e89. doi:10.1186/s12859-021-04120-9. PMC 8056616. PMID 33879057.
  2. ^ Rho M, Tang H, Ye Y (November 2010). "FragGeneScan: predicting genes in short and error-prone reads". Nucleic Acids Research. 38 (20): e191. doi:10.1093/nar/gkq747. PMC 2978382. PMID 20805240.
  3. ^ Nishikawa, Tetsuo; Ota, Toshio; Isogai, Takao (2000-11-01). "Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences". Bioinformatics. 16 (11): 960–967. doi:10.1093/bioinformatics/16.11.960. ISSN 1367-4803. PMID 11159307.
  4. ^ Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (March 2010). "Prodigal: prokaryotic gene recognition and translation initiation site identification". BMC Bioinformatics. 11: 119. doi:10.1186/1471-2105-11-119. PMC 2848648. PMID 20211023.
  5. ^ Keller O, Kollmar M, Stanke M, Waack S (March 2011). "A novel hybrid gene prediction method employing protein multiple sequence alignments". Bioinformatics. 27 (6): 757–63. doi:10.1093/bioinformatics/btr010. hdl:11858/00-001M-0000-0011-F244-D. PMID 21216780.
  6. ^ Li, Heng; Liu, Jin-Song; Xu, Zhao; Jin, Jiao; Fang, Lin; Gao, Lei; Li, Yu-Dong; Xing, Zi-Xing; Gao, Shao-Gen; Liu, Tao; Li, Hai-Hong (2005-07-01). "Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome". Journal of Computer Science and Technology. 20 (4): 446–453. doi:10.1007/s11390-005-0446-x. ISSN 1860-4749. S2CID 13497894.
  7. ^ Bajic, Vladimir B.; Seah, Seng Hong; Chong, Allen; Zhang, Guanglan; Koh, Judice L. Y.; Brusic, Vladimir (2002-01-01). "Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters". Bioinformatics. 18 (1): 198–199. doi:10.1093/bioinformatics/18.1.198. ISSN 1367-4803. PMID 11836231.
  8. ^ Nielsen, P.; Krogh, A. (2005-12-15). "Large-scale prokaryotic gene prediction and comparison to genome annotation". Bioinformatics. 21 (24): 4322–4329. doi:10.1093/bioinformatics/bti701. ISSN 1367-4803. PMID 16249266.
  9. ^ Larsen, Thomas Schou; Krogh, Anders (2003-06-03). "EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance". BMC Bioinformatics. 4 (1): 21. doi:10.1186/1471-2105-4-21. ISSN 1471-2105. PMC 521197. PMID 12783628.
  10. ^ Foissac S, Gouzy J, Rombauts S, Mathé C, Amselem J, Sterck L, de Peer YV, Rouzé P, Schiex T (May 2008). "Genome annotation in plants and fungi: EuGene as a model platform". Current Bioinformatics. 3 (2): 87–97. doi:10.2174/157489308784340702.
  11. ^ Sallet, Erika; Gouzy, Jérôme; Schiex, Thomas (2019), Kollmar, Martin (ed.), "EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes", Gene Prediction: Methods and Protocols, Methods in Molecular Biology, vol. 1962, New York, NY: Springer, pp. 97–120, doi:10.1007/978-1-4939-9173-0_6, ISBN 978-1-4939-9173-0, PMID 31020556, S2CID 131776381, retrieved 2021-11-24
  12. ^ Salamov AA, Solovyev VV (April 2000). "Ab initio gene finding in Drosophila genomic DNA". Genome Research. 10 (4): 516–22. doi:10.1101/gr.10.4.516. PMC 310882. PMID 10779491.
  13. ^ Schiex T, Gouzy J, Moisan A, de Oliveira Y (July 2003). "FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences". Nucleic Acids Research. 31 (13): 3738–41. doi:10.1093/nar/gkg610. PMC 169016. PMID 12824407.
  14. ^ Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F (May 2016). "Using intron position conservation for homology-based gene prediction". Nucleic Acids Research. 44 (9): e89. doi:10.1186/s12859-018-2203-5. PMC 4872089. PMID 26893356.
  15. ^ Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J (May 2018). "Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi". BMC Bioinformatics. 19 (1): 189. doi:10.1093/nar/gkw092. PMC 5975413. PMID 29843602.
  16. ^ Yabuki, Yukimitsu; Mukai, Yuri; Swindells, Mark B.; Suwa, Makiko (2004-03-01). "GENIUS II: a high-throughput database system for linking ORFs in complete genomes to known protein three-dimensional structures". Bioinformatics. 20 (4): 596–598. doi:10.1093/bioinformatics/btg478. ISSN 1367-4803. PMID 14751990.
  17. ^ Blanco, Enrique; Parra, Genís; Guigó, Roderic (June 2007), "Using geneid to Identify Genes", Current Protocols in Bioinformatics, Chapter 4, John Wiley & Sons, Inc.: 4.3.1–4.3.28, doi:10.1002/0471250953.bi0403s18, ISBN 978-0471250951, PMID 18428791
  18. ^ Snyder, Eric E.; Stormo, Gary D. (1995-04-21). "Identification of Protein Coding Regions In Genomic DNA". Journal of Molecular Biology. 248 (1): 1–18. doi:10.1006/jmbi.1995.0198. ISSN 0022-2836. PMID 7731036.
  19. ^ Lukashin AV, Borodovsky M (February 1998). "GeneMark.hmm: new solutions for gene finding". Nucleic Acids Research. 26 (4): 1107–15. doi:10.1093/nar/26.4.1107. PMC 147337. PMID 9461475.
  20. ^ Besemer J, Lomsadze A, Borodovsky M (June 2001). "GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions". Nucleic Acids Research. 29 (12): 2607–18. doi:10.1093/nar/29.12.2607. PMC 55746. PMID 11410670.
  21. ^ Lomsadze A, Burns PD, Borodovsky M (September 2014). "Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm". Nucleic Acids Research. 42 (15): e119. doi:10.1093/nar/gku557. PMC 4150757. PMID 24990371.
  22. ^ Zhu W, Lomsadze A, Borodovsky M (July 2010). "Ab initio gene identification in metagenomic sequences". Nucleic Acids Research. 38 (12): e132. doi:10.1093/nar/gkq275. PMC 2896542. PMID 20403810.
  23. ^ Antonov I, Borodovsky M (June 2010). "Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm". Journal of Bioinformatics and Computational Biology. 8 (3): 535–51. doi:10.1142/S0219720010004847. PMID 20556861.
  24. ^ Yeh, Ru-Fang; Lim, Lee P.; Burge, Christopher B. (2001-05-01). "Computational Inference of Homologous Gene Structures in the Human Genome". Genome Research. 11 (5): 803–816. doi:10.1101/gr.175701. ISSN 1088-9051. PMC 311055. PMID 11337476.
  25. ^ Burge, Chris; Karlin, Samuel (1997-04-25). "Prediction of complete gene structures in human genomic DNA11Edited by F. E. Cohen". Journal of Molecular Biology. 268 (1): 78–94. doi:10.1006/jmbi.1997.0951. ISSN 0022-2836. PMID 9149143.
  26. ^ Burge, Christopher B. (1998-01-01), Salzberg, Steven L.; Searls, David B.; Kasif, Simon (eds.), "Chapter 8 - Modeling dependencies in pre-mRNA splicing signals", New Comprehensive Biochemistry, Computational Methods in Molecular Biology, vol. 32, Elsevier, pp. 129–164, doi:10.1016/S0167-7306(08)60465-2, ISBN 978-0-444-82875-0, retrieved 2021-11-24
  27. ^ Burge, Christopher B; Karlin, Samuel (1998-06-01). "Finding the genes in genomic DNA". Current Opinion in Structural Biology. 8 (3): 346–354. doi:10.1016/S0959-440X(98)80069-9. ISSN 0959-440X. PMID 9666331.
  28. ^ Delcher, Arthur L.; Bratke, Kirsten A.; Powers, Edwin C.; Salzberg, Steven L. (2007-01-19). "Identifying bacterial genes and endosymbiont DNA with Glimmer". Bioinformatics. 23 (6): 673–679. doi:10.1093/bioinformatics/btm009. ISSN 1460-2059. PMC 2387122. PMID 17237039.
  29. ^ Delcher, A. (1999-12-01). "Improved microbial gene identification with GLIMMER". Nucleic Acids Research. 27 (23): 4636–4641. doi:10.1093/nar/27.23.4636. ISSN 1362-4962. PMC 148753. PMID 10556321.
  30. ^ Salzberg, S. L.; Delcher, A. L.; Kasif, S.; White, O. (1998-01-01). "Microbial gene identification using interpolated Markov models". Nucleic Acids Research. 26 (2): 544–548. doi:10.1093/nar/26.2.544. ISSN 0305-1048. PMC 147303. PMID 9421513.
  31. ^ Majoros WH, Pertea M, Salzberg SL (November 2004). "TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders". Bioinformatics. 20 (16): 2878–9. doi:10.1093/bioinformatics/bth315. PMID 15145805.
  32. ^ Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2004). "GrailEXP and Genome Analysis Pipeline for Genome Annotation". Current Protocols in Bioinformatics. 8 (1): 4.9.1–4.9.15. doi:10.1002/0471250953.bi0409s04. ISSN 1934-340X. PMID 18428726.
  33. ^ Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2003). "GrailEXP and Genome Analysis Pipeline for Genome Annotation". Current Protocols in Human Genetics. 39 (1): 6.5.1–6.5.15. doi:10.1002/0471142905.hg0605s39. ISSN 1934-8258. PMID 18428363. S2CID 21431978.
  34. ^ Schweikert G, Zien A, Zeller G, Behr J, Dieterich C, Ong CS, et al. (November 2009). "mGene: accurate SVM-based gene finding with an application to nematode genomes". Genome Research. 19 (11): 2133–43. doi:10.1101/gr.090597.108. PMC 2775605. PMID 19564452.
  35. ^ Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, et al. (August 2011). "Multiple reference genomes and transcriptomes for Arabidopsis thaliana". Nature. 477 (7365): 419–23. Bibcode:2011Natur.477..419G. doi:10.1038/nature10414. PMC 4856438. PMID 21874022.
  36. ^ "MORGAN". sites.stat.washington.edu. Retrieved 2021-11-24.
  37. ^ Bedő, Justin; Di Stefano, Leon; Papenfuss, Anthony T (November 2020). "Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix". GigaScience. 9 (11). doi:10.1093/gigascience/giaa121. ISSN 2047-217X. PMC 7672450. PMID 33205815.
  38. ^ Reese, Martin G (2001-12-01). "Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome". Computers & Chemistry. 26 (1): 51–56. doi:10.1016/S0097-8485(01)00099-7. ISSN 0097-8485. PMID 11765852.
  39. ^ Reese, Martin G.; Eeckman, Frank H.; Kulp, David; Haussler, David (1997-01-01). "Improved Splice Site Detection in Genie". Journal of Computational Biology. 4 (3): 311–323. doi:10.1089/cmb.1997.4.311. PMID 9278062.
  40. ^ "Home - ORFfinder - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-11-24.
  41. ^ Santana-Garcia, Walter; Rocha-Acevedo, Maria; Ramirez-Navarro, Lucia; Mbouamboua, Yvon; Thieffry, Denis; Thomas-Chollier, Morgane; Contreras-Moreira, Bruno; van Helden, Jacques; Medina-Rivera, Alejandra (2019-01-01). "RSAT variation-tools: An accessible and flexible framework to predict the impact of regulatory variants on transcription factor binding". Computational and Structural Biotechnology Journal. 17: 1415–1428. doi:10.1016/j.csbj.2019.09.009. ISSN 2001-0370. PMC 6906655. PMID 31871587.
  42. ^ Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques (2018-05-02). "RSAT 2018: regulatory sequence analysis tools 20th anniversary". Nucleic Acids Research. 46 (W1): W209–W214. doi:10.1093/nar/gky317. ISSN 0305-1048. PMC 6030903. PMID 29722874.
  43. ^ McNair, Katelyn; Zhou, Carol; Dinsdale, Elizabeth A.; Souza, Brian; Edwards, Robert A. (2019-11-01). "PHANOTATE: a novel approach to gene identification in phage genomes". Bioinformatics. 35 (22): 4537–4542. doi:10.1093/bioinformatics/btz265. ISSN 1367-4803. PMC 6853651. PMID 31329826.
  44. ^ Brendel, V.; Xing, L.; Zhu, W. (2004-02-05). "Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus". Bioinformatics. 20 (7): 1157–1169. doi:10.1093/bioinformatics/bth058. ISSN 1367-4803. PMID 14764557.
  45. ^ Henderson, John; Salzberg, Steven; Fasman, Kenneth H. (1997-01-01). "Finding Genes in DNA with a Hidden Markov Model". Journal of Computational Biology. 4 (2): 127–141. doi:10.1089/cmb.1997.4.127. hdl:1903/8004. PMID 9228612.