Online Encyclopedia Search Tool

Your Online Encyclopedia

 

Online Encylopedia and Dictionary Research Site

Online Encyclopedia Free Search Online Encyclopedia Search    Online Encyclopedia Browse    welcome to our free dictionary for your research of every kind

Online Encyclopedia



Gene finding

Gene finding is the area of computational biology that is involved in algorithmically identifying stretches of sequences that are actually functional (code for proteins or have regulatory functions) from non-coding or junk sequences.

In the case of prokaryotes, this process is much easier due to the presence of specific promoter sequences as well as the absence of splicing mechanisms.

In the eukaryotes, a variety of approaches have been attempted and none of the approaches have been entirely successful. The underlying proposition is that there are specific rules by which the DNA transcript is read and translated. The rules can either be a part of the DNA itself or enforced in the cellular environment by signals that are themselves derived by the translation of the DNA. A major problem in identifying genes in eukaryotes is the mechanism of splicing and often of multiple and overlapping splice sites. Splice site identification is sometimes treated as a separate problem in the field of computational biology.


The gene finding approaches can broadly be classified into

  • Linguistic approaches - which assume that there are various semantic elements in the DNA sequence that can be pieced together just like sentences. These use lexical analysis and grammar rules.
  • Pattern approaches - these assume that there are specific patterns which could be expressed in terms of regular expressions that can be found for coding sequences and possibly even for specific protein families. These algorithms are often implemented by Deterministic Finite State Automata DFA.
  • Statistical approaches - these assume that there are specific differences in the statistical properties of coding and non-coding sequences. Some approaches look at the entropy of the sequences, while others have looked at specific nucleotide ratios. Some approaches improve upon the pattern approaches above by interpreting the pattern rules in a more fuzzy way. These are the Hidden Markov model based gene finders.

External links

  • http://www.binf.ku.dk/users/krogh/genefinding.html
  • http://www.swbic.org/links/1.4.3.2.php
Last updated: 03-18-2005 11:16:12