40 Automation of an NSP-Based (Negative Selection Pattern) Gene Family Identification Strategy
-
Published:2008
Download citation file:
Expressed Sequence Tags (ESTs) are short nucleotide subsequences of expressed genes, which can be rapidly generated in large quantities. With improving sequencing technology, the number of ESTs available in open-access databases is exponentially increasing. A method of gene family identification that makes use of this data can be a valuable tool in genome analysis. We have previously demonstrated such a technique which uses negative selection patterns (NSP) between family members to screen out potential paralogs from contigs assembled from ESTs (Frank et al., 2006; Frank et al., 2008). This strategy is now fully automated and tested on 10 gene families in Arabidopsis thaliana to see how the resulting putative paralogs compare with the actual gene sequences in this fully sequenced genome. The automation correctly identifies specific member genes in these families using only EST data. These results suggest that this automated strategy can identify many gene families in species where they are as yet undiscovered.