next up previous contents
Next: Introduction Up: On the Species of Previous: On the Species of   Contents

Synopsis

Most organisms have developed ways to recognize and interact with other species. Symbiotic interactions range from antagonism (pathogenic) to altruism (mutualistic). Some molecular mechanisms of interspecific interaction are well understood, but many remain to be discovered. Expressed sequence tags (ESTs) from cultures of interacting symbionts can help identify transcripts that regulate symbiosis, but present a unique challenge for functional analysis. Given a sequence expressed in an interaction between two symbionts, the challenge is to determine from which organism the transcript originated. For high-throughput sequencing from interaction cultures, a reliable computational approach is needed. Previous investigations into GC nucleotide composition and comparative similarity searching provide provisional solutions, but can yield indeterminate results. Here we use a comparative lexical analysis, which is based on a likelihood-ratio test of hexamer counts, and is more statistically powerful. The comparative lexical analysis has the following advantages over other approaches: it does not require that the full protein-coding content of both genomes be known for reasonable inferences to be made; it is sensitive to biases in codon usage and GC content commonly observed when comparing taxa, but does not require knowledge of the reading frame for amino acid translation; unlike GC analysis, it establishes a clear threshold above or below which we can infer the species of origin; likelihood-ratio test statistics have been proven to maximize statistical power by minimizing the false negative rate when formally testing hypotheses; and a confidence level for an inference can readily be assigned.

In the work described below, tests against genes whose origin and function are known yielded 94% accuracy. Non-plant transcripts comprised about 75% of a Phytophthora sojae-infected soybean (Glycine max cv. Harasoy) library, contrasted with less than 5% in root tissue libraries of Medicago truncatula from axenic, Phytophthora medicaginis-infected, and rhizobacterial treatments. Mycorrhizal libraries contained about 23% microbial transcripts; a negative control axenic plant library contained a similar proportion of putative microbial transcripts. Whether this indicates unreliable inferences, genes that have been transferred horizontally between genomes, or some other phenomenon, remains to be determined in future work. Many of the transcripts isolated from mixed cultures were of unknown function, suggesting specificity to symbiotic metabolism and therefore candidates likely to be interesting for further functional investigation.


next up previous contents
Next: Introduction Up: On the Species of Previous: On the Species of   Contents
Peter T. Hraber 2001-06-13