Next: Introduction
Up: On the Species of
Previous: On the Species of
  Contents
Most organisms have developed ways to recognize and interact with
other species. Symbiotic interactions range from antagonism
(pathogenic) to altruism (mutualistic). Some molecular mechanisms of
interspecific interaction are well understood, but many remain to be
discovered. Expressed sequence tags (ESTs) from cultures of
interacting symbionts can help identify transcripts that regulate
symbiosis, but present a unique challenge for functional analysis.
Given a sequence expressed in an interaction between two symbionts,
the challenge is to determine from which organism the transcript
originated. For high-throughput sequencing from interaction cultures,
a reliable computational approach is needed. Previous investigations
into GC nucleotide composition and comparative similarity searching
provide provisional solutions, but can yield indeterminate results.
Here we use a comparative lexical analysis, which is based on a
likelihood-ratio test of hexamer counts, and is more statistically
powerful. The comparative lexical analysis has the following
advantages over other approaches: it does not require that the full
protein-coding content of both genomes be known for reasonable
inferences to be made; it is sensitive to biases in codon usage and GC
content commonly observed when comparing taxa, but does not require
knowledge of the reading frame for amino acid translation; unlike GC
analysis, it establishes a clear threshold above or below which we can
infer the species of origin; likelihood-ratio test statistics have
been proven to maximize statistical power by minimizing the false
negative rate when formally testing hypotheses; and a confidence level
for an inference can readily be assigned.
In the work described below, tests against genes whose origin and
function are known yielded 94% accuracy. Non-plant transcripts
comprised about 75% of a Phytophthora sojae-infected soybean
(Glycine max cv. Harasoy) library, contrasted with less than
5% in root tissue libraries of Medicago truncatula from
axenic, Phytophthora medicaginis-infected, and rhizobacterial
treatments. Mycorrhizal libraries contained about 23% microbial
transcripts; a negative control axenic plant library contained a
similar proportion of putative microbial transcripts. Whether this
indicates unreliable inferences, genes that have been transferred
horizontally between genomes, or some other phenomenon, remains to be
determined in future work. Many of the transcripts isolated from
mixed cultures were of unknown function, suggesting specificity to
symbiotic metabolism and therefore candidates likely to be interesting
for further functional investigation.
Next: Introduction
Up: On the Species of
Previous: On the Species of
  Contents
Peter T. Hraber
2001-06-13