next up previous contents
Next: Base Composition Up: Methods Previous: Validation   Contents


Test Sequences


Table 2.2: Test sets: Number of EST sequences (n) and nucleotides (nt) as raw, trimmed (limited lengths of N-rich regions, poly-A and poly-T sites), and screened (removed ribosomal, chloroplast, and mitochondrial DNA, and remaining sequences shorter than 300 nt) sequences. Sequences were obtained from the library indicated by the ID column. The abbreviation dpi stands for ``days post-inoculation'', indicating mixed plant-microbe cultures.
LIBRARY RAW TRIMMED SCREENED
SPECIES TISSUE ID n nt n nt n nt
P. sojae mycelia MY 969 527,295 902 510,010 895 506,086
P. sojae zoospores ZO 1013 583,520 960 569,576 957 567,976
+G. max 2 dpi HA 994 577,626 938 563,226 927 556,305
M. truncatula root hairs MtRHE 899 539,719 893 536,787 890 534,037
+G. versiforme 10-38 dpi MHAM 3259 1,785,721 3030 1,735,390 3017 1,725,491
+P. medicaginis 10 dpi DSIR 2462 1,324,815 2289 1,287,568 2284 1,282,518
M. truncatula roots KV0 2718 1,387,832 2550 1,351,137 2492 1,318,131
+S. meliloti 1 dpi KV1 1125 562,452 1012 537,644 1003 531,813
+S. meliloti 2 dpi KV2 1960 976,344 1732 926,953 1726 922,433
+S. meliloti 3 dpi KV3 2375 1,316,430 2217 1,279,691 2173 1,251,795

To diagnose the species of origin for sequences expressed in symbiotic cultures, we collected sequences generated by distinct EST sequencing projects from the GenBank database [13,126,127]. Sequences from pathogenic interactions originated from cultures of a species from the genus Phytophthora with its plant host, such as P. sojae and soybean (G. max) isolated from inoculated hypocotyls two days after infection [93] and P. medicaginis and M. truncatula isolated from infected roots ten days after infection (C. Vance, unpublished). Sequences expressed during mutualistic interactions were obtained from cultures with M. truncatula and mycorrhizal (Glomus versiforme; M.J. Harrison, unpublished) or rhizobacterial (S. meliloti; K. VandenBosch, unpublished) endosymbionts several days after inoculation. Sequences expressed in pure, axenic cultures from P. sojae mycelia and zoospores [93] and from sterile, uninoculated M. truncatula roots [34] provided negative controls.

To maximize the reliability of diagnostic comparisons, we screened test sequences for high quality as for training sequences, and for low similarity to E. coli, chloroplast and mitochondrial genes, and noncoding RNA transcripts (ribosomal and transfer RNAs). Independent BLASTN comparisons identified sequences having very high similarity ($E<10^{-100}$) to vector sequences or moderately high similarity ($E<10^{-20}$) to non-nuclear or non-coding sequences obtained from GenBank. Sequences so identified were withheld from analysis. A summary of test sequences appears in Table 2.2. All test sequences obtained using the procedure described above are available as supplementary material.


next up previous contents
Next: Base Composition Up: Methods Previous: Validation   Contents
Peter T. Hraber 2001-06-13