| LIBRARY | RAW | TRIMMED | SCREENED | |||||
| SPECIES | TISSUE | ID | n | nt | n | nt | n | nt |
| P. sojae | mycelia | MY | 969 | 527,295 | 902 | 510,010 | 895 | 506,086 |
| P. sojae | zoospores | ZO | 1013 | 583,520 | 960 | 569,576 | 957 | 567,976 |
| +G. max | 2 dpi | HA | 994 | 577,626 | 938 | 563,226 | 927 | 556,305 |
| M. truncatula | root hairs | MtRHE | 899 | 539,719 | 893 | 536,787 | 890 | 534,037 |
| +G. versiforme | 10-38 dpi | MHAM | 3259 | 1,785,721 | 3030 | 1,735,390 | 3017 | 1,725,491 |
| +P. medicaginis | 10 dpi | DSIR | 2462 | 1,324,815 | 2289 | 1,287,568 | 2284 | 1,282,518 |
| M. truncatula | roots | KV0 | 2718 | 1,387,832 | 2550 | 1,351,137 | 2492 | 1,318,131 |
| +S. meliloti | 1 dpi | KV1 | 1125 | 562,452 | 1012 | 537,644 | 1003 | 531,813 |
| +S. meliloti | 2 dpi | KV2 | 1960 | 976,344 | 1732 | 926,953 | 1726 | 922,433 |
| +S. meliloti | 3 dpi | KV3 | 2375 | 1,316,430 | 2217 | 1,279,691 | 2173 | 1,251,795 |
To diagnose the species of origin for sequences expressed in symbiotic cultures, we collected sequences generated by distinct EST sequencing projects from the GenBank database [13,126,127]. Sequences from pathogenic interactions originated from cultures of a species from the genus Phytophthora with its plant host, such as P. sojae and soybean (G. max) isolated from inoculated hypocotyls two days after infection [93] and P. medicaginis and M. truncatula isolated from infected roots ten days after infection (C. Vance, unpublished). Sequences expressed during mutualistic interactions were obtained from cultures with M. truncatula and mycorrhizal (Glomus versiforme; M.J. Harrison, unpublished) or rhizobacterial (S. meliloti; K. VandenBosch, unpublished) endosymbionts several days after inoculation. Sequences expressed in pure, axenic cultures from P. sojae mycelia and zoospores [93] and from sterile, uninoculated M. truncatula roots [34] provided negative controls.
To maximize the reliability of diagnostic comparisons, we screened
test sequences for high quality as for training sequences, and for low
similarity to E. coli, chloroplast and mitochondrial genes, and
noncoding RNA transcripts (ribosomal and transfer RNAs). Independent
BLASTN comparisons identified sequences having very high
similarity (
) to vector sequences or moderately high
similarity (
) to non-nuclear or non-coding sequences
obtained from GenBank. Sequences so identified were withheld from
analysis. A summary of test sequences appears in
Table 2.2. All test sequences obtained
using the procedure described above are available as supplementary
material.