next up previous contents
Next: Global Diversity Up: Methods Previous: Empirical Diversity Inference   Contents


Library Complementarity and Similarity

Some of the libraries were obtained from M. truncatula root tissues grown in pure culture, while other libraries were prepared from infected root cultures. We expect some redundancy in gene content found in various libraries, as all are obtained from the same living tissue, plant roots. To quantify the extent to which transcript composition of one library resembles another, we calculated the degree of similarity between each pair of libraries A and B using the Jaccard similarity index $J_{AB}$ [32,88]. This similarity measure is related to the degree of complementarity $C_{AB}$, or distinctness in composition between two libraries, as $J_{AB} = 1 - C_{AB}$.

The complementarity $C_{AB}$ is the ratio of the number of distinct transcripts, those unique to either library ( $U_{AB} = S_A + S_B -
2V_{AB}$), to the total diversity in both libraries combined ( $S_{AB}
= S_A + S_B - V_{AB}$), where $V_{AB}$ is the number of distinct transcripts (or quasispecies) present in both libraries [32]. Transcripts present in both libraries were identified with a BLASTN search, where B is the set of query sequences and A is the subject set.

Thus, Jaccard similarity summarizes the proportion of distinct transcripts found in both libraries. It varies from 0, in cases where no transcripts are shared between the two libraries, to 100%, in which the two contain entirely the same quasispecies of transcripts. An advantage of the Jaccard similarity index is that it possesses the properties of a true distance metric [32]. Average similarity across all libraries can also be used as an index of beta diversity.


next up previous contents
Next: Global Diversity Up: Methods Previous: Empirical Diversity Inference   Contents
Peter T. Hraber 2001-06-13