![]() |
We used four frequency distributions to validate diversity estimators (Figure 3.2), chosen to model ecological distributions [76,89]. The four sample distributions from four families, each producing long-tailed curves, having many rare and a few common individuals [38,89] were:
Methods in R, version 1.1.1 [62], generated the validation distributions (rpois, rexp, rlnorm, and rnbinom).
To make continuous distributions discrete, R rounded down
decimal values to the nearest integer (e.g., 0.999 = 0). The zero
abundance, null class was eliminated from frequency calculations
because it is not observable; no representatives are present, so not
counted in diversity (c.f. [38]). Sampling was from
as
samples, m=20 replicates of n=500
individuals, with replacement between samples.
The true underlying frequency distribution of transcripts in a cell is unknown, but expression assay results indicate that the general form of the distribution is likely to vary across cell types [74].