Quantitative methods for studying the mechanisms underlying the
diversity of immune receptors
(dissertation abstract)

An efficient immune response to pathogenic challenges is essential for the survival of the organism. Immunodeficiency syndromes, genetic or acquired, invariably have a fatal outcome. What makes an organism capable of fighting an infectious disease but succumb to another is not very well understood. A basic paradigm in immunology is that  the immune system realizes a priori a broad coverage of the pathogenic universe, by creating a very large number of different immune receptors. The sources for this diversity are :
    germline diversity - a relatively large number of genes encoding immune receptors, only one of which is used by any one B or T cell.
    combinatorial diversity - the variable part of immune receptors is assembled from a number of fragments that are randomly chosen from libraries. During this process, some variability is introduced at the junction between fragments, by base deletion or non-templated base addition.
    somatic diversity - through somatic mutation or gene conversion of the already rearranged receptor genes.
 However, sequence analysis has shown that there is a high degree of similarity between immune receptor genes in any one species. Moreover, there are organisms such as sharks in which variable region antibody genes are 90% or more similar at the sequence level, and the survival of the species does not seem to be affected [Hinds:1993,Hohman:1995].

These findings give rise to a number of questions that I will attempt to address in my dissertation:
    1. is the idea that the immune system realizes a uniform coverage of the complete pathogen space right? Related to this is the question of  how would one measure
the "diversity" of  immune receptors.
    2. what is the relative importance of germline versus somatic diversification?
    3. what mechanisms and dynamics operate in somatic hypermutation?

Attempts to quantify diversity measure of  immune receptors [Wu:1970] lead to characterization of framework (FR), and antigen-binding regions (CDR), respectively. However, in this study the degree of  biochemical similarity between different amino acids has not been taken into account. In order to address the question of whether the immune receptors are indeed "uniformly covering" some space, one would need to assess the degree of functional diversity, that is the diversity of  epitope-binding sites. I intend to use some of the well established measures of similarity between amino acids to determine the diversity of  V region antibody libraries in species for which these libraries have been sequenced.

Somatic diversification through targeted mutation of immune receptor genes during an ongoing immune response has been documented in all the species that have an immune system. The importance of this mechanism is not known though, as the number of mutations that are introduced in the genes in different species varies widely.
Using a highly abstract  model of immune system interaction with pathogens, we showed [Oprea:1998] that increasing the number of genes in the germline improves the recognition of a random pathogen only very slowly, when the size of the pathogen universe is much larger than the number of genes available. This result argues that somatic mutation might indeed be a very important second level mechanism for ensuring pathogen space coverage. It also raises the interesting possibility that gene rarrangement increases the achievable fitness not only by the combinatorial nature of this process, but also by opening the opportunity for deletion and non-templated addition of nucleotides at the junction between gene fragments.

If somatic mutation is indeed an important mechanism underlying diversity of immune receptors, and given the paradigm that most mutations are deleterious, we would expect to observe the trace of genetic pressures on the mutability of immune receptors. Specifically, we would expect that the antigen-binding regions would be "strategically" placed in the sequence space, so that they can readily diversify once the mutation mechanisms operates. Also, we would expect framework regions to be stable under somatic mutation, so as to reduce the production of non-functional receptors. We document this effect for immunoglobulin genes in a number of species. We show that codon usage bias is responsible for the low mutability of framework regions, and that codon usage bias, as well as bias in amino acid composition account for the high mutability of antigen-binding regions. In a species that heavily relies on somatic mutation for diversification of the V lambda chain, in sheep, the difference in FR versus CDR mutability is especially pronounced. Regarding to the mechanism of somatic hypermutation, we show that a set of 150 unrelated genes from the human genome also seem to have a codon usage bias that would give them lower than average mutability under somatic hypermutation. By lower than average we mean, given the amino acid sequence that the gene codes for, if the codon usage was unbiased for all amino acids, what average replacement mutatbility per nucleotide site would these genes have? A cautious interpretation of this result is that somatic hypermutation seems to exploit some codon usage bias already present in the genome.

Estimating mutation rates from mutant distributions would be particularly important in assessing the effects of various genetic manipulations (deletion of the immunoglobulin promoter, knocking-in a non-immunoglobulin gene as target for somatic mutation, etc.) on the mutation rate. In the last section of my thesis, I will explore the possibility of estimating the mutation rate in a number of experimental systems, ranging from exponentially growing bacterial cultures to cultures of eukaryotic cells. In particular, I investigate the effect of cell cycle time distribution on the mutation rate estimates that we obtain from fluctuation analysis experiments [Luria:1943]. We introduce an improved method  over the classical Luria-Delbruck distribution-based technique for estimation mutation rate in exponentially growing bacterial cultures. We discuss a number of generalizations involving cell death or sampling, and introduce a tentative method for estimating the mutation rate from the distribution of mutations in passenger genes.

REFERENCES

[Hinds:1993] Hinds-Frey, K.R.,  Nishikata, H., Litman, R.T. and Litman, G.W.  Somatic variation preceeds extensive diversification of germline sequences and combinatorial joining in the evolution of immunoglobulin heavy chain diversity. Journal of Experimental Medicine 178:815-824 (1993).
[Hohman:1995] Hohman, V.S., Schulter, S.F. and Marchalonis, J.J. Diversity of Ig light chain cluster in the Sandbar  Shark (Carcharhinus plumbeus). Journal of Immunology 155:3922-3928 (1995).
[Wu:1970]  Wu, T.T. and Kabat, E.A. An a analysis of the sequences of the variable regions of the Bence-Jones proteins and myeloma light chains and their implications  for antibody complementarity. Journal of Experimental Medicine 132:211-250 (1970).
[Oprea:1998] Oprea, M. and Forrest, S. Simulated evolution of antibody libraries under pathogen selection. Proceedings of the 1998 IEEE International Conference on Systems, Man, and Cybernetics, San Diego, CA.
[Luria:1943] Luria, S.E. and Delbruck, M. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28:491-511 (1943).