I think I have reached an understanding of the role of germline diversity and somatic mutation in generating the immune repertoire. However, there still are a number of related problems that I am currently working on, or hope to be working on in the near future. The most concrete project concerns implementing the mutation rate estimation methods that I only introduced conceptually in my dissertation. There are other issues that could be pursued, based on the work that I presented here. We parametrized the Luria-Delbrück distribution using a parameter that we determined empirically, by fitting to simulation data. We do not have an analytical form for this parameter. It would be of great interest to find that form. Moreover, for the case of gamma-distributed cell cycle times, we do not have as much as a distribution based on an empirically-determined parameter.
The finding that non-immunoglobulin sequences share optimization features with respect to somatic hypermutation raises the exciting possibility that somatic hypermutation is derived from more general mutation mechanisms that operate across the genome. What the nature of these mechanisms might be has not been revealed by my analyses. I have, however, a starting point for that search. That is the observation that somatic mutation targets A/T nucleotides. The next step is to look into what mutation/repair mechanisms might share such a bias. One candidate is the single-base mismatch repair that seem to preserve G nucleotides (1998). I also intend to determine the sequence specificity of the germline mutation mechanism and compare it directly to the specificity of somatic hypermutation.
In my analysis of the properties of the germline-encoded repertoire, I concluded that "sticky" antibodies are a good anticipative strategy. Antibodies of this type indeed have been described, especially in neonatal immune systems. However, they pose the problem that they bind not only pathogens, but also self structures. We could now introduce a set of self molecules and require that the repertoire not bind these molecules. It would be very interesting to find out what type of antibodies would emerge under these conditions.
Finally, there are processes which take place at the gene level, that might constrain the learning capacity of the immune system. These have not been not been taken into account by the models that I described in this dissertation. They are, however, important issues to consider if one is to understand the dynamics of the immune repertoire. The generation of the antibody repertoire in neonates seems to be much more deterministic than in adults. This justified, in part, my analysis of the "germline-encoded" antibodies. These antibodies are characterized by the lack of non-templated nucleotide additions, and by preferential V-D, D-J, and V-J associations. It is believed that short regions of homology at the ends of the rearranging fragments are responsible for constraining the rearrangement process. I intend to build a mechanistic model for the rearrangement process, which I can use to test the previous hypothesis.
Gene conversion introduces a sampling dynamics at the level of genes. It would be interesting to know how much germline diversity can be maintained with reasonable values of the parameters that determine the dynamics of gene conversion. In certain species, such as chicken and rabbit, gene conversion is used as a primary diversification mechanism, responsible for creating the naive repertoire. The donor genes are generally pseudo-genes, that is they cannot, by themselves, generate functional immunoglobulins. It would be interesting to know what type of dynamics these genes have, given that they are constrained by the interaction with the acceptor gene. The mechanism of gene conversion is also not known. Using gene conversion data, one could attempt to infer what this mechanism might be, in a way that would be similar to my attempt to infer the nature of the somatic hypermutation mechanism.