Distance in Probability Measure Space



next up previous
Next: Average-Case Analysis Up: EMPIRICAL RESULTS Previous: Theoretical vs. Empirical

Distance in Probability Measure Space

In the section above, the notion of distance in measure space was used informally. Here we will make this notion more precise, and use distance in measure space to further discuss the accuracy of Markov approximation. There are many ways to define a distance between probability measures. Ideally, one would compare two measures on the basis of their assignment of probability to blocks of all sizes. This, however, is impractical. Since the measures we will discuss are 1- and 2-step Markov measures, we opt to compare these measures on the basis of the 1- and 2-block probabilities which define them. Among the most straight-forward distances is as follows. Let and be measures, and let denote absolute value. Now define the metrics , where the sum is taken over all n-blocks B, n=1 or 2. The maximum distance between any two measures under these metrics is 1, independent of n.

In figure 4, the distribution of the and distance between the the empirical class invariant measure and the fixed-point measure of the Markov equations which define the class is shown. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1. Orders of approximation 0-2 are shown. Increasing dash length corresponds to increasing order of approximation.

  
Figure: Distribution of the and distance between the the empirical class invariant measure and the fixed-point measure of the Markov equations which define the class. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1. Orders of approximation 0-2 are shown. Increasing dash length corresponds to increasing order of approximation.

Under both the and metrics, the typical distance from empirical to the theoretical invariant measures is < 0.05. These distances are small compared to the maximum distance of 1. They are not insignificant, however, compared to the empirical convergence tolerance of 0.002 (see section xx.x). As order increases, the distributions peak at smaller values of the distance. Also, as order increases the tails of the distributions become longer. This reflects the fact, as shown also in figure 3, that though accuracy typically improves with order, for some classes of rules, prediction actually deteriorates with increase in order. This phenomenon has been observed previously [5][3] in studies of individual cellular automaton rules.

Rules in a given class have a distribution of invariant statistical properties. From the results of figure 4, this distribution is typically centered near the fixed-point measure of the system of equations which define the class. Rules within a class vary according to structure not specified by the system of equations. These differences tend to cancel each other when averages over a class are taken. Consider a process by which a measure is stepped forward by a rule selected at random from a class at each time step. One may anticipate that (in the limit of large class size) the invariant measure of such a process should be close to the fixed-point measure of the system of equations which define the class.

Since appears to be somewhat better than at resolving differences between behavior of the various orders of Markov approximation, will be used exclusively in the following.



next up previous
Next: Average-Case Analysis Up: EMPIRICAL RESULTS Previous: Theoretical vs. Empirical




Thu Nov 10 12:16:46 GMT 1994