In the section above, the notion of distance in measure space was
used informally. Here we will make this notion more precise, and
use distance in measure space to further discuss the accuracy of
Markov approximation. There are many ways to define a distance between
probability measures.
Ideally, one would
compare two measures on the basis of their assignment of probability to
blocks of all sizes. This, however, is impractical. Since the measures
we will discuss are 1- and 2-step Markov measures, we opt to compare these
measures on the basis of the 1- and 2-block probabilities which define them.
Among the most straight-forward distances
is as follows.
Let
and
be measures, and let
denote absolute value.
Now define the metrics
,
where the
sum is taken over all n-blocks B, n=1 or 2.
The maximum distance between any two measures under these metrics is 1,
independent of n.
In figure 4, the
distribution of the
and
distance between the
the empirical class invariant measure and the fixed-point measure of
the Markov equations which define the class is shown.
The contribution of each class to a distribution is weighted by
the number of rules in the class. The curves are then normalized so
that the total bin height is 1.
Orders of approximation 0-2 are shown.
Increasing dash length corresponds to increasing order of approximation.
Figure:
Distribution of the
and
distance between the
the empirical class invariant measure and the fixed-point measure of
the Markov equations which define the class.
The contribution of each class to a distribution is weighted by
the number of rules in the class. The curves are then normalized so
that the total bin height is 1.
Orders of approximation 0-2 are shown.
Increasing dash length corresponds to increasing order of approximation.
Under both the
and
metrics, the typical distance
from empirical to the theoretical invariant measures is < 0.05.
These distances are small compared to the maximum distance of 1.
They are not insignificant, however, compared to the empirical
convergence tolerance of 0.002 (see section xx.x).
As order increases, the distributions peak at smaller values of the
distance. Also, as order increases the tails of the distributions
become longer. This reflects the fact, as shown also in figure 3, that
though accuracy typically improves with order, for some classes of
rules, prediction actually deteriorates with increase in order.
This phenomenon has been observed previously [5][3]
in studies of individual cellular automaton rules.
Rules in a given class have a distribution of invariant statistical properties. From the results of figure 4, this distribution is typically centered near the fixed-point measure of the system of equations which define the class. Rules within a class vary according to structure not specified by the system of equations. These differences tend to cancel each other when averages over a class are taken. Consider a process by which a measure is stepped forward by a rule selected at random from a class at each time step. One may anticipate that (in the limit of large class size) the invariant measure of such a process should be close to the fixed-point measure of the system of equations which define the class.
Since
appears to be somewhat better than
at resolving differences
between behavior of the various orders of Markov approximation,
will be used exclusively in the following.