Worst-Case Analysis



next up previous
Next: DISCUSSION Up: EMPIRICAL RESULTS Previous: Class Standard Deviation

Worst-Case Analysis

A possible source of occasional poor average predictions is the presence of rules in a class whose behavior is rather different from that predicted by the approximation. Such rules could produce averages over a class which indicate poor performance of the approximation even though most rules in the class in fact conform well with the approximation. This possibility was explored by finding, for each class, the rule whose empirical invariant measure was furthest from that predicted by Markov approximation.

The distribution of the worst distance of the empirical invariant measure of a rule in a class from the empirical class invariant measure is shown in figure 7. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1. Error bars give 1 standard deviation over the bin. Orders of approximation 0-2 are shown. Increasing dash length corresponds to increasing order of approximation.

  
Figure: Distribution of the worst distance of the empirical invariant measure of a rule in a class from the empirical class invariant measure. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1. Error bars give 1 standard deviation over the bin. Orders of approximation 0-2 are shown. Increasing dash length corresponds to increasing order of approximation.

The improvement of predictions with increase in order is more evident when worst- rather than average- cases are considered. At all orders 0-2 the worst-case distribution is shifted toward higher values relative to the corresponding average-case distribution. The 2nd-order distribution has a long tail, as was the case with average-case predictions (figure 5). While the 0th- and 1st-order distributions for average-case predictions overlap considerable, the 0th- and 1st-order worst-case distributions overlap but slightly.

Recall that the 0th- and 1st-order curves are a result of sampling rules from classes, while the 2nd-order curve represents all rules in a class. This implies that the distributions shown in figure 7 for 0th- and 1st-order are lower bounds for the actual worst-case performance. That is, until all rules in a class are examined, it may be possible to find a rule whose invariant measure lies at a larger distance from the theoretical prediction than that of any of those rules already sampled. Hence the improvement of performance of the 2nd-order approximation over the 0th- and 1st-order theories is actually better than indicated in figure 7.

Still, the lower bound distributions represented by the 0th- and 1st-order curves of figure 7 are probably close to the actual worst-case distributions. That is, the outliers in a 0th- or 1st- order class are sufficiently numerous that the samples chosen for these experiments (200 and 50 rules/class for 0th- and 1st-order approximation respectively) typically include some of them. This is indicated by the results of figure 8.

  
Figure: Dependence of the worst distance as in figure 7 on the number of rules sampled from a class. In the 0th-order figure, curves labeled with filled triangle, square, and pentagon represent sampling of 13,50, and 100 rules resp. from each class. The solid curve labeled is the same as the 0th-order curve in figure 7 (200 rules sampled from each class). In the 1st-order figure, curves labeled with filled triangle, square, and pentagon represent sampling of 13,20, and 33 rules resp. from each class. The solid curve is the same as the 1st-order curve in figure 7. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1.

Here, subsamples are taken at random from the samples of rules from each class, and the worst-case distance computed from rules contained in these subsamples. In the 0th-order figure, curves labeled with filled triangle, square, and pentagon represent sampling of 13,50, and 100 rules resp. from each class. The solid curve labeled is the same as the 0th-order curve in figure 7 (200 rules sampled from each class). In the 1st-order figure, curves labeled with filled triangle, square, and pentagon represent sampling of 13,20, and 33 rules resp. from each class. The solid curve is the same as the 1st-order curve in figure 7. The contribution of each class to a distribution is weighted by the number of rules in the class. The curves are then normalized so that the total bin height is 1.

These subsample sizes allow comparisons across orders of approximation to be made in which sample size is invariant. There are typically approximately 13 rules in a 2nd-order class, hence the 2nd-order distribution of figure 7 can be compared with the distributions in figure 8 computed on the basis of sample size 13 (these curves are labeled by filled triangles.) These curves all represent samples of roughly the same size. Still, as the order is increased, the typical worst-case performance improves considerably. In the same way, the distribution based on sub-samples of size 50 of 0th-order classes ( the curve labeled with filled squares in the top panel of figure 4) may be compared with the distribution based on samples of size 50 of the 1st-order classes (Solid curve in the bottom panel of figure 8). From this comparison one again concludes that increase in order typically results in improved worst-case performance. For both 0th- and 1st-order approximation, as the sub-sample size is increased, the indicated worst-case performance worsens. It appears, however, that the distributions in figure 8 are approaching a limit with increased sub-sample size. Hence, the curves based on the largest sample size are probably a good estimate of the actual worst-case performance for these orders of approximation.



next up previous
Next: DISCUSSION Up: EMPIRICAL RESULTS Previous: Class Standard Deviation




Thu Nov 10 12:16:46 GMT 1994