next up previous
Next: Stability of Experiments Up: No Title Previous: Some Important OTTERFeatures

Things to look for in the experiments

Behavior of K-list, comparison with otter, superlinear speedups, ratio strategy, etc. time to proof vs. gen/sec.

The following experiments come from a variety of areas: lattice theory, semigroup theory, circuit design, calculus, non-classical logic, and robotics. They include several standard theorem prover test problems from the literature as well as some more exotic ones.

The tests were conducted on the Sequent Symmetry in the Advanced Computing Research Facility at Argonne National laboratory. We give results for OTTER as well as for ROO, so that speedups can be measured against the best known sequential algorithm, as they should be, to show that the parallel algorithm with one process is in general comparable to that of OTTER, and to illustrate the occasional differences between OTTER and ROO with one process.

We give statistics not only for elapsed time for also for the number of clauses generated, because it is a rough measure of how much work was done during the run. We aim for regular and predictable speedups in the amount of work done per unit of time, even when the rearrangement of the search space caused by nondeterminism causes sudden changes in the time ROO takes to discover the first proof.

Sam's Lemma, a problem in lattice theory named after the first theorem prover that proved it, is a standard problem for evaluating theorem provers. It was one of the first non-trivial problems done by an automated system. It was first described in [#!mccharen-overbeek-wos!#]. Although its solution was reported nearly twenty years ago, it still remains beyond the reach of most current systems. The main reason is common to many problems in algebra: there are many paths to each derived clause, so any system that does not employ subsumption will explore a much larger search space than is necessary.

It is possible for a free semigroup to be known to be finite yet for its size to remain unknown. The semigroup can be represented in terms of generators and an operation, in which the semigroup itself is the closure of the set of generators under that operation. OTTER's (and therefore ROO's) basic algorithm can thus be used to compute the size of the semigroup. It is only necessary to describe the semigroup operation in terms of hyperresolution. this process is described in detail in [#!mcfadden-lusk:f2b2!#] for the case of the semigroup F2B2. The semigroup F3B2 is larger, and potentially contains 5125 elements. The fact that there are actually only 1435 elements was first discovered by ITP, OTTER's predecessor[#!lusk-overbeek-mccune:itp1!#,#!lusk-overbeek-mccune:itp2!#]. Again, this problem relies heavily on subsumption.

Is it possible to construct a 3-input, 3-output binary circuit, consisting of any number of AND gates and OR gates, but only two NOT gates, in which each output is the inversion of the corresponding input? This is a pleasing circuit design puzzle, described in [#!wos-overbeek-lusk-boyle:book!#] and in ???-mathematical-intelligencer-article. The trick is to keep track not of every circuit that can be built, but only of the output patterns that can be constructed, and the use of the inverters. Any interesting result derived by OTTER is that any solution of this problem (of which there are many) must the two inverters in the same way; that is, to invert the same signals.

This is a challenge problem from Woody Bledsoe (ref??), which states a form of the theorem that the sum of continuous functions is continuous.

This is the ``blind hand problem'', a standard problem from the area of robotics. It is included in many theorem prover test suites.

This is a problem in combinatorial group theory first brought to our attention by John Kalman. It is one of a family of problems to determine the structure of certain groups, called Fibonacci groups, defined by a symmetric set of relations. Some of these problems are still open. It illustrates the performance of ROO on a Knuth-Bendix completion problem. The problem is to show that the group on five generators with the five given relations is the cyclic group of order 22.

Neither the well-known ``wos10'' problem nor Schubert's steamroller are represented in the list of problems. They are each done by OTTER in a few seconds, and thus are simply not difficult to warrant the use of ROO.


next up previous
Next: Stability of Experiments Up: No Title Previous: Some Important OTTERFeatures
Karen D. Toonen
1998-11-19