Thursday, March 16, 2017

Description: Theorem Prover in Lisp

[Update: RTP release 1.1 is now available.]
---

As I mentioned, I've now completed basic testing of my (very, very simple) resolution theorem prover in Common Lisp: the code is ready to load. But read the stuff below first.

1. How to run the system

Once you have the code in front of you, go to Part 2: clauses and extended clauses where you will find the test-data definitions as follows:
(defvar *c1* (mk-fact-clause '(member ?item (?item . ?rest ))))
(defvar *c2* (mk-rule-clause '(member ?item (? . ?rest)) '((member ?item ?rest))))
(defvar *c3* (mk-fact-clause '(likes rod horvath)))
(defvar *c4* (mk-fact-clause '(likes sally renner)))
(defvar *c5* (mk-fact-clause '(likes sally rod)))
(defvar *c6* (mk-fact-clause '(likes hardy renner)))
(defvar *c7* (mk-fact-clause '(likes hardy rod)))
(defvar *c8* (mk-fact-clause '(amusing hardy)))
(defvar *c9* (mk-fact-clause '(likes horvath moties)))
(defvar *c10* (mk-rule-clause '(likes sally ?x) '((likes ?x moties)) ) )
(defvar *c11* (mk-rule-clause '(likes rod ?x) '((likes ?x renner) (likes ?x rod))))
(defvar *c12* (mk-rule-clause '(likes ?x ?y) '((amusing ?y) (likes rod ?y)) ) )
(defvar *c13* (mk-fact-clause '(likes ?x ?x)))

(defvar *g1* (mk-goal-clause '((likes sally ?who))))
(defvar *g2* (mk-goal-clause '((member sally (rod renner horvath sally . moties)))))

(setf *axioms* (list *c1* *c2* *c3* *c4* *c5* *c6* *c7* *c8*
                 *c9* *c10* *c11* *c12* *c13*))

(setf *goals* (list *g1*))
Input your own facts, rules and goals and assign them to variables *axioms* and *goals* as shown above. Or just use my test data to check the system out in the first instance.

Now go to Part 4:  TEST DATA and execute the following commands as shown there.
(defvar *igb-list-ia-list* (initialise-databases *goals* *axioms*))

(defvar *igb-list* (first *igb-list-ia-list*))  ; extended goal clause(s)

(defvar *ia-list*   (second *igb-list-ia-list*))  ; extended axiom clauses

(setf archived-goals-and-proofs (prove *goals* *axioms* 6))    ; depth 6 here

(pprint (setf all-proofs (show-all-proofs archived-goals-and-proofs *ia-list*)))
The search depth (eg 6) is arbitrary - try 4 or 8 etc.

With *1step* set to true,
(defvar *1step* t)        ; If t stops proof loop each iteration and
;                                      prints archived-goals, archived proofs & next goals
the theorem-prover will print its workings on each inference cycle and wait for input before proceeding. If you just want the thing to run to completion, set *1step* to nil.

---

2. In-code documentation

Most of the code is heavily documented with runtime examples - but read this post first.

---

3. Data Structures

The data structures are as follows.

1. We start with literals like (LIKES ?X RENNER), (LIKES SALLY ?WHO) - which the function 'unify' tries to unify to create bindings like this:
((?WHO . RENNER) (#:?X1113 . SALLY)).

2. At the level of basic binary resolution we have clauses, which look like this:

A fact is a clause (a list of literals) which contains only one (positive) literal followed by <- .
(defvar *c1* '( (likes horvath moties) <- )                          ; a fact clause
A rule is a clause (a list) which is a (positive) literal (the clause head) followed by <- and then a list of literals - the body of the clause.

We use <- with value nil as a spacer only for human readability.
(defvar *c2* '((likes sally ?who) <- ((likes ?who ?y) (likes ?y moties)))     ;  rule
A goal is a clause (a list) of one or more negative literals. First element is <- for human readability.
(defvar *g* '(<- ((likes ?x renner) (likes ?x rod)) ) )          ;  a goal clause

3. At the level of running the whole proof procedure we have extended-clauses, sometimes written as igb-list and ia-list for extended goal clauses and extended axiom clauses. The letter 'i' stands for index, and 'b' stands for binding.

An extended clause is a triple: index,  clause, then (if a goal) binding. It is implemented as a list like this: (index clause binding). Example:
(defvar *icb* '((16 (4 (nil nil)) (2 (nil nil)))                   ; Index
                     (<- ((likes ?x renner) (amusing ?x)))       ; Clause
                              ((T . T))))                                    ; Binding
An index is a binary tree of integers, with leaves nil.
Example: (1 (nil nil))   or   (16 (12 (7 (nil nil)) (4 (nil nil)) ) (5 (nil nil)) )
Each leading integer is the current clause number; the two immediate children are the two resolution parent clauses .. and so on. Initial goals, and axioms have no parents, thus nil.

The reason for extended clauses is that we need to keep track of the inference steps to reconstruct the entire proofs afterwards. The bindings are kept to allow us to instantiate variables in the original query which we need as the answer!

---

4. FAQs (reprinted from here).

4.1. Isn't this just Prolog?

Prolog is encountered as a black box. You provide a program and a query (as above) and you get back bindings, like this:
?- likes(sally, W).

W = sally;
W = rod;
... and so on

If you want proofs, to add the 'occurs check' or to change the order in which resolution steps are tried (the search strategy) - well tough: all those things are both fixed and opaque in Prolog. To make them explicit for modification, you have to write an explicit theorem-prover in Prolog (which can of course be done).

4.2. You are using Horn clauses?

Yes. I initially thought to implement a general clausal prover, but Horn clauses make the resolution step particularly simple (just one positive literal to match) and you lose neither generality nor expressive power. But since everything in the code is both modular and explicit, it would be easy to extend the program.

4.3. The style is functional?

Yes. Resolution theorem provers on the Internet are heavily optimised with clever, imperative algorithms and direct access data structures such as hash tables. This makes the code efficient but obscure - very hard to understand.

I didn't want to do that. This is a tool-kit and I'm not trying to create thousands of logical inferences per second. My intended application area (simple inference over an ever-changing knowledge-base for a chatbot) never requires massive inferential chains so clarity and easy modifiability was my objective.

4.4. What was hard?

The program is architected at three levels.

(a) Unification

This is basically already quite modular and straightforward. I re-used Peter Norvig's code.

(b) Resolution

Binary resolution itself - specially with Horn clauses - is a straightforward procedure - as you will see in the code. There are some subtleties with bindings and substitutions, but once you realise that resolution is fundamentally a local operation it's not too difficult.

(c) Control and proof construction

The process of creating new goals and resolving them with the axioms is somewhat complex although again, sticking with Horn clauses makes it a lot easier: just think of a naive Prolog execution model.

However, if you want to return proofs, you need to number the axioms and number-and-archive goals as you process them, capturing the resulting tree of inferences. At the end, for each empty-clause = successful proof, you need to trace that index-tree backwards to recover the proof steps. I found it a bit intricate.

4.5. What's next?

In theory you can do anything with a theorem-prover (Prolog being the existence proof) but it's not necessarily the best architecture. For a planner, where state changes are part of the problem definition, I need to adapt the tool-kit to a design centred around actions with pre- and post-condition in the context of a goal-state and an updating world-state. Such a dynamic model can be used both for actions in a (virtual) world and conversation planning via speech acts.

The theorem-prover remains optimal as a separate module, managing the crystallized knowledge base using inference to draw conclusions - for example to drive questions and answers.

I'm thinking of using the already-programmed Eliza front-end as a robust conversational interface. Doesn't matter if it's bluffing some of the time if it can use learning, inference and planning often enough.

Onwards to the GOFAI smart chatbot ...

---

5. Miscellaneous

As usual, code is provided for free use, unsupported and with no guarantees at all that there aren't bugs I've missed. Feel free to tell me in the comments.

Other code you may find useful:
  1. A resolution theorem prover in Lisp (as described here)
  2. Prolog in Lisp
  3. Minimax (noughts-and-crosses) in Lisp
  4. Eliza in Lisp
 An example of input and output of the prover.

---

Update: Saturday March 18th 2017.

1. Code here slightly updated (to v. 1.01) with change to 'show-proof' (and 'show-all-proofs') to add an additional 'top level goals' parameter, thereby getting rid of the previous embedded global variable. This makes the code completely functional now.

2. With further testing (on the genealogical axiom set below) .. with deeper inferences .. I have noticed that naive pprint doesn't do a good job of showing the output - the 'found proofs'. Over the next few days I'll write a dedicated display function which can handle more deeply-nested proofs and let you know when it's done. I'll also provide you with proper version of the genealogical test data below: (not completed testing, so it's illustrative only at this point).

; ---  Test data: family tree : axioms and goal ---
;   https://jameskulkarni17.wordpress.com/2011/09/26/family-tree-using-prolog/
;   with anglicised names and modified rules.

(defvar *g1* (mk-fact-clause '(male gerry)))
(defvar *g2* (mk-fact-clause '(male peter)))
(defvar *g3* (mk-fact-clause '(male john)))
(defvar *g4* (mk-fact-clause '(male mike)))
(defvar *g5* (mk-fact-clause '(male james)))
(defvar *g6* (mk-fact-clause '(male hugo)))

(defvar *g7* (mk-fact-clause '(female mary)))
(defvar *g8* (mk-fact-clause '(female jane)))
(defvar *g9* (mk-fact-clause '(female sarah)))
(defvar *g10* (mk-fact-clause '(female christine)))


(defvar *g11* (mk-fact-clause '(child-of gerry mary peter)))
(defvar *g12* (mk-fact-clause '(child-of gerry mary john)))
(defvar *g13* (mk-fact-clause '(child-of peter jane james)))
(defvar *g14* (mk-fact-clause '(child-of peter jane mike)))
(defvar *g15* (mk-fact-clause '(child-of john sarah christine)))
(defvar *g16* (mk-fact-clause '(child-of john sarah hugo)))


(defvar *g17* (mk-fact-clause '(brother peter john)))
(defvar *g18* (mk-fact-clause '(brother john peter)))
(defvar *g19* (mk-fact-clause '(brother james mike)))
(defvar *g20* (mk-fact-clause '(brother mike james)))

(defvar *g21* (mk-fact-clause '(brother hugo christine)))
(defvar *g22* (mk-fact-clause '(sister christine hugo)))

(defvar *g23* (mk-rule-clause '(parent ?X ?Y) '((child-of ?X ? ?Y))))
(defvar *g24* (mk-rule-clause '(parent ?X ?Y) '((child-of ? ?X ?Y))))

(defvar *g25* (mk-rule-clause '(sibling ?X ?Y) '((brother ?X ?Y))))
(defvar *g26* (mk-rule-clause '(sibling ?X ?Y) '((sister ?X ?Y))))

(defvar *g27* (mk-rule-clause '(father ?X ?Z) '((child-of ?X ? ?Z))))

(defvar *g28* (mk-rule-clause '(mother ?Y ?Z) '((child-of ? ?Y ?Z))))

(defvar *g29* (mk-rule-clause '(son ?X ?Y ?Z)
                              '((male ?X) (father ?Y ?X) (mother ?Z ?X))))

(defvar *g30* (mk-rule-clause '(daughter ?X ?Y ?Z)
                              '((female ?X) (father ?Y ?X) (mother ?Z ?X))))

(defvar *g31* (mk-rule-clause '(wife ?X ?Y) '((female ?X) (child-of ?Y ?X ?))))

(defvar *g32* (mk-rule-clause '(grandfather ?X ?Z)
                              '((male ?X) (parent ?X ?Y) (parent ?Y ?Z))))

(defvar *g33* (mk-rule-clause '(grandmother?X ?Z)
                              '((female ?X) (parent ?X ?Y) (parent ?Y ?Z))))

(defvar *g34* (mk-rule-clause '(uncle ?X ?Z)
                              '((male ?X)   (sibling ?X ?Y) (parent ?Y ?Z))))

(defvar *g35* (mk-rule-clause '(aunt ?X ?Z)
                              '((female ?X) (sibling ?X ?Y) (parent ?Y ?Z))))

(defvar *g36* (mk-rule-clause '(cousin ?X ?Y)
                              '((parent ?U ?X) (sibling ?U ?W) (parent ?W ?Y))))

(defvar *g37* (mk-rule-clause '(ancestor ?X ?Y ?Z) '((child-of ?X ?Y ?Z))))
(defvar *g38* (mk-rule-clause '(ancestor ?X ?Y ?Z)
                              '((child-of ?X ?Y ?W)  (ancestor ?W ? ?Z))))

(defvar *family-tree-axioms* (list *g1* *g2* *g3* *g4* *g5* *g6* *g7*
                                 *g8* *g9* *g10* *g11* *g12* *g13*
                                 *g14* *g15* *g16* *g17* *g18* *g19*
                                 *g20* *g21* *g22* *g23* *g24* *g25*
                                 *g26* *g27* *g28* *g29* *g30* *g31*
                                 *g32* *g33* *g34* *g35* *g36* *g37* *g38*))

(defvar *gg1* (mk-goal-clause '((father ?X ?Y))))
(defvar *gg2* (mk-goal-clause '((mother ?X ?Y))))
(defvar *gg3* (mk-goal-clause '((child-of ?X ?Y ?Z))))
(defvar *gg4* (mk-goal-clause '((son james ?X ?Y))))
(defvar *gg5* (mk-goal-clause '((daughter christine ?X ?Y))))
(defvar *gg6* (mk-goal-clause '((grandfather ?X ?Y))))
(defvar *gg7* (mk-goal-clause '((aunt ?X ?Y))))
(defvar *gg8* (mk-goal-clause '((uncle peter ?X))))
(defvar *gg9* (mk-goal-clause '((cousin ?X ?Y))))
(defvar *gg10* (mk-goal-clause '((ancestor ?X ?Y james))))

(defvar *family-tree-goal* (list *gg1* ))   ; start testing with the first goal

;;; A function, 'prove-it', to save copy-and-paste of the long commands below

(defun prove-it (goal depth)
    (setf *igb-list-ia-list*
        (initialise-databases *family-tree-goal* *family-tree-axioms*))
    (setf *igb-list* (first *igb-list-ia-list*))
    (setf  *ia-list*  (second *igb-list-ia-list*))
     (setf archived-goals-and-proofs (prove (list goal) *family-tree-axioms*  depth))
     (pprint (setf all-proofs
                           (show-all-proofs archived-goals-and-proofs *ia-list* (list goal))))
 nil)

#|   --- Commands to run the test data ---

(defvar *igb-list-ia-list*
        (initialise-databases *family-tree-goal* *family-tree-axioms*))
(defvar *igb-list* (first *igb-list-ia-list*))
(defvar *ia-list*  (second *igb-list-ia-list*))

(setf archived-goals-and-proofs (prove *family-tree-goal* *family-tree-axioms*  6))

(pprint (setf all-proofs (show-all-proofs archived-goals-and-proofs *ia-list* *family-tree-

goal*)))

or type

(prove-it *gg1* 6)    ; but currently doesn't display well

----------------------------------------------------------------------------------
|#

#|
**********************   Intended Output   *************************

(defvar *gg1* (mk-goal-clause '((father ?X ?Y))))

1 ?- father(X,Y).

X = gerry
Y = peter ;

X = gerry
Y = john ;

X = peter
Y = james ;

X = peter
Y = mike ;

X = john
Y = christine ;

X = john
Y = hugo ;

No
******************************************************
(defvar *gg2* (mk-goal-clause '((mother ?X ?Y))))

2 ?- mother(X,Y).

X = mary
Y = peter ;

X = mary
Y = john ;

X = jane
Y = james ;

X = jane
Y = mike ;

X = sarah
Y = christine ;

X = sarah
Y = hugh ;

No
******************************************************
(defvar *gg3* (mk-goal-clause '((child-of ?X ?Y ?Z))))

3 ?- child-of(X,Y,Z).

X = gerry
Y = mary
Z = peter ;

X = gerry
Y = mary
Z = john ;

X = peter
Y = jane
Z = james ;

X = peter
Y = jane
Z = mike ;

X = john
Y = sarah
Z = christine ;

X = john
Y = sarah
Z = hugo ;

No
******************************************************
(defvar *gg4* (mk-goal-clause '((son james ?X ?Y))))

4 ?- son(james,X,Y).

X = peter
Y = jane

Yes
******************************************************
(defvar *gg5* (mk-goal-clause '((daughter christine ?X ?Y))))

5 ?- daughter(christine,X,Y).

X = john
Y = sarah

Yes
******************************************************
(defvar *gg6* (mk-goal-clause '((grandfather ?X ?Y))))

6 ?- grandfather(X,Y).

X = gerry
Y = james ;

X = gerry
Y = mike ;

X = gerry
Y = christine ;

X = gerry
Y = hugo ;

No
******************************************************
(defvar *gg7* (mk-goal-clause '((aunt ?X ?Y))))

7 ?- aunt(X,Y).

X = jane
Y = christine;

X = jane
Y = hugo ;

X = sarah
Y = james ;

X = sarah
Y = mike ;

No
******************************************************
(defvar *gg8* (mk-goal-clause '((uncle peter ?X))))

8 ?- uncle(peter,X).

X = christine;

X = hugo;

No
******************************************************
(defvar *gg9* (mk-goal-clause '((paternal-cousin ?X ?Y))))

9 ?- cousin(X,Y).

X = james
Y = christine ;

X = james
Y = hugo ;

X = mike
Y = christine ;

X = mike
Y = hugo;

X = christine
Y = james ;

X = christine
Y = mike ;

X = hugo
Y = james ;

X = hugo
Y = mike ;

No
******************************************************
(defvar *gg10* (mk-goal-clause '((paternal-ancestor ?X ?Y james))))

10 ?- ancestor(X,Y,james).

X = peter
Y = jane ;

X = gerry
Y = mary ;

|#


8 comments:

  1. OK. I downloaded LISPWORKS today and the theorem prover seems to work on the original test data. I guess that the cleverest thing it deduced is:

    (((<- ((LIKES SALLY HORVATH)))

    because that follows from clauses 9 and 10 so is not a direct axiom. There are lots of numbers in the output, which I dont recognise yet, although it would be useful if the prettyprint just told us that clauses 9 etc were being used.

    I dont claim to have much understanding of LISPWORKS itself as of right now....

    ReplyDelete
    Replies
    1. Actually I have now modified the rule-set to add:

      (defvar *c10b* (mk-rule-clause '(likes sally ?x) '((amusing ?x )) ) )

      and, indeed we now deduce:

      (((<- ((LIKES SALLY HARDY))))

      since Hardy is amusing.

      What I mean about the numbers is that the clause names themselves (e.g. c10 or my new c10b) dont appear in the derivation description, just these other clause numbers which dont seem very informative(?)

      I am also wondering whether its deduction about Hardy is including some irrelevant analysis of Rod and Renner. I shall need to get better acquainted with this output information!

      Delete
    2. Look in the code file at the walkthrough of "one-step-prove" where you will see how clause lists ia-list and igb-list get assigned numbers (initially from 'initialise-databases').

      This has nothing to do with any assignments you personally made in setf or defvar statements.

      The prover output, which shows the proof tree, uses these dynamically-generated index numbers.

      In the end, I'm afraid that understanding the code is the key to it. It's not entirely trivial, I agree.

      Delete
  2. Looking at it on Day 2 I notice an output not contained in your 5, namely:

    (((<- ((LIKES SALLY SALLY))))
    (((13 NIL NIL) ((LIKES ?X ?Y) <- ((AMUSING ?Y) (LIKES ROD ?Y))))
    ((10 NIL NIL) ((LIKES SALLY ?X) <- ((LIKES ?X MOTIES))))
    ((10 NIL NIL) ((LIKES SALLY ?X) <- ((LIKES ?X MOTIES))))
    ((14 NIL NIL) (<- ((LIKES SALLY ?WHO))))
    ((17 14 10) (<- ((LIKES ?WHO MOTIES))))
    ((21 17 10) (<- ((LIKES MOTIES MOTIES))))
    ((27 21 13) +EMPTY-CLAUSE+))))

    This seems to say that Sally likes Sally because Sally (it has now been deduced) likes Moties and Sally likes anyone who likes Moties!
    Though I am not sure why not very amusing Rod is included here in the proof. I guess its back to the code and the debugger to find out what is really going on ....

    ReplyDelete
    Replies
    1. This appearance of Rod might be a "database refresh issue" since when I start with a clean database the ROD clause is not present. However I still get this extra Likes Sally Sally with depth =4 (minus the ROD line).
      Also this relates to my previous day's remark about Rod and Renner which also appeared unexpectedly. At one time I thought that the "proof" included (abandoned)"search" paths, but maybe it is more subtle as Rod etc dont always appear.

      Delete
    2. That's a relief: the "proof" you mentioned in your previous comment is clearly not correct. It would have been a bug.

      Note that as long as the inferential steps are sound, it's possible to create arbitrarily long (and redundant) proofs. This is not a bug, it's a not-too-interesting feature.

      Delete
    3. Here is the clean form:

      (((<- ((LIKES SALLY SALLY))))
      (((13 NIL NIL) ((LIKES ?X ?X) <-))
      ((10 NIL NIL) ((LIKES SALLY ?X) <- ((LIKES ?X MOTIES))))
      ((10 NIL NIL) ((LIKES SALLY ?X) <- ((LIKES ?X MOTIES))))
      ((14 NIL NIL) (<- ((LIKES SALLY ?WHO))))
      ((17 14 10) (<- ((LIKES ?WHO MOTIES))))
      ((21 17 10) (<- ((LIKES MOTIES MOTIES))))
      ((27 21 13) +EMPTY-CLAUSE+))))

      I guess the repetition of clause-index 10 is OK since I think it is used twice. I havent studied the exact cause of the database refresh issue, so it might return to haunt us later....

      Delete
    4. Re: Update - OK I wont be able to work on it for a few days. I agree that the pretty-print could be clarified for the longer proofs. No bugs uncovered as yet.

      Delete

Comments are moderated. Keep it polite and no gratuitous links to your business website - we're not a billboard here.