2.1.2 The occurs check

Instead of saying that Prolog matches terms, you'll find that many books say that Prolog unifies terms. This is very common terminology, and we will often use it ourselves. But while it does not really matter whether you call what Prolog does `unification' or `matching', there is one thing you do need to know: Prolog does not use a standard unification algorithm when it performs unification/matching. Instead, it takes a shortcut. You need to know about this shortcut.

Consider the following query:

father(X) = X.

Do you think these terms match or not?

A standard unification algorithm would say: No, they don't. Do you see why? Pick any term and instantiate X to the term you picked. For example, if you instantiate X to father(father(butch)), the left hand side becomes father(father(father(butch))), and the right hand side becomes father(father(butch)). Obviously these don't match. Moreover, it makes no difference what you instantiate X to. No matter what you choose, the two terms cannot possibly be made the same, for the term on the left will always be one symbol longer than the term on the right (the functor father on the left will always give it that one extra level). The two terms simply don't match.

But now, let's see what Prolog would answer to the above query. With old Prolog implementations you would get a message like:

Not enough memory to complete query!

and a long string of symbols like:

X = father(father(father(father(father(father(father(father
(father(father(father(father(father(father(father(father(father
(father(father(father(father(father(father(father(father(father
(father(father(father(father(father(father(father(father(father
(father(father(father(father(father(father(father(father(father

Prolog is desperately trying to match these terms, but it won't succeed. That strange variable X, which occurs as an argument to a functor on the left hand side, and on its own on the right hand side, makes matching impossible.

To be fair, what Prolog is trying to do here is reasonably intelligent. Intuitively, the only way the two terms could be made to match would be if X was instantiated to `a term containing an infinitely long string of father functors', so that the effect of the extra father functor on the left hand side was canceled out. But terms are finite entities. There is no such thing as a `term containing an infinitely long string of father functors'. Prolog's search for a suitable term is doomed to failure, and it learns this the hard way when it runs out of memory.

Now, current Prolog implementations have found a way of coping with this problem. Try to pose the query father(X) = X to SICStus Prolor or SWI. The answer will be something like:

X = father(father(father(father(father(father(...))))))))))

The dots are indicating that there is an infinite nesting of father functors. So, newer versions of Prolog can detect cycles in terms without running our of memory and have a finite internal representation of such infinite terms.

Still, a standard unification algorithm works differently. If we gave such an algorithm the same example, it would look at it and tell us that the two terms don't unify. How does it do this? By carrying out the occurs check. Standard unification algorithms always peek inside the structure of the terms they are asked to unify, looking for strange variables (like the X in our example) that would cause problems.

To put it another way, standard unification algorithms are pessimistic. They first look for strange variables (using the occurs check) and only when they are sure that the two terms are `safe' do they go ahead and try and match them. So a standard unification algorithm will never get locked into a situation where it is endlessly trying to match two unmatchable terms.

Prolog, on the other hand, is optimistic. It assumes that you are not going to give it anything dangerous. So it does not make an occurs check. As soon as you give it two terms, it charges full steam ahead and tries to match them.

As Prolog is a programming language, this is an intelligent strategy. Matching is one of the fundamental processes that makes Prolog work, so it needs to be carried out as fast as possible. Carrying out an occurs check every time matching was called for would slow it down considerably. Pessimism is safe, but optimism is a lot faster!

Prolog can only run into problems if you, the programmer, ask it to do something impossible like unify X with father(X). And it is unlikely you will ever ask it to anything like that when writing a real program.


Patrick Blackburn, Johan Bos and Kristina Striegnitz
Version 1.2.5 (20030212)