5.3 Arithmetic and lists

Probably the most important use of arithmetic in this course is to tell us useful facts about data-structures, such as lists. For example, it can be useful to know how long a list is. We'll give some examples of using lists together with arithmetic capabilities.

How long is a list? Here's a recursive definition.

The empty list has length zero.
A non-empty list has length 1 + len(T), where len(T) is the length of its tail.

This definition is practically a Prolog program already. Here's the code we need:

len([],0). len([_|T],N) :- len(T,X), N is X+1.

This predicate works in the expected way. For example:

?- len([a,b,c,d,e,[a,b],g],X). X = 7

Now, this is quite a good program: it's easy to understand and efficient. But there is another method of finding the length of a list. We'll now look at this alternative, because it introduces the idea of accumulators, a standard Prolog technique we will be seeing lots more of.

If you're used to other programming languages, you're probably used to the idea of using variables to hold intermediate results. An accumulator is the Prolog analog of this idea.

Here's how to use an accumulator to calculate the length of a list. We shall define a predicate accLen3/ which takes the following arguments.

accLen(List,Acc,Length)

Here List is the list whose length we want to find, and Length is its length (an integer). What about Acc? This is a variable we will use to keep track of intermediate values for length (so it will also be an integer). Here's what we do. When we call this predicate, we are going to give Acc an initial value of 0. We then recursively work our way down the list, adding 1 to Acc each time we find a head element, until we reach the empty list. When we do reach the empty set, Acc will contain the length of the list. Here's the code:

accLen([_|T],A,L) :- Anew is A+1, accLen(T,Anew,L). accLen([],A,A).

The base case of the definition, unifies the second and third arguments. Why? There are actually two reasons. The first is because when we reach the end of the list, the accumulator (the second variable) contains the length of the list. So we give this value (via unification) to the length variable (the third variable). The second is that this trivial unification gives a nice way of stopping the recursion when we reach the empty list. Here's an example trace:

?- accLen([a,b,c],0,L). Call: (6) accLen([a, b, c], 0, _G449) ? Call: (7) _G518 is 0+1 ? Exit: (7) 1 is 0+1 ? Call: (7) accLen([b, c], 1, _G449) ? Call: (8) _G521 is 1+1 ? Exit: (8) 2 is 1+1 ? Call: (8) accLen([c], 2, _G449) ? Call: (9) _G524 is 2+1 ? Exit: (9) 3 is 2+1 ? Call: (9) accLen([], 3, _G449) ? Exit: (9) accLen([], 3, 3) ? Exit: (8) accLen([c], 2, 3) ? Exit: (7) accLen([b, c], 1, 3) ? Exit: (6) accLen([a, b, c], 0, 3) ?

As a final step, we'll define a predicate which calls accLen for us, and gives it the initial value of 0:

leng(List,Length) :- accLen(List,0,Length).

So now we can pose queries like this:

leng([a,b,c,d,e,[a,b],g],X).

Accumulators are extremely common in Prolog programs. (We'll see another accumulator based program later in this lecture. And many more in the rest of the course.) But why is this? In what way is accLen better than len? After all, it looks more difficult. The answer is that accLen is tail recursive while len is not. In tail recursive programs the result is all calculated once we reached the bottom of the recursion and just has to be passed up. In recursive programs which are not tail recursive there are goals in one level of recursion which have to wait for the answer of a lower level of recursion before they can be evaluated. To understand this, compare the traces for the queries accLen([a,b,c],0,L) (see above) and len([a,b,c],0,L) (given below). In the first case the result is built while going into the recursion -- once the bottom is reached at accLen([],3,_G449) the result is there and only has to be passed up. In the second case the result is built while coming out of the recursion -- the result of len([b,c], _G481), for instance, is only computed after the recursive call of len has been completed and the result of len([c], _G489) is known.

?- len([a,b,c],L). Call: (6) len([a, b, c], _G418) ? Call: (7) len([b, c], _G481) ? Call: (8) len([c], _G486) ? Call: (9) len([], _G489) ? Exit: (9) len([], 0) ? Call: (9) _G486 is 0+1 ? Exit: (9) 1 is 0+1 ? Exit: (8) len([c], 1) ? Call: (8) _G481 is 1+1 ? Exit: (8) 2 is 1+1 ? Exit: (7) len([b, c], 2) ? Call: (7) _G418 is 2+1 ? Exit: (7) 3 is 2+1 ? Exit: (6) len([a, b, c], 3) ?