13.4 Feature Structure Unification

Unification is a (partial) operation on feature structures. Intuitively, it is the operation of combining two feature structures such that the new feature structure contains all the information of the original two, and nothing more. For example, let F1 be the feature structure

and let F2 be the feature structure

Then, F1 \sqcup \  F2, the unification of these two feature structures is:

Clearly F1\ \sqcup \ F2 contains all the information that is in F1 and F2 --- and it doesn't contain any other information.

Why did we call unification a partial operation? Why didn't we just say that it was an operation on feature structures? The point is that unification is not guaranteed to return a result. For example, let F3 be the feature structure

and let F4 be the feature structure

Then F3\ \sqcup \ F4 does not exist. There is no feature structure that contains all the information in F3 and F4, because the information in these two feature structures is contradictory. So, the value of this unification is undefined.

Those are the basic intuitions about unification, so let's now give a precise definition. This is easy to do if we make use of the idea of subsumption, which we discussed above.

The unification of two feature structures  F and  G (if it exists) is the smallest feature structure that is subsumed by both  F and  G. That is,  F\ \sqcup \  G (if it exists) is the feature structure with the following three properties:

  1. F\ \sqsubseteq \ 
F\sqcup  G ( F\ \sqcup \  G is subsumed by F)

  2.  G\ \sqsubseteq \ 
F\sqcup  G ( F\ \sqcup \  G is subsumed by G)

  3. If H is a feature structure such that F\sqsubseteq  H and G\sqsubseteq  H, then  F\sqcup  G \sqsubseteq \   H ( F\ \sqcup \ G is the smallest feature structure fulfilling the first two properties. That is, there is other feature structure that also has properties 1 and 2 and subsumes  F\ \sqcup \ G.)

If there is no smallest feature structure that is subsumed by both  F and  G, then we say that the unification of  F and  G is undefined.

We said above that the subsumption relation \sqsubseteq on feature structures can be thought of as analogous to the subset relation on sets. Similarly, unification \sqcup is rather like an analog of set-theoretic union (recall that the union of two sets is the smallest set that contains all the elements in both sets). But there is a difference: union of sets is an operation (that is, the union of two sets is always defined) whereas (as we have discussed) unification is only a partial operation on feature structures.

Now that the formal definition is in place, let's look at a few more examples of feature structure unification. First, let F5 be

and let  F6 be the feature structure

Then  F5\ \sqcup \   F6 is

Next, an example involving reentrancies. Let  F7 be

Then  F7 \ \sqcup \  F6 is

A final example. What happens if one of the feature structures we are trying to unify is the empty feature structure [ \ ] --- that is, the feature structure containing no information at all? Pretty clearly, for any feature structure F, we have that:

 F \sqcup \ [ \ ] = [ \ ] \ \sqcup
\  F =   F

That is, the empty feature structure is the identity element for the (partial) operation on feature structure unification. That is, the empty feature structure behaves like the empty set in set theory (recall that the union of the empty set with any set S is just S).

That concludes our discussion of feature structure unification --- but before going on to discuss how to implement it in Prolog, let's briefly look at an operation of feature structures called generalization.

Generalization can be thought of as an analog of the set-theoretic operation of intersection. Recall that the intersection of two sets is the set that contains only the elements common to both sets. Similarly, the generalization of two feature structures contains only the information that is common to both feature structures.

For example, let F9 be

and let  F10 be

Then  F9\ \sqcap \  F10, the generalization of  F9 and  F10, is

Clearly  F9\ \sqcap \   F10 contains only information which can be found in both  F9 and  F10.

Here's a precise definition of generalization: The generalization of two feature structures  F and  G is the largest feature structure that subsumes both  F and  G. That is,  F\ \sqcap \   G is the feature structure with the following three properties:

  1.  F\sqcap  G\ \sqsubseteq \   F

  2.  F\sqcap  G
\ \sqsubseteq \   G

  3. If  H is a feature structure such that H\sqsubseteq  F and  H\sqsubseteq  G, then  H\ \sqsubseteq \  
F\sqcup  G

There is an important difference between unification and generalization: unlike unification, generalization is an operation on feature structures, not just a partial operation. That is, the generalization of two features is always defined. Think about it. Consider the worst case --- two feature structures  F and  G that contain no common information at all. Is there a largest feature structure that subsumes both? Of course there is --- the empty feature structure [ \ ]!

Thus, from a mathematical perspective, generalization is somewhat easier to work with than unification. Nonetheless, it is not used nearly so often in computational linguistics as unification is, so we won't discuss how to implement it. But implementing generalization in Prolog is a very interesting task; one you may like to try for yourself.


Patrick Blackburn and Kristina Striegnitz
Version 1.2.4 (20020829)