We begin by reviewing some basic features covered in the Overview, along with a few simple facts:
- Attributes consist of types which are sets of values.
- The types and values are identifiers. They have no inherent meaning, no meaning other than that which the language description assigns to them.
- The values may carry grammatical and/or semantic information.
vincimakes no distinction.
- Attributes are used, among other things, to restrict the selection of lexicon entries and to control morphology. As a consequence, they may be used to enforce agreement between parts of an utterance.
- Individual values are "atomic", i.e. indivisible. They can, however, be composed into compound attributes, and these can be deconstructed into their components.
The language describer specifies the attributes with a sequence of lists in the form:
Number(sing, plur) Person (first, second, third) Nountrait(human, animate, edible, ...) Function(subj, objd, obji)
each type name being followed by a parenthesized list of its values. These are gathered in an attributes file. For reasons lost to antiquity, the file must be terminated with the symbol %. This will be rectified in due course.
The identifiers, both types and values, must be distinct.
In the present implementation, and in the foreseeable future, the number of types and the number of values in any one type is limited to 254. This, however, is not really restrictive; if a type needs more values, one may define two separate types and use them as a compound pair, giving about 64,000 combinations.
The class name, for example, Number, may be used as a default, in that it stands for any value in the class (like sing and plur). This is useful in representing things like nouns which may be either singular or plural. It is the responsibility of the language describer to place values which may co-exist, like neg and interrogative in separate classes.
On the other hand, as we will see later when discussing partial ordering, values may exist in a subset relation, where one value implies another.
Compounding and Deconstruction
The following, slightly modified from an earlier example, shows a Number attribute being chosen, and passed to subject and verb to ensure agreement in number between them:
ROOT = choose No: Number; NP[No] V[vtdi, third, No, pres] NP % NP = inherit Nu: Number; DET[Nu] N[Nu] %
Suppose we try to adapt this to ensure semantic agreement between subject and verb, and between verb and object, so that, for example, monkeys may eat bananas, but bananas may not eat monkeys. By analogy, we attempt:
ROOT = choose No: Number, Nt1, Nt2: Nountrait; NP[No, Nt1] V[vtdi, third, No, pres, Nt1, Nt2] NP[Nt2] % NP = inherit Nu: Number, Nt: Nountrait; DET[Nu] N[Nu, Nt] %
But here we encounter a snag. The attribute list of the verb contains two values of type Nountrait, say animate and edible, but
vinci has no way of distinguishing one from the other, no way of knowing which belongs to the subject, which to the object.
vinci solves this problem by way of compounding. Two attribute values, say, animate and subj, may be compounded using the dot-operator to form the compound attribute animate.subj. Replacing the V node in the above context-free rule by:
V[vtdi, third, No, pres, Nt1.subj, Nt2.objd]
then provides the necessary distinction. The lexicon entry for a verb like "eat" must contain in its attributes field compound attributes such as animate.subj and edible.objd to ensure appropriate lexical selection.
Compound attributes may contain as many components as the language describer wants. The order matters: plur.masc is not the same as masc.plur. The order in which the compound was put together does not: first.(masc.plur) is the same as (first.masc).plur. We write both of these as first.masc.plur without parentheses, which would not, in fact, be recognized by
In mathematical terms, the dot-operator is associative but not commutative.
In time, we will come upon many uses for compound attributes, but will not dwell on them here.
By the way, when searching an attribute list containing compound attributes, we need a generalized form of pattern to match against the elements in the list. This is the compound attribute pattern, which looks like a compound attribute but which may contain types as well as values. As in the uncompounded attribute case, the types simply act as wildcards which match any values of that type; for example, Nountrait.subj matches human.subj, animate.subj, edible.subj, ...
If a child node, or indeed any other part of a
vinci process, is passed a compound attribute, it may need to break it apart to obtain subsections: individual simple attributes or shorter compounds. This operation is called deconstruction.
Deconstruction may be carried out by an inherit clause (see Syntax), or by any other process which picks up a compound attribute from a list intending to pass it on. (These are de facto inheritances.) Some of the components of the compound attribute pattern used to search for the required attribute are preceded by slash symbols, /, which replace the dot-operators (except on the first component, where there is no dot). For example, we may write: Nountrait/subj or /Gender.Number. In each case,
vinci searches for a compound attribute matching the whole pattern, and then discards components matching the ones preceded by the deconstruction slash. So the former example gives us the Nountrait value from, say, edible.subj, while the latter gives us plur from masc.plur.
The set of values of an attribute type may form a partial ordering. A partial order is the relation in which one or more elements may share a common dominating element. A genealogy is an example of this: brothers and sisters share a common parent.
We represent partial ordering with the symbols > and <, so that, to use the genealogical example, mother>daughter means that mother is superior to daughter in the hierarch, while daughter<mother means that daughter is inferior to mother in the hierarchy.
The partial ordering relation is transitive, so several layers of precedence are possible. So, to continue with the example, if grandmother<mother and mother<daughter, then grandmother<daughter.
Partial ordering helps to capture some interesting semantic relations. Consider, for example, the following attributes:
- mobile - with the meaning 'things capable of movement'
- vehicle - with the meaning 'means of transport'
- animate - with the meaning 'alive and capable of autonomous action'
It is clear that both vehicles and humans are capable of movement. This can be captured by the following attribute specification:
where 'mobile<vehicle' means that all vehicles are mobile.
Now, if we have a lexicon containing the fields:
where '>vehicle' means 'having the trait vehicle or any higher traits in the partial ordering', and a syntax rule like:
ROOT = N[mobile] %
Then either "car" or "child" may be chosen.
Alternatively, to capture the fact that verbs, for example, may select partially ordered subsets of arguments, we might have lexical entries like:
"eat"|V|ingest, <animate.subj, <edible.obj, ... "apple"|N|>fruit, ... "turnip"|N|>vegetable, ... "boy"|N|>human, ... "dog"|N|>animal, ...
and attributes like:
Entity(animate<human, animate<animal, ... edible<fruit, edible<vegetable, ... ) Action(ingest, move, ... Function (subj, obj)
then a syntax like:
ROOT = N[animate] V[ingest, animate.subj, edible.obj] N[edible] %
will generate, among other things:
boy eat apple dog eat turnip
Of course, if we do not want a particular noun to participate in a partial ordering, we may simply avoid adding the > or < in its lexical entry.
Judicious use of partial ordering can produce complex semantic networks, thereby capturing some lexico-semantic relations. We will see later that lexical pointers provide a valuable complement to this approach.
By accident or design, we often shorten commonly used compound nouns. Thus:
- attribute means attribute value.
- Number value, for example, means a value of type Number. By extension, Nountrait.subj value means a value matching this pattern (which presumably contains at least one type, otherwise we would speak of the attribute human.subj).
- The term compound attribute includes simple attribute as a special case. If the distinction matters, we use simple and non-simple attribute.
- By analogy, compound attribute pattern includes simple attribute pattern.