vinci STore and REcover

The STore command

In this section, we will illustrate and discuss the several textual representations produced by the STore command. To do this, we make use of examples produced for the utterance:

    the fairy godmother passed a magic sword to the prince

The utterance is, of course, a variation on those associated with our much loved generous professor. In this case, it is a good supernatural who presents a magic artifact to the hero of a fairy tale, to assist him in rescuing the victim/heroine.

The sentence, taken from a generated fairy story, was created using a global preselection (a dramatis personae):

    PRESELECT =
        twit:       PN/"Midas";
        hero:       PN[male, brave, handsome];
        victim:     PN/_pre_ twit/@14: daughter;
        villain:    PN/"Merlin";
        goodfairy:  PN[female, good, supernatural];
        magicobj:   N[physobj, magic]
    %

a local preselection for the current utterance:

    PRESELECT =
        vtdi;
        action:      V[give];
        agent:       PN/_pre_ goodfairy;
        beneficiary: PN/_pre_ hero;
        theme:       N/_pre_ magicobj
    %

and syntax for vtdi sentences:

    ROOT =

    ...

    < _pre_ vtdi:
        NPP[agent, def, nomin]
        V[p3, sing, past]/_pre_ action
        (
            {this "dative shift" applies only to some vtdi verbs}
            NPP[beneficiary, def, accus]
            NPT[theme, indef, accus]
        |
            NPT[theme, indef, accus]
            NPP[beneficiary, def, dative]
        )

    ...

    %

where PN is a word category for proper nouns, NPP is a noun phrase for proper nouns (which can yield: "Midas", "the king", "a king" or a pronoun, depending on the attributes), NPPX is a subsidiary used by NPP, and NPT is a common noun phrase (yielding the sword or a sword; not currently a pronoun). The attribute dative, and certain others, cause both NPP and NPT to prefix their phrase with an appropriate preposition.

Note that the syntax allows for the so-called English dative shift: "she gave the prince a sword", as an alternative to: "she gave a sword to the prince".

Single Tree

In the first instance, let us assume that the cluster of trees produced by the generation consists of a single ROOT tree.

The full tree representation in path-signature style, produced by the command STore 1 | 0, is:

$u
$t 999999
N|G|ROOT||0|""||
S|G|NPP|agent, def, nomin|0|""||
SS|G|NPPX|agent, def, nomin|0|""||
SSS|G|DET|def|0|""|LEX_7|
SSSR|G|PN||1|""|LEX_4|
SR|G|V|p3, sing, past|0|""|LEX_6|
SRR|G|NPT|theme, indef, accus|0|""||
SRRS|G|DET|indef|0|""|LEX_8|
SRRSR|G|N||0|""|LEX_5|
SRRR|G|NPP|beneficiary, def, dative|0|""||
SRRRS|G|PREP|dative|0|""|LEX_9|
SRRRSR|G|NPPX|beneficiary, def, accus|0|""||
SRRRSRS|G|DET|def|0|""|LEX_10|
SRRRSRSR|G|PN||1|""|LEX_1|
$t 0
N|G|ROOT||0|""||
S|G|NPP|agent, def, nomin|0|""||
SS|G|NPPX|agent, def, nomin|0|""||
SSS|G|DET|def|0|"the"|LEX_7|
SSSR|G|PN||1|"fairy godmother"|LEX_11|
SR|G|V|p3, sing, past|0|"passed"|LEX_6|
SRR|G|NPT|theme, indef, accus|0|""||
SRRS|G|DET|indef|0|"a"|LEX_8|
SRRSR|G|N||0|"magic sword"|LEX_5|
SRRR|G|NPP|beneficiary, def, dative|0|""||
SRRRS|G|PREP|dative|0|"to"|LEX_9|
SRRRSR|G|NPPX|beneficiary, def, accus|0|""||
SRRRSRS|G|DET|def|0|"the"|LEX_10|
SRRRSRSR|G|PN||1|"prince"|LEX_12|
$L
{LEX_0} "Midas"|PN|male, rich, vain, weak||#1||||||||type:"king"/N|daughter:"Marie"|home:"castle"/N|
{LEX_1} "Braveheart"|PN|male, kill.monster, handsome, brave, strong||#1||||||||type:"prince"/N||
{LEX_2} "Marie"|PN|female, beautiful, kind||#1||||||||type:"princess"/N||
{LEX_3} "Merlin"|PN|male, bad, supernatural||#1||||||||type:"sorcerer"/N||
{LEX_4} "Wanda"|PN|female, good, supernatural||#1||||||||type:"fairy godmother"/N||
{LEX_5} "magic sword"|N|neuter, magic, physobj||#1||
{LEX_6} "pass"|V|Number, Personne, Tense, give||$v7||"passed"|
{LEX_7} "the"|DET|def||#1||
{LEX_8} "a"|DET|indef||#1||
{LEX_9} "to"|PREP|dative||#1||
{LEX_10} "the"|DET|def||#1||
{LEX_11} "fairy godmother"|N|female||#1||
{LEX_12} "prince"|N|male||#1||
%

The textual representation begins with $u and ends with %. (In the intended design, the identifier for an individual cluster of trees was to appear on the $u line.) For a ROOT tree only, it has three parts, heralded by:

    $t 999999
    $t 0
    $L

The first part describes the ROOT tree before it has undergone any syntax transformations. $t 0 describes the transformed ROOT tree, the final tree for the utterance. In this case the tree-shape is unchanged from the first tree, though its leaf nodes now include the words created by the morphology. The third part contains copies of all lexicon entries used in the generated sentences.

In the tree sections, each line represents a single tree node, and has seven fields, containing its:

The last five fields are self-explanatory.

The path signatures define the shape of the tree. The root node has path N (which actually denotes an empty path). In all other cases, the path describes how to get to the node from the root. Reading it sequentially, S tells us to go to leftmost child from the current position, R tells us to go to the next right sibling.

Thus, in the present example,

We will not discuss the properties of the path notation here. Suffice it to say that the nodes are in an order (parent before child, left sibling before right) which allows the tree to be easily rebuilt node by node.

In our environment, the trees are sometimes transmitted to, or imported by, a program called disptree, which displays them graphically. The colour field tells disptree what colour to use for the node, in every case here G (green). Nodes in other colours were sometimes added later.

disptree was oriented to an earlier environment (Solaris 8 and Openwindows), and has not been compiled for any other. In preference to making it available, we may, at some future time, add a further style to STore to produce data for a commonly available download such as graphviz. The "commands" ($u, $t, $L, ...) are part of a much larger set, which can serve to direct a more comprehensive user interface.

The lexicon section contains all the lexicon entries used in the tree cluster, making the representation independent of any changes to the lexicon between storage and recovery. It also avoids the need for REcover (see below) to re-scan the lexicon or to re-perform indirections in order to carry out student error-checking.

The order of the LEX entries is dependent on the order in which tree nodes were developed. In this example, LEX_0 through LEX_5 are the global preselections, LEX_6 the local one. (Only one lexicon entry, action, is preselected locally; the other local preselections involve references to global ones.) LEX_7 through LEX_10 are the various determiners and prepositions requested by the syntax. LEX_11 and LEX_12 arise from LEX_4 and LEX_1 respectively, by way of indirections.

LEX entries may be marked unused implying that vinci did not search the lexicon for some terminal node. This arises in the example of The Generous Professor if his gift, say, is pronominalized. Since the syntax itself specifies its gender and the noun node is discarded by the pronomial transformation, there is no need for vinci to select the noun entry itself.

Note that in the example above, LEX_0, LEX_2 and LEX_3 are not used in the utterance. They are not marked unused, however, because they have corresponding lexicon entries.

LEX entries may also display an error message if the corresponding search was unsuccessful.

Indentation Style

The full tree in indentation style, produced by the command STore 2 | 0, corresponding to $t 0, is:

$t 0
ROOT||0|""||
  NPP|agent, def, nomin|0|""||
    NPPX|agent, def, nomin|0|""||
      DET|def|0|"the"|LEX_7|
      NP||1|"fairy godmother"|LEX_11|
  V|p3, sing, past|0|"passed"|LEX_6|
  NPT|theme, indef, accus|0|""||
    DET|indef|0|"a"|LEX_8|
    N||0|"magic sword"|LEX_5|
  NPP|beneficiary, def, dative|0|""||
    PREP|dative|0|"to"|LEX_9|
    NPPX|beneficiary, def, accus|0|""||
      DET|def|0|"the"|LEX_10|
      NP||1|"prince"|LEX_12|

This is visually more friendly to a human reader. We can clearly see the four children of ROOT: NPP, V, NPT and NPP, along with their children and grandchildren. It is a little less convenient for a subsequent computer algorithm.

List of Leaf Nodes

The abbreviated output, with leaf nodes only, produced by STore 0 | 0 for the same utterance is:

$u
$l 999999
DET|def|0|""|LEX_7|
PN||1|""|LEX_4|
V|p3, sing, past|0|""|LEX_6|
DET|indef|0|""|LEX_8|
N||0|""|LEX_5|
PREP|dative|0|""|LEX_9|
DET|def|0|""|LEX_10|
PN||1|""|LEX_1|
$l 0
DET|def|0|"the"|LEX_7|
PN||1|"fairy godmother"|LEX_11|
V|p3, sing, past|0|"passed"|LEX_6|
DET|indef|0|"a"|LEX_8|
N||0|"magic sword"|LEX_5|
PREP|dative|0|"to"|LEX_9|
DET|def|0|"the"|LEX_10|
PN||1|"prince"|LEX_12|
$L
{LEX_0} "Midas"|PN|male, rich, vain, weak||#1||||||||type:"king"/N|daughter:"Marie"|home:"castle"/N|
{LEX_1} "Braveheart"|PN|male, kill.monster, handsome, brave, strong||#1||||||||type:"prince"/N||
{LEX_2} "Marie"|PN|female, beautiful, kind||#1||||||||type:"princess"/N||
{LEX_3} "Merlin"|PN|male, bad, supernatural||#1||||||||type:"sorcerer"/N||
{LEX_4} "Wanda"|PN|female, good, supernatural||#1||||||||type:"fairy godmother"/N||
{LEX_5} "magic sword"|N|neuter, magic, physobj||#1||
{LEX_6} "pass"|V|Number, Personne, Tense, give||$v7||"passed"|
{LEX_7} "the"|DET|def||#1||
{LEX_8} "a"|DET|indef||#1||
{LEX_9} "to"|PREP|dative||#1||
{LEX_10} "the"|DET|def||#1||
{LEX_11} "fairy godmother"|N|female||#1||
{LEX_12} "prince"|N|male||#1||
%

The trees are replaced by lists of their leaf nodes; the list headers are marked by $l (lowercase L) instead of $t; and path and colour fields are omitted. Otherwise the representation is the same.

REcover creates a tree corresponding to $l 0 with ROOT as root and the eight lines as children.

Multiple Trees

The output for multiple trees is very similar. To illustrate this, context-free rules were added for QUESTION and for R_6, both simply being copies of ROOT. vinci selected a different 'give' verb:

    Question : the fairy godmother handed a magic sword to the prince
    R_6 : the fairy godmother handed a magic sword to the prince

The output for STore 0 | 0 is:

$u
$l 1
DET|def|0|"the"|LEX_7|
PN||1|"fairy godmother"|LEX_11|
V|p3, sing, past|0|"handed"|LEX_6|
DET|indef|0|"a"|LEX_8|
N||0|"magic sword"|LEX_5|
PREP|dative|0|"to"|LEX_9|
DET|def|0|"the"|LEX_10|
PN||1|"prince"|LEX_12|
$l 6
DET|def|0|"the"|LEX_7|
PN||1|"fairy godmother"|LEX_13|
V|p3, sing, past|0|"handed"|LEX_6|
DET|indef|0|"a"|LEX_8|
N||0|"magic sword"|LEX_5|
PREP|dative|0|"to"|LEX_9|
DET|def|0|"the"|LEX_10|
PN||1|"prince"|LEX_14|
$L
{LEX_0} "Midas"|PN|male, rich, vain, weak||#1||||||||type:"king"/N|daughter:"Marie"|home:"castle"/N|
{LEX_1} "Braveheart"|PN|male, kill.monster, handsome, brave, strong||#1||||||||type:"prince"/N||
{LEX_2} "Marie"|PN|female, beautiful, kind||#1||||||||type:"princess"/N||
{LEX_3} "Merlin"|PN|male, bad, supernatural||#1||||||||type:"sorcerer"/N||
{LEX_4} "Wanda"|PN|female, good, supernatural||#1||||||||type:"fairy godmother"/N||
{LEX_5} "magic sword"|N|neuter, magic, physobj||#1||
{LEX_6} "hand"|V|Number, Personne, Tense, give||$v7||"handed"|
{LEX_7} "the"|DET|def||#1||
{LEX_8} "a"|DET|indef||#1||
{LEX_9} "to"|PREP|dative||#1||
{LEX_10} "the"|DET|def||#1||
{LEX_11} "fairy godmother"|N|female||#1||
{LEX_12} "prince"|N|male||#1||
{LEX_13} "fairy godmother"|N|female||#1||
{LEX_14} "prince"|N|male||#1||
%

The list for the untransformed ROOT is now omitted, but lists are shown for QUESTION ($l 1) and R_6 ($l 6). Since syntax transformations and indirections are now applied independently for each tree, we see separate LEX entries for "fairy godmother" and "prince".

The corresponding output for STore 1 | 0 and STore 2 | 0 are self-evident and need no further comment.

Interplay of STore, SAve and REcover

The combination of STore and SAve allows ivi/vinci to create a 'replayable' grammar. For example, consider the following steps:

  1. Load a vinci grammar
  2. Generate an utterance with <Esc g>
  3. Go to a blank corefile and run the command STore
  4. SAve the data in the corefile
  5. Quit ivi
  6. Sometime later, reopen ivi
  7. Reload the grammar from step (1)
  8. Run the command REcover <filename>, where <filename> is the one to which the previously STored file has been SAved
  9. Go to a corefile and type one or more of <ESC m > <digit> <return>

The string generated originally will appear in the corefile. Alternatively, the REcovered grammar output may be used as the basis for error analysis. So, for example, if a string appears in some corefile and the command <Esc K> is typed at the beginning of the string, vinci will compare the string against the model produced by the grammar.