vinci commands and operations

Commands for installing and uninstalling vinci language files have been described in the section on Files and file installation. This section describes other vinci commands and operations. Most have been described in detail either in the Overview or in a specific section of the Manual. We have also included a few not-yet-implemented ones. These are next on the to-do list when we take a break from writing manual sections.

Generation

Operation: Generate utterances

Purpose: Generate an utterance or cluster of utterances based on the currently installed language description.

Default Key: <ESC> g

Operation Number: 97

How many utterances form the cluster depends on the presence of the standard rootnode metavariables (ROOT, QUESTION, ANSWER, R_3, etc.) in the syntax and user files. The utterances are stored in a set of 21 buffers from which they can be retrieved by the Retrieve utterance operation.

The utterances for QUESTION, ANSWER, R_3, ..., R_20 are stored in buffers 1, 2, 3, ..., 20, respectively. If ROOT is the only rootnode, its utterance is stored in buffer 0; otherwise, the ROOT utterance is never created.

All utterances are also displayed in the log in corefile 7. In a learner-testing environment, the learner is presumably not shown corefile 7; rather, the QUESTION is retrieved to some other corefile, where the learner will type a response.

As noted in an earlier section, vinci displays a series of progress messages on the second-to-last line of the screen during generation, showing where it has got to. On today's machines, however, these usually disappear before the user has a chance to see them, so they are now of doubtful value. (Except when vinci crashes!!)

Operation: Retrieve utterance

Purpose: Fetch an utterance from a buffer to the current corefile.

Default Key: <ESC> m <number> <RETURN>

Operation Number: 102

The operation requires a parameter, <number>, in the range 0 to 20. It appends a copy of the corresponding utterance to the end of the current corefile. Parameter 0 denotes the ROOT utterance; 1, the QUESTION; 2, the ANSWER; 3, R_3; and so on.

Non-existent utterances produce the error message:

    sentence n does not exist

on the error message line.

Reminder: If any utterance other than ROOT exists, the one corresponding to ROOT itself is not produced. In such a case, the ROOT tree is used only for the generation of the other utterances.

Example: <ESC> m 1 <RETURN> appends a copy of the QUESTION to the current corefile.

Operation: Global preselection

Purpose: Carry out global preselections.

Default Key: <ESC> <ESC> P

Operation Number: 39

Local preselections (if any) are carried out automatically before the generation of any utterance. Global preselections are carried out when the user executes this operation. The actual operations are the same except at the start. Global preselection begins by discarding all previous preselections; local, only the most recent local ones. In either case, vinci searches the syntax (both US and SY) for a PRESELECT rule, and carries out preselection for each clause in turn. The resulting preselections are placed on a list, local ones being appended to any global ones. Since the _pre_ phrases search the list from the bottom, local preselections will supersede global ones (or earlier ones) in the event of a clash of tags.

It is worth repeating a warning. After global preselection, it is advisable to uninstall the file containing the PRESELECT rule. (So it should probably be a US file.) If not, and if a subsequent generation takes place without the rule being discarded or superseded, it will be seen as a local rule. In that case, a new set of preselections will be created with the same tags, and these will temporarily supersede the global ones.

Storage and Retrieval

Command: STore

Purpose: Convert the trees and lexicon entries which created the most recent cluster of utterances from their internal form to a textual representation, and append this to the current corefile.

Format: ST <digit> | <digit>

Parameter:The first parameter specifies the style of the tree:

Style	Meaning
0	requests only the leaf nodes of the trees.
1	requests all nodes, each preceded by a path signature and colour
2	requests all nodes, each followed by its children indented two spaces; in their turn, these are followed by their children indented two more spaces, and so on. (This gives a good non-graphical visual representation.)

while the second parameter specifies its destination:

Destination	Meaning
0	requests transmission to the current corefile
1	requests transmission to an ivi driver program if one is active

In both cases the lexicon entries used in the utterances are displayed.

Command: REcover

Purpose: Recover a cluster of trees and lexicon entries from a file containing the textual representation (style 0 or 1) produced by STore.

Format: RE <filename>

Parameter: The parameter is the name of the file containing the textual representation. If parameter 0 was used for STore, the recovered trees are flattened versions of the originals.

On the face of it, STore appears to be a misnomer, because the textual representation is not sent to a file, and REcover is not the exact inverse, as the names suggest. However, the command has a dual purpose. On the one hand, it allows a user to view the syntax trees. This is useful for debugging a language description, since it shows exactly what the various steps in the generation have produced. On the other hand, STore (to a clear corefile) can be followed by SAve, to produce the file from which the trees can be recovered at a future time.

The REcover command brings a stored cluster back into the vinci environment with sufficient information to permit the vinci error comparison process to be applied to the "old" sentences.

The original design allowed multiple clusters of utterances in a single file, but this was never implemented. (For the record, I believe there was to be an identifying string on the $u line, with a matching string as a second parameter to REcover.)

Details of the textual representation, along with examples, are discussed in The Fairy Godmother.

Miscellaneous

Command: RAndom seed

Purpose: Initialize the random number sequence

Format: RA <number>

Parameter: <number> is a 6- to 8-digit odd number. If the parameter is non-zero, RA uses it to reinitialize the random number sequence; otherwise it reinitializes the sequence using digits obtained from the time-of-day clock. An even number is changed to the next higher odd one. The "seed" is reported in corefile 7.

When vinci needs to make a random choice, it does so by consulting a sequence of "pseudorandom" numbers. The sequence is initialized with the help of the time-of-day clock when ivi is started up, and the initial value reported in corefile 7. This means that on different occasions, exactly the same language files and sequences of actions will give rise to different utterances, which is what the user normally wants.

It is sometimes useful, however, especially when debugging either vinci or a language description, to be able to repeat a sequence of utterances. If RAndom is used to reinitialize the seed to an earlier value, the pseudorandom sequence, and therefore the utterances, will be repeated.

Command: MPasses

Purpose: Alter the number of morphology passes and the lexical fields containing their initial expressions.

Format: MP <number> <number> <number> <number> <number>

Parameters: Up to five integers may be given. The first indicates the root node for which the passes are being altered. Thus 0, 1, 2, 3, ..., 20 denote ROOT, QUESTION, ANSWER, R_3, ..., R_20, respectively. If this parameter is 21, the command applies to the passes used to build phonological variants in the student error-checking process.

The remaining numbers specify the fields containing the initial morphology expressions for up to four passes.

Example: MP 2 | 19 | 17 | 24 causes the ANSWER tree to have three morphology passes with initial expressions in fields 19, 17 and 24.

Command: VFreq

Purpose: Vary the frequency of selected lexicon entries.

Format: VF <terminal node>

Parameter:The parameter is a terminal node of the kind which appears in a syntax rule. It normally has a frequency variation attachment. The terminal node serves as a lexical selection pattern. The command applies the frequency variation to all lexicon entries identified by the pattern. The effect is temporary, lasting until the lexicon is re-installed. Full details are in the Lexicon web page.

Example: VF N[masc]/+0 prevents masculine nouns from being selected.

Operation: Sleep

Purpose: Cause ivi to sleep (do nothing) for three seconds.

Default Key: <ESC> j

Operation Number: 101

In the context of generation for the purposes of testing language learners, we sometimes wish to display a QUESTION for a brief time, before erasing it and allowing the learner to type a response. This operation can be used in an ivi function or procedure to cause a three-second delay between operations. It was used in terminal-based language learning but has been superseded in web-based environments.

Debugging -- `vinci` itself and Language Descriptions

Command: VDisplay

Purpose: Display some of the details of vinci data structures

Format: VD <number>

Parameter: The parameter is a number from 0 through 8 selecting one of 9 alternatives. If a higher number is entered the command is ignored.

The primary purpose of the command is to help debug vinci itself and possibly a language description. Most alternatives simply let the user see that the components of a language description have been installed as intended. Some are very low-level, and may be refined in the future.

The alternatives:

0 - displays by way of the command letters (AT, SY, MO, TB, ...) on the second-to-last line, which language description files are currently installed. (So LE indicates the presence of a lexicon, MO of morphology rules, etc.)
1 - this is the same as, and executes, ST 2 | 0
2 - this is the same as, and executes, ST 1 | 1. It was intended to transmit tree representations to disptree, a program which can run as an ivi driver program and display the trees graphically. We retain VD 2 primarily because disptree still uses it.
3 - appends extracts of the currently installed context-free syntax rules to corefile 7. The extracts are merely strings of bytes, and include some which are non-displayable (values 5, 6, etc.) ivi shows all of these as ~. Unhappily these include byte 255 (or -1), which acts as an internal signal in some of ivi's structures. It is therefore very unwise to edit a corefile containing these lines.

4, 5, 6, 7 and 8 - are analogous to 3. They display the currently installed:

4	syntax transformations
5	morphology tables
6	morphology rules
7	lexicon
8	actual preselections

Similar strictures apply to these as to 3. though 4 and 5 have no non-displayable bytes. In the case of 7 and 8, attributes fields (fields 3) are in a binary form, and again appear largely as undisplayable characters (~).

Command: TStinput

Purpose: Display the tokens produced by the common subprocess of the installation commands.

Format: TS <filename>

Parameter: The parameter is the name of any vinci language description file (including LX, MR, IP and TG files). The file is passed to the common subprocess, and the output is displayed in corefile 7. This lets the user see, for example, that macros are working as intended.

Serious warning. Use this only on very short sample files. The output contains one line per token, and might be excessive if used on a complete file, as the following example illustrates:

    linefeed
    identifier: PRESELECT
    char '='
    linefeed
    identifier: vtdi
    char ';'
    linefeed
    identifier: action
    char ':'
    identifier: V
    char '['
    identifier: give
    char ']'
    char ';'
    linefeed
    identifier: agent
    char ':'
    identifier: PN
    char '/'
    token: _pre_
    identifier: goodfairy
    char ';'

Error Comparison

An utterance typed by the user is compared to several alternative expected utterances generated by vinci, specifically the ones produced by the standard root-node metavariables ANSWER, R_3, etc. There are two comparison processes, one straightforward requiring an exact match, the other far more complex. The former was an early version, supposedly superseded by the latter, but the simple form still proves useful in many of our applications.

The Simple Process

Operation: Error compare (simple version)

Purpose: Compare a student response with an expected utterance or cluster of utterances, using exact matches.

Default Key: <ESC> k

Operation Number: 99

The student response consists of the contents of the current corefile from the cursor position to the end. This is normalized, i.e. converted to a list of words independent of layout, and compared to the expected responses in the utterance buffers 2, 3, ... If an exact match is found, an error function corresponding to the first match is executed. Error functions are defined by the VError command (below).

Two observations: a glance at the code suggests that the "exact" match actually permits vinci wildstars in the expected responses, if the user has contrived to generate them; also that the comparison begins with utterance buffer 1 (QUESTION). The latter would cause a problem if the user merely copied the question as the response. It should probably be corrected.

Command: VError

Purpose: Define an error function

Format: VE <number>|<function string>

Parameters: <number> is the number of the error function being defined; the second parameter specifies the sequence of keys which the function is to be equivalent to.

This is precisely analogous to the ivi FUnction command, except that it defines the error functions, which are triggered by the Error compare (simple) operation. Error function numbers must be less than or equal to 20.

The Full Process

Operation: Error compare (full version)

Purpose: Compare a student response with an expected utterance or cluster of utterances, using the complex error matching process.

Default Key: <ESC> K

Operation Number: 103

The vinci error matching process will be described in a subsequent section. Suffice it to say here that the process detects differences in word order, omission and insertion of words, and certain variations (lexical, morphological, phonological or orthographical) within words. It is helped in this process by morphological, phonological and lexical variants files, which have been mentioned in the Files and Installation section.

The process provides a detailed report of differences between expected and actual utterances, which is appended to corefile 15.

In its present form, the process assigns a score to each report. If there is more than one alternative expected utterance (i.e. more than one root node from ANSWER, R_3, ...), only the match with the best score is reported. In some applications it would be better to give the user control over this. To remedy this, a further operation and a new command are provided which allow the user to specify an utterance number for the comparison.

The new operation takes the form:

     <ESC> <CTRL> K <number> <RETURN>

This compares the student response with a root (for example, 9 corresponds to R_9. The operation produces a report including a score. (I will probably cause the score to be entered in the result register to simplify use by a driver program.) The number can be 1 thro' 20.

Obviously this gives the user free rein to test whichever generated sentences he/she wants.

The new command takes the form:

VRootset <number>, <number>, ...

where the sequence of numbers corresponds to the set of roots to be compared. So, for example, VR 3,5,13 sets the current root set to 3, 5 and 13, so that only R_3, R_5 and R_13 are compared.

VR with no arguments resets the root set to the default.

Morphological Enumeration

Operation: Enumerate morphology

Purpose: List morphological variants of words on the current ROOT-tree

Default Key: <ESC> L

Operation Number: 104

This operation arose as a by-product of the error comparison process. It has been used to check the morphology of lexical entries, and for language-learning applications. The operation enumerates the morphological variants of leafnodes on the last generated ROOT-tree. The variants to be enumerated are determined by the currently installed MR file. The enumerations are appended to corefile 7.

Typical output for the present indicative tense of the French verb dire takes the form:

    plur, p3, prés; disent
    plur, p2, prés; dites
    plur, p1, prés; disons
    sing, p3, prés; dit
    sing, p2, prés; dis
    sing, p1, prés; dis

in which each variant is accompanied by the lists of attributes which produce it. The list is produced in reverse order.

Since the operation sometimes produces results which, while correct, are not what the user was expecting, it is necessary to describe more precisely what the operation does. This will form part of the subsequent section on student error-checking.

Lexical Transformations

A fuller description of the lexical transformation process, along with examples, can now be found in a separate link.

Command: LXtransf

Purpose: Install a lexical transformations file.

Format: LX <filename>

Parameter: The parameter is the name of a file containing one or more numbered lexical transformation rules.

The command installs a file containing lexical transformations and discards any installed previously. This is analogous to other installation commands, and reports any errors in corefile 7.

Command: RXtransf

Purpose: Select a numbered lexical transformation rule among those installed by the LXtransf command

Format: RX <number>

Operation: Generate lexical entries

Purpose: Start (or restart) the generation of lexical entries based on the selected rule.

Default Key: <ESC> l (i.e. lower-case L)

Operation Number: 98

The operation creates new lexical entries from the current lexicon by applying the rule selected with the LXtransf and RXtransf commands. The new lexical entries, which are flagged as ivi records, are appended to the current corefile. The operation can be interrupted with the ivi Abort key. If it is, and if the operation is executed again without a further LX or RX command, creation resumes where it left off.

Warning: There is currently nothing to prevent the generated entries overflowing the total available corefile space and crashing the program. If the entries are about 50 characters long and the corefiles empty, the total space should accept about 80,000 entries. The user is therefore advised to interrupt generation at around 50,000, saving the corefile, clearing it and resuming. A running count is displayed.

There may be a problem on a very fast computer, in that the creation of entries may take place too quickly for the user to react. Pending changes to the program to prevent overflow, the user should consider installing the lexicon in sections to prevent very large yields in a single step.

Semantic expressions and transformations

Operation: Transform semantic expression

Purpose: Make a preselection rule from a semantic expression

Default Key: <ESC> H

Operation Number: 106

This feature is currently being redeveloped. It transforms the semantic expression found on the current cursor line to a preselection rule, based on the semantic transformations in the SM file.

The intended scenario is that a user prepares a list of semantic expressions in a corefile. He/she moves the cursor to each in turn, generates a corresponding preselection rule, installs it, generates a sentence, brings this to the corefile, and then proceeds to the next expression. In this way, he/she creates a paragraph based on the list of expressions.

The first form for semantic expressions and semantic transformations was designed and implemented several years ago, and we have used it in a number of projects; for example, in the generation of fairy tales. The semantic expressions are intended to capture meaning in a language-independent way. They can then be turned into preselections, and thus into text, using language-dependent semantic transformations and syntax. With this approach, we generated corresponding fairy tales both in English and French; and, in fact, the preselections themselves proved to be language-independent. Although these languages are closely related, it is clear that we could also have generated the same tales in many other languages.

Our current semantic expressions and transformations are, however, not adequately general to capture many kinds of meaning that we want to express. We and our students have been conducting research in this area, and the current semantic expressions and transformations will be replaced by enhanced ones in the future.