vinci error analysis

Some history

vinci was originally designed to provide a tool for generative computer aided language learning (CALL). In that function, it is distinct from two other approaches: a) canned learning materials, where the designer enters manually everything a learner will see, and b) open-ended tools like grammar and spelling checkers designed to deal with unstructured input. In addition, the generative paradigm opens the door for an adaptive learning approach, where a learner's performance on preceding materials is used to calculate what is shown subsequently.

Generation of questions and answers

The actual generation of questions and answers follows the framework presented earlier, where a syntax rule produces, by means of transformations, QUESTION, ANSWER and potentially variant trees (R_3, R_4, etc.) and where lexical pointers and morphology rules permit the creation of regular variants. So, in a simple case, a synax like:

    ROOT = CHOOSE Ge: Genre, No: Nombre;

        ADJ[french, Ge, No] %

    QUESTION = ROOT %

    ANSWER = MAKE_ENG:ROOT %

    MAKE_ENG = TRANSFORMATION
    ADJ[french] : 1[english, -Genre, -Nombre];
    %

will produce a question composed of a French adjective of variable gender and number and an answer composed of its equivalent in English (minus gender and number, since these are not used in English). This assumes lexical entries like:

"rond"|ADJ|Langue, Genre, Nombre||$adjreg|||"round"|
"petit"|ADJ|Langue, Genre, Nombre||$adjreg|||"little"|

and so on, as well as a morphology rule like:

    rule adjreg
        english : #8;
        fem, plur : #1 + "es";
        fem, sing : #1 + "e";
        masc, plur : #1 + "s";
        masc, sing : #1;
    %

As an alternative, use of the $0 frequency adjustment (remember: this reduces a lexical item's frequency to 0 so it won't be chosen again) allows for multiple choice questions generated by a syntax like:

    ROOT = CHOOSE Ge: Genre, No: Nombre;

        ADJ[french, Ge, No]/$0 ADJ[french, Ge, No]/$0  
    %

    QUESTION = MAKE_FRN:ROOT %

    ANSWER = MAKE_ENG:ROOT %

    MAKE_FRN = TRANSFORMATION
    ADJ[french] ADJ[french] : 1;
    %

    MAKE_ENG = TRANSFORMATION
    ADJ[french] ADJ[french] : 1[english, -Genre, -Nombre];
    %

Note that this simple example produces two French adjectives, with the first one being the correct one. It is assumed that some subsequent process will 'shuffle' their order, although this could be done with a vinci grammar as well. The answer is the appropriate English word. Note also that constant use of the $0 operator will eventually reduce the available lexical items to 0, so it is usual to 'refresh' the list between questions by the VFreq command to reset frequencies to some positive value.

This simple example illustrates the basic mechanisms, but it is useful to bear in mind that comparable grammars could call images or sounds as questions or answers. This is done on the VinciLingua site, which functions on a web page, with ivi/vinci as a back end.

Once questions and answers have been generated, and learner input obtained, vinci permits two types of error analysis:

The first, which is based on simple string matching and requires less processing, is most useful when the set of answers is simple and predictable, as in multiple choice questions and answers. The second, which may include many levels of analysis, comes into its own in the case of more complex student answers. We will discuss each of these in turn.

Simple error analysis

vinci's simple error analysis is triggered by the <ESC k> key sequence and applies to anything in the current corefile's text zone beyond the current cursor position. So a typical sequence of actions would include:

  1. define the system's response to each input (see below)
  2. generate a question/answer set
  3. emit the appropriate question
  4. open an empty corefile
  5. place the cursor at the start of the corefile
  6. enter the learner's response
  7. place the cursor at the start of the learner's response
  8. enter the sequence <ESC k>

Within this sequence, the system's analysis is based on the VError command, which takes the form:

    VE <number>|<string>

The values for <number> may range from 0 to 20. Each corresponds to the content of a generated tree, so VE 0 corresponds to the ROOT, VE 1 to the QUESTION, VE 2 to the ANSWER, and so on.

The <string> associated with each number is sent to the current corefile if the learner's input corresponds to its tree. So, given a grammar which produces an ANSWER of "une table", and an R_3 of "un table", where only the first is correct, and assuming the set of commands:

    VE 2|Yes, you're correct!
    VE 3|No, that's not right.

if the learner has typed

    une table

the system will output Yes, you're correct!, and so on. Note that the string that is output may contain any ivi/vinci operation, including calls to emit generated strings. So, in the place of the VE 3 string above, which is not very helpful, it would be possible to define:

    VE 3|No, the right answer is {[}m2{m}

which will emit the correct answer.

This simple approach to errors is particularly useful in cases like multiple choice questions, where the set of potential answers is known in advance. In addition, to deal with freeform input, it also has a 'wildcard' feature which can be used as an else test, as the following example illustrates.

Let us assume a case where a learner has typed table, with no article, or les tables. These responses will fall through the tests defined by VE2 and VE3 above. To remedy this, let us assume a new lexical entry:

   "*"|N|star||#1|||

and a rule R_3 which generates the star. Assume also the rule:

    VE 3|No, the answer is {[}m2{m}

Now, if the learner types something that does not match la table, he or she will see the error message defined by VE 3 above. (For more details on wildcards, see the ivi manual description of Functions.)

Note also that more complex star-based patterns are also possible, like play*, to catch all variants of "play", or * two *, to capture a sequence with "two" as its middle part.

Complex error analysis

Unlike simple error analysis, vinci's complex error analysis performs a more thorough study of a learner's input at a variety of levels, including syntax, lexical choice, morphology, and spelling, and generates a detailed report on what it finds. We will examine each of these in turn, but several things are worth noting from the outset.

vinci performs what might be called parsing by generation, in the sense that it attempts to generate what the student has entered. In the case of a match, especially in the case of errors, it is assumed that the set of rules used by vinci correspond in some way to the operations used by the student, and that these will help in diagnosing what has gone wrong. We do not assume psychological plausibility (that what the machine has done corresponds to what has occurred in the learner's brain), but we do assume that being able to obtain the same output as the student provides at least an indication of what might underly any errors found.

The order of analysis used is as follows. First, for each of the potential answers generated by the system, a two-dimensional matrix is produced with the learner's response in one dimension and the system's response in the other. In the case of a perfect match between the two, we can think of a row of checks down the diagonal, where the learner's first word corresponds to the system's first word, the learner's second word corresponds to the system's second word, and so on. Note that the trees generated by the system may correspond either to various possible expected correct answers or to expected erroneous answers.

In the case where the 'diagonal' just described is not found, vinci first checks whether the expected words of the response can be found 'off' the diagonal, in other words, out of order. It also checks whether erroneous words 'on' the diagonal could be the product of some morphology rule (as in making a singular plural), some lexical relation (as in using a synonym), or some spelling error. If each of these cases, it is considered that a match has been found.

Since vinci may check multiple possible syntax trees, multiple morphological and lexical generations, and so on, the best match to the learner's input is to some extent a matter of how various errors are scored. For example, are syntax errors judged to be graver than inflectional errors, or the reverse? To allow for these variations, the system has two commands:

The EScore command has six parameters which represent the scores to be assigned to:

The EScore command takes the following form:

ES <integer>|<integer>|<integer>|<integer>|<integer>|<integer>

where the integers (positive number from 0 upwards) represent the six elements shown above.

Fields may be left blank, as in:

ES |<integer>||<integer>||<integer>

Blank fields are given a value of 0 by default.

The current status of the EScore values can be seen by means of the PErrorscores command, which takes the form:

PE <return>

and which prints the current error values in Corefile 7. Initially, output looks like this:

Error scores: missing/extra 0; order 0; lex 0; morph 0; phon 0; spell 0

After an EScore command like:

ES 5|10|3|0|5|2

the PEerrorscores command would show:

Error scores: missing/extra 5; order 10; lex 3; morph 0; phon 5; spell 2

It is important to note that the initial value of all error scores is 0, so it is important to set them in the initial procedure file or driver.

Also, remember that these scores represent penalties, so a correct answer has a score of 0. As in golf, a lower score is better.

By default, in complex error analysis, a user's input is checked against roots R_2 (aka QUESTION) to R_19. However, this can be problematic. For example, a grammar writer may wish to give instructions to a learner by generating them and representing them in one of the root trees. Or again, a sequence of trees may be used to set the context for a question and answer. To cater to these instances, and provide a grammar writer with the option of checking only some trees, vinci provides the VRootset command, which has the format:

    VRootset <number>, <number>, ..

where the sequence of numbers corresponds to the set of roots to be compared. So, for example, VR 3,5,13 sets the current root set to 3, 5 and 13, so that only R_3, R_5 and R_13 are compared. (VR with no arguments resets the root set to the default.)

The error analysis itself is triggered by the sequence <ESC K> (note the uppercase K). As in the case of simple analysis, the cursor should be on the start of the learner's response, and everything from that point forward will form part of what is analysed.

We will turn now to each of the levels of analysis.

Syntax analysis

Since vinci can generate multiple parallel trees, it can be used to generate multiple potential answers, all of them correct. So, for example, given a question like:

    What is the capital of Canada?

the following possible trees might be produced by means of transformations:

    ANSWER: The capital of Canada is Ottawa.
    R_3: It is Ottawa.
    R_4: Ottawa 

and so on, where R_3 and R_4 are systematic reductions of the initial tree, as described in some detail in Z. Harris' work on Mathematical Linguistics. (Of course, the variant trees might also be entirely unrelated.)

The syntax analysis will find the closest match to one of these (which may still include errors or differences) and use that for its response to the learner.

In the case where no tree matches perfectly the learner's response, then vinci will check whether the correct items are present, but in the wrong order, or whether some are absent, or whether new items are present which have not been predicted. The following examples illustrate this.

Let us assume a simple grammar which generates a sequence of two number names ("one", or "two") in any order:

ROOT = NUM NUM %

QUESTION = ROOT %

ANSWER = ROOT %

Let us assume that the expected answer is "one one". If the learner types either "one two", or "two one", the system will show the extra item:

  BEST ROOT : R2
  SCORE     : 10
  EXPECTED : one one
  RESPONSE : two one
  -- S1  EXTRA    two
  C1 S2   EXACT   one/one
  ***

  BEST ROOT : R2
  SCORE     : 10
  EXPECTED : one one
  RESPONSE : one two
  -- S2  EXTRA    two
  C1 S1   EXACT   one/one
  ***

On the other hand, if the system was expecting "one two", and the learner misses an item by typing "two" only, a typical response might look like:

  BEST ROOT : R2
  SCORE     : 20
  EXPECTED : one two
  RESPONSE : two
  C1 --  MISSING  one
  C2 S1   EXACT   two/two
  ***

In the case of misordered responses, the first two columns, where C represents the computer's expectation and S the learner's response, with the digits corresponding to the order in each case, will show the nature of the error or errors:

  BEST ROOT : R2
  SCORE     : 20
  EXPECTED : one two
  RESPONSE : two one
  C1 S2   EXACT   one/one
  C2 S1   EXACT   two/two
  ***

Of course, these may all be combined in the case of more complex responses. Note in passing that the separation of the learner's response is based on spaces, so the system will need to adapt to sequences like "il n'a pas", which corresponds to three items in the analyzed learner response, but four as generated by the computer. The usual response is either to pre-parse the learner's response, or to transform the system's response using transformations.

Lexical analysis

As noted above, if vinci does not find a match at some point on the diagonal formed by the learner's response and the tree being compared, it searches for lexical variants of each item. We will illustrate this with the case of synonymy. To begin, let us assume the following lexical items:

"cat"|N|animal, Number||$s||syn:"feline";||||
"dog"|N|animal, Number||$s||syn:"canine";||||
"feline"|N|animal, Number||$s||syn:"cat";||||
"canine"|N|animal, Number||$s||syn:"dog";||||

Note that each has a pointer to a synonym (syn:) in its eighth field. Let us assume also a file defining lexical pointers. For example:

    syn(8)
    %

which specifies that synonym pointers are to be found in field 8. (By convention, we give such tagfiles the suffix .tg.)

Now, if this file is loaded by the TGinfo command, and a learner types a synonym in place of the expected word, say, "feline" in place of "cat", then the complex error analysis will note this, as shown below:

***
  BEST ROOT : R2
  SCORE     : 0
  EXPECTED : cat
  RESPONSE : feline
  C1 S1   LEX dogs/cats  7:syn
  ***

Note in particular that the score assigned by the analysis is 0, since it considers that use of the synonym is possible as a response. This means that a grammar writer needs to take care of what appears in a tagfile. Items which would not be considered correct should not be placed there.

In addition, the morphological attributes attached to the expected response can be attached to the pointed at response as well, as the following example shows:

"cat"|N|animal, Number||$s||syn:"feline"[!Number];||||

Morphological analysis

In the case of morphology, if the learner's form on the diagonal is neither identical to the expected form, nor to the product of a lexical pointer, then vinci attempts a morphological analysis. In a nutshell, taking the rule defined in field 5 of the expected lexical item, it enumerates all forms based on the entire set of guarded subrules. To see how this works, consider the following snippet from a French lexicon:

    "grand"|ADJ|dimension, Number, Gender||$adj||||
    "petit"|ADJ|dimension, Number, Gender||$adj||||

and the morphological rule called by both lexical items:

    rule adj
        fem, plur:  #1 + "es";
        fem, sing:  #1 + "e";
        masc, plur: #1 + "s";
        masc, sing: #1;
    %

And assume a grammar has been written which gives an adjective in context and asks for its inflected form:

ROOT = CHOOSE Nu: Number, Ge: Gender;

        DET[Nu, Ge] ADJ[Nu, Ge] N[Nu, Ge]
%

QUESTION = BASEADJ:ROOT %

ANSWER = SHOWADJ:ROOT %

BASEADJ = TRANSFORMATION
DET ADJ N : 1 SYM[lbracket] 2[masc, sing] SYM[rbracket] 3;
%

SHOWADJ = TRANSFORMATION
DET ADJ N : 2;
%

This grammar will produce questions like "une [ grand ] table" and expect the appropriately inflected form of the adjective in brackets (here, "grande"). Let us assume that a learner has entered "grand". The resulting report looks like this:

 BEST ROOT : R2
 SCORE     : 5
 EXPECTED : grande
 RESPONSE : grand
 C1 S1   MORPH    grande/grand fem/masc
 ***

Note that the report includes both the learner response/expected response pair and the attributes by which they differ. This provides a mechanism for tracking over questions which morphology dimensions are causing a learner problems.

In some cases, more than one generated form might match the learner's input. For example, in the case of verbes of the "-er" class in French, the present tense for both the first person and the third person singular will be the same (as in "je pense" and "elle pense"). In cases like these, vinci outputs all possible forms with their analyses, separated by OR.

Phonological analysis

As we noted earlier, vinci permits the use of an indefinite number of parallel morphologies. This feature makes it possible to capture cases where a learner has misspelled a form, but where the misspelled form would be pronounced in the same way as the correct form. For example, in English, "cat" and "kat" would both have the same pronunciation, as would "fix" and "phicks". We refer to these as phonographic errors. (Note in passing, if we actually wanted to accept "kat", which is found in some texts, we would use a lexical pointer to the spelling variant.) To capture phonographic errors, two elements are needed.

An example of the first is:

"k"("c", "k", "ck", "cc")
"a"("a")
"t"("t", "tt")
"d"("d", "dd")
"O"("o", "aw")
"g"("g", "gg")
"A"("o", "a")
"w"("w", "au")
"h"("h")
"o"("o", "oa")
"R"("r", "rr")
"s"("s", "c", "ce", "se")
"I"("e", "i")
"z"("z", "zz")

This table is stored in a file with the conventional suffix .ip and is loaded into vinci with the command IP <filename>.

An example of a phonetic lexical field is:

"cat"|N|Number||$num||||||||||||||"kat"|||$numphon_s|
"dog"|N|Number||$num||||||||||||||"dOg"|||$numphon_z|
"cow"|N|Number||$num||||||||||||||"kAw"|||$numphon_z|
"horse"|N|Number||$num||||||||||||||"hoRs"|||$numphon_Iz|

As a default, phonetic rules are stored in fields 22 and 23 of the lexical entry, but this can be changed by means of the MPasses command. The rules referred to here look like this:

rule num
        sing : #1;
        plur : #1 + "s";
%

rule numphon_s
        sing : #19;
        plur : #19 + "s";
%

rule numphon_z
        sing : #19;
        plur : #19 + "z";
%

rule numphon_Iz
        sing : #19;
        plur : #19 + "Iz";
%

in order to capture the phonology of the plural of the nouns.

If a grammar is written which expects a learner to type some word, say "cats", and the learner types instead "katts", then the error analysis will show:

  BEST ROOT : R2
  SCORE     : 1
  EXPECTED : cats
  RESPONSE : katts
  C1 S1   PHONO cats/katts   c/k t/tt

showing which letters have been mistyped and the keyword PHONO to show the type of error. In the current defaults, the penalty for a phonographic error is low (1), since the learner clearly knows the answer but simply cannot spell it.

Reporting

In the case of simple error analysis, responses provided by the system are emitted as strings and can be copied to a webpage or otherwise manipulated as desired. In the case of complex error analysis, as we have seen, reports are more complex. At the simplest level, they can be sent directly to a learner who can interpret them after some instruction. Alternatively, since their format is fixed, they can be parsed by some program (PHP on a website, a driver or procedure file in ivi) and turned into something more palatable.

At the same time, error reports can be saved and processed later to decipher patterns in a learner's responses to allow for adaptive generation in subsequent examples or to permit an instructor to focus on areas of difficulty in a later classroom or individual interaction with a learner.