ALLC/ACH96
Generating Coherent Paragraphs
Greg Lessard, French Studies, Queen's University
Michael Levison, Computing and Information Science, Queen's University
1 introduction
1.1 computer aided instruction and authenticity
- tradeoff between authenticity and control
- communicative competence versus grammatical knowledge
1.2 one solution: authentic texts plus parsing
- hand parsing (CALIS)
- machine parsing and understanding (BORIS, etc.)
1.3 another solution: machine construction of texts
- microworlds (FLUENT, Hamburger, 1994)
- the problem of extension
1.4 needed: a simple metalanguage for paragraph construction
- relatively powerful
- non-programmer friendly
- easily adaptable
- supports dialogues
1.5 VINCI
2 theories of the paragraph
2.1 the paragraph as unit
- a barrier for anaphora (Hofmann, 1989)
- a semantic domain
Reagan thinks bananas. (Zadrozny, 1991)
2.2 paragraph types
- narrative
- procedural
- expository
2.3 discourse prominence
- a model for English (Longacre, 1989)
Band 1: Storyline Past (S/Agent) Action, (S/Agent/Patient) Motion
Past (S/Experience) Cognitive Events (punctiliar
adverbs)
Past (S/Patient) Contingencies
Band 2: Background Past Progressive (S/Agent) background activities
Past (S/Experience) Cognitive states (durative
adverbs)
Band 3: Flashback Pluperfects (Events, activities which are out of
sequence)
Pluperfects (Cognitive events/states that are out
of sequence)
Band 4: Setting Stative verbs/adjectival predicates/ verbs with
(expository) inanimate subjects (descriptive)
"Be" verbs/verbless clauses (equational)
"Be"/"Have" (existential, relational)
Band 5: Irrealis Negatives
(other possible Modals/futures
worlds)
Band 6: Evaluation Past tense (cf. setting)
(author intrusion) Gnomic present
Band 7: Cohesive band Script determined
(verbs in preposed/ Repetitive
postposed adverbial Back Reference
clauses)
2.4 given and new (Weissberg, 1984)
- linear progression
Hydrodology is based on the water cycle, more commonly called the
hydrologic cycle. This cycle can be visualized as beginning with
the evaporation of water from oceans and continental lands. The
resulting vapor... (p. 489)
- constant topic
Herbage of crested wheatgrass was harvested from 10 unfertilized
plots and 10 permanent plots annually fertilized with 8 pounds of
nitrogen per acre. Herbage from ... The selected herbage...
(p. 489)
- hypertheme
The reflector was protected from the weather by an outer window of
0.10 mm tedlar. The focal length of the reflector... The back of
the reflector... The reflector rack... (p. 490)
2.5 computational approaches
- planning (McKeown, 1985, Hovy, 1991)
- parsing (Zadrozny, 1991)
- text structure (Mann & Thompson, 1987)
- reference (Dale, 1992)
2.6 pedagogical approaches
/newpage
3 VINCI formalism
3.1 attributes
Nounsemantics (
human,
player<human,
goalie<player,
roy<goalie,
forward<player,
sakic<forward,
action,
goal<action,
shot<action,
drop<action,
kill<action,
physobj,
machine<physobj,
computer<machine,
laptop<computer,
document,
program<document,
dtd<program,
...)
3.2 lexical pointers
"Patrick Roy"|N|npr.>roy.Function, Number.Function||#1|||
hyper:"goalie"; appos:"the Colorado goaltender";||||
"error"|N|Determ.error.Function, Number.Function||$1|||
syn:"mistake"||||
"mistake"|N|Determ.error.Function, Number.Function||$1|||
syn:"error"||||
"apple"|N|>indef,>apple.Function,Attitude,Number||$1|||
ptaste:"sweet"/ADJ; ntaste:"sour"/ADJ; ptexture:"crunchy"/ADJ, "crisp"/ADJ,
"juicy"/ADJ; ntexture:"mushy"/ADJ,"pulpy"/ADJ, "soft"/ADJ;
pcolour:"red"/ADJ; ncolour:"green"/ADJ; kind:"MacIntosh","Spy";||||
3.3 guarded syntax
ROOT =
S[npr.roy.subj, stop, n1.shot.objd, past, sing.subj, sing.objd]
S =
< Determ.Nounsemantics.objd :
inherit As : Number.subj,
Ao : Number.objd,
Ns : Determ.Nounsemantics.subj,
No : Determ.Nounsemantics.objd,
Vs : Verbsemantics,
Ts : Tense;
NP[As, Ns]
VP[Ns, Vs, No, Ts, As, Ao]
>
inherit As : Number.subj,
Ns : Determ.Nounsemantics.subj,
Vs : Verbsemantics,
Ts : Tense;
NP[As, Ns]
VP[Ns, Vs, Ts, As] %
NP =
< npr.Nounsemantics.Function :
inherit Ns : Determ.Nounsemantics.Function;
N[Ns]
>
inherit Nr : Number.Function,
Ns : Determ.Nounsemantics.Function;
DET[Nr, Ns] N[Nr, Ns] %
VP =
< Determ.Nounsemantics.objd :
inherit As : Number.subj,
Ao : Number.objd,
Ns : Determ.Nounsemantics.subj,
No : Determ.Nounsemantics.objd,
Vs : Verbsemantics,
Ts : Tense;
V[Ns, Vs, No, Ts, As, p3] NP[Ao, No] PUNCT[period]
>
inherit As : Number.subj, Ns : Determ.Nounsemantics.subj,
Vs : Verbsemantics, Ts : Tense;
V[Ns, Vs, Ts, As, p3] PUNCT[period] %
3.4 transformations
APPOS = TRANSFORMATION
PRIORITY 22
* N[npr.Nounsemantics.Function] * : 1 2 PUNCT[comma] 2/@8:appos 3 ;
* NP * : 1 APPOS: 2 3 ;
%
HYPER = TRANSFORMATION
PRIORITY 20
* N[npr.Nounsemantics.Function] * : 1 DET/"the" 2/@8:hyper 3 ;
* N * : 1 2/@8:hyper 3 ;
* NP * : 1 HYPER: 2 3 ;
%
PASSIVE = TRANSFORMATION
PRIORITY 30
* NP * VP * : 1 PASSMORPH: 4 3 PREP[by] 2 5;
%
PASSMORPH = TRANSFORMATION
PRIORITY 31
* V NP[plur.objd] * : 1 3 V[vcop, 2!Tense, p3, plur.subj] 2[past] 4 ;
* V NP[sing.objd] * : 1 3 V[vcop, 2!Tense, p3, sing.subj] 2[past] 4 ;
%
4 operations
4.1 static versus dynamic information
- static
I like apples. They are sweet, red and crunchy. Apples are good
for you. I particularly like MacIntosh.
- dynamic
Patrick Roy stopped twenty shots. Joe Sakic scored two goals.
- dynamic with static (apposition)
Patrick Roy, the Colorado goaltender, stopped twenty shots.
Joe Sakic, Colorado's highscoring centre, scored twenty goals.
4.2 negation
Patrick Roy didn't stop twenty shots.
Joe Sakic didn't score twenty goals.
4.3 connectors
Patrick Roy stopped twenty shots while Joe Sakic scored two goals.
4.4 pronominalisation
he stopped twenty shots. Joe Sakic scored them.
4.5 hyperonymy/hyponymy
the goalie stopped twenty shots. the player scored two goals.
4.6 tense
Patrick Roy stopped twenty shots while Joe Sakic scored two goals.
Patrick Roy stops twenty shots while Joe Sakic scores two goals.
Patrick Roy will stop twenty shots while Joe Sakic will score two goals.
4.7 promotion/demotion
(tense)
Joe Sakic scored two goals. he had been in a slump.
(passive)
Patrick Roy stopped twenty shots. Two goals were scored by Joe Sakic.
/newpage
5 operations on a larger span
5.1 the base specification
ROOT =
S[npr.jones.subj, open, def.door.objd, past, sing.subj, sing.objd]
PUNCT[period]
S[npr.jones.subj, drop, poss.laptop.objd, past, sing.subj, sing.objd]
PUNCT[period]
S[npr.smith.subj, be, past, dead, sing.subj]
PUNCT[period]
S[def.computer.subj, kill, past, npr.smith.objd, sing.subj, sing.objd]
PUNCT[period]
S[npr.smith.subj, make, indef.error.objd, in, poss.dtd.obji, past,
sing.subj, sing.objd, sing.obji]
PUNCT[period]
%
5.2 basic output
Professor Jones opened the door. Professor Jones dropped his laptop.
Professor Smith was dead. the computer killed Professor Smith.
Professor Smith made a mistake in his DTD.
5.3 enhanced output
the door was opened by Professor Jones, the literature professor. he
dropped his laptop. Professor Smith, the humanities computing
specialist was dead. he had been killed by his computer. he had made an
error in his DTD.
5.4 alternative output
the door was opened by Professor Jones, the literature professor.
Professor Smith, the humanities computing specialist was dead. he had
made an error in his DTD. the computer had killed him. Professor Smith
dropped his laptop.
5.5 excessive pronominalisation
he opened it. he was dead. he had made an error in his DTD. the
computer had killed him. he dropped his laptop.
6 interactivity
6.1 formatting of output
the door was opened by Professor Jones, the literature professor.
he dropped his laptop.
Professor Smith, the humanities computing specialist was dead.
he had been killed by his computer.
he had made an error in his DTD.
6.2 questions and answers
What did Professor Jones open?
What did Professor Jones drop?
What did Professor Smith do?
Who was dead?
6.3 rewriting
7 conclusions
7.1 extensions of the model
- multilingual generation
- clothing information in syntax
7.2 additions to VINCI
- deconstruction of attributes
- predecessor and successor relations
- preselection of lexical classes
- inferencing
References:
Dale, R. (1992) Generating referring expressions : constructing descriptions
in a domain of objects and processes. Cambridge, Mass.: MIT Press.
Hamburger, H. (1994) Foreign Language Immersion: Science, Practice and a
System. Journal of Artificial Intelligence in Education 5/4:429-454.
Hofmann, T.R. (1989) Paragraphs and Anaphora. Journal of Pragmatics 13:
239-250.
Hovy, D. (1991) Approaches to the Planning of Coherent Text. In Paris,
Swartout and Mann, (eds.) Natural Language Generation in Artificial
Intelligence and Computational Linguistics. Boston: Kluwer.
Lehnert, W. et al. (1983) BORIS - An experiment in in-depth understanding of
narratives, Artificial Intelligence 20:15-62.
Lenat, D. (1990) Building large knowledge-based systems: representation and
inference in the Cyc project. Reading, Mass.: Addison-Wesley.
Levison, M., Lessard, G. (1992) A System for Natural Language Generation,
Computers and the Humanities 26:43-58.
Longacre, R.E. (1989) Two Hypotheses Regarding Text Generation and Analysis.
Discourse Processes 12:413-460.
Longacre, R.E. (1979) The Paragraph as a Grammatical Unit. Syntax and
Semantics 12: 115-134.
Mann, W.C. and Thompson, S.A. (1987) Rhetorical Structure Theory: Description
and Construction of Text Structures. In Kempen, G. (ed.) Natural Language
Generation. Dordrecht: Martinus Nijhoff.
McKeown, K.R. (1985) Text generation : using discourse strategies and focus
constraints to generate natural language text. Cambridge: Cambridge University
Press.
Weissberg, R.C. (1984) Given and New: Paragraph Development Models from
Scientific English. TESOL Quarterly 18/3:485-500.
Zadrozny, W., Jensen, K. (1991) Semantics of Paragraphs. Computational
Linguistics 17/2:171-209.