ALLC/ACH96


Generating Coherent Paragraphs

Greg Lessard, French Studies, Queen's University

Michael Levison, Computing and Information Science, Queen's University



1 introduction


1.1 computer aided instruction and authenticity


- tradeoff between authenticity and control

- communicative competence versus grammatical knowledge


1.2 one solution: authentic texts plus parsing


- hand parsing (CALIS)

- machine parsing and understanding (BORIS, etc.)


1.3 another solution: machine construction of texts


- microworlds (FLUENT, Hamburger, 1994)

- the problem of extension


1.4 needed: a simple metalanguage for paragraph construction


- relatively powerful

- non-programmer friendly

- easily adaptable

- supports dialogues


1.5 VINCI



2 theories of the paragraph


2.1 the paragraph as unit


- a barrier for anaphora (Hofmann, 1989)


- a semantic domain


Reagan thinks bananas. (Zadrozny, 1991)


2.2 paragraph types


- narrative

- procedural

- expository


2.3 discourse prominence


- a model for English (Longacre, 1989)


Band 1: Storyline Past (S/Agent) Action, (S/Agent/Patient) Motion

Past (S/Experience) Cognitive Events (punctiliar

adverbs)

Past (S/Patient) Contingencies


Band 2: Background Past Progressive (S/Agent) background activities

Past (S/Experience) Cognitive states (durative

adverbs)


Band 3: Flashback Pluperfects (Events, activities which are out of

sequence)

Pluperfects (Cognitive events/states that are out

of sequence)


Band 4: Setting Stative verbs/adjectival predicates/ verbs with

(expository) inanimate subjects (descriptive)

"Be" verbs/verbless clauses (equational)

"Be"/"Have" (existential, relational)


Band 5: Irrealis Negatives

(other possible Modals/futures

worlds)


Band 6: Evaluation Past tense (cf. setting)

(author intrusion) Gnomic present


Band 7: Cohesive band Script determined

(verbs in preposed/ Repetitive

postposed adverbial Back Reference

clauses)


2.4 given and new (Weissberg, 1984)


- linear progression


Hydrodology is based on the water cycle, more commonly called the

hydrologic cycle. This cycle can be visualized as beginning with

the evaporation of water from oceans and continental lands. The

resulting vapor... (p. 489)


- constant topic


Herbage of crested wheatgrass was harvested from 10 unfertilized

plots and 10 permanent plots annually fertilized with 8 pounds of

nitrogen per acre. Herbage from ... The selected herbage...

(p. 489)


- hypertheme


The reflector was protected from the weather by an outer window of

0.10 mm tedlar. The focal length of the reflector... The back of

the reflector... The reflector rack... (p. 490)


2.5 computational approaches


- planning (McKeown, 1985, Hovy, 1991)

- parsing (Zadrozny, 1991)

- text structure (Mann & Thompson, 1987)

- reference (Dale, 1992)


2.6 pedagogical approaches

/newpage

3 VINCI formalism


3.1 attributes


Nounsemantics (

human,

player<human,

goalie<player,

roy<goalie,

forward<player,

sakic<forward,


action,

goal<action,

shot<action,

drop<action,

kill<action,


physobj,

machine<physobj,

computer<machine,

laptop<computer,


document,

program<document,

dtd<program,


...)


3.2 lexical pointers


"Patrick Roy"|N|npr.>roy.Function, Number.Function||#1|||

hyper:"goalie"; appos:"the Colorado goaltender";||||


"error"|N|Determ.error.Function, Number.Function||$1|||

syn:"mistake"||||

"mistake"|N|Determ.error.Function, Number.Function||$1|||

syn:"error"||||


"apple"|N|>indef,>apple.Function,Attitude,Number||$1|||

ptaste:"sweet"/ADJ; ntaste:"sour"/ADJ; ptexture:"crunchy"/ADJ, "crisp"/ADJ,

"juicy"/ADJ; ntexture:"mushy"/ADJ,"pulpy"/ADJ, "soft"/ADJ;

pcolour:"red"/ADJ; ncolour:"green"/ADJ; kind:"MacIntosh","Spy";||||


3.3 guarded syntax


ROOT =


S[npr.roy.subj, stop, n1.shot.objd, past, sing.subj, sing.objd]


S =

< Determ.Nounsemantics.objd :

inherit As : Number.subj,

Ao : Number.objd,

Ns : Determ.Nounsemantics.subj,

No : Determ.Nounsemantics.objd,

Vs : Verbsemantics,

Ts : Tense;

NP[As, Ns]

VP[Ns, Vs, No, Ts, As, Ao]


>

inherit As : Number.subj,

Ns : Determ.Nounsemantics.subj,

Vs : Verbsemantics,

Ts : Tense;

NP[As, Ns]

VP[Ns, Vs, Ts, As] %


NP =

< npr.Nounsemantics.Function :

inherit Ns : Determ.Nounsemantics.Function;

N[Ns]


>

inherit Nr : Number.Function,

Ns : Determ.Nounsemantics.Function;

DET[Nr, Ns] N[Nr, Ns] %


VP =

< Determ.Nounsemantics.objd :

inherit As : Number.subj,

Ao : Number.objd,

Ns : Determ.Nounsemantics.subj,

No : Determ.Nounsemantics.objd,

Vs : Verbsemantics,

Ts : Tense;

V[Ns, Vs, No, Ts, As, p3] NP[Ao, No] PUNCT[period]


>

inherit As : Number.subj, Ns : Determ.Nounsemantics.subj,

Vs : Verbsemantics, Ts : Tense;

V[Ns, Vs, Ts, As, p3] PUNCT[period] %


3.4 transformations


APPOS = TRANSFORMATION

PRIORITY 22

* N[npr.Nounsemantics.Function] * : 1 2 PUNCT[comma] 2/@8:appos 3 ;

* NP * : 1 APPOS: 2 3 ;

%


HYPER = TRANSFORMATION

PRIORITY 20

* N[npr.Nounsemantics.Function] * : 1 DET/"the" 2/@8:hyper 3 ;

* N * : 1 2/@8:hyper 3 ;

* NP * : 1 HYPER: 2 3 ;

%


PASSIVE = TRANSFORMATION

PRIORITY 30

* NP * VP * : 1 PASSMORPH: 4 3 PREP[by] 2 5;

%


PASSMORPH = TRANSFORMATION

PRIORITY 31

* V NP[plur.objd] * : 1 3 V[vcop, 2!Tense, p3, plur.subj] 2[past] 4 ;

* V NP[sing.objd] * : 1 3 V[vcop, 2!Tense, p3, sing.subj] 2[past] 4 ;

%



4 operations


4.1 static versus dynamic information


- static


I like apples. They are sweet, red and crunchy. Apples are good

for you. I particularly like MacIntosh.


- dynamic


Patrick Roy stopped twenty shots. Joe Sakic scored two goals.


- dynamic with static (apposition)


Patrick Roy, the Colorado goaltender, stopped twenty shots.

Joe Sakic, Colorado's highscoring centre, scored twenty goals.


4.2 negation


Patrick Roy didn't stop twenty shots.

Joe Sakic didn't score twenty goals.


4.3 connectors


Patrick Roy stopped twenty shots while Joe Sakic scored two goals.


4.4 pronominalisation


he stopped twenty shots. Joe Sakic scored them.


4.5 hyperonymy/hyponymy


the goalie stopped twenty shots. the player scored two goals.


4.6 tense


Patrick Roy stopped twenty shots while Joe Sakic scored two goals.

Patrick Roy stops twenty shots while Joe Sakic scores two goals.

Patrick Roy will stop twenty shots while Joe Sakic will score two goals.


4.7 promotion/demotion

(tense)


Joe Sakic scored two goals. he had been in a slump.


(passive)


Patrick Roy stopped twenty shots. Two goals were scored by Joe Sakic.

/newpage

5 operations on a larger span


5.1 the base specification


ROOT =

S[npr.jones.subj, open, def.door.objd, past, sing.subj, sing.objd]

PUNCT[period]

S[npr.jones.subj, drop, poss.laptop.objd, past, sing.subj, sing.objd]

PUNCT[period]

S[npr.smith.subj, be, past, dead, sing.subj]

PUNCT[period]

S[def.computer.subj, kill, past, npr.smith.objd, sing.subj, sing.objd]

PUNCT[period]

S[npr.smith.subj, make, indef.error.objd, in, poss.dtd.obji, past,

sing.subj, sing.objd, sing.obji]

PUNCT[period]

%


5.2 basic output


Professor Jones opened the door. Professor Jones dropped his laptop.

Professor Smith was dead. the computer killed Professor Smith.

Professor Smith made a mistake in his DTD.


5.3 enhanced output


the door was opened by Professor Jones, the literature professor. he

dropped his laptop. Professor Smith, the humanities computing

specialist was dead. he had been killed by his computer. he had made an

error in his DTD.


5.4 alternative output


the door was opened by Professor Jones, the literature professor.

Professor Smith, the humanities computing specialist was dead. he had

made an error in his DTD. the computer had killed him. Professor Smith

dropped his laptop.


5.5 excessive pronominalisation


he opened it. he was dead. he had made an error in his DTD. the

computer had killed him. he dropped his laptop.



6 interactivity


6.1 formatting of output


the door was opened by Professor Jones, the literature professor.

he dropped his laptop.

Professor Smith, the humanities computing specialist was dead.

he had been killed by his computer.

he had made an error in his DTD.


6.2 questions and answers


What did Professor Jones open?

What did Professor Jones drop?

What did Professor Smith do?

Who was dead?


6.3 rewriting



7 conclusions


7.1 extensions of the model


- multilingual generation

- clothing information in syntax


7.2 additions to VINCI


- deconstruction of attributes

- predecessor and successor relations

- preselection of lexical classes

- inferencing



References:


Dale, R. (1992) Generating referring expressions : constructing descriptions

in a domain of objects and processes. Cambridge, Mass.: MIT Press.


Hamburger, H. (1994) Foreign Language Immersion: Science, Practice and a

System. Journal of Artificial Intelligence in Education 5/4:429-454.


Hofmann, T.R. (1989) Paragraphs and Anaphora. Journal of Pragmatics 13:

239-250.


Hovy, D. (1991) Approaches to the Planning of Coherent Text. In Paris,

Swartout and Mann, (eds.) Natural Language Generation in Artificial

Intelligence and Computational Linguistics. Boston: Kluwer.


Lehnert, W. et al. (1983) BORIS - An experiment in in-depth understanding of

narratives, Artificial Intelligence 20:15-62.


Lenat, D. (1990) Building large knowledge-based systems: representation and

inference in the Cyc project. Reading, Mass.: Addison-Wesley.


Levison, M., Lessard, G. (1992) A System for Natural Language Generation,

Computers and the Humanities 26:43-58.


Longacre, R.E. (1989) Two Hypotheses Regarding Text Generation and Analysis.

Discourse Processes 12:413-460.


Longacre, R.E. (1979) The Paragraph as a Grammatical Unit. Syntax and

Semantics 12: 115-134.


Mann, W.C. and Thompson, S.A. (1987) Rhetorical Structure Theory: Description

and Construction of Text Structures. In Kempen, G. (ed.) Natural Language

Generation. Dordrecht: Martinus Nijhoff.


McKeown, K.R. (1985) Text generation : using discourse strategies and focus

constraints to generate natural language text. Cambridge: Cambridge University

Press.


Weissberg, R.C. (1984) Given and New: Paragraph Development Models from

Scientific English. TESOL Quarterly 18/3:485-500.


Zadrozny, W., Jensen, K. (1991) Semantics of Paragraphs. Computational

Linguistics 17/2:171-209.