Linguistic and Cognitive Underpinnings of Verbal Humour

Greg Lessard, French Studies, Queen's University

Michael Levison, Computing and Information Science, Queen's University

1. Introduction

A fundamental tenet of many current linguistic models is the distinction between competence and performance, with the former representing an internalized set of abstract rules and the latter the manifestation of linguistic knowledge in a real-world context, subject to constraints of memory, physiology, situation and so on. At the same time, many linguistic theories have assumed that the study of performance is somehow secondary or marginal, to be taken up once more central questions have been resolved.

In fact, however, there exists a broad range of linguistic activities which fall in some sense between the two terms of the opposition. Sometimes known as `exceptional language' (Obler and Menn 1982), this class includes verbal humour, riddles, and language games. Such phenomena are interesting in that

they embody both linguistic relations such as homonymy and synonymy, as well as a number of extralinguistic rules, such as those which define the precise `grammar' of particular styles of humour. (See Maranda 1971, Pepicello and Green 1984, Green and Pepicello 1984 for discussion of the notion of the grammar of riddles, which form one class of linguistic humour, and Scherzer 1982 for a survey of language games.)

In this paper, we propose to discuss one particular type of verbal humour known as Tom Swifties. Originally a parody of the writing style of ***, who studiously avoided the verb "said" in his writings in favour of occasionally far-fetched alternatives, Tom Swifties have taken on a life of their own[1], with the only trace of their origin remaining in the use of the proper name Tom in their structure, as the following examples illustrate.

(1a) We've struck oil, Tom said crudely.

(1b) Turn up the heat, Tom said coldly.

(1c) I hate seafood, Tom said crabbily.

The phenomenon is interesting for at least three reasons:

First, there has grown up in the past several years a research field built upon the computational modelling of linguistic phenomena (see Ephratt 1990, Steinhart 1995 for examples). In our research area of natural language generation, we are concerned with developing a computational environment

(known as VINCI, see Levison and Lessard 1992 for details) which provides a collection of metalanguages describing the semantics, syntax, lexicon and morphology of a language in a formalism familiar to linguists, thus allowing them to define subsets of a language description in order to test particular models. If the linguist's model is correct, the system will generate appropriate utterances; if not, it won't. The system has already been used in a variety of languages, notably French, English and Russian, and at a variety of levels, including word formation, pronominal usage and modelling of second language syntax errors. Linguistic humour, by its heavy use of a range of linguistic devices and relations, provides an acid test for a generation environment. Thus in previous work, we have modelled Tom Swifties (Lessard and Levison 1992) as well as riddles (Lessard and Levison 1993).

Second, by their very nature, examples of linguistic humour provide an ideal testing ground for the points of contact between cognitive and linguistic knowledge. As we shall see below, Tom Swifties illustrate clearly that there is no clear line of demarcation between clearly linguistic relations such as

synonymy and antonymy and more `encyclopedic' and cognitive relations which rely much more extensively on real-world knowledge, and similarly that the successful production of Tom Swifties calls into play a range of skills normally associated with language use in the strict sense.

Finally, since the production of Tom Swifties is a learned skill, as is that of most forms of exceptional language, there exists the potential of insight into the steps of the learning process. In the case which interests us here, we have access to transcripts of an on-line electronic discussion environment in which various members of a loosely-defined group appended their contribution to a growing list of Tom Swifties. As we will see below, this provides us with a unique window into the acquisition by the

group of the elements of this type of verbal humour.

2. Verbal Humour

The study of humour has a long and complex history covering a variety of disciplines ranging from psychology (Freud 19**) to sociology (***) to linguistics (for example, Hockett 1977). In many of these approaches, it is usual to distinguish two subclasses of humour, illustrated by examples (2a,b)[2].

(2a) A. Knock knock.

B: Who's there?

A: Banana. Knock knock.

B: Who's there?

A: Banana. Knock knock.

B: Who's there?

A: Orange you glad I didn't say banana?

(2b) Q. How many Californians does it take to screw in a light bulb?

A: Five. One to screw in the light bulb and four to share the experience.

On the one hand, verbal humour is based on the phonological, morphological, lexical or syntactic characteristics of utterances, such that it is impossible to substitute other items without losing the humorous quality of the utterance. Thus, in (2a), it is the homonymy of "orange you" and "aren't you" which grounds the humour. On the other hand, in non-verbal humour (2b), no particular linguistic element is crucial to the joke. We are dealing rather with cultural `knowledge' (the purported tendency of Californians to share experiences), and this knowledge could be expressed in a variety of fashions.

Note however that this distinction assumes a clear line of demarcation between linguistic and cognitive phenomena. As we will see below in the case of Tom Swifties, this line of demarcation between the two classes is not always clearcut. However, before exploring this question, we will begin by demonstrating that at least core Tom Swifties do obey a formal grammar which may be computationally modelled.

3. Modelling Tom Swifties

In order to model (and computationally generate) Tom Swifties, it is necessary to make explicit the `grammar' of the genre. In fact, there exist a number of sub-genres of Tom Swifties, depending on the pivotal element, which may be adverbial, as in (3a), verbal, as in (3b) or nominal (usually within a prepositional phrase), as in (3c)[3].

(3a) ***replace***

(3b) "It's not a candy mint, it's a breath mint", Tom asserted.

(3c) "I'm wearing my wedding ring", said Tom with abandon.

In what follows, on the basis of Lessard 1988 and Lessard and Levison 1992, we will present in detail the grammar of adverbial Tom Swifties. This model is easily extensible to the other classes.

In essence, Tom Swifties turn upon two and possibly three simultaneous relations, one formal and the other two semantic. Consider example (4):

(4) I hate seafood, Tom said crabbily.

The central element, which we will henceforth call the pivot is formed by the adverb "crabbily". On the one hand, there exists an orthographic and phonological subset relation between the noun "crab" (`type of crustacean') and the adverb "crabbily" (`with ill humour'). At the same time, there exists a semantic relation of hyponymy between the subset noun "crab" and the noun "seafood" in the embedded sentence. Finally, there exists a third semantic relation (shared negative evaluation) between the adverb "crabbily" and the syntactic structure X hate Y. Note that this third relation is not found in

all Tom Swifties. For example, it is lacking in examples (3a-c). In the remainder of this section, we shall concern ourselves only with the first two relations.

If we are to computationally model a Tom Swifty such as that given in (4), we must begin with a rich lexical specification, of which (5) (specified in a simplified VINCI formalism) provides a simple example. Each lexical entry forms a separate line with fields delimited by vertical bars. The first field

contains the lexical entry itself, within quotes. The second field of each lexical entry provides its part of speech, while the third field provides semantic attributes such as `edible'. The last field shown includes a set of lexical pointers, which provide named links to related lexical entries. Thus, the noun "crab" includes a pointer named hyperonym which points at the lexical entry "seafood" and another pointer named paronym which points at the adverb "crabbily".

(5) {Lexicon}

"crab"|N|edible...|hyperonym:"seafood"; paronym:"crabbily"/ADV;...




Given this rich set of lexical specifications, we can then define a syntax which generates a Tom Swifty from a base form. We will assume that the base form is defined by the noun, since it is the noun which grounds the formal and semantic links described above. More precisely, in order to find an appropriate noun, it must satisfy all the following conditions:

- it must describe something edible, since it will appear in a sentence describing tastes

- it must possess a hyperonym in the thirteenth field of its lexical entry

- it must also possess an adverbial paronym in its thirteenth field.

As shown in (6) this base form provides the seed for a transformation which produces a pronoun ("I") a verb of evaluation taking an edible direct object ("hate"), the hyperonym of the base noun ("seafood"), the verb "said" and the noun "Tom" and finally the adverbial paronym of the base noun. Application of

this tranformation to the base form generates the entire Tom Swifty, as well as all those which follow a similar pattern.

(6) {Syntax}

{Find a base noun which has both a hyperonym and a paronym in field 13}

BASE = N[edible]/13=hyperonym/13=paronym/ADV %

{Generate a sentence from the root using first its hyperonym, then its paronym}


N : PRON V[evaluation, edible.directobj] 1/@13:hyperonym

V[said] N[Tom] 1/@13:paronym/ADV ; %

{Apply the transformation to the BASE}


As we have shown elsewhere (Lessard and Levison 1993), specifications of a similar sort enable us to computationally model a range of verbal humour, including, for example homonym riddles (Terban 1982), as shown in (7a,b).

(7a) What do you call a less expensive bird?

A cheaper cheeper.

(7b) What do you call a naked bruin?

A bare bear.

5. Formal complexity

Despite our success in modelling simpler examples of verbal humour, it must be recognized that there exist much more complex examples which are currently beyond our capabilities. One of the principal reasons for this is the existence of non-lexicalized elements, whose generation requires that we create a new (and sometimes complex) semantic or conceptual structure prior to sytactic development.

Let us illustrate the range of difficulties encountered with a series of examples (8-11)[4]. In (8), both the pivot and the target are lexical entries ("crab"/"seafood"; "udder"/"cow", "milk"; "alarm"/"security system"). Consequently, generation of the Tom Swifty is relatively easy and involves essentially little more than lexical lookup and construction of a simple syntactic frame such as X hate Y, X sell Y. In most instances, the syntactic frame would be usable with a wide range of lexical items.

(8a) I hate seafood, said Tom crabbily.

(8b) Go out and milk the cows, Tom uddered.

(8c) You sell security systems, Tom exclaimed with alarm.

On the other hand, the examples in (9) present lexicalized pivots, but the targets themselves are free syntactic constructions, more or less direct paraphrases of the pivot ("run rough"/"sputter").

(9a) The engine seems to be running rough, Tom sputtered.

(9b) I think those hairy cows that walk on mountains are cute, yakked Tom.

(9c) I'm not doing well in class, said Tom failingly.

As a result of this, a variety of forms would be possible for the same target. Thus, in (9a) we could have "she's missing on the second cylinder", "this car never runs smoothly when it's damp", and so on.

The examples in (10) represent an even farther point on the continuum toward non-lexicalization, in that the pivot, while it continues to be lexicalized, is now morphologically complex, with its component elements each entering into a semantic relation with some element of the non-lexicalized target.

(10a) They had to amputate him at the shoulder, Tom announced disarmingly.

(10b) Yeah, my lobotomy was successful, said Tom absent-mindedly.

(10c) All my employees have quit, said Tom helplessly.

(10d) Yes, I will resume persistent questioning, replied Tom Swift.

Thus, in (10a), the prefix "dis-" is associated with the notion of removal, as in "dismember", "dismast", and linked with the lexical item "amputate" in the embedded sentence. Similarly, in (10b) the compound element "absent" in the pivot is linked with the semantics of "lobotomy".

Finally, the examples in (11) illustrate how the pivot may be reanalyzed not just to isolate a single lexical item within an adverb, but a complex series of lexical items.

(11a) I'm trapped in the wizard's necklace, said Tom independently.

(11b) I'll take the prisoner downstairs, said Tom Swift condescendingly.

Thus, in (11a), the adverb "independently" is reananalyzed as "in" + "de"="the" + "pendant", to correspond to "in the wizard's necklace". In (11b), the string "condescendingly" is reanalyzed as "con" + "descend".

6. Semantic complexity

Tom Swifties may be not only formally complex, but also semantically varied. To begin with, one finds the whole gamut of `standard' semantic relations between pivot and target, as in (12).

(12) synonymy: We've struck oil, Tom said crudely.

antonymy: Turn up the heat, Tom said coldly.

hyponymy: I hate seafood, Tom said crabbily.

hyperonymy: No, you can't have any of my lobster, Tom said shellfishly.

meronymy: I like pizza, Tom said crustily.

However, beyond these, one finds as well a complex web of more complex relations some of which require a high level of encyclopedic knowledge, as we can see from the examples in (13).

(13a) instrument: Passing a mirror, Gee I look great tonight' Tom reflected.

I'm being crucified, Tom said crossly.

agent: I'm sure I can repair your engine, Tom said mechanically.

(13b) causality: I'm six inches taller, Tom grow-ned.

(13c) degree: I wish I were taller, Tom said longingly.

(13d) quality: We're going to have sun today, Tom said brightly.

There's too much tabasco in this chili, Tom said hotly.

Put on the new nightie I bought you honey, Tom said transparently.

(13e) domain: I don't like computers, Tom said byte-ingly.

I've just come from a funeral, said Tom gravely.

(13f) anaphor: I've lost my underwear, Tom said briefly.

I know I'll find them, Tom said shortly.

While it is possible to argue that relations such as those (13a) which hinge upon the argument structure of verbs are still purely linguistic in character, this becomes progressively more difficult as we descend the list. Jackendoff (19**) includes a discussion of causality (13b) his semantic model, and most componential analyses would recognize a common semantic trait `above the norm' in (13c). However, the adjective-noun relations of quality shown in (13d) are more difficult to link to purely linguistic phenomena. Should we treat "bright", "hot" and "transparent" as the product of collocational relations,

or as parts of the semantic content of the nouns "sun", "tabasco" and "nightie"? Finally, relations such as those in (13e) have a clear grounding in encyclopedic information, but a less clear basis in narrow linguistic relations. Thus, in what sense would we link "byte" and "computer" apart from a world-view based on a complex frame which described in some detail the operation of computational devices? (See Barsalou 19** for a discussion of how this might be accomplished.)

Finally, example (13f) shows that discourse relations such as anaphora must be taken into account when analyzing authentic Tom Swifties.

7. Acquisition

Given that Tom Swifties utilize such a complex range of linguistic and semi-linguistic relations, it is not surprizing that they are a learned phenomenon. In fact, this is true of most exceptional language. The passage in (14) provides interesting insight into this process.

(14) A Very Mice Joke Book really has four authors. Son Kevin, daughter Kristi, and husband Roger all contributed to the store of Gounaud mouse jokes. In fact, the entire project was begun by eight-year old Kevin who volunteered the very first mouse joke at dinner one night.... A family of pun-lovers, we eagerly followed his lead and began keeping a notebook of our collective ideas. Mouse-mania dominated many supper times thereafter, spread to other world-weary friends, and infiltrated even my husband's office. (Gounaud, 1981)

The question then arises: how do people learn structures of the sort? Clearly, it would be valuable to have access to a longitudinal corpus in which one could trace the development of the model. As it happens, we possess such a corpus. At Queen's University in Canada, there exists an electronic discussion facility known as SOAPBOX in which any member of the university community may propose a subject for discussion and invite appends by others. Since its inception, SOAPBOX has given rise to over 2000 discussions, some serious, some frivolous. One of these was opened by the text reproduced in (15)[5]. Note that examples are given, but no formal specification ever appears in the corpus. In what follows, we will examine some of the developmental aspects of this corpus.

(15) All right, all right. Who remembers Tom Swifties? You know, jokes

along the lines of the following:

"I'm being crucified," Tom said crossly.

"I'm not doing well in class," Tom said failingly.

"Turn up the heat," Tom said coldly.

"We're going to have sun today," Tom said brightly.

I want to see your best (or worst) of these. Enjoy!

... End of Opening Comments by xxx

7.1 Sharing and Extension of the Universe of Discourse

An essential characteristic of our use of language is our ability to create a new universe of discourse and to dynamically alter and extend it on the fly. The following examples (16a-c) show that a similar skill is at work in the corpus, where appender aaa has used the encyclopedic knowledge that the

punishment for theft is removal of the hand in traditional Islamic countries, appender bbb has envisaged the consequence of application of the penalty and appender ccc has generalized the model to the overarching concept of criminality.

(16a) You have been found guilty of theft, said Ahyatola Tom offhandedly.

... End of Append by aaa

(16b) Arrghhh... my hand... it's gone..., said Tom disjointedly.

... End of Append by bbb

(16c) "I'll take the prisoner downstairs," said Tom Swift condescendingly.

... End of Append by ccc

Similarly, when the corpus of Tom Swifties was presented to evaluators (see section 7.5 for discussion), one of the evaluators could not help extending the original universe of discourse of underwear found in the corpus, adding the parenthetical note "Birth of the meta-Tom-Swifty".

(17) corpus: I've lost my underwear, Tom said briefly.

I know I'll find them, Tom said shortly.

evaluator: This is an ordered pair, said the clerk at the underwear counter logically.

7.2 Analogical Extension to Other Domains

We also find evidence in the corpus for the existence of a kind of analogical extension of an original model to other domains. Thus, the same appender discovers the model based on domain relations and in short order applies it to six different domains (18a-f).

(18a) "I don't like computers," Tom said byte-ingly.

(18b) "I like this song. I should buy a copy," Tom noted.

(18c) "I dislike biology," Tom said lifelessly.

(18d) "I hate chemistry," Tom said acidly.

(18e) "I like physics," Tom said energetically.

(18f) "I also like math," Tom added.

... End of Append by xxx

7.3 Development of New Formal Frames

Beyond extension of existing models, we also find evidence for the creation of new formal frameworks. Thus, example (19) illustrates the first occurrence in the corpus of a bilingual Tom Swifty, where the string "ver" is the French for worm.

(19) "Yes, we sell earthworms," Tom VERified.

... End of Append by aaa

Similarly, examples (20a-c) show the first examples in the corpus of complex pivots, containing more than one link to the embedded sentence. Again, this model is taken up by other appenders subsequently.

(20a) "Yes, I will resume persistent questioning," replied Tom Swift.

(20b) "The secret to strength lies in the muscles. Right, Edward?" insinuated Tom


(20c) "NO, you fool!! It used to be a fundamental principle, but it isn't any

more!!!" expostulated Tom Swift.

... End of Append by bbb

Similarly, we observe in the corpus a first use (21a) of the mechanism whereby capital letters are used to highlight the embedded element in the pivot. After being used twice by appender aaa, it is subsequently taken up by others (21b), and becomes part of the shared `grammar' in the rest of the corpus.

(21a) "There were two rotten eggs in the dozen you sold me!" Tom shouted TENsely.

<deleted material>

"Bartender, pour me one more." Tom muttered GINgerly.

<deleted material>

... End of Append by aaa

(21b) "I wish I had seen the first manned rocket launch." Tom said APOLLOgetically.

... End of Append by bbb

7.4 Lexicalization

As the corpus grows, we can see the development of the sense of a shared `lexicalized' set of examples. Thus, in (22a), xxx states:

(22a) I'm sure this is probably in here already, but I'm not goin to read thru

the whole thing anymore to find out.

'Ouch! I've cut myself!' said Tom sharply.

... End of Append by xxx

In fact, he is correct: the earlier append had the form shown in (22b).

(22b) "This loose-leaf has given me a paper-cut, Tom said sharply.

7.5 Metalinguistic Instructions and Judgements

At the same time as they develop increasingly complex models for Tom Swifties, appenders to the corpus show increasingly complex abilities to explain and evaluate their own productions and those of others. Thus, in (23a), appender aaa produces a relatively obscure example and some time later feels obliged to explain it.

(23a) "Hey, that bird is deformed. " said Tom knowingly.

... End of Append by aaa

<deleted appends by others>

And if you didn't get number 62...

'Hey, that bird is deformed.' said Tom knowingly...

Read it No Wing ly.

Wouldn't want you to have missed out.

... End of Append by aaa

More interestingly, in (23b), appender bbb (implicitly) criticizes the use of adverb by aaa, reflecting the generally shared judgement that the base form of the pivot must be a lexicalized form.

(23b) I am new at this, so pardon me excessively.

the program stop running;

"there is a bug right here" Tom swift said raidily.

... End of Append by aaa

Interesting Adverb....what's the definition?

... End of Append by bbb

In an attempt to explore under more controlled conditions the extent of the shared judgements about the quality of productions in the corpus, we asked two male adult evaluators to read the items in the corpus and to rank each production on a three-point scale, with values of good, average and bad. If there were only a random correlation between the two judges, we would expect to find 33.3% of cases with perfect agreement, 33.3% with one level of separation and 33.3% of cases with two levels of separation. In fact, the results shown in (24) demonstrate a much higher level of agreement between the two judges. In addition, where there do exist differences in level, they are frequently in the same direction, indicating that by and large, judge B is more severe than judge A.

(24) Scale: a= good b=average c=bad Judges: 2 male- A, B

Perfect agreement:

41 (45%)

A = B

Separation of 1 level:

40 (43%)

A > B 29

B > A 11

Separation of 2 levels:

11 (12%)

A > B 10

B > A 1


92 (100%)

8. Conclusion

In sum, by the end of the corpus, we have seen evidence for a wide range of `linguistic' skills, including creativity, analogy, lexicalization, and metalinguistic judgements All of these are applied not to purely linguistic phenomena, but to the `grammar' of Tom Swifties as such.

Taken together with the use of both strict semantic and more complex cognitive relations observed in section 6, these results suggest that there is no clearcut demarcation between linguistic productions in the usual sense (utterances used to refer unambiguously to extralinguistic reality) and at least some aspects of `exceptional language'.


[1] There are even collections of Tom Swifties available on the World Wide Web, for example, at

[2] Examples are taken from a collection of light bulb jokes at

[3] Examples are taken from the Web site given in [1].

[4] All subsequent examples are drawn from the corpus described in section 7.

[5] Note that in what follows, we have hidden the identity of those who appended to the discussion. However, in cases where the distinctness of appenders is significant, we have retained a separate identifier for each individual.


Ephratt, M. (1990) What's in a Joke.

Gounaud, Maud (1981) A Very Mice Joke Book, Boston.

Green, T.J., W.J. Pepicello (1984) The Riddle Process, Journal of American Folklore 97/384: 189-203.

Hockett, Charles F. (1977) Jokes, in The View from Language, Athens: Univ. of Georgia Press, pp. 257-289.

Lessard, Greg (1988) Tom Swifties: analyse sémantique et formelle, Annual Meeting, Canadian Linguistic Association, University of Windsor.

Lessard, Greg, Levison, Michael (1992) Computational Modelling of Linguistic Humour, ALLC/ACH Conference, Oxford University.

Lessard, Greg, Levison, Michael (1993) Computational Modelling of Riddling Strategies, ACH/ALLC Conference, Georgetown University. To appear in Research in Humanities Computing 5, Nancy Ide, Susan Hockey, eds., Oxford University Press.

Levison, Michael, Lessard, Greg (1992) A System for Natural Language Generation, Computers and the Humanities 26:43-58.

Maranda, E.K. (1971) The Logic of Riddles, in P. Maranda, ed. Structural Analysis of Oral Tradition, Philadelphia.

Obler, Lynn, Menn, Lise (1982) Exceptional Language and Linguistics, New York: Academic Press.

Pepicello, William J., Green, Thomas A. (1984) The Language of Riddles: New Perspectives, Columbus.

Scherzer, Joel (1982) Play Languages: with a note on Ritual Languages, in Obler and Menn, eds., 1982, pp. 175-199.

Steinhart, E. (1995) NETMET: A Program for Generating and Interpreting Metaphors, Computers and the Humanities 28:383-392.

Terban, M. (1982) Eight Ate: A Feast of Homonym Riddles, Boston.