CRT: Toki Pona
Design
equals intention.—Richard
Eckersley
Surely
the most powerful of all humanity’s beastly urges is the deep-seated lust to
edit other people’s writing. You’ve certainly felt it, reading the faux pas in
the local paper (“the official spoke on condition of animosity”), or the purple
prose in a best seller (“You know your body loves this, Anastasia”), or the
typo on the chalkboard at the deli counter (“your welcome to a sample”). But in
my case, the urge is directed not in correcting words or even sentences, but in
correcting grammars. That is why I studied for so many years to become a CRT.
No, not a Cathode Ray Tube, not a Communist Republican Terrorist, not even a
Cerebrally Ruptured Trumpite. I am a CRT: a Conlang Repair Technician.
When you think of an ideal repair tech you
should think of Darren, the guy who repaired my air-conditioner. There was the
poor young man, slaving away on a sunny day in August with a heat index of 110.
He wasn’t being creative, wasn’t adding anything, wasn’t improving anything. I
didn’t want my central air improved: I wanted not to roast alive in my own
living room. And I didn’t, thanks to Darren. As global warming proceeds apace I
believe fixers of air-conditioners will come to be worshipped as gods.
A repair technician doesn’t making things
better—s/he’s makes things the same in a better way. They restore function to
what should be functional and isn’t. And plenty of conlangs could work more
efficiently than they do. By “more efficiently” I don’t mean the endless
quibbles about Esperanto’s “Standard European” vocabulary, or which conlang is
most speakable/ learnable/useful, or what an ideal conlang “should” be. I
mean: are the intentions of the conlang’s creator fully realized? And if they
aren’t, how can they be made to be so? Any mismatch between design and
intention is what a Conlang Repair Technician repairs.
There are about 900 conlangs listed on
various websites, and more keep coming. Dothraki, a language of Westeros in
George R. R. Martin’s Song of Ice and
Fire series is a recent addition to the corpus, as is Na’vi from the film Avatar. But of all this cornucopia of
conlangs, surely the easiest to learn is the delightful “minilang” called Toki
Pona, the “Language of Good.” Hardly the kind of extremolang we’ve been
discussing in these pages, it is nonetheless a marvelous example of what you can
do once you realize the vast potential of language as a plaything. And like any
language, nat- or con-, Toki Pona is also a window into the human mind.
In doing my repair work I stick closely to
Pu, the “official” book about Toki
Pona written by its creator, Sonja Lang. jan
Sonja, as she is known to her tribe, has written a marvel of scholarship,
creativity, and playfulness, inspired—uniquely, as far as I can tell—by the
teachings of Lào Zi, the (perhaps imaginary) author of the Dào Dé Jīng. Toki Pona is an “a posteriori” conlang, meaning that
its vocabulary and grammar come from other languages. About 30% of the lexicon
is borrowed directly or indirectly from English, and the derivations are quite
obvious. For example, jaki is translated
in the Pu dictionary as “disgusting,
obscene, sickly, toxic, unclean, unsanitary”: in a word, yucky. On the other hand, “you” is sina and “strong” is wawa, both
of which are Finnish, and pana is
“give,” from KiSwahili pa-ana, “give-each.other.”
Facebook’s Toki Pona group hosts extensive discussions on the sources Lang used
for her vocabulary, and unlike Esperanto or Lojban, Toki Pona’s sources are
quite far-ranging.
Like many conlangs, Toki Pona is intended
to have a positive effect on the minds of its speakers. The reader already
knows my take on Sapir-Whorfism. But really, the “point” of Toki Pona—if a
language needs to make a point—is not to conduct an experiment, or to prove a
theory, but to have fun. And jan Sonja certainly
sets a good example. In referring to her book as Pu she makes a wonderful pun: pu
in Classical Chinese is the “uncarved block” of Taoism, the fundamental,
unspeakable ground-of-being. But to native English-speakers of a certain
temperament, Pu also opens the
gateless gate of verbal mischief:
(1) mi wile musi kepeken Pu
I want play using Book
"I like to play with Pu.”
Which,
of course, is what I’m doing right here. But more practically, in this chapter I do four things with Toki Pona:
1. interpret one principle
2. discover two morphemes
3. erect three rules, and
4. make four morphemes out of two.
This is not to say that
Toki Pona needs fixing: it doesn’t.
It already has what it needs to attract a fairly sizable online fanbase: 3,700
members of the Facebook group and counting. But as I develop my arguments in
this chapter, I hope it will become clear that, while Toki Pona is a “language
of good,” in certain ways it can be better. And your Conlang Repair Technician
can make it better without changing a single one of its major features.
One
What does Pu mean by
its repeated references to “simplicity”? The “simplicity” jan Sonja intends is a simplification of thought:
Training your mind to
think in Toki Pona can lead to deeper insights. If many of life’s problems are
created by our excess thoughts, then Toki Pona filters out the noise and points
to the center of things. (p. 12)
How is this done? “The
wisdom of life consists in the elimination of non-essentials,” as the
20th-century Chinese philosopher Lin Yutang is quoted as saying on Pu’s page 80. Eliminate the “clutter”
from a language and what is left will help its speakers have “simpler” minds,
more calm, honest, and collected. But what part
of language is to be simplified? In a nimi:
vocabulary.
There are only 120 or so words in Toki Pona. With so small a
word-hoard each item must cover a lot of semantic territory. Toki can mean “communicate, speak, say,
talk, use language, think,” while pona translates
“good, positive, useful, friendly, peaceful, simple.” The stretch from “good”
to “simple” is quite broad, and embodies a very particular view of “goodness,”
but jan Sonja isn’t out to construct
a “neutral” interspeech like Esperanto or a “logical” tongue like Lojban or
Ithkuil. Her intent is to boil down the word-soup of language until only the
bones remain, the essential concepts necessary for basic human communication.
In training the mind to think in these concepts the fundamental simplicity of
reality might then reveal itself
whenever you speak. You can’t say “credit default swap” in Toki Pona, but you
can say jan li suli mute, mani li suli
lili, “people are of much importance, money is of little importance.” (p.
73) This does not seem to me to be a restriction of thought a là Newspeak but
an elegant statement of fact: even a Wall Street hustler could understand it, once
past capitalist duckspeak.
But Pu cannot get
past a basic fact of human speech: any vocabulary
of a human language, no matter how small, is bound to contain boatloads of
hidden complexity. Recall Chomsky’s remarks on the “richness” of the lexicon:
Internal conditions on meaning are rich, complex, and
unsuspected; in fact, barely known. The most elaborate dictionaries do not
dream of such subtleties; they provide no more than hints that enable the intended
concept to be identified by those who already have it.
As a demonstration of
this, take pona as it is used in the
common phrase jan pona. Picking out
the “friendly” definition of pona, this
word is used by the Toki Pona community to mean “friend.” But why this meaning? It could just as easily
mean:
(a) “simple person”—one
who is genuine, unaffected, honest; or
(b) “peaceful
person”—someone like Gandhi, MLK, Kabir, or the Dalai Lama; or
(c) “good person”—one who
is noble, decent, honorable; or
(d) “positive person”—an
optimist, one who looks on the bright side; or
(e) “useful
person”—someone who can be counted on, who is dependable.
Of course, you could
argue that a “friend” is any or all of these things. But jan as used in jan pona
appears to pick out one particular meaning from those listed in Pu—the meaning “friendly”—and allows the
other meanings to hover in the background. Is this selection process “simple”?
And what rules or principles guide speakers in making it? And why translate
“friend” as jan pona and not as, say jan pi pilin mi, “person of my heart,”
or jan (lon) poka (mi), “person (at)
(my) side”?
Or take a more serious issue: the grammaticalization of
nurture. Suli is translated “big,
large, heavy, long, tall; important; adult.” Its compliment lili, however, means “little, small,
short; few; a bit; young.” If a jan suli is
an important person, is a jan lili unimportant?
The word for “parent” is mama, but
the word for “child” is . . . jan lili,
a compound. Why this asymmetry? There
are a number of words for being in charge—mama,
lawa, kute—but no words for a charge,
for someone under the care of another. Kute
means “hear, listen, pay
attention to, obey.” All examples in Pu of
kute as “obey” refer to children
obeying their parents. Do parents ever kute
their children? What Inuktitut expresses with -gi- is apparently beyond Toki Pona’s power. Whether it should be or not is beyond the power of
a mere CRT. I raise the issue of grammatical nurture to show that the
“simplicity” of Toki Pona’s lexicon masks complex presuppositions that are
nowhere made explicit. If Chomsky is to be believed, such complexity is as
natural as the Way of Heaven, and as inescapable.
Syntax, however, is another matter. The syntactic rules of Toki
Pona are refreshingly simple. Basic order is SVO; heads come before modifiers;
there are prepositions and a postposition, and a half-dozen form-classes. Simple
indeed—but could it be more so? In Chomsky’s current grammatical model, his
“Minimalist Program,” the only syntactic processes are merge and move,
and all word-order phenomena are explained by just these two. At least to
Chomsky, what is “simple” about language is not what most of us would guess
after being exposed in school to the “spiders from Mars,” those stick-leggedy
parsing trees. It’s word order and relations between words that are the easy
part of language, and which should be the easy part of any conlang.
In keeping with the Chomskian view, my first move as Toki Pona
CRT is to nail down what “simplicity” means as a conlang design principle. It
means this:
Principle of Lexical
Complexity
Whenever possible, move complexity out of the
syntax and into the lexicon.
I call this principle
“LexPlex” for short. It is the foundation of almost all the “repairs” I make to
Toki Pona.
Two
“Officially,” Toki Pona has only those morphemes listed in Pu, plus the occasional newcomer
accepted by the online kulupu pi toki
pona such as the recent kipisi, “cut”
(from Inuktitut kipi-, plus
antipassive -si-). However, language can
be sneakier than even its own inventor might imagine. For Toki Pona’s
vocabulary is haunted by—gasp!—invisible
morphemes, ghostly presences rising from the darkness of the Uncarved Block.
“Looked for, they are not seen; listened for, they are not heard; reached for,
they cannot be grasped.” But Lào Zi did not know the master science. These
intangible morphemes are quite graspable.
Something from nothing
In the first iteration of Toki Pona published in 2001 there
were only three numbers: ala (zero), wan (one), and tu (two). Later, three others were added: luka (five), mute (twenty),
and ale (one hundred). Numbers other
than these are built up by compounding, exactly as is done in Dyirbal,
Inuktitut, or Pawnee. However, numeral compounds are unique, made by a process
that is not found elsewhere in the language. Most compounds in Toki Pona are of
the head-modifier type: luka wawa, “hand
strong,” i.e., a strong hand. But when luka
is used as the number “five” it does not
serve as the head of a modifier: luka tu does
not mean *“five of two” but “five and two.” The silence that links numbers
into bigger numbers is not the silence that links non-number words into
phrases. Where there is meaning there is morpheme: there is a meaningful
distinction between these two silences, therefore the silence between numbers
must represent a zero morpheme, ø. (We’ll
get to the silent “of” later.)
ø in Toki Pona does
not function quite like the Boolean and
used in logic. In the Lord of the Rings, when
Barliman Butterbur tells of a battle with robbers, he does not give the number
of casualties as “five” but “three and two”: three Men and two Hobbits. In
Bree, good fences make good neighbors, so lumping Men and Hobbits together
under a single number just isn’t done. If good master Barliman had spoken Old
Toki Pona he’d have counted Merry’s ponies as tu tu wan, but he’d have counted the robbers’ victims as tu wan en tu. That is, he’d have kept
the functions of ø and and discrete.
Unlike and, ø is non-commutating: it orders the numbers
flanking it in such a way that the larger number is always to its left, the
smaller to its right. Though wan tu makes
as much sense as tu wan arithmetically,
tu wan is the only term for “3” given
in Pu. Similarly for all the other
numbers listed: the number to the left is always ≥ the number to the right:
“13” is luka luka tu wan, and so on.
Using one instance of each number, the biggest number you can make is “128,”
and it must be stated thus: ale mute luka
tu wan.
Speakers of Toki Pona do not consider coining new words to be
appropriate: o weka e nimi namako, “avoid
new words,” to quote a member of the Facebook group. But ø is not namako: it has
always been present, though by implication. There are still only 120+ nimi in Toki Pona, if by nimi is meant “(audible) word,”
as opposed to “morpheme.” I suspect jan Sonja didn’t list ø in the dictionary simply because she
was used to grammars that don’t use zero-morphemes. (English grammars don’t
usually describe the singular suffix -ø in
“girl-ø.”) Instead of describing ø she wrote a traditional chapter on the
rules for making numbers. All I’m doing with ø is folding the rules in Pu’s lesson 12 into a single morpheme.
Thus I take complexity out of the grammar and put it into the lexicon, which is
where, under the LexPlex principle, complexity belongs.
The sentence initiator
There is one more silent morpheme I would like to recognize for
Toki Pona, and that is σ, the
sentence initiator. This morpheme (technical name “sigma”) places a beginning
and an end to any sentence—and the end is just a new beginning. Its most
important job is to set boundaries around any sentence so that the words within
it can be more effectively interpreted. Despite its abstract nature it can be
pronounced: it is the longest of Toki Pona’s three pause phonemes, short (μ), regular (#), and long (σ).
To see what σ can do
for us, let’s look at la. Pu describes the word la as “very powerful. It allows you to
link two sentences, or link a fragment to a sentence.” (p. 51). It is also said
to “separate context from the main sentence.” The examples show that anything
preceeding la is used either as an
adverbial phrase (tenpo ni la, “now”)
or a dependent clause such as the kind made in English with “if” or “when.” So
far, ale li pona (“all is good”).
Trouble arises when we try to tell what la is doing in running text. As pause phonemes of any length tend
to be elided in ordinary speech, there are several possible interpretations to
(2):
(2) jan Sili li lape lili lon supa tenpo suno
pini la jan Sili li pona e tomo
“Sili napped on the sofa. Yesterday Sili
tidied up the house” (p. 61) or
“If Sili napped on the yesterday sofa, then
Sili tidied up the house” or
“Sili, having finished napping on the day
sofa, Sili tidied up the house”
The meaning of (2) hinges
crucially on the presence or absence of what in writing is a period and in
hearing is a long pause. In fact, jan
Sonja places a period between supa and
tenpo, thus making the first
translation of (1) the only one that is plausible. But that period represents a
meaning, and where there is meaning there is morpheme. Therefore, the long
pause shown in writing with a “.” is in Toki Pona a morpheme, σ.
La is two-sided, or
two-faced, if you will: its use extends to the constituents on either side of
it. σ puts boundaries on la, limiting its scope such that, for
example, the second and third translations of (2) are not allowed. In the
examples in lesson 14 jan Sonja puts
a comma before la, which tells me
that in her mind la has this quality
of “binary scope,” and that the constituent to the left is a modifier, the
constituent to the right a head. And what is a “constituent,” as far as la is concerned? Anything between itself
and σ.
Sigma can do something else for the grammar, and that is
eliminate the rule that deletes li after
mi and sina. The deletion rule can be rewritten as a formula within li’s lexical entry, so:
(α) li → $ / σ {mi, sina} ___
Unlike ø, the
silence of li after mi or sina—symbolized “$” for “empty” because the
morpheme’s sound has been “emptied” out of it—represents the process called
“dropping.” $
is not a separate morpheme but an “allomorph,” another version of a morpheme,
as the -s, -z, and -ez plurals in English are allomorphs of
a single morpheme. Dropping in Spanish allows $
te amo for yo te amo. Many languages have this option because the verb agrees
with the pronoun, and these are called, logically enough, “pro-drop” languages. Empty and zero require
distinct symbols because they represent distinct functions: $
is a variation of a morpheme, or of several morphemes, whereas ø is a morpheme in itself.
(α) is a formal way of saying what the dictionary says about
the use of li: it stands “between any
subject except mi alone or sina alone and its verb” (p. 128). The
appearance of σ in (α) merely makes the description simpler.
It accounts for the absence of li in mi toki but its presence in sina en mi li toki (“you and I talk”) or
in tomo sina li namako (“your house
is new”). It is not necessary to make
a syntactic rule for this, as all (α) describes is the behavior of li in a particular context, and such
“behavior” is a part of its meaning. And all meaning belongs in the lexicon.
Three
There are only two kinds of words in Toki Pona, function and
content. All human languages, con- or nat-, have at least these two word-types;
the distinction goes back to the Classical Chinese grammarians, who divided
their vocabularies into “empty” words and “full” words. These two basic
word-types allow us to recognize three kinds of basic grammatical relations:
between function and content: the phrase relation,
between two function words: the scope relation, and
between two content words: the modifier relation.
Once we’ve accounted for
these three relations we have written the grammar of Toki Pona.
The grammar described on Toki Pona’s Wikipedia page lists ten
rules of syntax. This sounds pretty simple: Esperanto has a whopping 16, and
the grammars of most natlangs have dozens if not hundreds. But your CRT can do
better, if by “better” you mean “fewer.” Here is my “repair” of the syntactic
rules of Toki Pona:
1. Rule of Templatic Syntax:
All well-formed strings have the
underlying form (function word + content word)*
Def. A f(unction-)word is an Î {anu, e, en, kepeken, la, li,
lon, oL, oR, pi, sama,
tan, taso, tawa, μ, σ, ø}
Def. * = may be repeated as desired,
indefinitely
Cor. redundant words are deleted at the
surface
2. Rule
of Nested Hierarchy:
F-words
group c(ontent)-words into phrases that nest in the Scope Hierarchy:
σ
< |laB, oL|
< en < |li, oR|
< e < {kepeken, lon, sama, tan, tawa} < anuB < pi < μ
Def. “_” = “must be included in all
sentences”
Def. “ < ” = “includes under its
scope” or “is interpreted after”
Def. “|…|” = “chose one and only one per
sentence from this set”
Def. “{…}” = “chose as many as needed
per sentence from this set”
Def. all f-words have rightward scope
unless otherwise indicated
3. Rule
of Structural Conjunction:
Boolean
and is formed by the repetition of
any Î {e, en, li} within its own scope
Corr. function scope is binary in this
context
Like the ten rules listed on Wikipedia,
the Three Rules listed here provide instruction on how to make and interpret
grammatical utterances in Toki Pona. Unlike the Ten, the Three do so in a way
that is easier to memorize and is logically more powerful.
The template
I have borrowed the notion of “template” here
from the North Afro-Asiatic languages, where it refers to morphology, not
syntax. A “templatic morphology” is one that makes words the way Arabic,
Berber, or Hebrew do, with a string of consonants that carry the meaning and
vowels inserted to distinguish, for example, the Hebrew katav, “he wrote” from katvah,
“text,” or kotev, “writing.” My
“templatic syntax” of Toki Pona means that a small number of function words fit
into the “template” to form the backbone of the sentence, with content words
inserted between them to flesh it out. Now let’s try making a sentence: “I want
a drink of water.” We start with our template:
f + c + f + c + f + c . . . .
and some vocabulary:
e, object
noun phrase (f)
en, subject
noun phrase (f)
li, verb
phrase (f)
mi,
I/me (c)
moku, eat/drink (c)
telo, water (c)
wile, want (c)
μ, head-modifier relator (f)
σ, sentence initiator (f)
All those words marked “(f)” are members of the
function-word class, listed under the definitions of Rule 1. All words not in
the f-word set are content words by default, as function and content are the
only two options for defining the syntactic role of a word. Two f-words, en and μ, we haven’t got around to explaining yet; we’ll define them more
precisely later. The Scope Hierarchy of Rule 2 orders the f-words so that they
must appear in a particular sequence. Also, function-words can only occupy the
f positions in the template, so there are gaps between them:
f c
f c f
c f c
σ . . . en . . . li . . . e . . .
E has no equivalent
in toki Inli (“English”): it is used whenever a c-word follows
another c-word that in turn follows li or
o. If e is present the following c-word is the object of the sentence; if
it is not, the following c-word is an adverb modifying the preceeding verb:
(3) ona li toki ala
she predicate
talk not
“she did not talk” (but did something else)
(3’) ona li toki e ala
she predicate
talk object not
“she said nothing” (but remained silent)
The distinction between
sentences like (3) and (3’) is central to Toki Pona’s grammar. It sometimes causes trouble for learners whose
native tongue has no marker of the direct object. Speakers of Biblical Hebrew (‘eth)
or Hawai’ian (i) would have no
trouble with e.
Having decided
that “I” is the subject of our sentence, “drink” and “want” are the verbs, and
“water” is the object, we plug these into the relevant c-positions in the
template. That is, we put “I” directly after en, the verbs directly after li,
and “water” after e, like this:
f (c) f
c f c
(f) c f
c
σ . . . en mi li (wile, moku) e telo
We’re not done yet: there is still the gap between σ and en, plus we have to find the missing f-word between wile and moku. The mystery f-word must be under the scope of li or by Rule 2 it won’t fit into the
ordering of the f-words we have already. Also, it must be consistent with the
meanings we have chosen—it must be on our list. The only morpheme that fits
both criteria is μ, so we plug it in.
This morpheme lets us know that wile is
the head of the li phrase and moku is modifying wile: “want” is what we’re doing, and “consume” restricts the meaning of “want”
to a particular type of wanting: a wanting to consume something. Now, after
inserting μ and applying (α), we have
σ . . . en mi [li] wile [μ]
moku e telo,
with brackets around words that are present but silent (li because it has been dropped, μ because it is inherently silent). The
only remaining piece of the puzzle is what to do with the c-word missing
between σ and en. There’s a hole in the
template that needs to be filled, but no meaning available to fill it. When
this happens in a natural language we postulate the existence of what is called
an “empty category.” In English, the sentence “he would like you to come” does
not contain an empty category because all the syntactic slots are filled. The
sentence “he would like to come” does
contain an empty category: the unpronounced subject of the verb “come.” In
other words, an empty category in English is a noun without a pronunciation,
but which you can guess is there because the sentence’s meaning and grammar
require it.
Toki Pona’s
empty category is a c-word with no pronunciation and no meaning. It is simply
a place-holder, as “0” is a place holder in numbers like “10” or “1001.” I’ll
list it here as “øC” to
distinguish it from “øF,” the compound number-maker. Now we can write our sentence as:
[σ]
[øC] en mi [li] wile [μ]
moku e telo
We still have one more step to go. In all known examples
of Toki Pona text, en never appears
at the beginning of a sentence. To account for this we write a formula similar
to (α):
(β) en → $ / øC ___.
This does two things at once: it explains the lack of
sentence-initial en, and it restricts
what we can do with øC. (β)
implies that whenever a sentence does not contain an initial en, en and øC are both present but silent: øC is inherently silent, and en has been dropped. (β) preserves the (f + c) template while
explaining why the sentence as pronounced/written appears to violate it. And
with these various steps and procedures followed we now have our surface
output, with silent morphemes left unwritten:
(4) mi wile
moku e telo.
There! Wasn’t that simple?
No? Then don’t
bother with any of the stuff I’ve just described. Seriously: if your native
language has features similar to Toki Pona’s—short words, few inflections, an
SVO core syntax, no ergativity, no evidentials, a language like English,
Mandarin, or /Xam—then you might be better off playing with Pu. If “simplicity” for you means “don’t
make me memorize a bunch of stuff” then you can learn and speak Toki Pona quite
nicely by fitting it to your preconceptions: that the subject comes before the
verb and the object after it, and so on. There will be a price, of course: by
going the easy route and not digging deeper into what makes Toki Pona tick,
you’ll miss out on methods of analysis you can apply to other languages,
including your own. You’ll speak the
language, you’ll comprehend the
language, but you won’t understand it.
In other words: you’ll miss out on a lot of cool stuff. Your loss, ya big
linguistic weeny!
You know it’s
not enough for a chemist to mix baking soda and vinegar together and watch them
fizz—she’ll want to know why bases and acids neutralize each other. And that
means atoms and molarity and valence shells and what “pH” means. And as any
chemist will tell you, atoms are just plain weird.
The same is true of linguists: it’s not enough to memorize rules or write
descriptions. Why do subject and
object come before the verb in Tibetan, but after it in Hawai’ian? Why does KiSwahili have prepositions but
Japanese has postpositions? Why does
Mandarin have no adjectives and English no evidentials? There’s no way to
answer those questions without entering the Temple of Supreme Weirdness: the
mind. In the split second between thought and speech something amazing is
happening, something that happens nowhere else in the universe, and linguists
just love being amazed. All the technical jargon, parsing diagrams, and arcane
theorizing has one goal: to explore the most amazing thing in the known
universe. Your mind. jan Sonja designed
her language to do just this.
Complexity is
a relative term. What is “simple” to a speaker of English may be anything but
to a speaker of Salish. The reverse is also true: Pu spends a lot of ink showing us how to make nouns into verbs,
verbs into adjectives, and so on. Chief Seattle would have needed no such
guidance.
Scope
The key concept of Rule 2 is “scope.”All
function-words in Toki Pona (and for that matter, in all other languages, nat-
or con-) combine with content-words to form phrases. A “noun phrase” is one
that is initiated by a nominal f-word such as “a” or “the” in English, ka or nā in Hawai’ian, or e or lon in Toki Pona; similarly for verb
phrases. Because the affected content words follow them we say such f-words
have “rightward scope.” Other f-words establish a phrase by standing at the end
of a constituent rather than at its beginning: in Toki Pona, o makes a vocative phrase by following a
noun. These functions have “leftward scope.” A few f-words such as anu (“or”) make phrases out of the words on either side
of them: they have “binary scope.”
This is how f and c relate to form
phrases. But how do phrases relate? We saw how when building (4): the f-words
in phrases (and therefore, the phrases themselves) must follow one another in a
particular order. The li phrase must
follow the en phrase because li is lower in “rank” than en, as symbolized by the “<” in Rule
2. Another way to say this is that an en phrase
includes a li phrase within its own
meaning; similarly, a li phrase
contains an e phrase within its
meaning. When one phrase is including within the meaning of another we say it
is “nested” within it. A sentence is like a Matryushka doll, constituents
inside of other constituents like dolls inside of dolls, and getting smaller
the deeper you go from sentence to clause to phrase to word.
Having the concept of “scope” under our
belts, the “power” of la mentioned in
Pu (p. 51) can now be more precisely
defined: it is the second-highest-ranking function-word in the hierarchy. This means that it can cover a lot of
territory: it can include more than one phrase within its scope. The only
f-word of higher rank is the highest-ranking morpheme of all, σ.
Three2
And since we’re dealing with threes in this section, there are
three morphemes in Toki Pona that change their meanings by changing their
scope. These are:
eR,
the direct object preposition, but eB,
the object phrase and
enR,
the subject preposition, but enB,
the subject phrase and
liR, the predicate marker, but liB, the verb phrase and
Besides marking the
direct object, e is used to add
another object to an object phrase:
(5) ona li seli e soweli e pan
s/he predicate
fire object animal object grain
“she cooked the hares and some rice” (p. 61)
The first e in (5) serves to mark off the
object(s) of the sentence from the subject and the verb. Because it applies
only to the words following it, it has rightward scope. But the second e applies to the words on either side of
it: it has binary scope. But the second e
has changed its meaning, from object-marker to conjunction, and changed its
place in the Scope Hierarchy, being under the scope of the first e. It seems that we must “split” e into two morphemes here, as the two
uses—object-marker and conjunction—differ in meaning and in syntax. But e is not the only morpheme that acts
this way.
En is an interesting
morpheme as it is part of a rather drastic asymmetry in Toki Pona’s lexicon. Of
the Boolean (“logical”) operations and,
not, and or, the disjunct
and negative operations are each expressed with a single word: not is ala, and or is anu. The conjunct operation, on the
other hand, is represented by three words:
e: within direct object phrases, as we have just seen in (5)
en: within a subject
phrase, the only function of en mentioned
in Pu (p. 56, 57)
li: within verb +
verb constructions to indicate “and also,” or “and then”
En is only used in subject phrases, and is used in
exactly the same way as e is used in
object phrases. That is, to express “you and I washed” you say sina en mi li telo, but to say “washed
clothes and dishes” is li telo e ken e
ilo moku, not *li telo e ken en
ilo moku. To link verbs rather than nouns it is li that is repeated: li telo
li seli, “cleaned and cooked.”
E is basically a
preposition forming direct object phrases. We can also look at li as a “preposition” forming verb
phrases. It follows, then, that we can interpret en as the marker of the subject phrase: the subject preposition, as
it were. Interpreting en as a subject
marker allows us to replace the syntactic notion “subject” with a lexical
notion “subject marker,” eliminating the need for a separate rule to define
what a “subject” is. In this way we are able to move another bit of complexity
into the lexicon. The location of an en-phrase can be fixed in front of the predicate
(that is, the li-phrase) by letting en have scope over li, and this is what is done in the Scope Hierarchy described in
Rule 2.
Interpreting e, en, and
li as phrase-markers and as conjunctions means there is no
overt and in Toki Pona: and is expressed structurally, by the
repetition of the appropriate f-word. This is not such an odd notion: Mandarin
has a structural or, formed by repeating
the verb with a different object. Although Pu
does not explicitly license other forms of and,
I would think we could use any right-handed f-word as a conjunction:
(6) ?mi tawa tomo esun tawa tomo lipu
I (go) to building business to building book
“I went to the store and to the library”
(7) ?mi lon telo suli lon poka telo
I at water big at side water
“I am at the sea, (specifically) at the
shore”
(8) ?sina pali e ni kepeken ilo palisa kepeken ilo kiwen
you do object
this use tool wood use tool stone
“you made this using tools of wood and tools
of stone”
(9) ?ona li tawa sama waso sama kala
s/he predicate
go like bird like fish
“she went gliding like a bird, and also like
a fish”
(10) ?o tawa o pali
command go command
do
“go do it!”
I do not feel that such
an extension is within my competance as a mere repair tech, so I must leave it
to the Toki Pona community to rule on this matter. If the community is pleased
to find that such sentences as (6) - (10) are grammatical, all we have to do is
change {e, en, li} in Rule 3 to “any f R” and we’ve
covered all bases. If we wished, we could even eliminate Rule 3 entirely and
instead cover the conjunctive uses of e,
en, etc. by making lexical rewrite formulae, so:
e → and / e … ___ … {preposition, σ}.
By erecting Rule 3 as I
do, I take some complexity out of the lexicon and put it into the grammar —but
I do this in the name of ease of memorization, and to point up the similarities
in the uses of e, en, and li. Rule 3 as stated also makes
extensions as those shown in (6) - (10) easier to accommodate, should the kulupu pi toki pona so desire.
Four
It is the custom in conlang repair to spend lots of time
futzing with function morphemes: critics of Esperanto, for example, tend to
dislike the accusative case-marker -n. However,
I don’t want to eliminate anything from Pu,
nor do I wish to change any established meanings. What I want to do is
tease apart meanings that don’t belong together. There are two morphemes in
Toki Pona that can each be seen as performing two discrete roles, but which
differ in ways that cannot easily be captured within rules. I propose to split
each of these morphemes in two:
o splits to becomes oR (irreal mode) and oL
(vocative case)
pi splits to become piR (genitive case) and μB (modifier relation)
Once again: I am not
conjuring morphemes out of nowhere. I am finding morphemes that have always
been there, and writing descriptions of these morphemes that are consistant
with what is already known about the language.
OL and oR
According to Pu, the
f-word o is used in three contexts:
“1. after a noun phrase to show who is being called or addressed; 2. before a
verb to express a command or request, and 3. after the subject (and replacing li) to express a wish or desire” (p. 41)
The first use of o is
quite different from the other two. It is used to indicate a nominal case, the
“vocative.” This is a case used in Latin, Sanskrit, Hawai’ian, and other
languages to hail someone: “Hey!” or “o Such-and-so!” Unlike all other
case-markers in Toki Pona the scope of vocative o extends leftward, over the words that preceed it. In this it
patterns with la, which is also
left-handed and also terminates its scope at σ. From the examples I’ve seen, oL
and la are in complementary
distribution, and so may be grouped together in the Scope Hierarchy: |la, o|L.
There is a single technical term to cover uses 2 and 3 of o, and that is “irreal.” This is the
verbal mode used to indicate that the event in question does not represent
something describable with the S-prime know.
Instead, an irreal event is something represented by feel, if, think, want, or not.want.
Irreal events include the future tense (“will”), desideratives (“wish that,”
“intend to”), conditionals (“if/then,” “assuming that,” “might be”), subjunctives
(“would that it were,” “that X may”), jussives (“let’s”), and commands (“you
must,” “do it!”). My view of oR
as an irreal marker is reinforced by the use jan Sonja makes of it in her translation of a Bahá’í prayer:
(11) I bear
witness O my God, that You have created me to know You and to worship You.
sewi mi
o! mi toki wawa e ni: sina pali e mi tawa seme? mi o sona e sina. mi o olin e
sina. (p. 89)
divine my vocative
I say strongly object this: you
make object me for what? I irreal
know object you. I irreal love object you.
It seems here that o has a subjunctive sense, and should be
translated “that I may.” That is, the speaker does not yet know God, but feels and wants
that she might, or should. Li and oR are in complimentary distribution,
and in the Scope Hierarchy are grouped together into a set like this: |li, o|R. If o before a verb is the irreal marker,
then it follows that li marks the
real mode, statements asserted as known.
As li may be dropped but o may not, we can say that real is the
default mode: all sentences in Toki Pona are assumed to describe the world as
it is known unless we are given to
believe otherwise by oR.
Pi and μ (and la)
The most common binary-scope function word in Toki Pona is pi. In any phrase word1 + pi + word2, pi lets us know that word1 is
the “head” of the phrase and that word2 modifies word1 in
much the same way that an adjective modifies a noun or an adverb modifies a
verb in a natlang like English. Between two nouns, pi translates the English word “of.” However, pi is not used in phrases with only two c-words: to say “a good
person” you do not say *jan pi pona, “a
person of goodness.” Pi is only used
when three or more c-words occur in a phrase:
(12) jan pona toki pona: “a friend good to
talk to”
(13) jan pona pi toki pona: “a
proponent of Toki Pona”
Example (12) shows how the modifier relation operates in
strings of three or more words: “When another word is added to a noun phrase,
it describes the sum of all previous words” (p. 44). That is, in any sequence
of content words, the last is interpreted as a modifier of all the words that
preceed it; the second to the last modifies the words that preceed it but not
the word it preceeds, and so in. That is, the “scope” of a modifier is all the
content words to the left of it, and the modifier scopes “nest” like this:
(((head + modifier) + modifier) + … ). If the head is what we are used to
calling a “noun,” the modifiers are “adjectives”; if the head is a “verb” the
modifiers are “adverbs.” The underlying structure of (12) looks like this:
(12a) (((jan
pona) toki) pona)
We go from left to right:
jan means “person” and pona means “good,” so we have “a good
person,” i.e., a “friend.” Adding the next word toki gives us a “friend speaking,” or “a friend who speaks.” The
final pona modifies all that comes
before it, so we have “a good speaking-friend,” a friend who is good to speak
to.
Adding pi to the mix
changes things. We start with (13) as we did with (12), interpreting the
head-modifier phrases first. This time, however, whereas toki and pona2 are
linked, pona1 and toki are not, and the bracketing looks
like this:
(13a) ((jan
pona) pi (toki pona)).
Pi not only separates jan pona from toki pona, it treats each h-m phrase as a unit that may in turn be
bound together into another unit, a pi-phrase.
The constituent to the left of pi is
the head, and the constituent to its right is the modifier. Pi thus creates a head-modifier phrase,
but at a higher (that is, more inclusive) level than the head-modifier phrases
with nothing between the words.
Or is this nothing a something? That is: do we need a rule that
says “an adjective follows the noun,” or “when another word is added to a noun
phrase, it describes the sum of all previous words”? There is another option,
the LexPlex option: split pi into two
morphemes, one of which forms h-m relations on the deepest level of the syntax,
and another which does the same thing but on the next higher level. The
higher-level morpheme is pi itself;
the other is a morpheme that establishes the deepest-level head-modifier
relation, a silent morpheme which I call “mem” and symbolize as “μ.”
Both pi and
μ can be translated “of”: toki μ pona, “language of good.” Where pi and μ differ most crucially is in their ability to nest: μ-phrases can nest inside of pi-phrases, but pi-phrases cannot nest inside of μ-phrases: μ is lower on
the Scope Hierarchy than pi. Also, μ can nest under itself: like and it is “recursive”:
(12b) (((jan μ pona) μ toki) μ pona)
We see that phrases differ according to the presence of absence
of pi, and that we can capture this
difference by interpreting the absence of pi
as the presence of μ. But μ and øF also differ: though both are silent, non-commutating,
and of binary scope, μ binds any
adjacent c-words into a h-m relation, whereas øF binds adjacent words into an and relation and appears only between words used as numbers.
We may also say that la enacts the
h-m relation but on the level of the clause and in the opposite direction: any
string of words after σ and before la is a dependent clause (i.e., a
modifier-clause), while anything after la
and before σ is an independent
clause (i.e., a head-clause). La, μ, and
pi all do essentially the same thing
but at different levels and in different directions.
Mem has a pronounciation: it is a short pause between words,
shorter than the pause between f and c that occurs in li sona or e jan. Compound
words are formed by deleting μ: jan μ
pona, “a good person,” jan pona, “friend.”
That is, jan pona as “friend” is
treated by the lexicon as a single word, not as a string of two c-words
(regardless of how it is written).
Details
In addition to the four “repairs” I describe above, there are
other changes that can be made to how words in Toki Pona are described within
the lexicon. By “lexicon” here I mean not just a listing of words and meanings
(as found in Pu, pp. 125-134), but
labels accompanying words to help describe their syntactic behavior. For
example, a detailed lexicon of Toki Pona would specify which words are functions,
and the scope of each. We can do more than this, however.
Exclamations can be listed within a rule, but Lexplex would
suggest they be specified in the lexicon. Any word from the set {a, ala, ike, jaki, mu, o, pakala, pona, toki} may be used as
an exclamation. Most of these words change their meaning when so used: μ pakala, “broken,” σ pakala, “sorry!” I propose including seme, the question-marker, in this set,
with the meaning “huh?” “what the?”
The only form-classes in Toki Pona are
function-words and content-words. The labels “noun,” “(pre-)verb,” “adjective,”
“adverb,” and “preposition” are unnecessary, as any c-word may play any of
these roles depending on what f-word it stands under:
toki =
“hello!” / σ ___ σ
“say”
/ |li, o| ___
“language” / {e, en, kepeken, lon, sama, tan,
tawa} ___
“linguistic” / μ ___
“speaking of” / ___ la
sona =
“know (something)” /|li, o| ___ e
“know how (to do something)” / |li, o| ___ μ
To make things more convenient we can use the traditional
labels to indicate sets of words of related function. The set {e, en, kepeken, lon, sama, tan, tawa}
can be replaced with {preposition}, the set |li, o|R with |mode|, and so on. We can also rearrange
items into different sets as needed the way I do in Rule 2, where I specify e and en separately from the other prepositions to allow subject phrases
to preceed the verb and objects to follow it.
You’ll recall my argument about the phrase
jan pona and why it means “friend”
and not, say, “buddha.” We can make use
of the power granted us by lexical rewrite formulas to narrow down the meaning
of any word according to its context:
pona
=
“friend” / jan + ___
“simple” / toki
+ ___
“good” / lape
+ ___ (“good night” = σ lape pona σ)
and so on.
Some words in Toki Pona are
“underspecified” for class. That is, they are inherently neither f- nor
c-words, but take on their class assignment according to where they fit in the
template: lon before a c-word is a
preposition (i.e., an f-word) meaning “at,” but before an f-word it is a c-word
meaning “exist,” among other things:
lon
=
“at” / ___ c-word
“exist” / |li,
o| ___
“true” / μ ___
“yes” / σ ___
It
sometimes happens that when lon and tawa are used as verbs the underlying
noun-phrase that follows will begin with the same word used as a preposition.
That is, “to be at” is li lon lon and
“go to” is li tawa tawa. You could
say that such expressions are “overspecified”—they contain more information
than necessary. Whenever this happens the redundant preposition is dropped:
(γ)
{lon, tawa} → $ / {lon,
tawa} ___
A similar
rule drops oL when it
appears before oR. Redundancy
is incompatible with simplicity so away it goes, into the great void of $.
Several
sections ago I claimed to discover two silent morphemes in Toki Pona, the
f-words øF and σ.
However, I then proceeded to sneak in a third silent morpheme, øC, to represent the empty
category. My perspicacious readers undoubtedly discovered this deception many
pages ago and have been chuckling in anticipation of seeing my comeuppance in The New York Times Book Review. But your
author will get the last laugh! øF
and øC are in cold fact a single zero, whose apparent duality
comes about because it is unspecified for the f/c distinction. Just as suno means “sun,” “sunny,” or “shine”
depending on context, so ø takes on
one or the other role according to contexts specified in the lexicon:
ø →
øF / larger number ___ smaller number (additive function)
ø →
øC / σ ___ en (empty
category)
We don’t
have to specify ø as f or c until the
template demands it, and until it does, ø,
like lon, is a single morpheme despite
the reader’s detective genius. You’ve got to get up early if you’re going to
get the drop on the Cunning Linguist! A a
a!
Mem
is not the only f-word that can appear inaudibly between two c-words. Consider
this sentence:
(14) sina
toki ala toki e toki Inli?
you talk not talk object talk English
According
to what we know of silent morphemes and of dropping we can deduce that sina toki has the underlying form sina li toki and that toki Inli has the underlying form toki μ Inli. But what of the three
c-word string toki ala toki? “Not
speaking talkatively” might be the translation if the underlying form was toki μ ala μ toki and this were a saying
of Master Lào. However, the string verb + ala
+ verb, where both verbs are identical, is actually the way Toki Pona asks
questions requiring a “yes” or “no” answer. (The format is borrowed from
Mandarin.) A more accurate translation might be something along the lines of
“Do you speak English, or do you not speak (it)?” In other words, there’s a
Boolean or hidden in this
construction. The logical place to put it is after ala, so that the full underlying sentence is:
(14a) f c
f c f c f
c f c
f c
[en] sina
[li] toki μ ala [anu] toki e toki μ Inli?
“do you speak English?”
Anu is dropped from the surface output by the
formula
(δ) anu →$ / li c1 ala ___ c2, where c1 ≡
c2.
as
the ala between the identical verbs
makes anu redundant, and thus
droppable.
øC
is not the only c-word that can appear inaudibly between two f-words. Sina, for example, is always deleted before oR
in commands:
(ε) sina → $ / ___ oR when used
as imperative,
or,
to state it another way:
oR
→ command / sina → $ ___
Every language must navigate between the
Scylla and Charybdis of “say what you mean” and “don’t take too long saying
it.” (γ), (δ), and (ε) are three examples of how to shoot the rapids. More
rewrite formulas can be written for various other contexts, and more uses can
be made of øC, but I will
leave finding these for the amusement of the reader. O musi! Enjoy!
How
do you say “What did you do to her?” in Toki Pona? Most English-speakers might
come up with something like:
(16a) sina
puli e seme tawa ona?
you do object
what? to her
This
is the translation to be found in Pu:
(16b) sina seme e ona? (p. 32)
Although I don’t think jan Sonja speaks any Australian
languages, the use of seme as a verb
“do what?” is exactly parallel to the use of the verb wiyamal in Dyirbal. (16b) is a neat example of what you can do with
a conlang once you’ve freed yourself from the habit-patterns of your native
tongue. And you can do this even if—gasp!—you don’t speak any Native Australian
at all.
Sapir-Whorf . . . and beyond
jan
Sonja intends her language to be an experiment involving our old buddy the
Sapir-Whorf Hypothesis. She set up Toki Pona to be a benign Newspeak, a way of
simplifying thought to bring about spiritual (or at least, psychological)
insight. Rather than come to love Big Brother, jan Sonja wants us to love the “simplicity” extolled in the Dào Dé Jīng:
More
words count less
Hold
fast to the center (chapter 5)
It is
most important
To
see simplicity
To
realize one’s true nature
To
cast off selfishness
And
to temper desire. (chapter 19)
Have
little and gain
Have
much and be confused (chapter
22)
I believe I have shown throughout The Cunning Linguist that Sapir-Whorfian notions of linguistic
determinism simply won’t wash. But I think jan
Sonja has accomplished something much more interesting than yet another S-W
experiment: she has created a CSM—a conlang
semantic metalanguage.
In the chapter “Atoms For Peace” I discussed the S-prime, which
some linguists believe is the “atom of thought,” or at least, of language. The
best-known Natural Semantic Metalanguage is the one described by Wierzbicka and
Goddard, which makes use of about 65 primes. Toki Pona’s 120+ root-words
perform very much the same role as S-primes: “Toki Pona is a language that
breaks down advanced ideas to their most basic elements” (p. 9). The process of
translating from a natlang to Toki Pona is, in effect, a way of performing
semantic analysis by means of a metalanguage. There’s a bit of CSM on page 12
of Pu: “What is a ‘bad friend’? The
Toki Pona expression for friend is jan
pona, or literally ‘good person.’ You quickly realize that a bad friend is
a contradiction in itself.” Which you do, that is, assuming you know that pona and ike are antonyms (nowhere described as such in the dictionary, but
easily deduced), or that you know that pona
is how Toki Pona says good and
ike is how it says bad. Knowing these things, the
contradiction in *jan pona ike instantly
becomes obvious. Perhaps we might even state this as
Rule 4: Non-Contradiction
Two content-words that are antonyms to each
other may not modify the same head.
We specify in the lexicon
which words are antonyms and thus in complimenmtary distribution: |ike, pona|, |pimeja, suno|, and so on.
Contradictions such as Ayn Rand’s “rape by invitation” are tolerated in
English, but in Toki Pona it seems they are not only not tolerated, they are
not grammatical. In giving us a language in which words represent so directly
the “atoms of thought,” jan Sonja may
have made her most wonderful contribution to the wonderful world of conlangs.
And speaking of wonderful, let’s see what the Story of the Girl
looks like in the Language of Good:
ni la mama meli mi li toki e mi:
meli lili li insa e luka ona lon ko seli, ona li pana e ko seli tawa sewi. ona
li toki e ko seli, “ko seli lon ma ni la ale pi ona o ante tawa nasin pi sewi
pimeja . . .”
I use the compound: ko seli, “fire powder” to mean “ashes.”
And nasin pi sewi pimeja, “path of
the dark above,” is the Milky Way. Get on the ‘Net, buy Sonja Lang’s lovely
little book, and keep the story going.
#