ZE is monolexic; it contains only ze, the "joining" or "mixing" operator which has the sense of 'and, jointly'. Ze is not a logical connective; it does not allow a number of connected claims to be made simultaneously. A claim made with ze is always a single claim. Thus, Da redro, e cmalo makes two claims about X, namely that it is red, and that it is small. De redro ze nigro makes only one claim about Y, namely that it is red-and-black mixed together. Perhaps it has red stripes alternating with black ones; perhaps it has red dots on a black field. But if a thing is redro ze nigro, it is not true that it is either red or black separately.
ZE may be used to join arguments or, as above, predicates. The designata of ze- joined arguments are like teams. If Da ze de pa berti leva felstaga ('X and Y, jointly, carried that log (fallen trunk)' then what is being asserted is that the team of them did. We may gather that it is most unlikely that either of them could have carried it separately.
The grammar of ZE is very similar to that of the afterthought connectives; see Lexeme A. However, since ze-binding is "tighter" than shek-binding, that is, since strings like Da, e de ze di = 'X, and Y and Z, jointly' will always parse as (da e (de ze di)), grammatically "earlier" rules are required to effect ze-joints than shek-joints. This is what requires ze to be in its own lexeme.
ZE is also used by the preparser to recognize acronymic PREDA's. This is because pseudo-ze's are sometimes generated by the acronymic hyphen -z-.
*Lexeme ZI: Magnitude Suffixes
There are three of these words, zi za zu, indicating small, intermediate, and large magnitudes, respectively, in their role as suffixes in tense (fazi = 'right away') and location (vizu = 'in this region') compounds; see PA. A separate lexeme is required to enable the preparser to recognize such compounds and also to identify some acronymic PREDA's, because, just as with ZE (q.v.), pseudo-instances of za zi zu may be generated in these words by the acronymic hyphen -z-.
*Lexeme ZO: The Quantity Abstractor
ZO is monolexic; it's sole member is zo, the quantitative abstraction operator. It has a grammar parallel to that of PO, q.v. The only reason zo occupies a separate lexeme is because, just as with ZE and ZI words, pseudo-zo syllables may be generated by the use of hyphen -z- in some acronymic PREDA's. If it weren't for this mechanical use of ZO by the preparser, zo would be a member of PO.
GRAMMAR (UTTERANCE FORMS)
4.1 Design Objectives: The objectives that have controlled the design of Loglan grammar have been, first, to accommodate the rich variety of claims and designations found in natural language but, second, to do so with a grammar modeled on the predicate calculus, thus facilitating both the logical manipulation of its utterances by its users and the design of a language with parsimonious metaphysical assumptions. Third, it was to make the operations of that grammar natural enough to fit the human central nervous system so as to make the language speakable, and fourth, it was to be flexible enough to imitate the natural word orders of the planet's major tongues when desired…for example, in translation. But it was also, fifth, to be expressible in a rule set small enough to be easily learned by adults, possibly even under experimental conditions. Finally, sixth, the grammar was to be syntactically unambiguous for two reasons. The most obvious one was to make interaction with machines possible. Somewhat less obvious is a reason that is the same reason logicians and mathematicians require unambiguous codes, namely to make it possible rigorously to entertain implausible ideas…a prime requirement of a logical language.
Syntactic unambiguity was achieved heuristically, and therefore doubtably, as early as 1963, and was not to be formally demonstrated until 1982. A few years before that achievement a constructive-proof algorithm for demonstrating conflict-freeness in certain classes of computer languages had become available and was soon adapted to disambiguating human grammars. Syntactic unambiguity was lost again in 1984 when The Institute's small computer proved inadequate to the task of servicing the growing grammar. It was recently redemonstrated for the enlarged language when The Institute acquired more capacious computing machinery in 1986. During the Winter and Spring of 1987 a large backlog of planned new grammatical features were installed in the language one by one; and I was gratified to discover that the condition of conflict freeness was easily returned to each time. In short, given our present tools for disambiguating human grammars, there is no reason to believe that Loglan grammar will not remain syntactically unambiguous for the indefinite future.
4.2 Definitions and Conventions: Let us first look at a typical grammar rule and provide a terminology for discussing its parts:
De no ke ckano ki bunbo = Y is not both kind and a fool.
This is the grammar rule by which "kekked", or forethoughtfully connected, predicates are formed. The specimens on the right illustrate each line of the grammar rule. The number on the left gives its position in the grammar. The remark 'Local' tells us that the rules which use 'kekpred' are all nearby.
Each numbered line in a grammar rule is called a rule. The sign [=>] in each rule may be read 'may be developed as' or 'may produce' and is called the production sign. The expression to the left of the production sign in a rule is called its left-half. If a rule has no explicit left-half, it is assumed to have the same left-half as the first preceding rule that has one.
Rules 150-1 are all the rules in this grammar that have 'kekpred' as their left-halves. A grammar rule is the complete set of rules in a given grammar that have a given left-half. Within a grammar rule, the order of rules is arbitrary.
This is a context-free grammar. In such grammars the left-halves of all grammar rules are single elements. Right-halves may either be single elements, in which case the rule is called a replacement, or they may be strings of two or more elements (up to about five in number), in which case, it's an expansion.
Elements which are written entirely in upper-case letters, like 'KA' and 'NO', or in a mixture of upper-case letters and numerals, like 'M3', are called lexemes. Recall that lexemes are sets of grammatically interchangeable words, roughly corresponding to the "parts-of speech" of conventional grammar.
Elements whose names are written at least partly in lower-case letters—for example, 'kekpred' and 'predicate'—are known as gramemes. By definition, each grameme in a grammar appears in the left-half of exactly one of its grammar rules. At present, there are 88 gramemes in Loglan grammar; so there are 88 grammar rules.
The remark 'Local' appended to the 'kekpred' grammar rule tells us that the rules that use this grameme are all nearby (actually, in the same "grammar group"; see below). Sometimes a series of one or more numerals occupies the space occupied here by 'Local'; these are references to the non-local rules which use the grameme defined by that grammar rule. The forward references are given first; the backward references, if any, are separated from the forward ones by a semicolon.
The right-halves of a grammar rule are called the allograms of its grameme. 'kekpred' has two allograms. When either gramemes or their allograms are referred to in text, their names will be shown in angle brackets. Thus we can say that has the allograms and . Alternatively, we can write the Kekpred Grammar Rule as NO kekpred | M3 KA predicate KI predicate>. In this second formulation the bar [|] is used to separate the names of allograms and is read 'or'.
A grameme that appears in the allograms of another grameme, as
appears in one of the allograms of , is said to be used by that other grameme. A grameme which is used in one of its own allograms is said to be recursive. In general, we will find it advantageous to develop early in the grammar the gramemes that will be used by later gramemes.
A sequence of grammar rules which has been ordered by the above principle, and which terminates in a widely used structure, is called a grammar group, or sometimes just a group. Ideally, all the gramemes except the final one in a group are used only within that group. When this is true of a grameme in a group, that grameme is said to be a local to that group and so is marked 'Local' as above. Groups are usually given the name of the major non-local grameme with which they terminate. For example, is part of the Predicates Group (Rules 128-54) and is local to that group. The last grameme developed in the Predicates Group is
As shown in the example, each rule will be illustrated by a specimen of Loglan that it helped produce. When we need to refer to a rule or specimen, we will use the ordinal of the rule but prefix it with 'R' or 'S'; thus S150-1 are the specimens of R150-1. The part of a specimen that is in boldface corresponds to the part produced by its rule. In the specimens given for R150-1 above, all but the Loglan words [Da] and [De] were produced by the two rules; so all but those parts of the two specimens are shown in boldface. A corresponding pattern of bolding is shown in the English translations of the Loglan specimens.
Some rules use M- or machine lexemes. For example R150 uses M3. Machine lexemes are inaudible to the human ear, but, as we will see later, they are indispensable for machine parsing. They are in fact lexemes in machine Loglan, which may be thought of as that "dialect" of Loglan that machines will be able to read and hear. By convention, the name of a machine lexeme is formed of the letter 'M' followed by a numeral. There are at present 11 machine lexemes in Loglan: M1-M11. All are involved in one way or another with extending the limited 1-element lookahead of the LRl parser. This Yacc-generated parser is one of the three components of our machine grammar. Another component is the preparser which, among other mechanizing functions, inserts machine lexemes into the strings to be parsed. These machine lexemes, together with certain other lexemes (like ERROR in the first grammar group), are of concern only to the machine and so are not part of human Loglan. These and other computational features are removed from the parsed string by the third component of the machine grammar, the postparser. It is the function of the postparser to humanize the parse by making it intelligible to humans. We retain these machine-oriented features in our exposition of the grammar in the current notebook only because some of our readers may be interested in how machines may be said to "understand Loglan". We assume that most readers, however, will wish to ignore all grammatical embellishments that have been put there solely for machines. So we will glance only briefly at these computational devices. Readers interested primarily in the computability of the language must go to other sources for detailed information about how this has been accomplished; e.g., Notebook 1, 1982.
4.3 The Structure of Loglan Grammars: The grammar rules of Loglan may be conveniently divided into twelve functional groups. We will discuss those groups here in the order in which the listener is likely to make use of them (the so-called "top-down" order). In that order they are the groups of rules that govern the formation of (1) Optional Punctuators, (2) Linked Arguments, (3) Predicate Units, (4) Descriptive Predicates, (5) Sentence Predicates, (6) Modifiers, (7) Argument Modifiers, (8) Arguments, (9) Term Sets, (10) Predicates, (11) Sentences, and (12) Utterances.
In general, later structures on the above list involve earlier ones. Thus, looking at the grammar from the "bottom-up", we would find that
Utterances require Sentences;
Sentences require Modifiers, Predicates and Arguments;
Arguments require Descriptive Predicates and Argument Modifiers;
Argument Modifiers require Modifiers;
Modifiers require Arguments;
Sentence Predicates require Predicate Units and Descriptive Predicates;
Descriptive Predicates require Predicate Units;
Predicate Units require Linked Arguments;
Linked Arguments require Arguments again, which are thus the most "circular" structures in the language, in that they are used by nearly everything which they in turn use; and that
Optional Punctuators are used by nearly every other grammar group.
Therefore we will start with Group A, the Optional Punctuators, for these are at the "top" of the parse tree. We will then work "down" to Group K, which develops Utterances, for these are near its "root". The grammar is thus presented in its "top-down" order; it goes from the "leaves" (the lexemes) toward the "root" (the grameme itself) of the parse tree. This is the same order, by the way, in which the rules appear in the formal (machine) grammar which was given to Yacc to build the LR1 parser. While this may not be the best order in which to teach grammar rules to human learners, it is the most economical one in that it minimizes forward references. It is thus most suitable for a work whose most enduring value to its users will probably be its utility as a reference work.
The three "optional punctuators" gu, gue and PAUSE, the latter being represented by a pause in speech and a comma [,] in text, are optional only in the sense that they may be omitted when they are not necessary. But in fact, usage requires that these punctuators be omitted when the intended grouping is clear without them; so they are seldom optional in any real sense. Oddly enough, the computer sometimes treats these punctuators as "present in spirit" even when they are omitted. S1 is an example of this.
Mu titci fa (ERROR) = We eat later. (The machine sees this as an error because it expects a gu. But if it supplies the dummy lexeme ERROR at the end of this utterance, it can complete the parse.
We arrange for it to do this by making an allogram of in R2, and of in R4.)
Mu titci vi (ERROR) = We eat here.
Mu titci vi gu le supta = We eat here the soup. (Without gu the utterance would be heard as 'We eat in the soup'; so gu is necessary here.)
Both Mu vi titci le supta = 'We here eat the soup' and Mu titci le supta vi = 'We eat the soup here' avoid the need for punctuation. The unpunctuated word orders tend to be stylistically preferred in Loglan but are certainly not obligatory. In fact, the judicious use of punctuation makes almost any conceivable word-order possible in Loglan, a feature which is very useful in translating whenever one wishes to reproduce the flavor of a natural language text as closely as possible.
There is also an invisible '(ERROR)' at the end of S3 as of nearly all specimens. But we don't show it here because the allogram of is not part of R3.
Da bilti ge cmalo nirli ckela (ERROR) = X is beautiful for a small-girls school. (Punctuation at the end of an utterance may always be omitted. Again, the computer will compensate for such "errors" by inserting the dummy ERROR Lexeme.)
Da bilti ge cmalo nirli gue ckela = X is a beautiful small-girls [pause] school, i.e., a school for small girls who are beautiful. (Here the gue serves as a right parenthesis matched with ge.)
Mu titci fa gu le mitro = We eat later the meat. (Again, an explicit mark is called for; to omit it is to generate the phrase fa le mitro = 'after the meat'.)
Mu titci fa, le mitro = We eat later, the meat. (When is invoked, the required mark may be a comma in writing or a pause in speech.)