Download `Style Machine` and its Codes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Equative wikipedia , lookup

Kannada grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Preposition and postposition wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Relative clause wikipedia , lookup

French grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Chinese grammar wikipedia , lookup

Vietnamese grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Turkish grammar wikipedia , lookup

Romanian grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

English clause syntax wikipedia , lookup

Latin syntax wikipedia , lookup

Esperanto grammar wikipedia , lookup

English grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
T
Thhee ““SSttyyllee M
Maacchhiinnee”” aanndd iittss C
Cooddeess
ffoorr
SSttaattiissttiiccaall SSttuuddiieess ooff N
Naattuurraall SSyynnttaaccttiicc D
Deevveellooppm
meenntt::
A
Ann O
Onn--ggooiinngg K
KIISSSS PPrroojjeecctt
Dr. Ed Vavra, the Developer of KISS Grammar
For the “Introduction” to the Statistical Studies, click here.
El Greco’s The Fifth Seal of the Apocalypse or The Vision of Saint John) 1608–1614
(I love El Greco’s elongated figures.)
© 2014
2
The “Style Machine” is a computer program made in an authoring program called ToolBook
almost twenty years ago. I have been asked if I would give it to others to use. I’ll be happy to
give it to anyone who wants it, but I cannot give them a version that works. First, it is rather
complex and making it “user friendly” would require a great deal of work. In addition, it is
somewhat cranky, giving me problems that I have been able to fix, but I would not have the time
to fix such problems in copies I gave away. Moreover, it will not work in Windows 7. I would
love to have other people use it, but I simply cannot spend the time.
The program itself creates a separate data file for each collection of writing to be studied.
(See the “Introduction.”) In a data file, each sample has two “pages.” The first page includes
fields for an electronic copy of the writing sample, and for identification information. A button
on that page leads to the “analysis” page. Initially, the uncoded writing sample is simply copied
to a field in the analysis page. The analysis page, however, includes over 250 data fields that are
filled in when the sample is processed through the “coding” part of the program. (See below.)
As noted, I have been using this program for almost twenty years. Originally, I put nine
tables of data for each study in KISS’s “Cobweb Corner.” This was EXTREMELY timeconsuming, so I will probably not do it in the future. In this on-going project, all the data
documents include coded versions of each sample. This enables anyone who is interested to
check what is being counted and how. In many cases, you will also be able to see how some
writers tend to favor various constructions. Some writers use deeply embedded subordinate
clauses; others do not. Some writers favor gerundives over clauses. Others do a lot of
compounding.
T
Thhee ““SSttyyllee M
Maacchhiinnee”” C
Cooddeess
After an uncoded sample gets to the “analysis” page, a click sends it to the “coding page.”
No, the machine can not code the structure of a sample on its own—codes have to be “clicked”
into the text. The uncoded sample arrives in a “locked” text field. To the right of that field there
are buttons for different codes. The coding is a process of selecting a button and then clicking on
the word to which that code should be attached. The codes, as described below, are divided into
four groups. Once the sample has been coded, it is sent to the “Analysis” page where the various
3
numbers are automatically counted. From there, it and the numbers are sent to the “Analyzed”
page for that sample.
The “Main” Group of Codes
At the top of the coding page, four buttons can be clicked, each of which brings up one
group of codes and hides the others. One group is titled “Main.” It brings up buttons for the
following codes.
Prepositional Phrases
With this button clicked, any click on the text of the sample will insert a brace { }. A left
mouse button click inserts an opening brace; the right button inserts a closing brace. (Braces are
used instead of parentheses because braces are rarely—if ever—found in real texts.) The braces
are colored pink. To insert a code correctly, one has to click on the word to which the code
should be attached. These codes enable the program to count both the number of prepositional
phrases and the number of words in those prepositional phrases. Perhaps the most interesting
statistical result here in the percent of words within a text that are in prepositional phrases.
Main Clause Indicators
In typical KISS analysis keys, a vertical line indicates the end of a main clause. For
statistical analysis, however, the following codes are put before every main clause (or fragment):
/-/ means the following is a typical main clause.
/C/ means the following clause is a compound.
/R/ means the following is a run-on.
/,/ means the following is a comma-splice.
/F/ means the following is a fragment.
These codes enable the program to automatically count the number of compounds, run-ons,
comma-splices, and fragments per main clause.
Subject/Verb Agreement
With this button activated, clicking on a word attaches “=SV” to the word to indicate an
error in subject/verb agreement.
Three Codes for Other Compounds
=CS is attached to the second (or more) of compound subjects.
=CV is similarly attached to compound verbs.
=CC similarly identifies compound complements.
4
Passive Verbs
=PV is attached to passive finite verbs.
The “Subordinate Clause” Group of Codes
As in the regular KISS analysis keys, left and right mouse-button clicks insert opening and
closing brackets. (Unlike the regular analysis keys, here subordinate clause codes are in green.)
There are 33 code buttons in this group that identify eleven different functions of clauses. The
codes for these buttons attach four letters to the opening bracket. The first letter indicates
whether the code is “L”eft, “M”id, or “R”ight branching. If the second letter is “N,” the clause is
a noun clause. The machine distinguishes six types of noun clauses:
DO = direct object
PN = predicate noun
OP = object of a preposition
DS = delayed subject
SU = subject
AP = appositive
For example, if [LNDO is attached to the first word of a clause, that clause should be a
subordinate clause the functions as a left-branching direct object:
/-/[LNDO It was too late], he thought.
If the second two letters are “AJ” or “AV,” the clause is adjectival or adverbial. In these
clauses, the fourth letter is either “F” (for a full clause) or “R” (for a reduced clause. For
example:
/-/One must be careful [RAVRwhen hunting].
indicates that the “when hunting” is right-branching, adverbial, and is reduced from “when one is
hunting.”
The last three codes in this group have “INJ” as their final three letters to denote clauses that
function as interjections.
/-/It was, [MINJshe thought], a good idea.
Confused? It takes a while to get used to the system, and it does take some time to code a
passage. A 250-word passage usually takes fifteen to twenty minutes.
The “Analysis,” or “counting,” page automatically counts the number of words in each main
clause and each subordinate clause. It also distinguishes and counts embedding levels. (A
5
subordinate clause in a main clause is considered to be at Level One; a subordinate clause within
the subordinate clause is at Level Two, etc.
Consider, for example, the sentence from Twain’s Tom Sawyer:
\-\Forty times I've said [RNDO[LAVFif you didn't let that jam alone] I'd skin you.]
The program recognizes “[RNDO” as the beginning of a Level One, right-branching noun clause
that functions as a direct object, and it starts counting the words in it. The connected “[LAVF”
denotes the beginning of a Level Two clause that functions as a left-branching adverb. So the
program begins to count the words in it. The “]” after “alone” indicates the end of the Level Two
“if” clause. The “]” after “you” indicates the end of the Level One direct object clause.
The studies thus far suggest that professional writers rarely go beyond Level Three.
The “Verbals” Group of Codes
The twenty one buttons in this group work differently. The codes show up in the analyzed
text in blue, preceded by an asterisk and followed by numbers. Ten codes identify functions of
infinitives:
*INFDO = direct object
*INFDE = ellipsed direct object, as in “They made him captain.”
*INFPN = predicate noun
*INFOP = object of a preposition
*INFSU = subject
*INFDS = delayed subject
*INFAP = appositive
*INFIJ = interjection
*INFAJ = adjective
*INFAV = adverb
Similarly, eight codes for gerunds show up as:
*GerDO = direct object
*GerPN = predicate noun
*GerOP = object of a preposition
*GerSU = subject
*GerDS = delayed subject
*GerAP = appositive
*GerIJ = interjection
*GerNU = gerund that functions as a Noun Used as an Adverb
Three codes indicate the branching of gerundives:
*GiveL = before its subject
6
*GiveM = between the subject and verb
*GiveR = after the subject and verb
The two-digit number that follows each of the preceding codes indicates the number of
words that chunk to the verbal. These have to be counted (by hand), and the number put into
another field where it is automatically attached to the code. The following sentence is from
“Mary White,” by William Allen White:
\-\She climbed {into her khakis,} chattering*GiveR10 {to her mother} {about the
work} [RAJFshe was doing,] and hurried=CV to get*INFAV04 her horse and
be*INFAV19 out {on the dirt roads} {for the country air and the radiant green
fields} {of the spring.}
Although “chattering” is a participle that could be considered a gerund that functions as a Noun
Used as an Adverb (to “climbed”), KISS opts for the gerundive (adjectival) explanation because
students have problems with misplaced participles that function as gerundives. In other words,
either explanation would work here, but KISS has students focus on the gerundive first in order
to avoid misplaced modifiers.
In this case, the “chattering” phrase consists of ten words—“chattering to her mother about
the work she was doing.” Counting the number of words in such phrases is important because we
should expect the numbers to rise with age.
The “Other” Group of Codes
The last group of codes works in the same way as do the codes for verbals, except that they
are preceded by a #.
#NuA = Noun Used as an Adverb
#APP = Appositive
#INJ = Interjection
#DirA = Direct Address
#DSu = Delayed Subject (single word)
#RCM = Retained Complement
#PPP = Post-positioned Adjective
#AAV = Noun Absolute – Adverbial
#ASu = Noun Absolute – Subject
#ADO = Noun Absolute – Direct Object
#AOP = Noun Absolute – Object of Preposition
#APN = Noun Absolute – Predicate Noun
7
There are some constructions that are missed in this coding system, but the preceding
explanation will help you to decipher the codes in the selections.