Download sentence()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Georgian grammar wikipedia , lookup

Pleonasm wikipedia , lookup

Macedonian grammar wikipedia , lookup

Antisymmetry wikipedia , lookup

English clause syntax wikipedia , lookup

Swedish grammar wikipedia , lookup

Japanese grammar wikipedia , lookup

Navajo grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Transformational grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Kannada grammar wikipedia , lookup

Inflection wikipedia , lookup

Portuguese grammar wikipedia , lookup

Old Irish grammar wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Parsing wikipedia , lookup

Italian grammar wikipedia , lookup

French grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Romanian nouns wikipedia , lookup

Zulu grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Chinese grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Romanian grammar wikipedia , lookup

Preposition and postposition wikipedia , lookup

Turkish grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

English grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
Computer Science 112
Fundamentals of Programming II
Recursive Processing of Languages
Applications
• Grammar checkers in word processors
• Programming language compilers
• Natural language queries (Google, etc.)
Languages and Grammars
• A grammar specifies the rules for
constructing well-formed sentences in a
language
• Every language, including a programming
language, has a grammar
Generate Sentences in English
• Given a vocabulary and grammar rules, one can generate some
random and perhaps rather silly sentences
• Vocabulary - the set of words belonging to the parts of speech
(nouns, verbs, articles, prepositions)
• Grammar - the set of rules for building phrases in a sentence
(noun phrase, verb phrase, prepositional phrase)
The Structure of a Sentence
sentence
noun phrase
verb phrase
A sentence is a noun phrase followed by a verb phrase
The Structure of a Sentence
sentence
noun phrase
article
verb phrase
noun
A noun phrase is an article followed by a noun
The Structure of a Sentence
sentence
noun phrase
article
the
verb phrase
noun
girl
Similar
to the behavior of strings so far
Pick actual words for those parts of speech at random
The Structure of a Sentence
sentence
noun phrase
article
the
noun
verb phrase
verb
girl
Similar
noun phrase
prepositional phrase
to the behavior of strings so far
A verb phrase is a verb followed by a noun phrase and a
prepositional phrase
The Structure of a Sentence
sentence
noun phrase
article
the
noun
verb phrase
verb
noun phrase
girl
Similarhit
to the
prepositional phrase
behavior of strings so far
Pick a verb at random
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
the
girl
Similarhit
to the
prepositional phrase
noun
behavior of strings so far
Expand a noun phrase again
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
the
prepositional phrase
noun
girl
the
boy
Similarhit
to the behavior
of
strings so far
Pick an article and a noun at random
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
the
noun
girl
the
boy
Similarhit
to the behavior
of
prepositional phrase
preposition
noun phrase
strings so far
A prepositional phrase is a preposition followed by a noun
phrase
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
the
noun
girl
the
boy
Similarhit
to the behavior
of
Pick a preposition at random
prepositional phrase
preposition
stringswith
so far
noun phrase
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
noun
prepositional phrase
preposition
noun phrase
article
the
girl
the
boy
Similarhit
to the behavior
of
Expand another noun phrase
stringswith
so far
noun
The Structure of a Sentence
sentence
noun phrase
article
noun
verb phrase
verb
noun phrase
article
the
noun
girl
the
boy
Similarhit
to the behavior
of
prepositional phrase
preposition
stringswith
so far
More random words from the parts of speech
noun phrase
article
noun
a
bat
Representing the Vocabulary
nouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair',
'fence', 'table', 'computer', 'cake', 'field']
verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']
prepositions = ['with', 'to', 'from', 'on', 'below',
'above', 'beside']
articles = ['a', 'the']
Use a list of words for each part of speech (lexical category)
Picking a Word at Random
nouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair',
'fence', 'table', 'computer', 'cake', 'field']
verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']
prepositions = ['with', 'to', 'from', 'on', 'below',
'above', 'beside']
articles = ['a', 'the']
import random
print(random.choice(verbs))
# Prints a randomly chosen verb
The random module includes functions to select numbers,
sequence elements, etc., at random
Grammar Rules
sentence = nounphrase verbphrase
nounphrase = article noun
verbphrase = verb nounphrase prepositionalphrase
prepositonalphrase = preposition nounphrase
A sentence is a noun phrase followed by a verb phrase
Etc., etc.
Define a Function for Each Rule
# sentence = nounphrase verbphrase
def sentence():
return nounphrase() + ' ' + verbphrase()
Each function builds and returns a string that is an instance
of the phrase
Separate phrases and words with a space
Define a Function for Each Rule
# sentence = nounphrase verbphrase
def sentence():
return nounphrase() + ' ' + verbphrase()
# nounphrase = article noun
def nounphrase():
return random.choice(articles) + ' ' + random.choice(nouns)
When a part of speech is reached, select an instance at
random from the relevant list of words
Call sentence() to Try It Out
# sentence = nounphrase verbphrase
def sentence():
return nounphrase() + ' ' + verbphrase()
# nounphrase = article noun
def nounphrase():
return random.choice(articles) + ' ' + random.choice(nouns)
…
for x in range(10): print(sentence())
# Display 10 sentences
You can also generate examples of the other phrases by
calling their functions
Kinds of Symbols in a Grammar
• Terminal symbols: words in the vocabulary
of the language
• Non-terminal symbols: words that describe
phrases or portions of sentences
• Metasymbols: used to construct rules
Metasymbols for a Grammar
Metasymbols
""
=
[ ]
{ }
( )
|
Use
Enclose literal items
Means "is defined as"
Enclose optional items
Enclose zero or more items
Group together required choices
Indicates a choice
A Grammar of
Arithmetic Expressions
expression = term { addingOperator
term }
term = factor { multiplyOperator factor }
factor = primary ["^" primary ]
primary = number | "(" expression ")"
number = digit { digit }
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
addingOperator = "+" | "-"
multiplyingOperator = "*" | "/"
Example sentences: 3,
4 + 5, 5 + 2 * 3, (5 + 2) * 3 ^ 4
Alternative Notation: Train
Track
term = factor { multiplyingOperator factor }
factor
*
/
primary = number
|
"(" expression ")"
number
(
expression
)
Parsing
• A parser analyzes a source program to
determine whether or not it is syntactically
correct
Source language program
Parser
OK or not OK
Syntax error messages
Scanning
• A scanner picks out words in a source
program and sends these to the parser
Source language program
Scanner
Tokens
Lexical error messages
Parser
Ok or not OK
Syntax error messages
The Scanner Interface
Scanner(aString)
Creates a scanner on a source string
get()
Returns the current token (at the cursor)
next()
Advances the cursor to the next token
Tokens
• A Token object has two attributes:
– type (indicating an operand or operator)
– value (an int if it’s an operand, or the source string otherwise)
• Token types are
– Token.EOE
– Token.PLUS, Token.MINUS
– Token.MUL, Token.DIV
– Token.INT
– Token.UNKNOWN
The Token Interface
Token(source)
Creates a token from a source string
str(aToken)
String representation
isOperator()
True if an operator, false otherwise
getType()
Returns the type
getValue()
Returns the value
Recursive Descent Parsing
• Each rule in the grammar translates to a
Python parsing method
expression = term { addingOperator term }
def expression(self):
self.term()
token = self.scanner.get()
while token.getType() in (Token.PLUS, Token.MINUS):
self.scanner.next()
self.term()
token = self.scanner.get()
Recursive Descent Parsing
• Each method is responsible for a phrase in
an expression
term = factor { multiplyingOperator factor }
def term(self):
self.factor()
token = self.scanner.get()
while token.getType() in (Token.MUL, Token.DIV):
self.scanner.next()
self.factor()
token = self.scanner.get()
Recursive Descent Parsing
primary = number
|
"(" expression ")"
def primary(self):
token = self.scanner.get()
if token.getType() == Token.INT:
self.scanner.next()
elif token.getType() == Token.L_PAR:
self.scanner.next()
self.expression()
self.accept(self._scanner.get(),
Token.R_PAR, "')' expected")
self.scanner.next()
else:
self.fatalError(token, "bad primary")
For Monday
Expression Trees