Download BCA-601

Document related concepts

Philosophy of artificial intelligence wikipedia , lookup

Multi-armed bandit wikipedia , lookup

AI winter wikipedia , lookup

Ecological interface design wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Logic programming wikipedia , lookup

Embodied language processing wikipedia , lookup

Computer Go wikipedia , lookup

Expert system wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Personal knowledge base wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Transcript
ARTIFICIAL INTELLIGENCE
BCA - 601
ARTIFICIAL INTELLIGENCE
BCA - 601
This SIM has been prepared exclusively under the guidance of Punjab Technical
University (PTU) and reviewed by experts and approved by the concerned statutory
Board of Studies (BOS). It conforms to the syllabi and contents as approved by the
BOS of PTU.
Reviewer
Dr. N. Ch. S.N. Iyengar
Senior Professor, School of Computing Sciences,
VIT University, Vellore
Author: Prof. Angajala Srinivasa Rao
Copyright © Author, 2009
Reprint 2010
All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Publisher.
Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Publisher and its Authors shall in no event be liable for any errors, omissions
or damages arising out of use of this information and specifically disclaim any implied warranties or
merchantability or fitness for any particular use.
Vikas® is the registered trademark of Vikas® Publishing House Pvt. Ltd.
VIKAS® PUBLISHING HOUSE PVT LTD
E-28, Sector-8, Noida - 201301 (UP)
Phone: 0120-4078900 • Fax: 0120-4078999
Regd. Office: 576, Masjid Road, Jangpura, New Delhi 110 014
• Website: www.vikaspublishing.com • Email: [email protected]
CAREER OPPORTUNITIES
Computer software engineers are projected to be one of the
fastest growing occupations in the next decade. To improve
the intelligence of computer systems 'Computer applications
software engineers' analyze users' needs and design, construct,
and maintain specialized utility programs. Software engineers
coordinate the construction and maintenance of a company's
computer systems and plan their future growth. Software
engineers work for companies that configure, implement, and
install complete computer systems.
Increasing emphasis on intelligence of computers suggests that
software engineers with advanced degrees that include
artificial intelligence will be sought after by software
developers, government agencies, and consulting firms
specializing in information assurance and security.
PTU DEP SYLLABI-BOOK MAPPING TABLE
Artificial Intelligence
Syllabi
SECTION I
Introduction to AI: Definitions, AI problems, the underlying
assumption, and AI techniques, Level of Model, Criteria for
Success.
Problems, Problem Space and Search: Defining the problem
as a state space search, Production System, Problem
Characteristics, Production System Characteristics, issues in
design of search programs.
SECTION-II
Knowledge Representation Issues: Representation and
mapping, approaches to knowledge representation, issues in
knowledge representation, the frame problem.
Knowledge representation using predicate logic:
Representing simple facts in logic, representing instance and is
a relationships, resolution
SECTION-III
Weak -slot and -filler structures: Semantic nets, frames as
sets and instances.
Strong slot and filler structures: Conceptual dependency,
scripts, CYC.
Natural language processing: Syntactic processing, semantic
analysis, discourse and pragmatic processing
Mapping in Book
Unit 1: What is Artificial
Intelligence?
(Pages 3-30);
Unit 2: Problems, Problem
Spaces and Search
(Pages 31-62)
Unit 3: Knowledge Representation
Issues
(Pages 63-83);
Unit 4: Using Predicate Logic
(Pages 85-112)
Unit 5: Weak Slot and
Filler Structures
(Pages 113-132)
Unit 6: Strong Slot and
Filler Structures
(Pages 133-153)
Unit 7: Natural Language
Processing
(Pages 155-190)
CONTENTS
INTRODUCTION
UNIT 1
WHAT IS ARTIFICIAL INTELLIGENCE?
1
3-30
1.0 Introuction
1.1 Unit Objectives
1.2 Basics of Artificial Intelligence (AI)
1.2.1
1.2.2
1.2.3
1.2.4
1.2.5
1.2.6
AI Influence on Communication Systems
Timeline of Telecommunication-related Technology
Timeline of AI-related Technology
Timeline of AI Events
Modern Digital Communications
Building Intelligent Communication Systems
1.3 Beginning of AI
1.3.1 Definitions
1.3.2 AI Vocabulary
1.4 Problems of AI
1.5 Branches of AI
1.6 Applications of AI
1.6.1
1.6.2
1.6.3
1.6.4
1.6.5
1.6.6
1.6.7
1.6.8
1.6.9
Perception
Robotics
Natural Language Processing
Planning
Expert Systems
Machine Learning
Theorem Proving
Symbolic Mathematics
Game Playing
1.7 Underlying Assumption of AI
1.7.1 Physical Symbol System Hypothesis
1.8 AI Technique
1.8.1 Tic-Tac-Toe
1.8.2 Question Answering
1.9
1.10
1.11
1.12
1.13
1.14
1.15
1.16
Level of the AI Model
Criteria for Success
Conclusion
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
UNIT 2 PROBLEMS, PROBLEM SPACES AND SEARCH
2.0 Introduction
2.1 Unit Objectives
2.2 Problem of Building a System
2.2.1 AI - General Problem Solving
2.3 Defining Problem in a State Space Search
2.3.1 State Space Search
2.3.2 The Water Jug Problem
31-62
2.4 Production Systems
2.4.1 Control Strategies
2.4.2 Search Algorithms
2.4.3 Heuristics
2.5 Problem Characteristics
2.5.1
2.5.2
2.5.3
2.5.4
Problem Decomposition
Can Solution Steps be Ignored?
Is the Problem Universe Predictable?
Is Good Solution Absolute or Relative?
2.6 Characteristics of Production Systems
2.7 Design of Search Programs and Solutions
2.7.1 Design of Search Programs
2.7.2 Additional Problems
2.8
2.9
2.10
2.11
2.12
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
UNIT 3
KNOWLEDGE REPRESENTATION ISSUES
63-83
3.0 Introduction
3.1 Unit Objectives
3.2 Knowledge Representation
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
Approaches to AI Goals
Fundamental System Issues
Knowledge Progression
Knowledge Model
Knowledge Typology Map
3.3 Representation and Mappings
3.3.1 Framework of Knowledge Representation
3.3.2 Representation of Facts
3.3.3 Using Knowledge
3.4 Approaches to Knowledge Representation
3.4.1 Properties for Knowledge Representation Systems
3.4.2 Simple Relational Knowledge
3.5 Issues in Knowledge Representation
3.5.1
3.5.2
3.5.3
3.5.4
Important Attributes and Relationships
Granularity
Representing Set of Objects
Finding Right Structure
3.6 The Frame Problem
3.6.1 The Frame Problem in Logic
3.7
3.8
3.9
3.10
Summary
Key Terms
Questions and Exercises
Further Reading
UNIT 4
4.0
4.1
4.2
4.3
USING PREDICATE LOGIC
Introduction
Unit Objectives
The Basic in Logic
Representing Simple Facts in Logic
4.3.1 Logic
4.4 Representing Instance and Isa Relationships
85-112
4.5 Computable Functions and Predicates
4.6 Resolution
4.6.1
4.6.2
4.6.3
4.6.4
4.6.5
4.7
4.8
4.9
4.10
4.11
4.12
Algorithm: Convert to Clausal Form
The Basis of Resolution
Resolution in Propositional Logic
The Unification Algorithm
Resolution in Predicate Logic
Natural Deduction
Summary
Key Terms
Answers to ‘Check Your Progress”
Questions and Exercises
Further Reading
UNIT 5
WEAK SLOT AND FILLER STRUCTURES
113-132
5.0 Introduction
5.1 Unit Objectives
5.2 Semantic Nets
5.2.1 Representation in a Semantic Net
5.2.2 Inference in a Semantic Net
5.2.3 Extending Semantic Nets
5.3 Frames
5.3.1 Frame Knowledge Representation
5.3.2 Distinction between Sets and Instances
5.4
5.5
5.6
5.7
5.8
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
UNIT 6
6.0
6.1
6.2
6.3
6.4
STRONG SLOT AND FILLER STRUCTURES
133-153
Introduction
Unit Objectives
Conceptual Dependency
Scripts
CYC
6.4.1 Motivation
6.5
6.6
6.7
6.8
6.9
6.10
6.11
CYCL
Global Ontology
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
UNIT 7
NATURAL LANGUAGE PROCESSING
7.0 Introduction
7.1 Unit Objectives
7.2 The Problem of Natural Language
7.2.1 Steps in the Process
7.2.2 Why is Language Processing Difficult?
7.3 Syntactic Processing
7.4 Augmented Transition Networks
155-190
7.5 Semantic Analysis
7.6 Discourse and Pragmatic Processing
7.6.1 Conversational Postulates
7.7
7.8
7.9
7.10
7.11
7.12
General Comments
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
INTRODUCTION
Artificial Intelligence (AI) is a technique that helps create software programs to
make computers perform operations, which require human intelligence. Currently,
AI is being used in various application areas such as intelligent game playing, natural
language processing, vision processing and speech processing. In AI, a large amount
of knowledge is required to solve problems such as natural language understanding
and generation. To represent knowledge in AI, predicate logic is used, which helps
represent simple facts. Artificial intelligence is also referred to as computational
intelligence.
Introduction
NOTES
This book, Artificial Intelligence is divided into seven units. The first unit
covers the basics of AI whereas the second unit covers general problem solving,
heuristics, control strategies, problem decomposition and design of search programs.
The third unit discusses knowledge presentation issues including knowledge
progression, knowledge model and typology map.
The fourth unit focusses on predicate logic covering the basis of resolution,
the unification algorithm and resolution in propositional logic as well as predicate
logic.
Units five and six are devoted to weak and strong slots and filler structures.
While the former discusses semantic nets and frames, the latter discusses global
ontology, conceptual dependency, scripts, CYC, CYCL and motivation.
The last unit is devoted to natural language processing, syntactic processing,
semantic analysis, discourse and pragmatic processing and conversational postulates.
The book follows the self-instructional mode wherein each Unit begins with
an Introduction to the topic. The Unit Objectives are then outlined before going on
to the presentation of the detailed content in a simple and structured format. Check
Your Progress questions are provided at regular intervals to test the student’s
understanding of the subject. A Summary, a list of Key Terms and a set of Questions
and Exercises are provided at the end of each unit for recapitulation.
Self-Instructional
Material
1
Introduction
NOTES
2
Self-Instructional
Material
UNIT 1 WHAT IS ARTIFICIAL
INTELLIGENCE?
What is Artificial
Intelligence?
NOTES
Structure
1.0 Introuction
1.1 Unit Objectives
1.2 Basics of Artificial Intelligence (AI)
1.2.1
1.2.2
1.2.3
1.2.4
1.2.5
1.2.6
AI Influence on Communication Systems
Timeline of Telecommunication-related Technology
Timeline of AI-related Technology
Timeline of AI Events
Modern Digital Communications
Building Intelligent Communication Systems
1.3 Beginning of AI
1.3.1 Definitions
1.3.2 AI Vocabulary
1.4 Problems of AI
1.5 Branches of AI
1.6 Applications of AI
1.6.1
1.6.2
1.6.3
1.6.4
1.6.5
1.6.6
1.6.7
1.6.8
1.6.9
Perception
Robotics
Natural Language Processing
Planning
Expert Systems
Machine Learning
Theorem Proving
Symbolic Mathematics
Game Playing
1.7 Underlying Assumption of AI
1.7.1 Physical Symbol System Hypothesis
1.8 AI Technique
1.8.1 Tic-Tac-Toe
1.8.2 Question Answering
1.9
1.10
1.11
1.12
1.13
1.14
1.15
1.16
Level of the AI Model
Criteria for Success
Conclusion
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
1.0 INTRODUCTION
In this unit, you will learn about the basics of artificial intelligence (AI), a key
technology today, used in many applications. AI is also known as Computational
Intelligence.
With the development of the digital electronic computer in 1941, the
technology to create machine intelligence was finally available. The term, ‘Artificial
Intelligence’ was first used in 1956 at the Dartmouth conference, and since then AI
Self-Instructional
Material
3
What is Artificial
Intelligence?
NOTES
has expanded with theories and principles developed by dedicated researchers.
Though advances in the field of AI have been slower, than first estimated, progress
continues to be made. During the last four decades a variety of AI programs has
been developed in sync with other technological advancements.
In 1941, the development of digital computer revolutionized every aspect
of storage and processing of information in the US and Germany. The early computers
required large, separate air-conditioned rooms, and were programmers’ nightmare.
These involved separate configurations of thousands of wires to get a program
running.
The 1949 innovation—the stored program computer—made the job of
entering a program easier. Further advancements in computer theory, led to
development in computer science and eventually to AI. With the invention of an
electronic means of processing data came a medium that made AI possible.
1.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
• Explain the basics of AI
• Discuss the advantages and constraints of AI
• Explain the beginning of AI
• Describe the key applications of AI
• Understand the underlying assumption of AI
• Discuss AI techniques
• Explain the level of the model
• Describe the criteria for success
1.2
BASICS OF ARTIFICIAL INTELLIGENCE
Artificial intelligence is ‘the learning and devising of an intelligence system’. It
recognizes its surrounding and initiates actions that maximize its change of success.
The term AI is used to describe the ‘intelligence’ that the system demonstrates.
Tools and insights from fields, including linguistics, psychology, computer science,
cognitive science, neuroscience, probability, optimization and logic are used. Since
the human brain uses many techniques to plan and cross-check facts, so now, system
integrations is seen as esstential for AI.
1.2.1 AI Influence on Communication Systems
• The telephone is one of the most marvellous inventions of the communications
era: the physical distance is conquered instantly.
■
Any telephone in the world can be reached through a vast communication
network that spans oceans and continents.
■
The form of communication is natural, namely human speech.
• Communication is no more just two persons talking; it is much more.
■
Humans communicate with a knowledge source to gather facts.
4
Self-Instructional
Material
■
■
■
■
■
Humans communicate with intelligent systems for solving problems
requiring higher mental process.
Humans communicate with experts to seek specialized opinion.
Humans communicate with logic machines to seek guidance.
Humans communicate with reasoning systems to get new knowledge.
Humans communicate for many more things to do.
What is Artificial
Intelligence?
NOTES
• Advances in communication technologies have led to increased worldwide
connectivity, while new technology like cell-phone has increased mobility.
1.2.2 Timeline of Telecommunication-related Technology
• Roots of Communication: The development of communication systems began
two centuries ago with wire-based electrical systems called telegraph and
telephone. Before that it was human messengers on foot or horseback. Egypt
and China built messenger relay stations.
• Electric Telegraph was invented by Samuel Morse in 1831.
• Morse code was invented by Samuel Morse in 1835, a method for transmitting
telegraphic information.
• Typewriter was invented by Christopher Latham Sholes in 1867, as the first
practical typewriting business offices machine.
• Telephone was invented by Alexander Graham Bell in 1875, an instrument
through which speech sounds, not voice, were first transmitted electrically.
• Telephone Exchange, a Rotary Dialling system became operational in New
Haven, Connecticut in 1878.
• Kodak Camera was invented in 1888 by George Eastman, who also introduced
Rolled Photographic Film.
• Telegraphing was invented by Valdemar Poulsen in 1899; it was a tape
recorder; recording media was a magnetized steel tape.
• Wireless telegraphy was invented by Guglielmo Marconi in 1902; it
transmitted MF band radio signals across the Atlantic Ocean, from Cornwall
to Newfoundland.
• Audion vacuum tube was invented by Lee Deforest in 1906, a two-electrode
detector device and later in 1908 a three electrode amplifier device.
• Cross-continental telephone call was invented by Graham Bell’s 1914,
evolution of the Telegraph into the Telephone; the Greek word tele meaning
from a far, and phone meaning voice.
• Radios with tuners came in 1916, a technological revolution of its time; pilots
directed gunfire of artillery batteries on ground through wireless operator
attached to each battery.
• Iconoscope was invented by Vladimir Zworykin in 1923, a tube for television
camera needed for TV transmission.
• Television system was invented by John Logie Baird in 1925, transmitted TV
signals in 1927 between London and Glasgow over telephone line.
• Radio networks came in 1927, a system which distributed programming
(contents) to multiple stations simultaneously, for the purpose of extending
total coverage beyond the limits of a single broadcast station.
Self-Instructional
Material
5
What is Artificial
Intelligence?
Radio broadcasting is traditionally through air as radio waves; it can be via
cable, FM, local wire networks, satellite and internet.
Stations are linked in radio networks, either in syndication or simulcast or both.
NOTES
1.2.3 Timeline of AI-related Technology
• Roots of AI: The development of artificial intelligence actually began centuries
ago, long before the computer.
• Roman Abacus (5000 years ago): machine with memory.
• Pascaline (1652): Calculating machines that mechanized arithmetic.
• Diff. Engine (1849): Mechanical calculating machines programmable to
tabulate polynomial functions.
• Boolean algebra (1854): ‘Investigation of Laws of Thought’ symbolic language
of calculus.
• Turing machines (1936): an abstract symbol-manipulating device, adapted
to simulate the logic, was the first computer invented (on paper only).
• Von Neumann architecture (1945): Computer design model, a processing
unit and a shared memory structure to hold both instructions and data.
• ENIAC (1946): Electronic Numerical Integrator and Calculator, the first
electronic ‘general-purpose’ digital computer by Eckert and Mauchly.
1.2.4 Timeline of AI Events
The concept of AI as a true scientific pursuit is very new. It remained over centuries
a plot for popular science fiction stories. Most researchers agree the beginning of
AI with Alan Turing.
• Turing test, by Alan Turing in 1950, in the paper ‘Computing Machinery and
Intelligence’ used to measure machine intelligence.
• Intelligent behaviour by Norbert Wiener in 1950 observed link between human
intelligence and machines, and theorized intelligent behaviour.
• Logic Theorist, a program by Allen Newell and Herbert Simon in 1955,
claimed that machines can contain minds just as human bodies do, proved 38
out of the first 52 theorems in Principia Mathematica.
• Birth of AI, Dartmouth Summer Research Conference on Artificial Intelligence
in 1956, organized by John McCarthy, regarded as the father of AI.
• Seven years later, in 1963, AI began to pick up momentum. The field was
still undefined, and the ideas formed at the conference were re-examined.
• In 1957, General Problem Solver (GPS) was tested. GPS was an extension of
Wiener’s feedback principle, and capable of solving to a great extent common
sense problems.
• In 1958, LISP language was invented by McCarthy and soon adopted as the
language of choice among most AI developers.
• In 1963, DoD’s Advanced Research projects started at MIT, researching
Machine-Aided Cognition (artificial intelligence), by drawing computer
scientists from around the world.
6
Self-Instructional
Material
• In 1968, Micro-world program SHRDLU, at MIT, controlled a robot arm
operated above flat surface scattered with play blocks.
• In mid-1970s, Expert systems for Medical diagnosis (Mycin), Chemical data
analysis (Dendral) and Mineral exploration (Prospector) were developed.
• During the 1970s, Computer vision (CV) technology of machines that ‘see’
emerged. David Marr was first to model the functions of the visual system.
What is Artificial
Intelligence?
NOTES
• In 1972, Prolog by Alain Colmerauer was a logic programming language.
Logic programming is the use of logic in both declarative and procedural
representation language.
1.2.5 Modern Digital Communications
In 1947, Shannon created a mathematical theory, which formed the basis for modern
digital communications. Since then the developments were:
• 1960s: three geosynchronous communication satellites, launched by NASA.
• 1961: packet switching theory was published by Leonard Kleinrock at MIT.
• 1965: wide-area computer network, a low speed dial-up telephone line, created
by Lawrence Roberts and Thomas Merrill.
• 1966: Optical fibre was used for transmission of telephone signals.
• Late 1966: Roberts went to DARPA to develop the computer network concept
and put his plan for the ‘Advanced Research Projects Agency Network –
ARPANET’, which he presented in a conference in 1967. There Paul Baran
and others at RAND presented a paper on packet switching networks for
secure voice communication in military use.
• 1968: Roberts and DARPA revised the overall structure and specifications
for the ARPANET, released an RFQ for development of key components: the
packet switches called Interface Message Processors (IMP’s).
• Architectural design by Bob Kahn.
• Network topology and economics design and optimization by Roberts with
Howard Frank and his team at Network Analysis Corporation.
• Data networking technology, network measurement system preparation by
Kleinrock’s team at University of California, Los Angeles (UCLA).
The year 1969 saw the beginning of the Internet era, the development of
ARPANET, an unprecedented integration of capabilities of telegraph,
telephone, radio, television, satellite, Optical fibre and computer.
In 1969 a day after Labour Day, UCLA became the first node to join the
ARPANET. That meant, the UCLA team connected the first switch (IMP) to
the first host computer (a minicomputer from Honeywell).
A month later the second node was added at SRI (Stanford Research Institute)
and the first Host-to-Host message on the Internet was launched from UCLA.
It worked in a cleaver way, as described below:
• Programmers for ‘logon’ to the SRI Host from the UCLA Host typed in ‘log’
and the system at SRI added ‘in’, thus creating the word ‘login’.
• Programmers could communicate by voice as the message was transmitted
using telephone headsets at both ends.
Self-Instructional
Material
7
What is Artificial
Intelligence?
NOTES
• Programmers at the UCLA end typed in the ‘l’ and asked SRI if they received
it; came the voice reply ‘got the l’.
• By 1969, they connected four nodes (UCLA, SRI, UC SANTA BARBARA,
and University of Utah). UCLA served for many years as the ARPANET
Measurement Centre.
• In the mid-1970s, UCLA controlled a geosynchronous satellite by sending
messages through ARPANET from California to East Coast satellite dish. By
1970, they connected ten nodes. * In 1972, the International Network Working
Group, INWG was formed to further explore packet switching concepts and
internetworking, as there would be multiple independent networks of arbitrary
design.
• In 1973, Kahn developed a protocol that could meet the needs of an open
architecture network environment. This protocol is popularly known as TCP/
IP (Transmission Control Protocol/Internet Protocol).
Note: TCP/IP is named after two of the most important protocols in it.
• IP is responsible for moving packet of data from node to node.
• TCP is responsible for verifying delivery of data from client to server.
• Sockets are subroutines that provide access to TCP/IP on most systems.
In 1976, X.25 protocol was developed for public packet networking.
• In 1977, the first Internet work was demonstrated by Cerf and Kahn. They
connected three networks with TCP: the Radio network, the Satellite network
(SATNET), and the Advanced Research Projects Agency Network
(ARPANET).
• In the 1980s, ARPANET was evolved into INTERNET. Internet is defined
officially as networks using TCP/IP.
• On 1 January 1983, the ARPANET and every other network attached to the
ARPANET officially adopted the TCP/IP networking protocol. From then
on, all networks that use TCP/IP are collectively known as the Internet. The
standardization of TCP/IP allows the number of Internet sites and users to
grow exponentially.
• Today, Internet has millions of computers and hundreds of thousands of
networks. The network traffic is dominated by its ability to promote ‘peopleto-people’ interaction.
1.2.6 Building Intelligent Communication Systems
Modern telecommunications are based on coordination and utilization of individual
services, such as telephony, cellular, cable, microwave terrestrial and satellite, and
their integration into a seamless global network. Successful design, planning,
coordination, management and financing of global communications network require
a broad understanding of these services, their costs and the advantages and
limitations. The next generation of wireless and wired communication systems and
networks has requirements for many AI and AI-related concepts, algorithms,
techniques and technologies. Research groups are currently exploring and using
Semantic Web languages in monitoring, learning, adaptation and reasoning.
8
Self-Instructional
Material
To present an idea of Intelligent Communication Systems, examples of an
‘Intelligent Mobile Platform’, the ‘VoiceRecognition across Mobile Phone’ and the
‘Project Knowledge Based Networking’ by DARPA are briefly illustrated.
Ex.1 Intelligent Mobile Platform: Magitti
The mobile phone is no more a simple two-way communication device. Intelligent
mobiles infer our behaviour and suggest appropriate lists of restaurants, stores and
events.
What is Artificial
Intelligence?
NOTES
The difference between Magitti with other GPS-enabled mobile applications
is Artificial intelligence.
Magitti is intelligence personal assistant.
Magitti’s specification has not been released, but mobile phones are becoming
increasingly powerful with sensors, entertainment tools, accelerometer, GPS, etc.
Perhaps AI would make more sense in not so long a future.
Ex.2 Voice Recognition across mobile phone: Lingo
Mobile phones can do lots of things, but a majority of people never use them for
more than calls and short text messages. A voice recognition-correction interface
across mobile phone applications is coming to market to provide speech recognition.
Now you can talk to your phone and the phone understands you.
Ex.3 Project Knowledge Based Networking by DARPA
Military research aims to develop self-configuring, secure wireless nets. Academic
concepts of Artificial Intelligence and Semantic Web, combined with technologies
such as the Mobile Ad-hoc Network (MANET), Cognitive Radio and Peer-to-Peer
networking provide the elements of such a network. This project by DARPA is
intended for soldiers in the field.
Ex.4 Semantic Web
Web is the first generation of the World Wide Web and contains static HTML ages.
Web 2.0 is the second generation of the World Wide Web, focused on the ability for
people to collaborate and share information online. It is dynamic, serves applications
to users and offers open communications. Web requires a human operator, using
computer systems to perform the tasks required to find, search and aggregate its
information. It is impossible for a computer to do these tasks without human guidance
because
Web pages are specifically designed for human readers.
Ex. 5 Mobile Ad-hoc Network (MANET)
The traditional wireless mobile networks have been ‘Infrastructure based’ in which
mobile devices communicate with access points like base stations connected to the
fixed network. Typical examples of this kind of wireless networks are GSM, WLL,
WLAN, etc. Approaches to the next generation of wireless mobile networks are
‘Infrastructure less’. MANET is one such network. A MANET is a collection of
wireless nodes that can dynamically form a network to exchange information without
using any existing fixed network infrastructure. This is very important because in
many contexts information exchange between mobile units cannot rely on any fixed
network infrastructure, but on rapid configuration of a wireless connections on-theSelf-Instructional
Material
9
What is Artificial
Intelligence?
NOTES
fly. The MANET is a self-configuring network of mobile routers and associated
hosts connected by wireless links, the union of which forms an arbitrary topology
that may change rapidly and unpredictably.
MANET concept
A mobile ad-hoc network is a collection of wireless nodes that can dynamically be
set up anywhere and anytime without using any existing network infrastructure. It
is an autonomous system in which mobile hosts connected by wireless links are
free to move randomly and often act as routers at the same time.
The MANET traffic includes:
• Peer-to-Peer: between two nodes
• Remote-to-Remote: two nodes beyond a single hop, and
• Dynamic Traffic: nodes are moving around.
Ex.6 Cognitive Radio
The concept of ‘cognitive’ radio was originated by Defense Advanced Research
Products Agency (DARPA) scientist, Joseph Mitola in 1999, and is the ‘next step
up’ for software defined radios (SDR) that are emerging today, primarily in military
applications. Most commercial radios and many two-way communication devices
are hardware-based with predetermined, analog operating parameters.
A Software Defined Radio (SDR) is a radio communication system where
components that have typically been in hardware (e.g. mixers, filters, amplifiers,
modulators/demodulators, detectors, etc.) are implemented using software. The
SDR concept is not new. A basic SDR consists of a computer (PC), a sound card,
some form of RF front end and embedded signal-processing algorithms. SDR radio
can receive and transmit a different form of radio protocol just by running different
software.
1.3 BEGINNING OF AI
Although the computer provided the technology necessary for AI, it was not until
the early 1950s that the link between human intelligence and machines was really
observed. Norbert Wiener was one of the first Americans to make observations on
the principle of feedback theory. The most familiar example of feedback theory is
the thermostat: It controls the temperature of an environment by gathering the actual
temperature of the house, comparing it to the desired temperature, and responding
by turning the heat up or down. What was so important about his research into
feedback loops was that Wiener theorized that all intelligent behaviour was the
result of feedback mechanisms—mechanisms that could possibly be simulated by
machines. This discovery influenced much of early development of AI.
In late 1955, Newell and Simon developed The Logic Theorist, considered
by many to be the first AI program. The program, representing each problem as a
tree model, would attempt to solve it by selecting the branch that would most likely
result in the correct conclusion. The impact that the logic theorist made on both the
public and the fields of AI has made it a crucial stepping stone in developing the AI
field.
10
Self-Instructional
Material
In 1956 John McCarthy, the father of AI, organized a conference to draw
the talent and expertise of others interested in machine intelligence for a month of
brainstorming. He invited them to Vermont for the conference of ‘The Dartmouth
summer research project on artificial intelligence.’ From that point, because of
McCarthy, the field would be known as Artificial intelligence. Although not a huge
success, the Dartmouth conference did bring together the founders of AI, and served
to lay the basis for the future of AI research.
What is Artificial
Intelligence?
NOTES
In 1957 at Dartmouth New Hampshire conference discussed about the
possibilities of simulating human intelligence and thinking in computers. Today
Artificial Intelligence is a well-established, natural and scaling branch of computer
science. It is the science and engineering of making intelligent machines, especially
intelligent computer programs. It is related to the similar task of using computers to
understand human intelligence, understanding language, learning, reasoning and
solving problems. But AI does not have to confine itself to the methods that are
biologically observable.
We will discuss alternative definitions of Artificial Intelligence below.
1.3.1 Definitions
Artificial Intelligence is the study of how to make computers do things which at the
moment people do better. This is ephemeral as it refers to the current state of computer
science and it excludes major area problems that cannot be solved well either by
computers or by people at the moment.
Alternative - I
Artificial Intelligence is the branch of computer science that is concerned with the
automation of intellectual performance. AI is based upon the principles of computer
science, namely data structures used in knowledge representation, the algorithms
needed to apply that knowledge and the languages and programming techniques
used in their implementation.
Alternative - II
Artificial Intelligence is a field of study that encompasses computational techniques
for performing tasks that seem to require intelligence when performed by humans.
Alternative - III
Artificial Intelligence is the field of study that seeks to explain and follow intellectual
performance in terms of computational processes.
Alternative - IV
Artificial Intelligence is about generating representations and procedures that
automatically or autonomously solve problems heretofore solved by humans.
Alternative - V
Artificial Intelligence is that part of computer science concerned with designing
intelligent computer systems, i.e. computer systems that exhibit characteristics. We
associate intelligence in human behaviour with understanding language, learning,
reasoning and solving problems.
Self-Instructional
Material
11
What is Artificial
Intelligence?
NOTES
There are various definitions of Artificial Intelligence according to different
authors. Most of these definitions take a very technical direction and avoid
philosophical problems connected with the idea that AI’s purpose is to construct an
artificial human. These definitions are categorized into four categories. The four
categories are:
1. Systems that think like humans (focus on reasoning and human framework)
2. Systems that think rationally (focus on reasoning and a general concept of
intelligence)
3. Systems that act like humans (focus on behaviour and human framework)
4. Systems that act rationally (focus on behaviour and a general concept of
intelligence)
What is rationality? - ‘Doing the right thing,’ simply speaking.
Definitions:
1. ‘The art of creating machines that performs functions that require intelligence
when performed by humans’ (Kurzweil). Involves cognitive modelling—we
have to determine how humans think in a literal sense.
2. GPS is a General Problem Solver (Newell and Simon). Deals with ‘right
thinking’ and dives into the field of logic. Uses logic to represent the world
and relationships between objects in it and come to conclusions about it.
Problems: hard to encode informal knowledge into a formal logic system and
theorem provers have limitations (if there’s no solution to a given logical
notation).
3. Turing defined intelligent behaviour as the ability to achieve human-level
performance in all cognitive tasks, sufficient to fool a human person (Turing
Test). Physical contact with the machine has to be avoided, because physical
appearance is not relevant to exhibit intelligence. However, the ‘Total Turing
Test’ includes appearance by encompassing visual input and robotics as well.
4. The rational agent—achieving one’s goals given one’s beliefs. Instead of
focusing on humans, this approach is more general, focusing on agents (which
perceive and act), more general than strict logical approach (i.e. thinking
rationally).
In practical ways, the Intelligence of a Computer System provides:
• Ability to automatically perform tasks that currently require human operators
• More autonomy in computer systems; fewer requirements for human
interference or monitoring
• Flexibility in dealing with inconsistency in the environment
• Ability to understand what the user wants from limited instructions
• Increase in performance by learning from experience
1.3.2 AI Vocabulary
Intelligence relates to tasks involving higher mental processes, e.g. creativity, solving
problems, pattern recognition, classification, learning, induction, deduction, building
analogies, optimization, language processing, knowledge and many more.
Intelligence is the computational part of the ability to achieve goals.
12
Self-Instructional
Material
Intelligent behaviour is depicted by perceiving one’s environment, acting in
complex environments, learning and understanding from experience, reasoning to
solve problems and discover hidden knowledge, applying knowledge successfully
in new situations, thinking abstractly, using analogies, communicating with others
and more.
What is Artificial
Intelligence?
NOTES
Science based goals of AI pertain to developing concepts, mechanisms and
understanding biological intelligent behaviour. The emphasis is on understanding
intelligent behaviour.
Engineering based goals of AI relate to developing concepts, theory and practice
of building intelligent machines. The emphasis is on system building.
AI Techniques depict how we represent, manipulate and reason with knowledge in
order to solve problems. Knowledge is a collection of ‘facts’. To manipulate these
facts by a program, a suitable representation is required. A good representation
facilitates problem solving.
Learning means that programs learn from what facts or behaviour can represent.
Learning denotes changes in the systems that are adaptive in other words, it enables
the system to do the same task(s) more efficiently next time.
Applications of AI refers to problem solving, search and control strategies, speech
recognition, natural language understanding, computer vision, expert systems, etc.
1.4 PROBLEMS OF AI
Intelligence does not imply perfect understanding; every intelligent being has limited
perception, memory and computation. Many points on the spectrum of intelligence
versus cost are viable, from insects to humans. AI seeks to understand the
computations required from intelligent behaviour and to produce computer systems
that exhibit intelligence. Aspects of intelligence studied by AI include perception,
communicational using human languages, reasoning, planning, learning and memory.
Let us consider some of the problems that Artificial Intelligence is used to
solve. Early examples are game playing and theorem proving, which involves
resolution. Common sense reasoning formed the basis of GPS a general problem
solver. Natural language processing met with early success; then the power of the
computers hindered progress, but currently this topic is experiencing a flow. The
question of expert systems is interesting but it represents one of the best examples
of an application of AI, which appears useful to non-AI people. Actually the expert
system solves particular subsets of problems using knowledge and rules about a
particular topic.
The following questions are to be considered before we can step forward:
1. What are the underlying assumptions about intelligence?
2. What kinds of techniques will be useful for solving AI problems?
3. At what level human intelligence can be modelled?
4. When will it be realized when an intelligent program has been built?
Check Your Progress
1. Who advanced
mathematical
theory?
2. Web 2.0 is the
second generation
of the World Wide
Web, focused on
people to
collaborate and
share information
online. (True or
False)
3. A MANET is a
collection of
wireless nodes that
can dynamically
form a network to
exchange
information without
using any existing
fixed network
infrastructure. (True
or False).
Self-Instructional
Material
13
What is Artificial
Intelligence?
NOTES
1.5 BRANCHES OF AI
A list of branches of AI is given below. However some branches are surely missing,
because no one has identified them yet. Some of these may be regarded as concepts
or topics rather than full branches.
Logical AI
In general the facts of the specific situation in which it must act, and its goals are all
represented by sentences of some mathematical logical language. The program
decides what to do by inferring that certain actions are appropriate for achieving its
goals.
Search
Artificial Intelligence programs often examine large numbers of possibilities – for
example, moves in a chess game and inferences by a theorem proving program.
Discoveries are frequently made about how to do this more efficiently in various
domains.
Pattern Recognition
When a program makes observations of some kind, it is often planned to compare
what it sees with a pattern. For example, a vision program may try to match a
pattern of eyes and a nose in a scene in order to find a face. More complex patterns
are like a natural language text, a chess position or in the history of some event.
These more complex patterns require quite different methods than do the simple
patterns that have been studied the most.
Representation
Usually languages of mathematical logic are used to represent the facts about the
world.
Inference
Others can be inferred from some facts. Mathematical logical deduction is sufficient
for some purposes, but new methods of non-monotonic inference have been added
to the logic since the 1970s. The simplest kind of non-monotonic reasoning is default
reasoning in which a conclusion is to be inferred by default. But the conclusion can
be withdrawn if there is evidence to the divergent. For example, when we hear of a
bird, we infer that it can fly, but this conclusion can be reversed when we hear that
it is a penguin. It is the possibility that a conclusion may have to be withdrawn that
constitutes the non-monotonic character of the reasoning. Normal logical reasoning
is monotonic, in that the set of conclusions can be drawn from a set of premises, i.e.
monotonic increasing function of the premises. Circumscription is another form of
non-monotonic reasoning.
Common sense knowledge and Reasoning
14
Self-Instructional
Material
This is the area in which AI is farthest from the human level, in spite of the fact that
it has been an active research area since the 1950s. While there has been considerable
progress in developing systems of non-monotonic reasoning and theories of action,
yet more new ideas are needed.
Learning from experience
There are some rules expressed in logic for learning. Programs can only learn what
facts or behaviour their formalisms can represent, and unfortunately learning systems
are almost all based on very limited abilities to represent information.
What is Artificial
Intelligence?
NOTES
Planning
Planning starts with general facts about the world (especially facts about the effects
of actions), facts about the particular situation and a statement of a goal. From
these, planning programs generate a strategy for achieving the goal. In the most
common cases, the strategy is just a sequence of actions.
Epistemology
This is a study of the kinds of knowledge that are required for solving problems in
the world.
Ontology
Ontology is the study of the kinds of things that exist. In AI the programs and
sentences deal with various kinds of objects and we study what these kinds are and
what their basic properties are. Ontology assumed importance from the 1990s.
Heuristics
A heuristic is a way of trying to discover something or an idea embedded in a
program. The term is used variously in AI. Heuristic functions are used in some
approaches to search or to measure how far a node in a search tree seems to be from
a goal. Heuristic predicates that compare two nodes in a search tree to see if one is
better than the other, i.e. constitutes an advance toward the goal, and may be more
useful.
Genetic programming
Genetic programming is an automated method for creating a working computer
program from a high-level problem statement of a problem. Genetic programming
starts from a high-level statement of ‘what needs to be done’ and automatically
creates a computer program to solve the problem. It is being developed by John
Koza’s group.
1.6 APPLICATIONS OF AI
AI has applications in all fields of human study, such as finance and economics,
environmental engineering, chemistry, computer science, and so on. Some of the
applications of AI are listed below:
• Perception
■
Machine vision
■
Speech understanding
■
Touch ( tactile or haptic) sensation
• Robotics
Self-Instructional
Material
15
What is Artificial
Intelligence?
NOTES
• Natural Language Processing
■
Natural Language Understanding
■
Speech Understanding
■
Language Generation
■
Machine Translation
• Planning
• Expert Systems
• Machine Learning
• Theorem Proving
• Symbolic Mathematics
• Game Playing
1.6.1 Perception
Perception is defined as ‘as the formation, from a sensory signal, of an internal
representation suitable for intelligent processing’. Computer perception is an
example of artificial intelligence. It focuses on vision and speech. Since it involves
incident time-varying continuous energy distributions prior to interpretation in
symbolic terms, perception is seen to be distinct from intelligence.
1.6.1.1 Machine vision
It is easy to interface a TV camera to a computer and get an image into memory; the
problem understands what the image represents. Vision takes lots of computation;
in humans, roughly 10 per cent of all calories consumed are burned in vision
computation.
1.6.1.2 Speech understanding
Speech understanding is available now. Some systems must be trained for the
individual user and require pauses between words. Understanding continuous speech
with a larger vocabulary is harder.
1.6.1.3 Touch (tactile or haptic) sensation
Important for robot assembly tasks.
1.6.2 Robotics
Although industrial robots have been expensive, robot hardware can be cheap. The
limiting factor in application of robotics is not the cost of the robot hardware itself.
What is needed is perception and intelligence to tell the robot what to do;
‘blind’’ robots are limited to very well-structured tasks (like spray painting car
bodies).
1.6.3 Natural Language Processing
Natural language processing is explained below.
1.6.3.1 Natural language understanding
16
Self-Instructional
Material
Natural languages are human languages such as English. Making computers
understand English allows non-programmers to use them with little training. Getting
computers to generate human or natural languages can be called Natural Language
generation. It is much easier than natural language understanding. A text written in
one language and then generated in another language by means of computers can
be called machine translation. It is important for organizations that operate in many
countries.
What is Artificial
Intelligence?
NOTES
1.6.3.2 Natural language generation
Generation is easier than Natural Language understanding. It can be an inexpensive
output device.
1.6.3.3 Machine translation
Usable translation of text is available now. It is important for organizations that
operate in many countries.
1.6.4 Planning
Planning attempts to make an order of actions for achieving goals. Planning
applications include logistics, manufacturing scheduling and planning manufacturing
steps to construct a desired product. There are huge amounts of money to be saved
through better planning.
1.6.5 Expert Systems
Expert Systems attempt to capture the knowledge of a human and present it through
a computer system. There have been many successful and economically valuable
applications of expert systems.
1.6.5.1 Benefits
• Reducing the skill level needed to operate complex devices
• Diagnostic advice for device repair
• Interpretation of complex data
• ‘Cloning’ of scarce expertise
• Capturing knowledge of expert who is about to retire
• Combining knowledge of multiple experts
• Intelligent training
1.6.6 Machine Learning
Machine learning has been a central part of AI research from the beginning. In
machine learning, unsupervised learning is a class of problems in which one seeks
to determine how the data is organized. Unsupervised learning is the ability to find
patterns in a stream of input. Supervised learning includes both classification
(determines what category something belongs in) and regression (given a set of
numerical input/output examples, discovers a continuous function that would
generate the outputs from the inputs).
1.6.7 Theorem Proving
Proving mathematical theorems might seem to be mainly of academic interest.
However, many practical problems can be cast in terms of theorems. A general
theorem prover can therefore be widely applicable.
Self-Instructional
Material
17
What is Artificial
Intelligence?
1.6.8 Symbolic Mathematics
Symbolic mathematics refers to manipulation of formulas, rather than arithmetic
on numeric values.
NOTES
• Algebra
• Differential Calculus
• Integral Calculus
Symbolic manipulation is often used in conjunction with ordinary scientific
computation as a generator of programs used to actually do the calculations. Symbolic
manipulation programs are an important component of scientific and engineering
workstations.
1.6.9 Game Playing
Games are good for research because they are well formalized, small and selfcontained. And they are good models for competitive situations, so principles
discovered in game playing programs may be applicable to practical problems.
Apart from these, the four following categories highlight application areas where
AI technology is having a strong impact on industry and everyday life.
1. Authorizing Financial Transactions: Credit card provides, telephone
companies, mortgage lenders, banks and the US Government employ AI
systems to detect fraud and expenditure financial transactions, with daily
transaction volumes in the billions. These systems first use learning algorithms
to construct profiles of customer usage patterns, and then use the resulting
profiles to detect unusual patterns and take the appropriate action (Ex.
Disabling the credit card). Such automated oversight of financial transactions
is an important component in achieving a viable basis for electronic commerce.
2. Configuring Hardware and Software: Systems that diagnose and treat
problems, whether illness in people or problems in hardware and software,
are now in widespread use. Diagnostic systems based on Artificial Intelligence
Technology are being built into photocopiers, computer operating systems
and office automation tools to reduce service calls. Standalone units are being
used to monitor and control operations in factories and office buildings.
Artificial Intelligence based systems assist physicians in many kinds of medical
diagnosis, in prescribing treatments and monitoring patient responses.
Microsoft’s Office Assistant, an integral part of every office application,
provides users with customized help by means of decision-theoretic reasoning.
3. Scheduling for Manufacturing: The use of automatic scheduling for
manufacturing operations in exploring as manufacturers realize that remaining
competitive demands on every more efficient use of resources. This Artificial
Intelligence technology—supporting rapid rescheduling up and down the
‘supply chain’ to respond to changing orders, changing markets, and
unexpected events—has been shown to be superior to less adaptive systems
based on older technology. This same technology has proven highly effective
in other commercial tasks, including job shop scheduling and assigning airport
gates and railway crews. It also has proven highly effective in military settings.
DARPA reported that an AI based Logistics Planning Tool, DART, pressed
into service for operations Dessert shield and Dessert storm, completely repaid
its three decades of investments in AI research.
18
Self-Instructional
Material
4. The Future: AI began as an attempt to answer some of the most fundamental
questions about human existence by understanding the nature of intelligence,
but it has grown into a scientific and technological field affecting many aspects
of commerce and society.
Even as AI technology becomes integrated into the framework of everyday
life, AI researchers remain focused on the grand challenges of automating
intelligence. Artificial intelligence is used in fields of medical diagnosis, stock
trading, robot control, law, scientific discovery, video games and toys. It can also be
used for evaluation in specific problems such as small problems in chemistry,
handwriting recognition and game-playing. For this Natural language processing
(NLP) is used which interacts between computers and human (natural) languages.
Natural language processing gives AI machines the ability to read and understand
the languages that human beings speak. Some straightforward applications of natural
language processing include information retrieval or text mining and machine
translation. The pursuit of ultimate goals of AI—design of intelligent artifacts,
understanding of human intelligence, abstract understanding of intelligence (possibly
super human)—continues to have practical consequences in the form of new
industries, enhanced functionality for existing systems, increased productivity in
general and improvements in the quality of life. But the ultimate promises of AI are
still decades away and the necessary advances in knowledge and technology will
require a continuous fundamental research effort.
1.7
What is Artificial
Intelligence?
NOTES
UNDERLYING ASSUMPTION OF AI
Newell and Simon presented the Physical Symbol System Hypothesis, which lies
in the heart of the research in Artificial Intelligence.
A physical symbol system consists of a set of entities called symbols, which
are physical patterns that can occur as components of another entity called an
expression or symbol structure. A symbol structure is a collection of instances or
tokens of symbols related in some physical way. At any instant, the system will
contain a collection of these symbol structures. In addition, the system also contains
a collection of processes that operate on expressions to produce other expressions;
processes of creation, modification, reproduction and destruction.
A physical symbol system is a machine that produces an evolving collection
of symbol structures.
1.7.1 Physical Symbol System Hypothesis
A physical symbol system has the necessary and sufficient attributes for general
intelligent action. It is only a hypothesis and there is no way to prove it other than
on empirical validation.
Evidence in support of the physical symbol system hypothesis has come not
only from the areas such as game playing but also from the areas such as visual
perception. We will discuss the visual perception area later.
The importance of the physical symbol system hypothesis is twofold. It is the
major theory of human intelligence and also forms the basic belief that, it is possible
to build programs that can execute intelligent tasks now performed by people.
Check Your Progress
Fill in the blanks:
4. Perception is one of
the_______by AI.
5. Pattern recognition
is one of the____of
AI.
6. Robotics is one of
the_______of AI.
Self-Instructional
Material
19
What is Artificial
Intelligence?
NOTES
1.8 AI TECHNIQUE
Artificial Intelligence research during the last three decades has concluded that
Intelligence requires knowledge. To compensate overwhelming quality, knowledge
possesses less desirable properties.
A. It is huge.
B. It is difficult to characterize correctly.
C. It is constantly varying.
D. It differs from data by being organized in a way that corresponds to its
application.
E. It is complicated.
An AI technique is a method that exploits knowledge that is represented so that:
• The knowledge captures generalizations that share properties, are grouped
together, rather than being allowed separate representation.
• It can be understood by people who must provide it—even though for many
programs bulk of the data comes automatically from readings.
• In many AI domains, how the people understand the same people must supply
the knowledge to a program.
• It can be easily modified to correct errors and reflect changes in real conditions.
• It can be widely used even if it is incomplete or inaccurate.
• It can be used to help overcome its own sheer bulk by helping to narrow the
range of possibilities that must be usually considered.
In order to characterize an AI technique let us consider initially OXO or tictac-toe and use a series of different approaches to play the game.
The programs increase in complexity, their use of generalizations, the clarity
of their knowledge and the extensibility of their approach. In this way they move
towards being representations of AI techniques.
1.8.1 Tic-Tac-Toe
1.8.1.1 The first approach (simple)
The Tic-Tac-Toe game consists of a nine element vector called BOARD; it represents
the numbers 1 to 9 in three rows.
1
2
3
4
5
6
7
8
9
An element contains the value 0 for blank, 1 for X and 2 for O. A MOVETABLE
vector consists of 19,683 elements (39) and is needed where each element is a nine
element vector. The contents of the vector are especially chosen to help the algorithm.
The algorithm makes moves by pursuing the following:
1. View the vector as a ternary number. Convert it to a decimal number.
20
Self-Instructional
Material
2. Use the decimal number as an index in MOVETABLE and access the vector.
3. Set BOARD to this vector indicating how the board looks after the move.
This approach is capable in time but it has several disadvantages. It takes
more space and requires stunning effort to calculate the decimal numbers.
This method is specific to this game and cannot be completed.
What is Artificial
Intelligence?
NOTES
1.8.1.2 The second approach
The structure of the data is as before but we use 2 for a blank, 3 for an X and 5 for
an O. A variable called TURN indicates 1 for the first move and 9 for the last. The
algorithm consists of three actions:
MAKE2 which returns 5 if the centre square is blank; otherwise it returns
any blank non-corner square, i.e. 2, 4, 6 or 8.
POSSWIN (p) returns 0 if player p cannot win on the next move and otherwise
returns the number of the square that gives a winning move. It checks each line
using products 3*3*2 = 18 gives a win for X, 5*5*2=50 gives a win for O, and the
winning move is the holder of the blank.
GO (n) makes a move to square n setting BOARD[n] to 3 or 5.
This algorithm is more involved and takes longer but it is more efficient in
storage which compensates for its longer time. It depends on the programmer’s
skill.
1.8.1.3 The final approach
The structure of the data consists of BOARD which contains a nine element vector,
a list of board positions that could result from the next move and a number
representing an estimation of how the board position leads to an ultimate win for
the player to move.
This algorithm looks ahead to make a decision on the next move by deciding
which the most promising move or the most suitable move at any stage would be
and selects the same.
Consider all possible moves and replies that the program can make. Continue
this process for as long as time permits until a winner emerges, and then choose the
move that leads to the computer program winning, if possible in the shortest time.
Actually this is most difficult to program by a good limit but it is as far that
the technique can be extended to in any game. This method makes relatively fewer
loads on the programmer in terms of the game technique but the overall game strategy
must be known to the adviser.
1.8.2 Question Answering
Let us consider Question Answering systems that accept input in English and provide
answers also in English. This problem is harder than the previous one as it is more
difficult to specify the problem properly. Another area of difficulty concerns deciding
whether the answer obtained is correct, or not, and further what is meant by ‘correct’.
For example, consider the following situation:
1.8.2.1 Text
Rani went shopping for a new Coat. She found a red one she really liked.
When she got home, she found that it went perfectly with her favourite dress.
Self-Instructional
Material
21
What is Artificial
Intelligence?
1.8.2.2 Question
1. What did Rani go shopping for?
2. What did Rani find that she liked?
NOTES
3. Did Rani buy anything?
Method 1
1.8.2.3 Data Structures
A set of templates that match common questions and produce patterns used to match
against inputs. Templates and patterns are used so that a template that matches a
given question is associated with the corresponding pattern to find the answer in the
input text. For example, the template who did x y generates x y z if a match occurs
and z is the answer to the question. The given text and the question are both stored
as strings.
1.8.2.4 Algorithm
Answering a question requires the following four steps to be followed:
• Compare the templates against the questions and store all successful matches
to produce a set of text patterns.
• Pass these text patterns through a substitution process to change the person
or voice and produce an expanded set of text patterns.
• Apply each of these patterns to the text; collect all the answers and then print
the answers.
1.8.2.5 Example
In question 1 we use the template WHAT DID X Y which generates
Rani go shopping for z and after substitution we get
Rani goes shopping for z and Rani went shopping for z giving z [equivalence] a
new coat
In question 2 we need a very large number of templates and also a scheme to allow
the insertion of ‘find’ before ‘that she liked’; the insertion of ‘really’ in the text; and
the substitution of ‘she’ for ‘Rani’ gives the answer ‘a red one’.
Question 3 cannot be answered.
1.8.2.6 Comments
This is a very primitive approach basically not matching the criteria we set for
intelligence and worse than that, used in the game. Surprisingly this type of technique
was actually used in ELIZA which will be considered later in the course.
Method 2
1.8.2.7 Data Structures
A structure called English consists of a dictionary, grammar and some semantics
about the vocabulary we are likely to come across. This data structure provides the
knowledge to convert English text into a storable internal form and also to convert
22
Self-Instructional
Material
the response back into English. The structured representation of the text is a processed
form and defines the context of the input text by making explicit all references such
as pronouns. There are three types of such knowledge representation systems:
production rules of the form ‘if x then y’, slot and filler systems and statements in
mathematical logic. The system used here will be the slot and filler system. Take,
for example sentence:
What is Artificial
Intelligence?
NOTES
‘She found a red one she really liked’.
Event2
Event2
instance:
finding
instance:
liking
tense:
past
tense:
past
agent:
Rani
modifier:
much
object:
Thing1
object:
Thing1
Thing1
instance:
coat
colour:
red
The question is stored in two forms: as input and in the above form.
1.8.2.8 Algorithm
• Convert the question to a structured form using English know how, then use
a marker to indicate the substring (like ‘who’ or ‘what’) of the structure, that
should be returned as an answer. If a slot and filler system is used a special
marker can be placed in more than one slot.
• The answer appears by matching this structured form against the structured
text.
• The structured form is matched against the text and the requested segments
of the question are returned.
1.8.2.9 Examples
Both questions 1 and 2 generate answers via a new coat and a red coat respectively.
Question 3 cannot be answered, because there is no direct response.
1.8.2.10 Comments
This approach is more meaningful than the previous one and so is more effective.
The extra power given must be paid for by additional search time in the knowledge
bases. A warning must be given here: that is – to generate an unambiguous English
knowledge base is a complex task and must be left until later in the course. The
problems of handling pronouns are difficult. For example:
Rani walked up to the salesperson: she asked where the toy department was.
Rani walked up to the salesperson: she asked her if she needed any help.
Whereas in the original text the linkage of ‘she’ to ‘Rani’ is easy, linkage of
‘she’ in each of the above sentences to Rani and to the salesperson requires additional
knowledge about the context via the people in a shop.
Self-Instructional
Material
23
What is Artificial
Intelligence?
Method 3
1.8.2.11 Data Structures
NOTES
World model contains knowledge about objects, actions and situations that are
described in the input text. This structure is used to create integrated text from input
text. The diagram shows how the system’s knowledge of shopping might be
represented and stored. This information is known as a script and in this case is a
shopping script.
Shopping Script: C - Customer, S - Salesperson
Props: M - Merchandize, D - Money-dollars, Location: L - a Store.
C (Customer)
selects the products
to purchase
C contacts to S
(Salesperson) to know
product’s details and
payment scheme
Materials to be
delivered by S after
clearing payment
Admin Dept contains M (Merchandize), D (MoneyDollars and L (Store Details) to control the whole
shopping transaction
Refer and instruct to
S
Marketing Dept keeps C details,
such as Name, Address, Phone
number and Email
This box represents C’s, S’s and Marketing
Activities
Arrow represents control of Shopping Flow
This box represents Admin Dept Activities
Fig. 1.1 Diagrammatic Representation of Shopping Script
1.8.2.12 Algorithm
24
Self-Instructional
Material
Convert the question to a structured form using both the knowledge contained in
Method 2 and the World model, generating even more possible structures, since
even more knowledge is being used. Sometimes filters are introduced to prune the
possible answers. To answer a question, the scheme followed is:
Convert the question to a structured form as before but use the world model
to resolve any ambiguities that may occur. The structured form is matched against
the text and the requested segments of the question are returned.
1.8.2.13 Example
What is Artificial
Intelligence?
NOTES
Both questions 1 and 2 generate answers, as in the previous program.
Question 3 can now be answered. The shopping script is instantiated and
from the last sentence the path through step 14 is the one used to form the
representation.
‘M’ is bound to the red coat-got home. ‘Rani buys a red coat’ comes from
step 10 and the integrated text generates that she bought a red coat.
1.8.2.14 Comments
This program is more powerful than both the previous programs because it has
more knowledge. Thus, like the last game program it is exploiting AI techniques.
However, we are not yet in a position to handle any English question. The major
omission is that of a general reasoning mechanism known as inference to be used
when the required answer is not explicitly given in the input text. But this approach
can handle, with some modifications, questions of the following form with the
answer—Saturday morning Rani went shopping.
Her brother tried to call her but she did not answer.
Question
Why couldn’t Rani’s brother reach her?
Answer
Because she was not in.
This answer is derived because we have supplied an additional fact that a person
cannot be in two places at once.
This patch is not sufficently general so as to work in all cases and does not provide
the type of solution we are really looking for.
1.9 LEVEL OF THE AI MODEL
‘What is our goal in trying to produce programs that do the intelligent things that
people do?’
do?
Are we trying to produce programs that do the tasks the same way that people
OR
Are we trying to produce programs that simply do the tasks the easiest way
that is possible?
Programs in the first class attempt to solve problems that a computer can
easily solve and do not usually use AI techniques. AI techniques usually include a
search, as no direct method is available, the use of knowledge about the objects
involved in the problem area and abstraction on which allows an element of pruning
to occur, and to enable a solution to be found in real time; otherwise, the data could
explode in size. Examples of these trivial problems in the first class, which are now
Self-Instructional
Material
25
What is Artificial
Intelligence?
NOTES
of interest only to psychologists are EPAM (Elementary Perceiver and Memorizer)
which memorized garbage syllables.
The second class of problems attempts to solve problems that are non-trivial for a
computer and use AI techniques. We wish to model human performance on these:
1. To test psychological theories of human performance. Ex. PARRY [Colby,
1975] – a program to simulate the conversational behaviour of a paranoid
person.
2. To enable computers to understand human reasoning – for example, programs
that answer questions based upon newspaper articles indicating human
behaviour.
3. To enable people to understand computer reasoning. Some people are reluctant
to accept computer results unless they understand the mechanisms involved
in arriving at the results.
4. To exploit the knowledge gained by people who are best at gathering
information. This persuaded the earlier workers to simulate human behaviour
in the SB part of AISB simulated behaviour. Examples of this type of approach
led to GPS (General Problem Solver) discussed in detailed later and also to
natural language understanding which will be covered in later lessons.
1.10 CRITERIA FOR SUCCESS
In any engineering project, we often ask when we will know we have succeeded. In
1950, Alan Turing proposed the Turing Test which determined if a machine could
think.
The test is conducted by two people, one of whom is the interrogator and the
other sits with a computer C in another room. The only communication is by means
of typing messages on a simple terminal around 1950. The interrogator is given the
task of determining whether a response comes from the person or the computer C.
The goal of the machine is to fool the interrogator into believing that the
machine is the person and therefore the interrogator will believe the machine can
think. The machine can trick the interrogator by giving the wrong answer to 12324
times 73981.
Some believe a computer will never pass the Turing Test and be able to
maintain the following dialogue due to Turing, which Turing believed that a computer
would need to exhibit to pass the test.
Recently computers have convinced five people out of ten that the programs
loaded inside them were of human intelligence. Also, Chinook, a computer program,
gave the draughts world.
This is a good example for AI technique and it defines three important AI techniques:
26
Self-Instructional
Material
Search
What is Artificial
Intelligence?
A search program finds a solution for a problem by trying various sequences of
actions or operators until a solution is found.
Advantages
NOTES
To solve a problem using search, it is only necessary to code the operators that can
be used; the search will find the sequence of actions that will provide the desired
result. For example, a program can be written to play chess using search if one
knows the rules of chess; it is not necessary to know how to play good chess.
Disadvantages
Most problems have search spaces so large that it is impossible to search the whole
space. Chess has been estimated to have 10120 possible games. The rapid growth of
combinations of possible moves is called the combinatoric explosion problem.
Use of knowledge
Use of knowledge provides a way of solving complex problems by exploiting the
structures of the objects that are concerned.
Abstraction
Abstraction provides a way of separating important features and variations from
the many unimportant ones that would otherwise overwhelm any process.
1.11 CONCLUSION
We live in an era of rapid change, moving towards the information or knowledge or
network society. We require seamless, easy-to-use, high quality, reasonable
communications between people and machines, anywhere and anytime. The creation
of intelligence in a machine has been a long cherished desire to replicate the
functionality of the human mind.
Intelligent information and communications technology (IICT) emulates and
employs some aspects of human intelligence in performing a task. The IICT based
systems include sensors, computers, knowledge-based software, human-machine
interfaces and other devices. IICT enabled machines and devices anticipate
requirements and deal with environments that are complex, unknown, unpredictable
and bring the power of computing technology into our daily lives and business
practices. Intelligent systems were first developed for use in traditional industries,
such as manufacturing, mining, etc., enabling the automation of routine or dangerous
tasks to improve productivity and quality. Today, an intelligent systems application
exists virtually in all sectors where they deliver social as well as economic benefits.
Post-World War II, a number of people started working on intelligent
machines. The English mathematician Alan Turing has been the first person who
gave a lecture in 1947 that AI was best researched by programming computers
rather than by building machines. By the late 1950s, there were many researchers
on AI and most of their work based on programming computers.
Daniel Dennett’s book Brainchildren has an excellent explanation about the
Turing test and various partial Turing tests have been implemented, i.e. with
restrictions on the observer’s knowledge of AI and the subject matter of questioning.
Self-Instructional
Material
27
What is Artificial
Intelligence?
NOTES
It turns out that some people are easily led into believing that a rather dumb program
is intelligent.
Alan Turing’s article ‘Computing Machinery and Intelligence’ in 1950
mentioned conditions for considering a machine to be intelligent. He argued that if
the machine could successfully pretend to be a knowledgeable observer like a human,
then we would certainly consider it to be intelligent. This test would satisfy most of
the people but not all philosophers. The observer could interact with the machine
and a human by teletype (to avoid the machine from imitating the appearance or
voice of the person), and the human would try to plead with the observer that it was
human and the machine would try to fool the observer.
The Turing test is a one-sided test. A machine that passes the test should
certainly be considered intelligent, but a machine could still be considered intelligent
without knowing enough about humans or to imitate a human.
1.12 SUMMARY
In this unit, you have learned about the various concepts related to AI, beginning
with its basics. It was in late 1955 that Newell and Simon developed The Logic
Theorist, considered by many to be the first AI program. The program, representing
each problem as a tree model, would attempt to solve it by selecting the branch that
would most likely result in the correct conclusion.
You have also learned the various definitions of AI according to different
authors. Most of these definitions take a very technical direction and avoid
philosophical problems connected with the idea that AI’s purpose is to construct an
artificial human. AI seeks to understand the computations required from intelligent
behaviour and to produce computer systems that exhibit intelligence. This unit also
explained in detail, the applications of AI which mean problem solving, search and
control strategies, speech recognition, natural language understanding, computer
vision, expert systems, etc. Aspects of intelligence studied by AI include perception,
communicational using human languages, reasoning, planning, learning and memory.
Another important topic that you have learned is AI techniques. The later
depict how we represent, manipulate and reason with knowledge in order to solve
problems. Knowledge is a collection of ‘facts’. To manipulate these facts by a
program, a suitable representation is required. A good representation facilitates
problem solving. The level of the AI model and the criteria for success has been
also discussed.
1.13 KEY TERMS
• Web 2.0: It is the second generation of the World Wide Web, focused on the
ability for people to collaborate and share information online. It is dynamic,
serves applications to users and offers open communications.
• MANET concept: A mobile ad-hoc network is a collection of wireless nodes
that can dynamically be set up anywhere and anytime without using any
existing network infrastructure.
28
Self-Instructional
Material
• Software defined radio: It is a radio communication system where
components that have typically been in hardware (e.g., mixers, filters,
amplifiers, modulators/demodulators, detectors, etc.) are implemented using
software.
• Artificial intelligence: It is the branch of computer science that is concerned
with the automation of intellectual performance.
What is Artificial
Intelligence?
NOTES
• AI techniques: These depict how we represent, manipulate and reason with
knowledge in order to solve problems. Knowledge is a collection of ‘facts’.
1.14 ANSWERS TO ‘CHECK YOUR PROGRESS’
1. Shannon
2. True
3. True
4. Aspects of intelligence studied by AI
5. Branches
6. Applications
1.15 QUESTIONS AND EXERCISES
1. Explain the Turing test to find out whether a machine is intelligent or not.
2. What would convince you that a given machine is truly intelligent?
3. What is intelligence? How do we measure it? Are these measurements useful?
4. When the temperature falls and the thermostat turns the heater on, does it act
because it believes the room to be too cold? Does it feel cold? What sorts of
things can have beliefs or feelings? Is this related to the idea of consciousness?
5. Some people believe that the relationship between your mind (a non-physical
thing) and your brain (the physical thing inside your skull) is exactly like the
relationship between a computational process (a non-physical thing) and a
physical computer. Do you agree?
6. How good are machines at playing chess? If a machine can consistently beat
all the best human chess players, does this prove that the machine is intelligent?
1.16 FURTHER READING
1. www.mm.iit.uni-miskolc.hu
2. www.cs.washington.edu
3. www.cs.sonoma.edu
4. www.littera.deusto.es
5. www.cs.waikato.ac.nz
6. www.en.wikibooks.org
7. www.deri.net
8. www.cs.wise.edu
9. www.cs.bham.ac.uk
10. www.mediateam.oulu.fi
11. www.indastudychannel.com
12. www.isoc.org
13. www.lk.cs.ucla.edu
14. www.precarn.ca
Self-Instructional
Material
29
What is Artificial
Intelligence?
NOTES
30
Self-Instructional
Material
15. www.cotsjournalonline.com
16. www.jhu-oei.com
17. www.webopedia.com
18. www.answers.com
19. www.en.wikipedia.org
20. www.genetic-programming.com
21. www.cc.gatech.com
22. www.aamhomeschool.org
23. www.etd.gsu.edu
24. www.ebiguity.umbc.edu
25. www.cs.utexas.edu
26. www.lutter1.de
27. www.etd.gsu.edu
28. www.techiwarehouse.com
29. www.uli.se
30. www.cs.ou.edu
31. www.aamhomeschool.org
32. www.a-website.org
33. www.moloth.com
34. www.sir.com
UNIT 2 PROBLEMS, PROBLEM SPACES
AND SEARCH
Problems, Problem
Spaces and Search
NOTES
Structure
2.0 Introduction
2.1 Unit Objectives
2.2 Problem of Building a System
2.2.1 AI - General Problem Solving
2.3 Defining Problem in a State Space Search
2.3.1 State Space Search
2.3.2 The Water Jug Problem
2.4 Production Systems
2.4.1 Control Strategies
2.4.2 Search Algorithms
2.4.3 Heuristics
2.5 Problem Characteristics
2.5.1
2.5.2
2.5.3
2.5.4
Problem Decomposition
Can Solution Steps be Ignored?
Is the Problem Universe Predictable?
Is Good Solution Absolute or Relative?
2.6 Characteristics of Production Systems
2.7 Design of Search Programs and Solutions
2.7.1 Design of Search Programs
2.7.2 Additional Problems
2.8
2.9
2.10
2.11
2.12
2.0
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
INTRODUCTION
In the preceding unit, you learnt about the basic concepts and applications of AI. In
this unit, you will learn about problems, their definitions, characteristics and design
of search programs. You will learn to define problems in a state space research. A
state space represents a problem in terms of states and operators that change states.
You will also learn about problem solving which is a process of generating solutions
from observed or given data. This unit also examines production systems. Production
systems provide appropriate structures for performing and describing search
processes. Other important topics discussed in this unit are—problem characteristics
and production systems’ characteristics. The latter provide us with a good way of
describing the operations that can be performed in a search for a solution to the
problem. Finally, you will also learn about design of search programs and solutions.
2.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
• Understand the problem of building a system
• Define problems in a state space search
Self-Instructional
Material
31
Problems, Problem
Spaces and Search
• Discuss production systems
• Explain characteristics of problems
• Describe production systems
• Design search programs and solutions
NOTES
2.2
PROBLEM OF BUILDING A SYSTEM
To solve the problem of building a system you should take the following steps:
1. Define the problem accurately including detailed specifications and what
constitutes a suitable solution.
2. Scrutinize the problem carefully, for some features may have a central affect
on the chosen method of solution.
3. Segregate and represent the background knowledge needed in the solution of
the problem.
4. Choose the best solving techniques for the problem to solve a solution.
Problem solving is a process of generating solutions from observed data.
• a ‘problem’ is characterized by a set of goals,
• a set of objects, and
• a set of operations.
These could be ill-defined and may evolve during problem solving.
• A ‘problem space’ is an abstract space.
■
A problem space encompasses all valid states that can be generated by
the application of any combination of operators on any combination of
objects.
■
The problem space may contain one or more solutions. A solution is a
combination of operations and objects that achieve the goals.
• A ‘search’ refers to the search for a solution in a problem space.
■
Search proceeds with different types of ‘search control strategies’.
■
The depth-first search and breadth-first search are the two common search
strategies.
2.2.1 AI - General Problem Solving
Problem solving has been the key area of concern for Artificial Intelligence.
Problem solving is a process of generating solutions from observed or given
data. It is however not always possible to use direct methods (i.e. go directly from
data to solution). Instead, problem solving often needs to use indirect or modelbased methods.
General Problem Solver (GPS) was a computer program created in 1957 by Simon
and Newell to build a universal problem solver machine. GPS was based on Simon
and Newell’s theoretical work on logic machines. GPS in principle can solve any
formalized symbolic problem, such as theorems proof and geometric problems and
chess playing.
32
Self-Instructional
Material
GPS solved many simple problems, such as the Towers of Hanoi, that could
be sufficiently formalized, but GPS could not solve any real-world problems.
To build a system to solve a particular problem, we need to:
Problems, Problem
Spaces and Search
• Define the problem precisely – find input situations as well as final situations
for an acceptable solution to the problem
• Analyse the problem – find few important features that may have impact on
the appropriateness of various possible techniques for solving the problem
NOTES
• Isolate and represent task knowledge necessary to solve the problem
• Choose the best problem-solving technique(s) and apply to the particular
problem
2.2.1.1 Problem definitions
A problem is defined by its ‘elements’ and their ‘relations’. To provide a formal
description of a problem, we need to do the following:
a. Define a state space that contains all the possible configurations of the relevant
objects, including some impossible ones.
b. Specify one or more states that describe possible situations, from which the
problem-solving process may start. These states are called initial states.
c. Specify one or more states that would be acceptable solution to the problem.
These states are called goal states.
Specify a set of rules that describe the actions (operators) available.
The problem can then be solved by using the rules, in combination with an
appropriate control strategy, to move through the problem space until a path from
an initial state to a goal state is found. This process is known as ‘search’. Thus:
• Search is fundamental to the problem-solving process.
• Search is a general mechanism that can be used when a more direct method
is not known.
• Search provides the framework into which more direct methods for solving
subparts of a problem can be embedded. A very large number of AI problems
are formulated as search problems.
• Problem space
A problem space is represented by a directed graph, where nodes represent
search state and paths represent the operators applied to change the state. To
simplify search algorithms, it is often convenient to logically and
programmatically represent a problem space as a tree. A tree usually decreases
the complexity of a search at a cost. Here, the cost is due to duplicating some
nodes on the tree that were linked numerous times in the graph, e.g. node B
and node D.
A tree is a graph in which any two vertices are connected by exactly one
path. Alternatively, any connected graph with no cycles is a tree.
Self-Instructional
Material
33
Problems, Problem
Spaces and Search
Graph
Trees
NOTES
Fig. 2.1 Graph and Tree
• Problem solving
The term, Problem Solving relates to analysis in AI. Problem solving may be
characterized as a systematic search through a range of possible actions to
reach some predefined goal or solution. Problem-solving methods are
categorized as special purpose and general purpose.
• A special-purpose method is tailor-made for a particular problem, often
exploits very specific features of the situation in which the problem is
embedded.
• A general-purpose method is applicable to a wide variety of problems. One
General-purpose technique used in AI is ‘means-end analysis’: a step-bystep, or incremental, reduction of the difference between current state and
final goal.
2.3
DEFINING PROBLEM AS A STATE SPACE
SEARCH
To solve the problem of playing a game, we require the rules of the game and
targets for winning as well as representing positions in the game. The opening position
can be defined as the initial state and a winning position as a goal state. Moves from
initial state to other states leading to the goal state follow legally. However, the
rules are far too abundant in most games— especially in chess, where they exceed
the number of particles in the universe. Thus, the rules cannot be supplied accurately
and computer programs cannot handle easily. The storage also presents another
problem but searching can be achieved by hashing.
The number of rules that are used must be minimized and the set can be
created by expressing each rule in a form as possible. The representation of games
leads to a state space representation and it is common for well-organized games
34
Self-Instructional
Material
with some structure. This representation allows for the formal definition of a problem
that needs the movement from a set of initial positions to one of a set of target
positions. It means that the solution involves using known techniques and a systematic
search. This is quite a common method in Artificial Intelligence.
Problems, Problem
Spaces and Search
NOTES
2.3.1 State Space Search
A state space represents a problem in terms of states and operators that change
states.
A state space consists of:
• A representation of the states the system can be in. For example, in a board
game, the board represents the current state of the game.
• A set of operators that can change one state into another state. In a board
game, the operators are the legal moves from any given state. Often the
operators are represented as programs that change a state representation to
represent the new state.
• An initial state.
• A set of final states; some of these may be desirable, others undesirable. This
set is often represented implicitly by a program that detects terminal states.
2.3.2 The Water Jug Problem
In this problem, we use two jugs called four and three; four holds a maximum of
four gallons of water and three a maximum of three gallons of water. How can we
get two gallons of water in the four jug?
The state space is a set of prearranged pairs giving the number of gallons of
water in the pair of jugs at any time, i.e., (four, three) where four = 0, 1, 2, 3 or 4
and three = 0, 1, 2 or 3.
The start state is (0, 0) and the goal state is (2, n) where n may be any but it is
limited to three holding from 0 to 3 gallons of water or empty. Three and four
shows the name and numerical number shows the amount of water in jugs for solving
the water jug problem. The major production rules for solving this problem are
shown below:
Initial condition
Goal comment
1. (four, three) if four < 4
(4, three) fill four from tap
2. (four, three) if three< 3
(four, 3) fill three from tap
3. (four, three) If four > 0
(0, three) empty four into drain
4. (four, three) if three > 0
(four, 0) empty three into drain
5. (four, three) if four + three<4
(four + three, 0) empty three into four
6. (four, three) if four + three<3
(0, four + three) empty four into three
7. (0, three) If three > 0
(three, 0) empty three into four
8. (four, 0) if four > 0
(0, four) empty four into three
9. (0, 2)
(2, 0) empty three into four
10. (2, 0)
(0, 2) empty four into three
Self-Instructional
Material
35
Problems, Problem
Spaces and Search
11. (four, three) if four < 4
(4, three-diff) pour diff, 4-four, into
four from three
12. (three, four) if three < 3
(four-diff, 3) pour diff, 3-three, into
three from four and a solution is given
below four three rule
NOTES
Fig. 2.2 Production Rules for the Water Jug Problem
Gallons in Four Jug
Gallons in Three Jug
Rules Applied
0
0
-
0
3
2
3
0
7
3
3
2
4
2
11
0
2
3
2
0
10
Fig. 2.3 One Solution to the Water Jug Problem
The problem solved by using the production rules in combination with an appropriate
control strategy, moving through the problem space until a path from an initial state
to a goal state is found. In this problem solving process, search is the fundamental
concept. For simple problems it is easier to achieve this goal by hand but there will
be cases where this is far too difficult.
2.4
PRODUCTION SYSTEMS
Production systems provide appropriate structures for performing and describing
search processes. A production system has four basic components as enumerated
below.
• A set of rules each consisting of a left side that determines the applicability of
the rule and a right side that describes the operation to be performed if the
rule is applied.
• A database of current facts established during the process of inference.
• A control strategy that specifies the order in which the rules will be compared
with facts in the database and also specifies how to resolve conflicts in selection
of several rules or selection of more facts.
• A rule firing module.
The production rules operate on the knowledge database. Each rule has a
precondition—that is, either satisfied or not by the knowledge database. If the
precondition is satisfied, the rule can be applied. Application of the rule changes
the knowledge database. The control system chooses which applicable rule should
be applied and ceases computation when a termination condition on the knowledge
database is satisfied.
36
Self-Instructional
Material
Example: Eight puzzle (8-Puzzle)
Problems, Problem
Spaces and Search
The 8-puzzle is a 3 × 3 array containing eight square pieces, numbered 1 through 8,
and one empty space. A piece can be moved horizontally or vertically into the empty
space, in effect exchanging the positions of the piece and the empty space. There
are four possible moves, UP (move the blank space up), DOWN, LEFT and RIGHT.
The aim of the game is to make a sequence of moves that will convert the board
from the start state into the goal state:
NOTES
2
3
4
1
8
6
2
8
5
7
7
Initial State
2
3
4
6
5
Goal State
This example can be solved by the operator sequence UP, RIGHT, UP, LEFT, DOWN.
Example: Missionaries and Cannibals
The Missionaries and Cannibals problem illustrates the use of state space search for
planning under constraints:
Three missionaries and three cannibals wish to cross a river using a twoperson boat. If at any time the cannibals outnumber the missionaries on either side
of the river, they will eat the missionaries. How can a sequence of boat trips be
performed that will get everyone to the other side of the river without any missionaries
being eaten?
State representation:
1. BOAT position: original (T) or final (NIL) side of the river.
2. Number of Missionaries and Cannibals on the original side of the river.
3. Start is (T 3 3); Goal is (NIL 0 0).
Operators:
(MM 2 0) Two Missionaries cross the river.
(MC 1 1) One Missionary and one Cannibal.
(CC 0 2) Two Cannibals.
(M 1 0) One Missionary.
(C 0 1) One Cannibal.
Self-Instructional
Material
37
Problems, Problem
Spaces and Search
Missionaries/Cannibals Search Graph
Missionaries on Left
Cannibals on Left
Boat Position
NOTES
3
3 0
CC
MC
3 1 1
2 2 1
M
C
3 2
0
CC
3 0
1
C
3 1
0
MM
1
1 1
MC
2
2 0
MM
0 2
1
0 3
0
0
1 1
CC
C
M
0
1 1 0
MC
2 0
CC
0 0
1
2.4.1 Control Strategies
The word ‘search’ refers to the search for a solution in a problem space.
• Search proceeds with different types of ‘search control strategies’.
• A strategy is defined by picking the order in which the nodes expand.
The Search strategies are evaluated along the following dimensions:
Completeness, Time complexity, Space complexity, Optimality (the search- related
terms are first explained, and then the search algorithms and control strategies are
illustrated next).
Search-related terms
• Algorithm’s performance and complexity
Ideally we want a common measure so that we can compare approaches in
order to select the most appropriate algorithm for a given situation.
■
Performance of an algorithm depends on internal and external factors.
Internal factors/ External factors
38
Self-Instructional
Material
Time required to run
❑ Size of input to the algorithm
❑ Space (memory) required to run
❑ Speed of the computer
❑ Quality of the compiler
Complexity is a measure of the performance of an algorithm. Complexity
measures the internal factors, usually in time than space.
❑
■
Problems, Problem
Spaces and Search
NOTES
• Computational complexity
It is the measure of resources in terms of Time and Space.
■
If A is an algorithm that solves a decision problem f, then run-time of A is
the number of steps taken on the input of length n.
■
Time Complexity T(n) of a decision problem f is the run-time of the ‘best’
algorithm A for f.
■
Space Complexity S(n) of a decision problem f is the amount of memory
used by the ‘best’ algorithm A for f.
• ‘Big - O’ notation
The Big-O, theoretical measure of the execution of an algorithm, usually
indicates the time or the memory needed, given the problem size n, which is
usually the number of items.
• Big-O notation
The Big-O notation is used to give an approximation to the run-time- efficiency
of an algorithm; the letter ‘O’ is for order of magnitude of operations or
space at run-time.
• The Big-O of an Algorithm A
■
If an algorithm A requires time proportional to f(n), then algorithm A is
said to be of order f(n), and it is denoted as O(f(n)).
■
If algorithm A requires time proportional to n2, then the order of the
algorithm is said to be O(n2).
■
If algorithm A requires time proportional to n, then the order of the
algorithm is said to be O(n).
The function f(n) is called the algorithm’s growth-rate function. In other words, if
an algorithm has performance complexity O(n), this means that the run-time t should
be directly proportional to n, ie t • n or t = k n where k is constant of proportionality.
Similarly, for algorithms having performance complexity O(log2(n)), O(log N),
O(N log N), O(2N) and so on.
Example 1:
Determine the Big-O of an algorithm:
Calculate the sum of the n elements in an integer array a[0..n-1].
Line no.
Instructions
No of execution steps
line 1
sum
1
line 2
for (i = 0; i < n; i++)
n+1
line 3
sum += a[i]
n
line 4
print
sum 1
Total
2n + 3
Self-Instructional
Material
39
Problems, Problem
Spaces and Search
NOTES
Thus, the polynomial (2n + 3) is dominated by the 1st term as n while the number
of elements in the array becomes very large.
• In determining the Big-O, ignore constants such as 2 and 3. So the algorithm
is of order n.
• So the Big-O of the algorithm is O(n).
• In other words the run-time of this algorithm increases roughly as the size of
the input data n, e.g., an array of size n.
Example 2:
Determine the Big-O of an algorithm for finding the largest element in a square 2D array a[0 . . . . . n-1] [0 . . . . . n-1]
Line no
Instructions
No of execution
line 1
max = a[0][0]
1
line 2
for (row = 0; row < n; row++)
line 3
for (col = 0; col < n; col++)
line 4
if (a[row][col] > max) max = a[row][col].
line 5
print max
n+1
n*(n+1)
n*(n)
1
2n2 + 2n + 3
Total
Thus, the polynomial (2n2 + 2n + 3) is dominated by the 1st term as n2 while the
number of elements in the array becomes very large.
• In determining the Big-O, ignore constants such as 2, 2 and 3. So the algorithm
is of order n2.
• The Big-O of the algorithm is O(n2).
• In other words, the run-time of this algorithm increases roughly as the square
of the size of the input data which is n2, e.g., an array of size n x n.
Example 3 : Polynomial in n with degree k.
The number of steps needed to carry out an algorithm is expressed as
f(n) = ak nk + ak-1 nk-1 + ... + a1 n1 + a0
Then f(n) is a polynomial in n with degree k and f(n) ∈ O(nk).
• To obtain the order of a polynomial function, use the term, which is of the
highest degree and disregard the constants and the terms which are of lower
degrees.
The Big-O of the algorithm is O(nk). In other words, the run-time of this algorithm
increases exponentially.
Example of Growth rates variation:
Problem: If an algorithm requires 1 second run-time for a problem of size 8,
find the run-time for that algorithm for the problem of size 16?
then
Solutions: If the Order of the algorithm is O(f(n)) then the calculated execution
time T (n) of the algorithm as problem size increases are as below.
O(f(n))
O(1)
40
Self-Instructional
Material
Run-time T(n) required as problem size increases
⇒
T(n) = 1 second ;
Algorithm is constant time, and independent of the
Problems, Problem
Spaces and Search
size of the problem.
O(log2n)
⇒
T(n) = (1*log216) / log28 = 4/3 seconds ;
NOTES
Algorithm is of logarithmic time, increases slowly
with the size of the problem.
O(n)
⇒
T(n) = (1*16) / 8 = 2 seconds
Algorithm is linear time, increases directly with the
size of the problem.
O(n*log2n)
⇒
T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
Algorithm is of log-linear time, increases more rapidly
than a linear algorithm.
O(n2)
⇒
T(n) = (1*162) / 82 = 4 seconds
Algorithm is quadratic time, increases rapidly with
the size of the problem.
O(n3)
⇒
T(n) = (1*163) / 83 = 8 seconds
Algorithm is cubic time increases more rapidly than
quadratic algorithm with the size of the problem.
O(2n)
⇒
T(n) = (1*216) / 28 = 28 seconds = 256 seconds
Algorithm is exponential time, increases too rapidly
to be practical.
Tree structure
Tree is a way of organizing objects, related in a hierarchical fashion.
• Tree is a type of data structure in which each element is attached to one or
more elements directly beneath it.
• The connections between elements are called branches.
• Tree is often called inverted trees because it is drawn with the root at the top.
• The elements that have no elements below them are called leaves.
• A binary tree is a special type: each element has only two branches below it.
Properties
• Tree is a special case of a graph.
• The topmost node in a tree is called the root node.
• At root node all operations on the tree begin.
• A node has at most one parent.
• The topmost node (root node) has no parents.
Self-Instructional
Material
41
Problems, Problem
Spaces and Search
• Each node has zero or more child nodes, which are below it .
• The nodes at the bottommost level of the tree are called leaf nodes.
Since leaf nodes are at the bottom most level, they do not have children.
NOTES
• A node that has a child is called the child’s parent node.
• The depth of a node n is the length of the path from the root to the node.
• The root node is at depth zero.
• Stacks and Queues
The Stacks and Queues are data structures that maintain the order of last-in,
first-out and first-in, first-out respectively. Both stacks and queues are often
implemented as linked lists, but that is not the only possible implementation.
Stack - Last In First Out (LIFO) lists
■
An ordered list; a sequence of items, piled one on top of the other.
■
The insertions and deletions are made at one end only, called Top.
■
If Stack S = (a[1], a[2], . . . . a[n]) then a[1] is bottom most element
■
Any intermediate element (a[i]) is on top of element a[i-1], 1 < i <= n.
■
In Stack all operation take place on Top.
The Pop operation removes item from top of the stack.
The Push operation adds an item on top of the stack.
Queue - First In First Out (FIFO) lists
• An ordered list; a sequence of items; there are restrictions about how items
can be added to and removed from the list. A queue has two ends.
• All insertions (enqueue ) take place at one end, called Rear or Back
• All deletions (dequeue) take place at other end, called Front.
• If Queue has a[n] as rear element then a[i+1] is behind a[i] , 1 < i <= n.
• All operation takes place at one end of queue or the other.
The Dequeue operation removes the item at Front of the queue.
The Enqueue operation adds an item to the Rear of the queue.
Search
Search is the systematic examination of states to find path from the start / root state
to the goal state.
• Search usually results from a lack of knowledge.
• Search explores knowledge alternatives to arrive at the best answer.
• Search algorithm output is a solution, that is, a path from the initial state to a
state that satisfies the goal test.
For general-purpose problem-solving – ‘Search’ is an approach.
• Search deals with finding nodes having certain properties in a graph that
represents search space.
• Search methods explore the search space ‘intelligently’, evaluating
possibilities without investigating every single possibility.
42
Self-Instructional
Material
Problems, Problem
Spaces and Search
Examples:
• For a Robot this might consist of PICKUP, PUTDOWN, MOVEFORWARD,
MOVEBACK, MOVELEFT, and MOVERIGHT—until the goal is reached.
• Puzzles and Games have explicit rules: e.g., the ‘Tower of Hanoi’ puzzle.
(a) Start
NOTES
(b) Final
Fig. 2.4 Tower of Hanoi Puzzle
• This puzzle involves a set of rings of different sizes that can be placed on
three different pegs.
• The puzzle starts with the rings arranged as shown in Figure 2.4(a)
• The goal of this puzzle is to move them all as to Figure 2.4(b)
• Condition: Only the top ring on a peg can be moved, and it may only be
placed on a smaller ring, or on an empty peg.
In this Tower of Hanoi puzzle:
• Situations encountered while solving the problem are described as states.
• Set of all possible configurations of rings on the pegs is called ‘problem space’.
• States
A State is a representation of elements in a given moment.
A problem is defined by its elements and their relations.
At each instant of a problem, the elements have specific descriptors and
relations; the descriptors indicate how to select elements?
Among all possible states, there are two special states called:
■
Initial state – the start point
■
Final state – the goal state
• State Change: Successor Function
A ‘successor function’ is needed for state change. The Successor Function
moves one state to another state.
Successor Function:
■
It is a description of possible actions; a set of operators.
■
It is a transformation function on a state representation, which converts
that state into another state.
■
It defines a relation of accessibility among states.
■
It represents the conditions of applicability of a state and corresponding
transformation function.
Check Your Progress
1. What does a search
refer to?
2. Problem solving is
a process of
generating solutions
from observed or
given data. (True or
False)
3. What are the two
characteristics by
which a problem is
defined?
Self-Instructional
Material
43
Problems, Problem
Spaces and Search
NOTES
• State space
A state space is the set of all states reachable from the initial state.
■
A state space forms a graph (or map) in which the nodes are states and
the arcs between nodes are actions.
■
In a state space, a path is a sequence of states connected by a sequence of
actions.
■
The solution of a problem is part of the map formed by the state space.
• Structure of a state space
The structures of a state space are trees and graphs.
■
A tree is a hierarchical structure in a graphical form.
■
A graph is a non-hierarchical structure.
• A tree has only one path to a given node;
i.e., a tree has one and only one path from any point to any other point.
• A graph consists of a set of nodes (vertices) and a set of edges (arcs). Arcs
establish relationships (connections) between the nodes; i.e., a graph has
several paths to a given node.
• The Operators are directed arcs between nodes.
A search process explores the state space. In the worst case, the search explores
all possible paths between the initial state and the goal state.
• Problem solution
In the state space, a solution is a path from the initial state to a goal state or,
sometimes, just a goal state.
■
A solution cost function assigns a numeric cost to each path; it also gives
the cost of applying the operators to the states.
■
A solution quality is measured by the path cost function; and an optimal
solution has the lowest path cost among all solutions.
■
The solutions can be any or optimal or all.
■
The importance of cost depends on the problem and the type of solution
asked.
• Problem description
A problem consists of the description of:
■
The current state of the world,
■
The actions that can transform one state of the world into another,
■
The desired state of the world.
The following action one taken to describe the problem:
❑ State space is defined explicitly or implicitly
A state space should describe everything that is needed to solve a problem
and nothing that is not needed to solve the problem.
❑ Initial state is start state
❑ Goal state is the conditions it has to fulfill.
The description by a disered state may be complete or partial.
❑ Operators are to change state
❑ Operators do actions that can transform one state into another;
❑ Operators consist of: Preconditions and Instructions;
44
Self-Instructional
Material
■
■
■
■
Preconditions provide partial description of the state of the world that
must be true in order to perform the action, and
Instructions tell the user how to create the next state.
Operators should be as general as possible, so as to reduce their number.
Elements of the domain has relevance to the problem
❑ Knowledge of the starting point.
Problem solving is finding a solution
❑ Find an ordered sequence of operators that transform the current (start)
state into a goal state.
Restrictions are solution quality any, optimal, or all
❑ Finding the shortest sequence, or
❑ finding the least expensive sequence defining cost, or
❑ finding any sequence as quickly as possible.
Problems, Problem
Spaces and Search
NOTES
This can also be explained with the help of algebraic function as given below.
Algebraic Function
A function may take the form of a set of ordered pair, a graph or a equation. Regardless
of the form it takes, a function must obey the condition that, no two of its ordered
pairs have the same first member with different second members.
Relation: A set of ordered pair of the form (x, y) is called a relation.
Function: A relation in which no two ordered pairs have the same x-value but
different y-value is called a function. Functions are usually named by lower-case
letters such as f, g, and h.
For example, f = {(-3, 9), (0, 0), (3, 9)},
g = {(4, -2), (2, 2)}
Here f is a function, g is not a function.
Domain and Range: The domain of a function is the set of all the first members of
its ordered pairs, and the range of a function is the set of all second members of its
ordered pairs.
if function f = {(a, A), (b, B), (c, C)},then its domain is {a, b, c } and its range is {A,
B, C}.
Function and mapping: A function may be viewed as a mapping or a pairing of one
set with elements of a second set such that each element of the first set (called
domain) is paired with exactly one element of the second set ( called co domain) .
e.g., if a function f maps {a, b, c} into {A, B, C, D} such that a ’! A (read ‘ a is
mapped into A’), b ’! B, c ’! C then the domain is {a, b, c} and the co domain is {A,
B, C, D}. Since a is paired with A in co domain, A is called the image of a. Each
element of the co domain that corresponds to an element of the domain is called the
image of that element.
The set of image points, {A, B, C}, is called the range. Thus, the range is a subset of
the co domain.
Onto mappings: Set A is mapped onto set B if each element of set B is image of an
element of a set A. Thus, every function maps its domain onto its range.
Describing a function by an equation: The rule by which each x-value gets paired
with the corresponding y-value may be specified by an equation. For example, the
Self-Instructional
Material
45
Problems, Problem
Spaces and Search
function described by the equation y = x + 1 requires that for any choice of x in the
domain, the corresponding range value is x + 1. Thus, 2 ’! 3, 3 ’! 4, and 4 ’! 5.
NOTES
Restricting domains of functions: Unless otherwise indicated, the domain of a
function is assumed to be the largest possible set of real numbers. Thus: The domain
of y = x / (x2 - 4) is the set of all real numbers except ± 2 since for these values of x
the denominator is 0.
The domain of y = (x - 1)1/2 is the set of real numbers greater than or equal to 1
since for any value of x less than 1, the root radical has a negative radicand so the
radical does not represent a real number.
Example: Find which of the relation describe function?
(a) y = x1/2 , (b) y = x3 , (c) y > x , (d) x = y2
Equations (a) and (b) produce exactly one value of y for each value of x. Hence,
equations (a) and (b) describe functions.
The equation (c), y > x does not represent a function since it contains ordered pair
such as (1, 2) and (1, 3) where same value of x is paired with different values of y.
The equation (d), x = y2 is not a function since ordered pair such as (4, 2) and (4, 2) satisfy the equation but have the same value of x paired with different values of
y.
Function notation: For any function f, the value of y that corresponds to a given
value of x is denoted by f(x).
If y = 5x -1, then f (2), read as ‘f of 2’, represents the value of y.
When x = 2, then f(2) = 5 . 2 - 1 = 9;
when x = 3, then f(3) = 5 . 3 - 1 = 14;
In an equation that describes function f, then f(x) may be used in place of y, for
example, f(x) = 5x -1. If y = f(x), then y is said to be a function of x.
Since the value of y depends on the value of x, y is called the dependent variable
and x is called the independent variable.
2.4.2 Search Algorithms
Many traditional search algorithms are used in AI applications. For complex
problems, the traditional algorithms are unable to find the solutions within some
practical time and space limits. Consequently, many special techniques are
developed, using heuristic functions.
The algorithms that use heuristic functions are called heuristic algorithms.
• Heuristic algorithms are not really intelligent; they appear to be intelligent
because they achieve better performance.
• Heuristic algorithms are more efficient because they take advantage of
feedback from the data to direct the search path.
• Uninformed search algorithms or Brute-force algorithms, search through
the search space all possible candidates for the solution checking whether
each candidate satisfies the problem’s statement.
46
Self-Instructional
Material
• Informed search algorithms use heuristic functions that are specific to the
problem, apply them to guide the search through the search space to try to
reduce the amount of time spent in searching.
A good heuristic will make an informed search dramatically outperform any
uninformed search: for example, the Traveling Salesman Problem (TSP), where the
goal is to find is a good solution instead of finding the best solution.
In such problems, the search proceeds using current information about the
problem to predict which path is closer to the goal and follow it, although it does
not always guarantee to find the best possible solution. Such techniques help in
finding a solution within reasonable time and space (memory). Some prominent
intelligent search algorithms are stated below:
1. Generate and Test Search
4. A* Search
2. Best-first Search
5. Constraint Search
3. Greedy Search
6. Means-ends analysis
Problems, Problem
Spaces and Search
NOTES
There are some more algorithms. They are either improvements or combinations of
these.
• Hierarchical Representation of Search Algorithms: A Hierarchical
representation of most search algorithms is illustrated below. The
representation begins with two types of search:
• Uninformed Search: Also called blind, exhaustive or brute-force search, it
uses no information about the problem to guide the search and therefore may
not be very efficient.
• Informed Search: Also called heuristic or intelligent search, this uses
information about the problem to guide the search—usually guesses the
distance to a goal state and is therefore efficient, but the search may not be
always possible.
Search Algorithms
G(State, Operator, Cost)
User Heuristics h(n)
No Heuristics
Informed Search
Uninformed Search
Priority Queue: g(n)
LIFO Stack
FIFO
Queue
Depth First Search
DFS
Breadth First Search
BFS
Imposed Fixed
Depth Limit
Depth Limited
Search
Cost First Search
Generate and Test
Hill Climbing
Priority
Queue = h(n)
Best First Search
Problem
Reduction
Constraint
Satisfaction
Means-end
Analysis
gradually fixed Priority Queue
f(n) = h(n) + g(n)
depth limit
Iterative
Deepening DFS
A* Search
A0* Search
Fig. 2.5 Different Search Algorithms
The first requirement is that it causes motion, in a game playing program, it moves
on the board and in the water jug problem, filling water is used to fill jugs. It means
the control strategies without the motion will never lead to the solution.
Self-Instructional
Material
47
Problems, Problem
Spaces and Search
NOTES
The second requirement is that it is systematic, that is, it corresponds to the
need for global motion as well as for local motion. This is a clear condition that
neither would it be rational to fill a jug and empty it repeatedly, nor it would be
worthwhile to move a piece round and round on the board in a cyclic way in a
game. We shall initially consider two systematic approaches for searching. Searches
can be classified by the order in which operators are tried: depth-first, breadth-first,
bounded depth-first.
Breadth First
Depth First
1
2
1
2
4
3
6
3
5 Goes too deep if tree is very large
4
5
6
Requires a lot of storage
Bounded Depth First
Maximum Depth Boundary
Depth is bounded artificially. Low storage requirement
2.4.2.1 Breadth-first search
A Search strategy, in which the highest layer of a decision tree is searched completely
before proceeding to the next layer is called Breadth-first search (BFS).
• In this strategy, no viable solutions are omitted and therefore it is guaranteed
that an optimal solution is found.
• This strategy is often not feasible when the search space is large.
Algorithm
1. Create a variable called LIST and set it to be the starting state.
2. Loop until a goal state is found or LIST is empty, Do
a. Remove the first element from the LIST and call it E. If the LIST is empty,
quit.
b. For every path each rule can match the state E, Do
(i) Apply the rule to generate a new state.
(ii) If the new state is a goal state, quit and return this state.
(iii) Otherwise, add the new state to the end of LIST.
48
Self-Instructional
Material
Advantages
Problems, Problem
Spaces and Search
1. Guaranteed to find an optimal solution (in terms of shortest number of steps
to reach the goal).
2. Can always find a goal node if one exists (complete).
NOTES
Disadvantages
1. High storage requirement: exponential with tree depth.
2.4.2.2 Depth-first search
A search strategy that extends the current path as far as possible before backtracking
to the last choice point and trying the next alternative path is called Depth-first
search (DFS).
• This strategy does not guarantee that the optimal solution has been found.
• In this strategy, search reaches a satisfactory solution more rapidly than breadth
first, an advantage when the search space is large.
Algorithm
Depth-first search applies operators to each newly generated state, trying to drive
directly toward the goal.
1. If the starting state is a goal state, quit and return success.
2. Otherwise, do the following until success or failure is signalled:
a. Generate a successor E to the starting state. If there are no more successors,
then signal failure.
b. Call Depth-first Search with E as the starting state.
c. If success is returned signal success; otherwise, continue in the loop.
Advantages
1. Low storage requirement: linear with tree depth.
2. Easily programmed: function call stack does most of the work of maintaining
state of the search.
Disadvantages
1. May find a sub-optimal solution (one that is deeper or more costly than the
best solution).
2. Incomplete: without a depth bound, may not find a solution even if one exists.
2.4.2.3 Bounded depth-first search
Depth-first search can spend much time (perhaps infinite time) exploring a very
deep path that does not contain a solution, when a shallow solution exists. An easy
way to solve this problem is to put a maximum depth bound on the search. Beyond
the depth bound, a failure is generated automatically without exploring any deeper.
Problems:
1. It’s hard to guess how deep the solution lies.
2. If the estimated depth is too deep (even by 1) the computer time used is
dramatically increased, by a factor of bextra.
3. If the estimated depth is too shallow, the search fails to find a solution; all
that computer time is wasted.
Self-Instructional
Material
49
Problems, Problem
Spaces and Search
NOTES
2.4.3 Heuristics
A heuristic is a method that improves the efficiency of the search process. These are
like tour guides. There are good to the level that they may neglect the points in
general interesting directions; they are bad to the level that they may neglect points
of interest to particular individuals. Some heuristics help in the search process without
sacrificing any claims to entirety that the process might previously had. Others may
occasionally cause an excellent path to be overlooked. By sacrificing entirety it
increases efficiency. Heuristics may not find the best solution every time but
guarantee that they find a good solution in a reasonable time. These are particularly
useful in solving tough and complex problems, solutions of which would require
infinite time, i.e. far longer than a lifetime for the problems which are not solved in
any other way.
2.4.3.1 Heuristic search
To find a solution in proper time rather than a complete solution in unlimited time
we use heuristics. ‘A heuristic function is a function that maps from problem state
descriptions to measures of desirability, usually represented as numbers’. Heuristic
search methods use knowledge about the problem domain and choose promising
operators first. These heuristic search methods use heuristic functions to evaluate
the next state towards the goal state. For finding a solution, by using the heuristic
technique, one should carry out the following steps:
1. Add domain—specific information to select what is the best path to continue
searching along.
2. Define a heuristic function h(n) that estimates the ‘goodness’ of a node n.
Specifically, h(n) = estimated cost(or distance) of minimal cost path from n
to a goal state.
3. The term, heuristic means ‘serving to aid discovery’ and is an estimate, based
on domain specific information that is computable from the current state
description of how close we are to a goal.
Finding a route from one city to another city is an example of a search problem in
which different search orders and the use of heuristic knowledge are easily
understood.
1. State: The current city in which the traveller is located.
2. Operators: Roads linking the current city to other cities.
3. Cost Metric: The cost of taking a given road between cities.
4. Heuristic information: The search could be guided by the direction of the
goal city from the current city, or we could use airline distance as an estimate
of the distance to the goal.
Heuristic search techniques
For complex problems, the traditional algorithms, presented above, are unable to
find the solution within some practical time and space limits. Consequently, many
special techniques are developed, using heuristic functions.
• Blind search is not always possible, because it requires too much time or
Space (memory).
50
Self-Instructional
Material
• Heuristics are rules of thumb; they do not guarantee a solution to a problem.
Problems, Problem
Spaces and Search
• Heuristic Search is a weak technique but can be effective if applied correctly;
it requires domain specific information.
Characteristics of heuristic search
NOTES
• Heuristics are knowledge about domain, which help search and reasoning in
its domain.
• Heuristic search incorporates domain knowledge to improve efficiency over
blind search.
• Heuristic is a function that, when applied to a state, returns value as estimated
merit of state, with respect to goal.
■ Heuristics might (for reasons) underestimate or overestimate the merit of
a state with respect to goal.
■
Heuristics that underestimate are desirable and called admissible.
• Heuristic evaluation function estimates likelihood of given state leading to
goal state.
• Heuristic search function estimates cost from current state to goal, presuming
function is efficient.
Heuristic search compared with other search
The Heuristic search is compared with Brute force or Blind search techniques below:
Comparison of Algorithms
Brute force / Blind search
Heuristic search
Can only search what it has knowledge
about already
Estimates ‘distance’ to goal state
through xplored nodes
No knowledge about how far a node
Guides search process toward goal
node from goal state
Prefers states (nodes) that lead
close to and not away from goal
state
Example: Travelling salesman
A salesman has to visit a list of cities and he must visit each city only once. There
are different routes between the cities. The problem is to find the shortest route
between the cities so that the salesman visits all the cities at once.
Suppose there are N cities, then a solution would be to take N! possible
combinations to find the shortest distance to decide the required route. This is not
efficient as with N=10 there are 36,28,800 possible routes. This is an example of
combinatorial explosion.
There are better methods for the solution of such problems: one is called
branch and bound.
First, generate all the complete paths and find the distance of the first complete
path. If the next path is shorter, then save it and proceed this way avoiding the path
when its length exceeds the saved shortest path length, although it is better than the
previous method.
Self-Instructional
Material
51
Problems, Problem
Spaces and Search
NOTES
Algorithm: Travelling salesman problem
1. Select a city at random as a starting point
2. Repeat
3. Select the next city from the list of all the cities to be visited and choose the
nearest one to the current city, then go to it,
4. until all cities are visited.
This produces a significant development and reduces the time from order N! to
N[2]. Our goal is to find the shortest route that visits each city exactly once. Suppose
the cities to be visited and the distances between them are as shown below:
Hyderabad
Secunderabad
Mumbai
Bangalore
Chennai
Hyderabad
-
15
270
780
740
Secunderabad
15
-
280
760
780
Mumbai
270
280
-
340
420
Bangalore
780
760
340
-
770
Chennai
740
780
420
770
-
(Distance in Kilometres)
One situation is the salesman could start from Hyderabad. In that case, one path
might be followed as shown below:
Hyderabad
15
Secunderabad
780
Chennai
420
Mumbai
340
Bangalore
780
Hyderabad
TOTAL : 2335
52
Self-Instructional
Material
Here the total distance is 2335 km. But this may not be a solution to the
problem, maybe other paths may give the shortest route.
It is also possible to create a bound on the error in the answer, but in general
it is not possible to make such an error bound. In real problems, the value of a
particular solution is trickier to establish, but this problem is easier if it is measured
in miles, and other problems have vague measures.
Although heuristics can be created for unstructured knowledge, producing
coherent analysis is another issue and this means that the solution lacks reliability.
Rarely is this an optimal solution, since the required approximations are usually in
sufficient.
Although heuristic solutions are bad in the worst case, the problem occurs
very infrequently.
Problems, Problem
Spaces and Search
Formal Statement:
Problem solving is a set of statements describing the desired states expressed in a
suitable language; e.g. first-order logic.
The solution of many problems (like chess, crosses) can be described by
finding a sequence of actions that lead to a desired goal.
NOTES
• Each action changes the state, and
• The aim is to find the sequence of actions that lead from the initial (start)
state to a final (goal) state.
A well-defined problem can be described by the example given below:
Example
• Initial State: (S)
• Operator or successor function: for any state x , returns s(x), the set of states
reachable from x with one action.
• State space: all states reachable from the initial one by any sequence of actions.
• Path: sequence through state space.
• Path cost: function that assigns a cost to a path; cost of a path is the sum of
costs of individual actions along the path.
• Goal state: (G)
• Goal test: test to determine if at goal state.
• Search notations
Search is the systematic examination of states to find path from the start /
root state to the goal state.
The notations used for this purpose are given below:
• Evaluation function f (n) estimates the least cost solution through node n
• Heuristic function h(n)
Estimates least cost path from node n to goal node
• Cost function g(n) estimates the least cost path from start node to node n
f (n) = g(n) + h(n)
Estimate
Actual
Goal
Start
n
h(n)
g(n)
f(n)
Self-Instructional
Material
53
Problems, Problem
Spaces and Search
• The notations ^f, ^g, ^h, are sometimes used to indicate that these values are
estimates of f , g , h
^f(n) = ^g(n) + ^h(n)
• If h(n) d• actual cost of the shortest path from node n to goal, then h(n) is an
underestimate.
NOTES
AI - Search and Control of Strategies
• Estimate cost function g*
The estimated least cost path from start node to node n is written as g*(n).
• g* is calculated as the actual cost, so far, of the explored path.
• g* is known exactly by summing all path costs from start to current state.
• If search space is a tree, then g* = g, because there is only one path from start
node to current node.
• In general, the search space is a graph.
• If search space is a graph, then g* e” g,
■
g* can never be less than the cost of the optimal path; it can only over
estimate the cost.
■
g* can be equal to g in a graph if chosen properly.
• Estimate heuristic function h*
Check Your Progress
The estimated least cost path from node n to goal node is written h*(n)
4. Production systems
provide appropriate
structures for
performing and
describing search
processes. (True or
False)
5. The Stacks and
Queues are data
structures that
maintain the order
of and first-in, firstout and last-in,
first-out
respectively. (True
or False)
6. Heuristic
algorithms are more
efficient than
traditional
algorithms. (True or
False)
7. What is a Breadthfirst search (BFS)?
8. Heuristics may not
find the best
solution every time
but guarantee that
they find a good
solution in
reasonable time.
(True or False)
54
Self-Instructional
Material
• h* is heuristic information, represents a guess at: ‘How hard it is to reach
from current node to goal state ?’.
• h* may be estimated using an evaluation function f(n) that measures
‘goodness’ of a node.
• h* may have different values; the values lie between 0 d” h*(n) d” h(n); they
mean a different search algorithm.
• If h* = h, it is a perfect heuristic; means no unnecessary nodes are ever
expanded.
2.5
PROBLEM CHARACTERISTICS
Heuristics cannot be generalized, as they are domain specific. Production systems
provide ideal techniques for representing such heuristics in the form of IF-THEN
rules. Most problems requiring simulation of intelligence use heuristic search
extensively. Some heuristics are used to define the control structure that guides the
search process, as seen in the example described above. But heuristics can also be
encoded in the rules to represent the domain knowledge. Since most AI problems
make use of knowledge and guided search through the knowledge, AI can be
described as the study of techniques for solving exponentially hard problems in
polynomial time by exploiting knowledge about problem domain.
To use the heuristic search for problem solving, we suggest analysis of the
problem for the following considerations:
• Decomposability of the problem into a set of independent smaller subproblems
• Possibility of undoing solution steps, if they are found to be unwise
Problems, Problem
Spaces and Search
• Predictability of the problem universe
• Possibility of obtaining an obvious solution to a problem without comparison
of all other possible solutions
• Type of the solution: whether it is a state or a path to the goal state
NOTES
• Role of knowledge in problem solving
• Nature of solution process: with or without interacting with the user
The general classes of engineering problems such as planning, classification,
diagnosis, monitoring and design are generally knowledge intensive and use a large
amount of heuristics. Depending on the type of problem, the knowledge
representation schemes and control strategies for search are to be adopted. Combining
heuristics with the two basic search strategies have been discussed above. There are
a number of other general-purpose search techniques which are essentially heuristics
based. Their efficiency primarily depends on how they exploit the domain-specific
knowledge to abolish undesirable paths. Such search methods are called ‘weak
methods’, since the progress of the search depends heavily on the way the domain
knowledge is exploited. A few of such search techniques which form the centre of
many AI systems are briefly presented in the following sections.
2.5.1 Problem Decomposition
Suppose to solve the expression is: + ∫(X³ + X² + 2X + 3sinx)dx
∫(X³ + X² + 2X + 3sinx)dx
∫x³dx
∫x²dx
∫2xdx
∫3sinxdx
x 4 /4
x³/3
2∫xdx
3∫sinxdx
x²
–3cosx
This problem can be solved by breaking it into smaller problems, each of
which we can solve by using a small collection of specific rules. Using this technique
of problem decomposition, we can solve very large problems very easily. This can
be considered as an intelligent behaviour.
2.5.2 Can Solution Steps be Ignored?
Suppose we are trying to prove a mathematical theorem: first we proceed considering
that proving a lemma will be useful. Later we realize that it is not at all useful. We
start with another one to prove the theorem. Here we simply ignore the first method.
Consider the 8-puzzle problem to solve: we make a wrong move and realize
that mistake. But here, the control strategy must keep track of all the moves, so that
we can backtrack to the initial state and start with some new move.
Consider the problem of playing chess. Here, once we make a move we never recover
from that step. These problems are illustrated in the three important classes of
problems mentioned below:
Self-Instructional
Material
55
Problems, Problem
Spaces and Search
1. Ignorable, in which solution steps can be ignored.
Eg: Theorem Proving
2. Recoverable, in which solution steps can be undone.
NOTES
Eg: 8-Puzzle
3. Irrecoverable, in which solution steps cannot be undone.
Eg: Chess
2.5.3 Is the Problem Universe Predictable?
Consider the 8-Puzzle problem. Every time we make a move, we know exactly
what will happen. This means that it is possible to plan an entire sequence of moves
and be confident what the resulting state will be. We can backtrack to earlier moves
if they prove unwise.
Suppose we want to play Bridge. We need to plan before the first play, but we
cannot play with certainty. So, the outcome of this game is very uncertain. In case
of 8-Puzzle, the outcome is very certain. To solve uncertain outcome problems, we
follow the process of plan revision as the plan is carried out and the necessary
feedback is provided. The disadvantage is that the planning in this case is often very
expensive.
2.5.4 Is Good Solution Absolute or Relative?
Consider the problem of answering questions based on a database of simple facts
such as the following:
1. Siva was a man.
2. Siva was a worker in a company.
3. Siva was born in 1905.
4. All men are mortal.
5. All workers in a factory died when there was an accident in 1952.
6. No mortal lives longer than 100 years.
Suppose we ask a question: ‘Is Siva alive?’
By representing these facts in a formal language, such as predicate logic, and
then using formal inference methods we can derive an answer to this question easily.
There are two ways to answer the question shown below:
Method I:
1. Siva was a man.
2. Siva was born in 1905.
3. All men are mortal.
4. Now it is 2008, so Siva’s age is 103 years.
5. No mortal lives longer than 100 years.
Method II:
1. Siva is a worker in the company.
2. All workers in the company died in 1952.
Answer: So Siva is not alive. It is the answer from the above methods.
56
Self-Instructional
Material
We are interested to answer the question; it does not matter which path we
follow. If we follow one path successfully to the correct answer, then there is no
reason to go back and check another path to lead the solution.
2.6
CHARACTERISTICS OF PRODUCTION
SYSTEMS
Problems, Problem
Spaces and Search
NOTES
Production systems provide us with good ways of describing the operations that can
be performed in a search for a solution to a problem.
At this time, two questions may arise:
1. Can production systems be described by a set of characteristics? And how
can they be easily implemented?
2. What relationships are there between the problem types and the types of
production systems well suited for solving the problems?
To answer these questions, first consider the following definitions of classes
of production systems:
1. A monotonic production system is a production system in which the application
of a rule never prevents the later application of another rule that could also
have been applied at the time the first rule was selected.
2. A nonmonotonic production system is one in which this is not true.
3. A partially communicative production system is a production system with
the property that if the application of a particular sequence of rules transforms
state P into state Q, then any combination of those rules that is allowable also
transforms state P into state Q.
4. A commutative production system is a production system that is both
monotonic and partially commutative.
Table 2.1 Four Categories of Production Systems
Production System
Partially Commutative
Non-partially Commutative
Monotonic
Non-monotonic
Theorem Proving
Robot Navigation
Chemical Synthesis
Bridge
Is there any relationship between classes of production systems and classes
of problems? For any solvable problems, there exist an infinite number of production
systems that show how to find solutions. Any problem that can be solved by any
production system can be solved by a commutative one, but the commutative one is
practically useless. It may use individual states to represent entire sequences of
applications of rules of a simpler, non-commutative system. In the formal sense,
there is no relationship between kinds of problems and kinds of production systems
since all problems can be solved by all kinds of systems. But in the practical sense,
there is definitely such a relationship between the kinds of problems and the kinds
of systems that lend themselves to describing those problems.
Partially commutative, monotonic productions systems are useful for solving
ignorable problems. These are important from an implementation point of view
without the ability to backtrack to previous states when it is discovered that an
incorrect path has been followed. Both types of partially commutative production
Self-Instructional
Material
57
Problems, Problem
Spaces and Search
NOTES
systems are significant from an implementation point; they tend to lead to many
duplications of individual states during the search process.
Production systems that are not partially commutative are useful for many
problems in which permanent changes occur.
2.7
DESIGN OF SEARCH PROGRAMS AND
SOLUTIONS
Solutions can be good in different ways. They can be good in terms of time or
storage or in difficulty of the algorithm. In case of the travelling salesman problem,
finding the best path can lead to a significant amount of computation. The solution
of such problems is only possible by using heuristics. In this type of problem, a path
is found of a distance of 8850 miles and another one of 7750. It is clear that the
second is better than the first but is it the best? Infinite time may be needed and
usually heuristics are used to find a very good path in finite time.
2.7.1 Design of Search Programs
Each search process can be considered to be a tree traversal. The object of the
search is to find a path from the initial state to a goal state using a tree. The number
of nodes generated might be huge; and in practice many of the nodes would not be
needed. The secret of a good search routine is to generate only those nodes that are
likely to be useful, rather than having a precise tree. The rules are used to represent
the tree implicitly and only to create nodes explicitly if they are actually to be of
use.
The following issues arise when searching:
• The tree can be searched forward from the initial node to the goal state or
backwards from the goal state to the initial state.
• To select applicable rules, it is critical to have an efficient procedure for
matching rules against states.
• How to represent each node of the search process? This is the knowledge
representation problem or the frame problem. In games, an array suffices; in
other problems, more complex data structures are needed.
Finally in terms of data structures, considering the water jug as a typical
problem do we use a graph or tree? The breadth-first structure does take note of all
nodes generated but the depth-first one can be modified.
2.7.1.1 Check duplicate nodes
1. Observe all nodes that are already generated, if a new node is present.
2. If it exists add it to the graph.
3. If it already exists, then
a. Set the node that is being expanded to the point to the already existing
node corresponding to its successor rather than to the new one. The new
one can be thrown away.
b. If the best or shortest path is being determined, check to see if this path is
better or worse than the old one. If worse, do nothing.
58
Self-Instructional
Material
Better save the new path and work the change in length through the chain of successor
nodes if necessary.
Problems, Problem
Spaces and Search
Example: Tic-Tac-Toe
State spaces are good representations for board games such as Tic-Tac-Toe. The
position of a game can be explained by the contents of the board and the player
whose turn is next. The board can be represented as an array of 9 cells, each of
which may contain an X or O or be empty.
NOTES
• State:
■
Player to move next: X or O.
■
Board configuration:
X
0
0
X
X
• Operators: Change an empty cell to X or O.
• Start State: Board empty; X’s turn.
• Terminal States: Three X’s in a row; Three O’s in a row; All cells full.
Search Tree
The sequence of states formed by possible moves is called a search tree. Each level
of the tree is called a ply.
Since the same state may be reachable by different sequences of moves, the
state space may in general be a graph. It may be treated as a tree for simplicity, at
the cost of duplicating states.
2.7.1.2 Solving problems using search
• Given an informal description of the problem, construct a formal description
as a state space:
■
Define a data structure to represent the state.
■
Make a representation for the initial state from the given data.
■
Write programs to represent operators that change a given state
representation to a new state representation.
■
Write a program to detect terminal states.
Self-Instructional
Material
59
Problems, Problem
Spaces and Search
NOTES
• Choose an appropriate search technique:
■
How large is the search space?
■
How well structured is the domain?
■
What knowledge about the domain can be used to guide the search?
2.7.2 Additional Problems
The additional problems are given below.
2.7.2.1 The Monkey and Bananas Problem
The monkey is in a closed room in which there is a small table. There is a bunch of
bananas hanging from the ceiling but the monkey cannot reach them. Let us assume
that the monkey can move the table and if the monkey stands on the table, the
monkey can reach the bananas. Establish a method to instruct the monkey on how
to capture the bananas.
1. Task Environment
A. Information
Set of places {on the floor, place1, place2, under the bananas}
B. Operators
CLIMB
Pretest the monkey is in the table’s place
Move the monkey on the table
WALK
Variable x is the set of places
Move the monkey’s place becomes x.
MOVE TABLE
Variable x is the set of places
Pretest the monkey’s place is in the set of places
The monkey’s place is in the table’s place
Move the monkey’s place becomes x.
The table’s place becomes x.
GET BANANAS
Pretest the table’s place is under the bananas
The monkey’s place is on the table
Move the contents of the monkey’s hand are the bananas
C. Differences
d1 is the monkey’s place
d2 is the table’s place
d3 are the contents of the monkey’s hand
D. Differences ordering
d3 is harder to reduce than d2 which is harder to reduce than d1.
Specific Task
Goal transform initial object to desired object
Objects
Initial the monkey’s place is place1 the table’s place is place2
The contents of the monkey’s hand is empty
Desired the contents of the monkey’s hand are the bananas.
60
Self-Instructional
Material
2.8
SUMMARY
In this unit, you have learned about problems, their definitions, characteristics and
design of search programs. You have also learned how to define the problem in a
state space research. A state space represents a problem in terms of states and
operators that change states. You have further learned about problem solving which
is a process of generating solutions from observed or given data.
Problems, Problem
Spaces and Search
NOTES
In this unit, you have also studied the two key steps that must be taken
towards designing a program to solve a particular problem:
1. First, accurately defining the problem means, specify the problem space, the
operators for moving within the space, and the starting and goal states.
2. Second, analyse the problem for determining where it falls.
Finally, consider the last two steps for developing a program to solve that
problem:
3. Identify and represent the knowledge required by the task.
4. Choose one or more techniques to solve the problem and apply those
techniques.
Literally, the relationship between the problem characteristics and specific
techniques should become clear and clear.
2.9
KEY TERMS
• Problem solving: It is a process of generating solutions from observed or
given data. It is however not always possible to use direct methods (i.e. go
directly from data to solution).
• Problem: A problem is defined by its ‘elements’ and their ‘relations’.
• State space search: A state space represents a problem in terms of states and
operators that change states.
• Production systems: These provide appropriate structures for performing
and describing search processes.
• Stacks and queues: The Stacks and Queues are data structures that maintain
the order of last-in, first-out and first-in, first-out respectively.
2.10 ANSWERS TO ‘CHECK YOUR PROGRESS’
A search refers to the search for a solution in a problem space.
True
A problem is defined by its ‘elements’ and their ‘relations’.
True.
False.
True.
A search strategy in which the highest layer of a decision tree is searched
completely before proceeding to the next layer is called Breadth-first search
(BFS).
8. True
1.
2.
3.
4.
5.
6.
7.
Self-Instructional
Material
61
Problems, Problem
Spaces and Search
NOTES
2.11 QUESTIONS AND EXERCISES
1. Discuss the meaning of ‘The first requirement of a good control strategy is
that it causes motion’.
2. Describe the necessary diagrams, a suitable state space representation for the
8-puzzle problem, and explain how the problem can be solved by state space
search? Show how heuristics can improve the efficiency of search?
3. State the travelling salesman problem and explain: Combinatorial explosion,
branch and bound technique and nearest neighbour heuristic.
4. Show that the Tower of Hanoi problem can be classified under the area of AI.
Give a state space representation of the problem.
2.12 FURTHER READING
1. www.mm.iit.uni-miskolc.hu
2. www.induastudychannel.com
3. www.mtsu.edu
4. www.cosc.brocku.com
5. www.cc.gatech.edu
6. www.cosmo.marymount.edu
7. www.tu-berlin.de
8. www.dubmail.gcd.ie
9. www.alanturing.net
10. www.en.wikipedia.org
11. www.cds.caltech.edu
12. www.dl.icr.org
13. www.cs.cardiff.ac.uk
14. www.niceindia.com
15. www.rdspc.com
16. www.cs.wise.edu
17. www.web.inet-tr.org.tr
18. www.books.google.com
19. www.acs.ilstu.edu
20. www.amsglossary.allenpress.com
21. www.sjc.edu.hk
22. www.lif-sud.uni-mrs.fr
23. Albrecht. Encyclopedia of Disability
24. Kumar, Deepak. 1995. ‘Undergraduate AI and its Non-imperative
Prerequisite’. ACM SIGART Bulletin, 4th Jan.
25. Zhou, R. 2006. ‘Breadth-first Heuristic Search’. Artificial Intelligence.
26. Bagchi, A. 1985. ‘Three Approaches to Heuristic Search in Networks’. Journal
of the ACM, 1st Jan.
27. Chandrasekaran, B., T. Johnson and R. Smit. 1992. ‘Task-Structure Analysis
for Knowledge Modeling (Technical)’. Communications of the ACM, Sept.
62
Self-Instructional
Material
UNIT 3 KNOWLEDGE
REPRESENTATION ISSUES
Knowledge
Representation Issues
NOTES
Structure
3.0 Introduction
3.1 Unit Objectives
3.2 Knowledge Representation
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
Approaches to AI Goals
Fundamental System Issues
Knowledge Progression
Knowledge Model
Knowledge Typology Map
3.3 Representation and Mappings
3.3.1 Framework of Knowledge Representation
3.3.2 Representation of Facts
3.3.3 Using Knowledge
3.4 Approaches to Knowledge Representation
3.4.1 Properties for Knowledge Representation Systems
3.4.2 Simple Relational Knowledge
3.5 Issues in Knowledge Representation
3.5.1
3.5.2
3.5.3
3.5.4
Important Attributes and Relationships
Granularity
Representing Set of Objects
Finding Right Structure
3.6 The Frame Problem
3.6.1 The Frame Problem in Logic
3.7
3.8
3.9
3.10
3.0
Summary
Key Terms
Questions and Exercises
Further Reading
INTRODUCTION
In the preceding unit, you learnt about problems, their definitions, characteristics
and design of search programs. In this unit, you will learn about the various aspects
of knowledge representation. Let us start with the observation that we have no
perfect method of knowledge representation today. This stems largely from our
ignorance of just what knowledge is. Nevertheless many methods have been worked
out and used by AI researchers.
You will learn that the knowledge representation problem concerns the
mismatch between human and computer ‘memory’, i.e., how to encode knowledge
so that it is a faithful reflection of the expert’s knowledge and can be manipulated
by computers.
For mapping, we call these representations of knowledge—knowledge bases—
and the manipulative operations on these knowledge bases, are known as inference
engine programs. In this unit you will learn about the frame problem which is a
problem of representing the facts that change as well as those that do not change.
For example, consider a table with a plant on it under a window. Suppose, it is
Self-Instructional
Material
63
Knowledge
Representation Issues
NOTES
moved to the centre of the room. Here, it must be inferred that the plant is now in
the centre but the window is not.
3.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
• Explain the different knowledge representation issues
• Understand representation and mapping
• Describe the various approaches to knowledge representation
• Explain the issues in knowledge representation
• Describe the frame problem
3.2
KNOWLEDGE REPRESENTATION
Knowledge Representation
(x): animal (x) fi eat (x, fruit) eat (x, meat)
• What is knowledge?
• How do we search through knowledge?
• How do we get knowledge?
• How do we get a computer to understand knowledge?
What AI researchers call ‘knowledge’ appears as data at the level of
programming. Data becomes knowledge when a computer program represents and
uses the meaning of some data. Many knowledge-based programs are written in the
LISP programming language, which is designed to manipulate data as symbols.
Knowledge may be declarative or procedural. Declarative knowledge is
represented as a static collection of facts with a set of procedures for manipulating
the facts. Procedural knowledge is described by an executable code that performs
some action. Procedural knowledge refers to ‘how-to’ do something. Usually, there
is a need for both kinds of knowledge representation to capture and represent
knowledge in a particular domain.
First-order predicate calculus (FOPC) is the best-understood scheme for
knowledge representation and reasoning. In FOPC, knowledge about the world is
represented as objects and relations between objects. Objects are real-world things
that have individual identities and properties, which are used to distinguish the
things from other objects. In a first-order predicate language, knowledge about the
world is expressed in terms of sentences that are subject to the language’s syntax
and semantics.
3.2.1 Approaches to AI Goals
64
Self-Instructional
Material
There are four approaches to the goals of AI: (1) computer systems that act like
humans; (2) programs that simulate the human mind; (3) knowledge representation
and mechanistic reasoning; and (4) intelligent or rational agent design. The first
two approaches focus on studying humans and how they solve problems, while the
latter two approaches focus on studying real-world problems and developing rational
solutions regardless of how a human would solve the same problems.
Programming a computer to act like a human is a difficult task and requires
that the computer system be able to understand and process commands in natural
language, store knowledge, retrieve and process that knowledge in order to derive
conclusions and make decisions, learn to adapt to new situations, perceive objects
through computer vision and have robotic capabilities to move and manipulate
objects. Although this approach was inspired by the Turing Test, most programs
have been developed with the goal of enabling computers to interact with humans
in a natural way rather than passing the Turing Test.
Knowledge
Representation Issues
NOTES
Some researchers focus instead on developing programs that simulate the
way in which the human mind works on problem-solving tasks. The first attempt to
imitate human thinking was the Logic Theorist and the General Problem Solver
programs developed by Allen Newell and Herbert Simon. Their main interest was
in simulating human thinking rather than solving problems correctly. Cognitive
science is the interdisciplinary field that studies the human mind and intelligence.
The basic premise of cognitive science is that the mind uses representations that are
similar to computer data structures and computational procedures that are similar
to computer algorithms that operate on those structures.
Other researchers focus on developing programs that use logical notation to
represent a problem and use formal reasoning to solve a problem. This is called the
‘logicist approach’ to developing intelligent systems. Such programs require huge
computational resources to create vast knowledge bases and to perform complex
reasoning algorithms. Researchers continue to debate whether this strategy will
lead to computer problem solving at the level of human intelligence.
Still other researchers focus on the development of ‘intelligent agents’ within
computer systems. Russell and Norvig defines these agents as ‘anything that can be
viewed as perceiving its environment through sensors and acting upon that
environment through effectors.’ The goal for computer scientists working in this
area is to create agents that incorporate information about the users and the use of
their systems into the agents’ operations.
3.2.2 Fundamental System Issues
A robust AI system must be able to store knowledge, apply that knowledge to the
solution of problems and acquire new knowledge through experience. Among the
challenges that face researchers in building AI systems, there are three that are
fundamental: knowledge representation, reasoning and searching and learning.
How do we represent what we know?
• Knowledge is a general term.
An answer to the question, ‘how to represent knowledge’, requires an analysis
to distinguish between knowledge ‘how’ and knowledge ‘that’.
■ knowing ‘how to do something’.
e.g. ‘how to drive a car’ is Procedural knowledge.
■ knowing ‘that something is true or false’.
e.g. ‘that is the speed limit for a car on a motorway’ is Declarative
knowledge.
• Knowledge and representation are distinct entities that play central but
distinguishable roles in the intelligence system.
Self-Instructional
Material
65
Knowledge
Representation Issues
■
■
NOTES
Knowledge is a description of the world. It determines a system’s
competence by what it knows.
Representation is the way knowledge is encoded. It defines the performance
of a system in doing something.
• Different types of knowledge require different kinds of representation.
The Knowledge Representation models/mechanisms are often based on:
■
Logic
■
Rules
Frames
■
■
Semantic Net
• Different types of knowledge require different kinds of reasoning.
Knowledge is a general term. Knowledge is a progression that starts with
data which is of limited utility. By organizing or analysing the data, we understand
what the data means, and this becomes information. The interpretation or evaluation
of information yield knowledge. An understanding of the principles within the
knowledge is wisdom.
3.2.3 Knowledge Progression
Interpretation
Organizing
Data
Knowledge
Information
Analysing
Understanding
Evolution
Wisdom
Principles
Fig. 3.1 Knowledge Progression
Data is viewed as a collection of disconnected facts.
Eg: It is raining.
Information emerges when relationships among facts are established and
understood. Providing answers to ‘who’, ‘what’, ‘where’, and ‘when’ gives the
relationships among facts.
Eg: The temperature dropped 15 degrees and it started raining.
Knowledge emerges when relationships among patterns are identified and
understood. Answering to ‘how’ gives the relationships among patterns.
Eg: If the humidity is very high, temperature drops substantially, and then atmosphere
holds the moisture, so it rains.
Wisdom is the understanding of the principles of relationships that describe patterns.
Providing the answer for ‘why’ gives understanding of the interaction between
patterns.
Eg: Understanding of all the interactions that may happen between raining,
evaporation, air currents, temperature gradients and changes.
Let’s look at various kinds of knowledge that might need to represent AI systems:
Objects
Objects are defined through facts.
Eg: Guitars with strings and trumpets are brass instruments.
66
Self-Instructional
Material
Knowledge
Representation Issues
Events
Events are actions.
Eg: Vinay played the guitar at the farewell party.
NOTES
Performance
Playing the guitar involves the behaviour of the knowledge about how to do things.
Meta-knowledge
Knowledge about what we know.
To solve problems in AI, we must represent knowledge and must deal with the
entities.
Facts
Facts are truths about the real world on what we represent. This can be considered
as knowledge level.
3.2.4 Knowledge Model
The Knowledge Model defines that as the level of ‘connectedness’ and
‘understanding’ increases, our progress moves from data through information and
knowledge to wisdom.
The model represents transitions and understanding.
• The transitions are from data to information, information to knowledge, and
finally knowledge to wisdom;
• The support of understanding is the transitions from one stage to the next
stage.
The distinctions between data, information, knowledge and wisdom are not
very discrete. They are more like shades of gray, rather than black and white.
Data and information deal with the past and are based on gathering facts and
adding context.
Knowledge deals with the present and that enables us to perform.
Wisdom deals with the future vision for what will be rather than for what it is
or it was.
Degrees of
Connectedness
Wisdom
Understanding
Principles
Knowledge
Understanding
Patterns
Information
Understanding
Relations
Data
Fig. 3.2 Knowledge Model
Degrees of
Understanding
Self-Instructional
Material
67
Knowledge
Representation Issues
NOTES
3.2.5 Knowledge Typology Map
According to the topology map:
1. Tacit knowledge derives from the experience, doing action and subjective
insight
2. Explicit knowledge derives from principle, procedure, process and concepts.
3. Information derives from data, context and facts.
4. Knowledge conversion devices from internalization, socialization, wisdom,
combination and externalization.
Finally Knowledge derives from the Tacit and Explicit Knowledge and
Information.
Let’s see the contents of the topology map:
Facts are data or instance that is specific and unique.
Concepts are classes of items or words or ideas that are known by a common name
and share common features.
Processes are flow of events or activities that describes how things will work rather
than how to do things.
Procedures are a series of step-by-step actions and decisions that give the result of
a task.
Principles are guidelines, rules and parameters that are going to rule or monitor.
Doing
Action
Experience
Principles
Procedure
Explicit Knowledge
Tacit
Knowledge
Process
Subjective
Insight
Knowledge
Concept
Information
Socialization
Facts
Context
Data
Wisdon
Intemalization
Knowledge
Conversion
Externalizatin
Combination
Fig. 3.3 Knowledge Typology Map
Principles are the basic building blocks of theoretical models and allow for making
predictions and drawing implications. These artifacts are supported in the knowledge
creation process for creating two of knowledge types: declarative and procedural,
which are explained below.
68
Self-Instructional
Material
• Knowledge Type
Cognitive psychologists sort knowledge into Declarative and Procedural
categories and some researchers add Strategic as a third category.
Procedural knowledge
Declarative knowledge
• Examples: procedures, rules,
• Example: concepts, objects,
strategies, agendas, models.
facts, propositions, assertions,
semantic nets, logic and
descriptive models.
• Focuses on tasks that must be
performed to reach a particular
objective or goal.
• Knowledge about ‘how to do
something’; e.g., to determine if
Peter or Robert is older, first find
their ages.
• Refers to representations of objects
and events; knowledge about facts
and relationships;
• Knowledge about ‘that something
if is true or false’. e.g., A car has
four tyres; Peter is older than
Robert.
Knowledge
Representation Issues
NOTES
Note :
About procedural knowledge, there is some disparity in views.
One view is, that it is close to Tacit knowledge; it manifests itself in the doing
of something, yet cannot be expressed in words; e.g. we read faces and moods.
Another view is that it is close to declarative knowledge; the difference is
that a task or method is described instead of facts or things.
All declarative knowledge is explicit knowledge; it is knowledge that can be
and has been articulated.
Strategic knowledge is considered to be a subset of declarative knowledge.
3.3
REPRESENTATION AND MAPPINGS
When we collect knowledge we are faced with the problem of how to record it. And
when we try to build the knowledge base we have the similar problem of how to
represent it.
We could just write down what we are told but as the information grows, it
becomes more and more difficult to keep track of the relationships between the
items.
Let us start with the observation that we have no perfect method of knowledge
representation today.
This stems largely from our ignorance of just what knowledge is.
Nevertheless many methods have been worked out and used by AI researchers.
The knowledge representation problem concerns the mismatch between
human and computer ‘memory’, i.e. how to encode knowledge so that it is a faithful
reflection of the expert’s knowledge and can be manipulated by the computer.
We call these representations of knowledge knowledge bases, and the
manipulative operations on these knowledge bases, inference engine programs.
What to Represent?
Facts: Truths about the real world and what we represent. This can be regarded as
the base knowledge level.
Representation of the facts: That which we manipulate. This can be regarded as
the symbol level since we usually define the representation in terms of symbols that
can be manipulated by programs.
Simple Representation: Simple way to store facts. Each fact about a set of
objects is set out systematically in columns. Little opportunity for inference.
Knowledge basis for inference engines.
Self-Instructional
Material
69
Knowledge
Representation Issues
NOTES
3.3.1 Framework of Knowledge Representation
A computer requires a well-defined problem description to process and to provide
well-defined acceptable solution. To collect fragments of knowledge we first need
to formulate a description in our spoken language and then represent it in formal
language so that the computer can understand. The computer can then use an
algorithm to compute an answer. This process is illustrated below.
Solve
Problem
Represent
Solution
Informal
Interpret
Formal
Compute
Representation
Output
Fig. 3.4 Knowledge Representation Framework
The steps are:
• The informal formalism of the problem takes place first.
• It is then represented formally and the computer produces an output.
• This output can then be represented in a informally described solution that
the user understands or checks for consistency.
Note: Problem solving requires
• Formal knowledge representation, and
• Conversion of informal (implicit) knowledge to formal (explicit) knowledge.
• Knowledge and Representation
Problem solving requires a large amount of knowledge and some mechanism
for manipulating that knowledge.
Knowledge and Representation are distinct entities and play a central but
distinguishable roles in the intelligence system.
■
Knowledge is a description of the world; it determines a system’s
competence by what it knows.
■
Representation is the way knowledge is encoded; it defines the system’s
performance in doing something.
In simple words, we:
• need to know about things we want to represent, and
• need some means by which we can manipulate things.
■
Objects - facts about objects in the domain
■
Events - actions that occur in the domain
■
Performance - knowledge about how to do things
• know things to represent
■
Meta-knowledge - Knowledge about what we know
70
Self-Instructional
Material
• need means to manipulate
■
Requires some formalism - to what we represent
Knowledge
Representation Issues
Thus, knowledge representation can be considered at two levels:
(a) knowledge level at which facts are described, and
NOTES
(b) Symbol level at which the representations of the objects, defined in terms of
symbols, can be manipulated in the programs.
Note: A good representation enables fast and accurate access to knowledge and
understanding of the content.
3.3.2 Representation of Facts
Representations of facts are those which we manipulate. This can be regarded as
the symbol level since we normally define the representation in terms of symbols
that can be manipulated by programs. We can structure these entities in two levels:
The knowledge level
At which the facts are going to be described.
The symbol level
At which representations of objects are going to be defined in terms of symbols that
can be manipulated in programs.
Fig. 3.5 Two Entities in Knowledge Representation
Natural language (or English) is the way of representing and handling the facts.
Logic enables us to consider the following fact:
spot is a dog
represented as dog(spot)
We infer that all dogs have tails
: dog(x) has_a_tail(x)
According to the logical conclusion
has_a_tail(Spot)
Using a backward mapping function
Spot has a tail can be generated.
The available functions are not always one to one but are many to many
which are a characteristic of English representations. The sentences All dogs have
Self-Instructional
Material
71
Knowledge
Representation Issues
NOTES
tails and every dog has a tail – both say that each dog has a tail, but from the first
sentence, one could say that each dog has more than one tail and try substituting
teeth for tails. When an AI program manipulates the internal representation of facts
these new representations can also be interpretable as new representations of facts.
Consider the classic problem of the mutilated chessboard. The problem in a
normal chessboard, the opposite corner squares, have been eliminated. The given
task is to cover all the squares on the remaining board by dominoes so that each
domino covers two squares. Overlapping of dominoes is not allowed. Consider
three data structures:
Fig. 3.6 Mutilated Checker
The first two data structures are shown in the diagrams above and the third
data structure is the number of black squares and the number of white squares. The
first diagram loses the colour of the squares and a solution is not easy. The second
preserves the colours but produces no easier path because the numbers of black and
white squares are not equal. Counting the number of squares of each colour, giving
black as 32 and the number of white as 30, gives the negative solution as a domino
must be on one white square and one black square; thus the number of squares must
be equal for a positive solution.
3.3.3 Using Knowledge
We have briefly discussed above where we can use knowledge. Let us consider how
knowledge can be used in various applications.
Learning
Acquiring knowledge is learning. It simply means adding new facts to a knowledge
base. New data may have to be classified prior to storage for easy retrieval, interaction
and inference with the existing facts and has to avoid redundancy and replication in
the knowledge. These facts should also be updated.
Retrieval
Using the representation scheme shows a critical effect on the efficiency of the
method. Humans are very good at it. Many AI methods have tried to model humans.
Reasoning
Get or infer facts from the existing data.
If a system only knows that:
• Ravi is a Jazz musician.
• All Jazz musicians can play their instruments well.
72
Self-Instructional
Material
Knowledge
Representation Issues
If questions are like this
Is Ravi a Jazz musician?
OR
Can Jazz musicians play their instruments well?
then the answer is easy to get from the data structures and procedures.
NOTES
However, questions like
Can Ravi play his instrument well?
require reasoning. The above are all related. For example, it is fairly obvious
that learning and reasoning involve retrieval, etc.
3.4
APPROACHES TO KNOWLEDGE
REPRESENTATION
We discuss the various concepts relating to approaches to knowledge representation
in the following sections.
3.4.1 Properties for Knowledge Representation Systems
The Knowledge Representation System possesses the following properties:
• Representational Adequacy
• Inferential Adequacy
• Inferential Efficiency
• Acquisitional Efficiency
• Well-defined Syntax and Semantics
• Naturalness
• Frame problem
3.4.1.1 Representational adequacy
A knowledge representation scheme must be able to actually represent the knowledge
appropriate to our problem.
3.4.1.2 Inferential adequacy
A knowledge representation scheme must allow us to make new inference from old
knowledge. It means the ability to manipulate the knowledge represented to produce
new knowledge corresponding to that inferred from the original. It must make
inferences that are:
• Sound – The new knowledge actually does follow from the old knowledge.
• Complete – It should make all the right inferences.
Here soundness is usually easy, but completeness is very hard.
Example: Given Knowledge
Tom is a man
All men are mortal.
The inference –
Self-Instructional
Material
73
Knowledge
Representation Issues
James is mortal
Is not sound, whereas
Tom is mortal
NOTES
Is sound.
3.4.1.3 Inferential efficiency
A knowledge representation scheme should be tractable, i.e. make inferences in
reasonable time. Unfortunately, any knowledge representation scheme with
interesting expressive power is not going to be efficient; often, the more general
knowledge representation schemes are less efficient. We have to use knowledge
representation schemes tailored to problem domain i.e. less general but more
efficient. The ability to direct the inferential mechanisms into the most productive
directions by storing guides.
3.4.1.4 Acquisitional efficiency
Acquire new knowledge using automatic methods wherever possible rather than on
human intervention. Till today no single system optimizes all the properties. Now
we will discuss some of the representation schemes.
3.4.1.5 Well-defined syntax and semantics
It should be possible to tell:
• Whether any construction is ‘grammatically correct’.
• How to read any particular construction, i.e. no ambiguity.
Thus a knowledge representation scheme should have well-defined syntax. It
should be possible to precisely determine for any given construction, exactly what
its meaning is. Thus a knowledge representation scheme should have well-defined
semantics. Here syntax is easy, but semantics is hard for a knowledge representation
scheme.
3.4.1.6 Naturalness
A knowledge representation scheme should closely correspond to our way of
thinking, reading and writing. A knowledge representation scheme should allow a
knowledge engineer to read and check the knowledge base. Simply we can define
knowledge representation as knowledge representation is the problem of representing
what the computer knows.
3.4.1.7 Frame problem
A frame problem is a problem of representing the facts that change as well as those
that do not change. For example, consider a table with a plant on it under a window.
Suppose we move it to the centre of the room. Here we must infer that plant is now
in the center but the window is not. Frame axioms are used to describe all the things
that do not change when an operator is applied in one state to go to another state say
n + 1.
3.4.2 Simple Relational Knowledge
• Simple way to store facts.
74
Self-Instructional
Material
• Each fact about a set of objects is set out systematically in columns.
Knowledge
Representation Issues
• Little opportunity for inference.
• Knowledge basis for inference engines.
Musician
Ravi
John
Robert
Krishna
Style
Instrument
Trumpet
Saxophone
Guitar
Guitar
Jazz
Avant Garde
Rock
Jazz
Age
36
35
43
47
NOTES
Fig 3.7 Simple Relational Knowledge
We ask things like:
• Who is dead?
• Who plays Jazz/Trumpet, etc.?
This sort of representation is popular in database systems.
3.4.2.1 Inheritable knowledge
age
Adult Male
35
is a
Musician
is a
Jazz
Avant
Garde/Jazz
instance
instance
John
Ravi
bands
bands
Ravi's Group
Ravi's Quintet
Naked City
Hyderabad
Fig. 3.8 Property Inheritance Hierarchy
Relational knowledge is made up of objects consisting of
• attributes
• corresponding associated values
We can extend the base by allowing inference mechanisms:
• Property inheritance
■
Elements inherit values from being members of a class.
■
Data must be organized into a hierarchy of classes.
Self-Instructional
Material
75
Knowledge
Representation Issues
• Boxed nodes – objects and values of attributes of objects.
• Values can be objects with attributes and so on.
• Arrows – point from object to its value.
NOTES
• This structure is known as a slot and filler structure, semantic network or a
collection of frames.
The algorithm is to retrieve a value for an attribute in an instance object:
Algorithm: Property Inheritance
1. Find the object in the knowledge base.
2. If there is a value for the attribute report it.
3. Otherwise look for a value of instance if none fail.
4. Otherwise go to that node and find a value for the attribute and then report it.
5. Otherwise search through using isa until a value is found for the attribute.
3.4.2.2 Inferential knowledge
Represent knowledge as formal logic:
All dogs have tails
: dog(x) has_a_tail(x)
Advantages:
• A set of strict rules.
■
Can be used to derive more facts.
■
Truths of new statements can be verified.
■
Guaranteed correctness.
• Many inference procedures available to implement standard rules of logic.
• Popular in AI systems. Ex. Automated theorem proving.
3.4.2.3 Procedural knowledge
The basic idea of Procedural Knowledge is the Knowledge encoded in some
procedures. Small programs know how to do specific things and how to proceed.
For example, a parser in a natural language understander has the knowledge that a
noun phrase may contain articles, adjectives and nouns. It is represented by calls to
routines that know how to process articles, adjectives and nouns.
Advantages:
• Heuristic or domain specific knowledge can be represented.
• Extended logical inferences, such as default reasoning facilitated.
• Side effects of actions may be modelled. Some rules may become false in
time. Keeping track of this in large systems may be tricky.
Disadvantages:
• Completeness - not all cases may be represented.
• Consistency - not all deductions may be correct.
• Modularity is sacrificed.
• Cumbersome control information.
76
Self-Instructional
Material
3.5
ISSUES IN KNOWLEDGE REPRESENTATION
Knowledge
Representation Issues
The issues in knowledge representation are discussed below.
3.5.1 Important Attributes and Relationships
NOTES
There are two main important attributes in Knowledge Representation: Instance
and Isa. They are important because each supports property inheritance.
What about the relationship between the attributes of an object, such as
inverses, existence, techniques for reasoning about values and single valued
attributes. We consider an example of an inverse in
band(John, Naked City)
This can be treated as John plays in the band Naked City or John’s band in Naked
City.
Another representation is band = Naked City
Band-members = John, Robert,
The relationship between the attributes of an object is independent of specific
knowledge they encode, and may hold properties like: inverses, existence in an isa
hierarchy, techniques for reasoning about values and single valued attributes.
Inverses
This is about consistency check, while a value is added to one attribute. The entities
are related to each other in many different ways. The figure shows attributes (isa,
instance and team), each with a directed arrow, originating at the object being
described and terminating either at the object or its value.
Existence in an isa hierarchy
This is about generalization-specialization, like classes of objects and Specialized
subsets of those classes, their attributes and specialization of attributes.
Ex.: The height is a specialization of general attribute; physical-size is a specialization
of physical-attribute. These generalization-specialization relationships are important
for attributes because they support inheritance.
Techniques for reasoning about values
This is about reasoning values of attributes not given explicitly. Several kinds of
information are used in reasoning. Like height must be in a unit of length, age of
person cannot be greater than the age of person’s parents. The values are often
specified when a knowledge base is created.
Single-valued attributes
This is about a specific attribute that is guaranteed to take a unique value. For
example, a guitar player can be only a single and be a member of only one team. KR
systems take different approaches to provide support for single valued attributes.
3.5.2 Granularity
At what level the knowledge represents and what are the primitives? Choosing the
Granularity of Representation Primitives are fundamental concepts, such as holding,
seeing, playing. English is a very rich language with over half a million words is
Self-Instructional
Material
77
Knowledge
Representation Issues
NOTES
clear and we will find difficulty in deciding upon which words to choose as our
primitives in a series of situations.
If Syam feeds a dog then it could be:
feeds(Syam, dog)
If Syam gives the dog a bone will be:
gives (Syam, dog, bone)
Are these the same?
In any sense does giving an object food constitute feeding?
If give(x, food) feed(x) then we are making progress. But we need to add certain
inferential rules.
In the famous program on relationships
Ravi is Sai’s cousin.
How do we represent this?
Ravi = daughter (brother or sister (father or mother (Sai)))
Suppose it is Sree then we do not know whether Sree is a male or female and then
son applies as well.
Clearly the separate levels of understanding require different levels of primitives
and these need many rules to link together apparently similar primitives.
Obviously there is a potential storage problem and the underlying question must be
what level of comprehension is needed.
3.5.3 Representing Set of Objects
There are certain properties of objects that are true as member of a set but not as
individual; for example, consider the assertion made in the sentences:
‘There are more sheep than people in India’, and ‘English speakers can be
found all over the world.’
To describe these facts, the only way is to attach assertion to the sets
representing people, sheep and English.
The reason to represent sets of objects is if a property is true to all or most
of the elements of a set, then it is more efficient to associate with the set rather than
to associate it explicitly with every elements of the set. This is done
• in logical representation through the use of a universal quantifier, and
• in hierarchical structure where nodes represent sets and inheritance propagate
a set level assertion down to the individual.
For example: assertion large (elephant), remember to make clear distinction
between
• whether we are asserting some property of the set itself, means, the set of
elephants is large, or
• asserting some property that holds for individual elements of the set means
any thing that is an elephant is large.
There are three ways in which sets may be represented:
(a) Name, for example the node Musician and the predicates Jazz and Avant
Garde in logical representation (see Fig. 3.8).
(b) Extensional definition is to list the numbers, and
78
Self-Instructional
Material
(c) Intentional definition is to provide a rule that returns true or false depending
on whether the object is in the set or not.
Knowledge
Representation Issues
3.5.4 Finding Right Structure
For accessing the right structure we can describe a particular situation. This requires
selecting an initial structure and then revising the choice. While doing so, it is
necessary to solve the following problems:
NOTES
• how to perform an initial selection of the most appropriate structure
• how to fill in appropriate details from the current situations
• how to find a better structure if the one chosen initially turns out not to be
appropriate
• what to do if none of the available structures is appropriate
• when to create and remember a new structure
There is no good and general-purpose method for solving all these problems.
Some knowledge representation techniques solve some of them.
Let’s consider two problems:
1. How to select initial structure
2. How to find a better structure
Selecting an initial Structure: There are three important approaches for the selection
of an initial structure.
(a) Index the structures directly by specific English words. Let each verb have
own its structure that describeds the meaning.
Disadvantages
(i) Many words may have several different meanings.
Eg: 1. John flew to Delhi.
2. John flew the kite.
‘flew’ here had different meaning in two different contexts.
(ii) It is useful only when there is an English description of the problem.
(b) Consider a major concept as a pointer to all of the structures.
Example:
1. The concept, Steak might point to two scripts, one for the restaurant and
the other for the super market.
2. The concept bill might point to as restaurant script or a shopping script.
We take the intersection of these sets; get the structure that involves all
the content words.
Disadvantages
1. If the problem contains extraneous concepts then the intersection will result
as empty.
2. It may require a great deal of computation to compute all the possible sets
and then to intersect them.
(c) Locate the major clue and use that to select an initial structure
Self-Instructional
Material
79
Knowledge
Representation Issues
NOTES
Disadvantages
1. We can’t identify a major clue in some situations.
2. It is difficult to anticipate which clues are important and which are not.
(2) Revising the choice when necessary
Once we find a structure and if it doesn’t seem to be appropriate then we
would opt for another choice. The different ways in which this can be done
are:
1. Select the fragments of the current structure and match them against the
other alternatives available. Choose the best one.
2. Make an excuse for the failure of the current structures and continue to
use that. There are heuristics such as the fact that a structure is more
appropriate if a desired feature is missing rather than having an
inappropriate feature.
Example: A person with one leg is more plausible than a person with a
tail.
3. Refer to specific links between the structures in order to follow new
directions to explore.
4. If the knowledge is represented in a ‘isa’ hierarchy then move upward
until a structure is found and they should not conflict with evidence. Use
this structure if it provides the required knowledge or create a new structure
below it.
3.6 THE FRAME PROBLEM
The frame problem is the challenge of representing the effects of action in logic
without having to represent explicitly a large number of intuitively obvious noneffects. Philosophers and AI researchers’ frame problem is suggestive of a wider
epistemological issue, namely whether it is possible or in principle, to limit the
scope of the reasoning required to derive the consequences of an action.
3.6.1 The Frame Problem in Logic
Check Your Progress
1. What AI researchers
call ‘knowledge’
appears as data at
the level of
programming. (True
or False)
2. Knowledge is a
description of the
world. It determines
a system’s
competence by
what it knows.
(True or False)
3. What is a topology
map?
80
Self-Instructional
Material
Using mathematical logic, how is it possible to write formulae that describe the
effects of actions without having to write a large number of accompanying formulae
that describe the mundane, obvious non-effects of those actions? Let’s look at an
example. The difficulty can be illustrated without the full apparatus of formal logic,
but it should be borne in mind that the devil is in the mathematical details. Suppose
we write two formulae, one describing the effects of painting an object and the
other describing the effects of moving an object.
1. colour(x, c) holds after paint(x, c)
2. position(x, p) holds after move(x, p)
Now, suppose we have an initial situation in which colour(A, Red) and
position(A, House). According to the machinery of deductive logic, what then
holds after the action paint(A, Blue) followed by the action move(A, Garden)?
Intuitively, we would expect colour(A, Blue) and position(A, Garden) to hold.
Unfortunately, this is not the case. If written out more formally in classical
predicate logic, using a suitable formalism for representing time and action
such as the situation calculus, the two formulae above gives the conclusion
that position(A, Garden) holds. This is because they don’t rule out the
possibility that the colour of A gets changed by the Move action.
The most obvious way to augment such formalization so that the right common
sense conclusions fall out is to add a number of formulae that explicitly
describe the non-effects of each action. These formulae are called frame
axioms. For example, we need a pair of frame axioms.
Knowledge
Representation Issues
NOTES
3. colour(x ,c) holds after Move(x, p) if colour(x ,c) held beforehand
4. Position(x, p) holds after Paint(x, c) if Position(x, p) held beforehand
In other words, painting an object will not affect its position, and moving an
object will not affect its colour. With the addition of these two formulae, all the
desired conclusions can be drawn. However, this is not at all a satisfactory solution.
Since most actions do not affect most properties of a situation, in a domain comprising
M actions and N properties we will, in general, have to write out almost MN frame
axioms. Whether these formulae are destined to be stored explicitly in a computer’s
memory, or are merely part of the designer’s specification, this is an unwelcome
burden.
The challenge is to find a way to capture the non-effects of actions more
successively in formal logic. What we need is some way of declaring the general
rule-of-thumb that an action can be assumed not to change a given property of a
situation unless there is evidence to the contrary. This default assumption is known
as the common sense law of inertia. The technical frame problem can be viewed as
the task of formalizing this law.
The main obstacle to doing this is the monotonic of classical logic. In classical
logic, the set of conclusions that can be drawn from a set of formulae always increases
with the addition of further formulae. This makes it impossible to express a rule
that has an open-ended set of exceptions, and the common sense law of inertia is
just such a rule.
Researchers in logic-based AI have put a lot of effort into developing a
variety of non-monotonic reasoning formalisms, such as circumscription, and
investigating their application to the frame problem. None of this has turned out to
be at all straightforward. One of the most troublesome barriers to progress was
highlighted in the so-called Yale shooting problem, a simple scenario that gives rise
to counter-intuitive conclusions if naively represented with a non-monotonic
formalism. To make matters worse, a full solution needs to work in the presence of
concurrent actions, actions with non-deterministic effects, continuous change and
actions with indirect ramifications. In spite of these subtleties, a number of solutions
to the technical frame problem now exist that are adequate for logic-based AI
research. Although improvements and extensions continue to be found, it is fair to
say that the dust has settled, and that the frame problem, technically, is more or less
solved.
3.7
SUMMARY
In this unit, you have learned about the various aspects of knowledge representation.
There is no perfect method of knowledge representation today. This stems largely
from our ignorance of just what knowledge is. Nevertheless, many methods have
Self-Instructional
Material
81
Knowledge
Representation Issues
NOTES
been worked out and used by AI researchers.
You have also learned about mapping. The knowledge representation problem
concerns the mismatch between human and computer ‘memory’, i.e., how to encode
knowledge so that it is a faithful reflection of the expert’s knowledge and can be
manipulated by computer. For mapping, we call these representations of knowledge
‘knowledge bases’, and the manipulative operations on these knowledge bases,
inference engine programs.
You have also learned about the frame problem which is a problem of
representing the facts that change as well as those that do not change.
In this unit, you have studied about knowledge representation and reasoning
formalism. This means it could lead to new conclusions from the knowledge we
have. It could be reasoned that we have enough knowledge to build a knowledgebased agent. However, propositional logic is a weak language; there are many things
that cannot be expressed. To express knowledge about objects, their properties and
the relationships that exist between objects, we need a more expressive language:
first-order logic.
3.8
KEY TERMS
• Knowledge: It is a general term. Knowledge is a progression that starts with
data that is of limited utility. By organizing or analysing the data, we
understand what the data means, and this becomes information. The
interpretation or evaluation of information yield knowledge.
• Information: It emerges when relationships among facts are established and
understood. Providing answers to ‘who’, ‘what’, ‘where’, and ‘when’ gives
the relationships among facts.
• Wisdom: It is the understanding of the principles of relationships that describe
patterns. Providing the answer for ‘why’ gives understanding of the interaction
between patterns.
• Frame problem: It is a problem of representing the facts that change as well
as those that do not change.
3.9
QUESTIONS AND EXERCISES
1. What do AI researchers mean by ‘knowledge’?
2. In which language are many knowledge-based programs written?
3. Knowledge is a description of the world. It determines a system’s competence
by what it knows. Explain.
4. Discuss the knowledge model.
5. Explain the key features of the frame problem.
82
Self-Instructional
Material
3.10 FURTHER READING
Knowledge
Representation Issues
1. www.plato.standford.edu
2. www.indiastudychannel.com
NOTES
3. www.egeria.cs.cf.ac.uk
4. www.cs.cardiff.ac.uk
5. www.guthulamurali.freeservers.com
6. www.meanderingpassage.com
7. www.jpmf.house.cern.ch
8. www.mm.iit.uni-miskolc.hu
9. www.bookrags.com
10. www.stewardess.inhatc.ac.kr
11. www.seop.leeds.ac.uk
12. www.nwlink.com
Self-Instructional
Material
83
Using Predicate Logic
UNIT 4 USING PREDICATE LOGIC
Structure
4.0
4.1
4.2
4.3
NOTES
Introduction
Unit Objectives
The Basic in Logic
Representing Simple Facts in Logic
4.3.1 Logic
4.4 Representing Instance and Isa Relationships
4.5 Computable Functions and Predicates
4.6 Resolution
4.6.1
4.6.2
4.6.3
4.6.4
4.6.5
4.7
4.8
4.9
4.10
4.11
4.12
Algorithm: Convert to Clausal Form
The Basis of Resolution
Resolution in Propositional Logic
The Unification Algorithm
Resolution in Predicate Logic
Natural Deduction
Summary
Key Terms
Answers to ‘Check Your Progress”
Questions and Exercises
Further Reading
4.0 INTRODUCTION
In the preceding unit, you learnt about the various aspects of knowledge
representation. In this unit, you will learn about the various concepts relating to the
use of predicate logic. You will also learn about how to use simple facts in logic.
The unit will also discuss resolution and natural deduction.
4.1 UNIT OBJECTIVES
After going through this unit, you will be able to:
• Define the types of logic
• Discuss the use of predicate logic
• Describe the representation of simple facts in logic
• Discuss the representation of Instance and Isa Relationships
• Analyse resolution and natural deduction
4.2 THE BASICS IN LOGIC
Logic is a language for reasoning; a collection of rules are used in logical reasoning.
We briefly discuss the following:
Types of logic
• Propositional logic - logic of sentences
• Predicate logic - logic of objects
Self-Instructional
Material
85
Using Predicate Logic
• Logic involving uncertainties
• Fuzzy logic - dealing with fuzziness
NOTES
• Temporal logic, etc
Types of logic
Propositional logic and Predicate logic are fundamental to all logic.
Propositional logic
• Propositions are ‘Sentences’; either true or false but not both.
• A sentence is the smallest unit in propositional logic.
• If proposition is true, then truth value is ‘true’; else ‘false’.
• Ex: Sentence – ‘Grass is green’; Truth value ‘true’;
Proposition is ‘yes’
Predicate logic
• Predicate is a function, may be true or false for arguments.
• Predicate logic is rules that govern quantifiers.
• Predicate logic is propositional logic added with quantifiers.
Ex: ‘The sky is blue’,
‘The cover of this book is blue’
Predicate is ‘is blue’, give a name ‘B’;
Sentence is represented as ‘B(x)’, B(x) reads as ‘x is blue’;
Object is represented as ‘x’
Mathematical Predicate Logic is an important area of Artificial Intelligence:
• Predicate logic is one of the major knowledge representation and reasoning
methods.
• Predicate logic serves as a standard of comparison for other representation
and reasoning methods.
• Predicate logic has a sound mathematical basis.
• The PROLOG Language is based on logic.
The form of predicate logic most commonly used in AI is first order predicate
calculus. Predicate logic requires that certain strong conditions be satisfied by the
data being represented:
• Discrete Objects: The objects represented must be discrete individuals
like people, trucks, but not 1 ton of sugar.
• Truth or Falsity: Propositions must be entirely true or entirely false,
inaccuracy or degrees of belief are not representable.
• Non-contradiction: Not only must data not be contradictory, but facts
derivable by rules must not contradict.
In many applications information is to be encoded into certain database
originate from descriptive sentences that are difficult and unnatural to represent by
simple structure called Arrays or Sets. Hence logic is used to represent these sentences
which we call predicate calculus, which represents real-world facts, and has logical
propositions written as well-formed formula.
86
Self-Instructional
Material
Predicate logic is one in which a statement from a natural language like English
is translated into symbolic structures comprising predicates, functions, variables,
constants, qualifiers and logical connectives.
The syntax for the predicate logic is determined by the symbols that are allowed
and the rules of combination. The semantics is determined by the interpretations
assigned to predicates. The symbols and rules of combination of a predicate logic
are:
(1) Predicate Symbol
(2) Constant Symbol
(3) Variable Symbol
(4) Function Symbol
Using Predicate Logic
NOTES
In addition to these symbols, parenthesis, delimiter and comma are also used.
Predicate Symbol: This symbol is used to represent a relation in the domain
of discourse. For example: Valmiki wrote Ramayana – wrote (Valmiki, Ramayana)
In this, wrote is a Predicate. e.g., Rama loves Sita – loves (Rama, Sita).
Constant Symbol: A constant symbol is simple term and is used to represent
object or entities in the domain of discourse. These objects may be physical objects
or people or anything we name. e.g., Rama, Sita, Ramayana, Valmiki
Variable Symbol: These symbols are permitted to have indefinite values
about the entity which is being referred to or specified. e.g., X loves Y – loves(X,
Y). In this X, Y are variables.
Function Symbol: These symbols are used to represent a special type of
relationship or mapping. e.g., Rama’s father is married to Rama’s Mother –
Married(father(Rama), Mother(Rama)). In this Father and Mother are functions
and married is Predicate.
Constants, variables and functions are referred to as terms and predicates are
referred to as atomic formulas or atoms. The statements in predicate logic are
termed as well-formed formula. A predicate with no variable is called a ground
atom.
4.3
REPRESENTING SIMPLE FACTS IN LOGIC
4.3.1 Logic
Logic is concerned with the truth of statements about the world.
Generally each statement is either TRUE or FALSE.
Logic includes: Syntax, Semantics and Inference Procedure.
Syntax: Specifies the symbols in the language about how they can be combined
to form sentences. The facts about the world are represented as sentences in logic.
Semantic: Specifies how to assign a truth value to a sentence based on its
meaning in the world. It specifies what facts a sentence refers to. A fact is a claim
about the world, and it may be TRUE or FALSE.
Inference Procedure: Specifies methods for computing new sentences from
an existing sentence.
Self-Instructional
Material
87
Using Predicate Logic
Note:
Facts are claims about the world that are True or False.
Representation is an expression (sentence), that stands for the objects and relations.
NOTES
Sentences can be encoded in a computer program.
• Logic as a Knowledge Representation Language: Logic is a language for
reasoning, a collection of rules used while doing logical reasoning. Logic is
studied as KR languages in artificial intelligence.
• Logic is a formal system in which the formulas or sentences have true
or false values.
• The problem of designing a KR language is a tradeoff between that
which is:
i. Expressive enough to represent important objects and relations in
a problem domain.
ii. Efficient enough in reasoning and answering questions about
implicit information in a reasonable amount of time.
• Logics are of different types: Propositional logic, Predicate logic,
temporal logic, Modal logic, Description logic, etc;
They represent things and allow more or less efficient inference.
• Propositional logic and Predicate logic are fundamental to all logic.
Propositional Logic is the study of statements and their connectivity.
Predicate Logic is the study of individuals and their properties.
4.3.1.1 Logic representation
The Facts are claims about the world that are True or False.
Logic can be used to represent simple facts.
To build a Logic-based representation:
• User defines a set of primitive symbols and the associated semantics.
• Logic defines ways of putting symbols together so that user can define legal
sentences in the language that represent TRUE facts.
• Logic defines ways of inferring new sentences from existing ones.
• Sentences - either TRUE or false but not both are called propositions.
• A declarative sentence expresses a statement with a proposition as content;
Example:
The declarative ‘snow is white’ expresses that snow is white; further, ‘snow is
white’ expresses that snow is white is TRUE.
In this section, first Propositional Logic (PL) is briefly explained and then
the Predicate logic is illustrated in detail.
4.3.2 Propositional logic (PL)
A proposition is a statement, which in English would be a declarative sentence.
Every proposition is either TRUE or FALSE.
Examples: (a) The sky is blue. (b) Snow is cold. , (c) 12 * 12=144
• Propositions are ‘sentences’, either true or false but not both.
88
Self-Instructional
Material
• A sentence is the smallest unit in propositional logic.
• If proposition is true, then truth value is ‘true’.
if proposition is false, then truth value is ‘false’.
Example:
Sentence Truth value Proposition (Y/N)
Using Predicate Logic
NOTES
‘Grass is green’ ‘true’ Yes
‘2 + 5 = 5’ ‘false’ Yes
‘Close the door’ - No
‘Is it hot outside?’ - No
‘x > 2’ where is variable - No (since x is not defined)
‘x = x’ - No
(don’t know what is ‘x’ and ‘=‘;
‘3 = 3’ or ‘air is equal to air’ or
‘Water is equal to water’ has no meaning)
– Propositional logic is fundamental to all logic.
– Propositional logic is also called Propositional calculus, sentential
calculus, or Boolean algebra.
– Propositional logic describes the ways of joining and/or modifying entire
propositions, statements or sentences to form more complicated
propositions, statements or sentences, as well as the logical relationships
and properties that are derived from the methods of combining or altering
statements.
Statement, variables and symbols
These and a few more related terms such as, connective, truth value, contingencies,
tautologies, contradictions, antecedent, consequent and argument are explained
below.
– Statement: Simple statements (sentences), TRUE or FALSE, that do not
contain any other statement as a part, are basic propositions; lower-case letters,
p, q, r, are symbols for simple statements.
Large, compound or complex statement are constructed from basic
propositions by combining them with connectives.
– Connective or Operator: The connectives join simple statements into
compounds, and join compounds into larger compounds.
Table 4.1 indicates five basic connectives and their symbols:
– Listed in decreasing order of operation priority;
– An operation with higher priority is solved first.
Example of a formula: ((((a ⊄ ¬b) V c ’! d) ”! ¬ (a V c))
Self-Instructional
Material
89
Table 4.1 Connectives and Symbols in Decreasing Order of Operation Priority
Using Predicate Logic
Connective
NOTES
Symbols
assertion
P
negation
¬p
conjunction
Read as
‘p is true’
~
!
pΛq ·
&&
Pvq ||
|
NOT
&
‘p is false’
AND
‘both p and q are
true’
disjunction
OR
‘either p is true, or q
is true, or both’
p→q ⊃
implication
if..then
⇒
‘if p is true, then q is
true’ ‘p implies q’
↔
equivalence
≡
if and only if
⇔
‘p and q are either
both
true or both false’
Note: The propositions and connectives are the basic elements of propositional
logic.
Truth value
The truth value of a statement is its TRUTH or FALSITY.
Example:
p is either TRUE or FALSE,
~p is either TRUE or FALSE,
p v q is either TRUE or FALSE, and so on. Use ‘ T ‘ or ‘ 1 ‘ to mean TRUE.
use ‘ F ‘ or ‘ 0 ‘ to mean FALSE
Truth table defining the basic connectives:
p
T
T
F
q
T
F
T
¬p
F
F
T
¬q
F
T
F
F
F
T
T
p
T
F
F
q
pvq
T
T
T
F
F
p’!q
T
F
T
T
p ”! q q’!p
T
F
F
T
T
T
F
T
Tautologies
A proposition that is always true is called a tautology, e.g. (P v ¬P) is always true
regardless of the truth value of the proposition P.
Contradictions
A proposition that is always false is called a contradiction. e.g., (P ‘“ ¬P) is always
false, regardless of the truth value of the proposition P.
Contingencies
A proposition is called a contingency, if that proposition is neither a tautology nor a
contradiction, e.g. (P v Q) is a contingency.
Antecedent, Consequent
In the conditional statements, p ’! q , the 1st statement or ‘if - clause’ (here p) is
called antecedent, 2nd statement or ‘then - clause’ (here q) is called consequent.
90
Self-Instructional
Material
Using Predicate Logic
Argument
Any argument can be expressed as a compound statement. Take all the premises,
conjoin them and make that conjunction the antecedent of a conditional and make
the conclusion the consequent. This implication statement is called the corresponding
conditional of the argument.
NOTES
Note:
• Every argument has a corresponding conditional, and every implication
statement has a corresponding argument.
• Because the corresponding conditional of an argument is a statement, it is
therefore either a tautology, or a contradiction or a contingency.
• An argument is valid ‘if and only if’ its corresponding conditional is a
tautology.
• Two statements are consistent ‘if and only if’ their conjunction is not a
contradiction.
• Two statements are logically equivalent ‘if and only if’ their truth table columns
are identical; ‘if and only if’ the statement of their equivalence using ‘a”’ is a
tautology.
Note
Truth tables are adequate to test validity, tautology, contradiction, contingency,
consistency and equivalence.
Here we will highlight major principles involved in knowledge representation.
In particular, predicate logic will be met in other knowledge representation schemes
and reasoning methods. The following are the standard logic symbols we use in this
topic:
There are two types of quantifiers:
1.
for all
2.
There exists
∀
∃
Connectives:
Implies
→
Not
¬
Or
∨
And
∧
Let us now look at an example of how predicate logic is used to represent
knowledge. There are other ways but this form is popular.
The propositional logic is not powerful enough for all types of assertions;
Example: The assertion ‘x > 1’, where x is a variable, is not a proposition
because it is neither true nor false unless value of x is defined. For x > 1 to be a
proposition,
• either we substitute a specific number for x;
• or change it to something like
‘There is a number x for which x > 1 holds’;
or
‘For every number x, x > 1 holds’.
Self-Instructional
Material
91
Using Predicate Logic
Consider the following example:
‘All men are mortal.
Socrates is a man.
NOTES
Then Socrates is mortal’,
These cannot be expressed in propositional logic as a finite and logically
valid argument (formula).
We need languages that allow us to describe properties (predicates) of objects,
or a relationship among objects represented by the variables. Predicate logic satisfies
the requirements of a language.
• Predicate logic is powerful enough for expression and reasoning.
• Predicate logic is built upon the ideas of propositional logic.
Predicate:
Every complete sentence contains two parts: a subject and a predicate. The subject
is what (or whom) the sentence is about. The predicate tells something about the
subject;
Example:
A sentence ‘Judy {runs}’.
The subject is Judy and the predicate is runs.
Predicate always includes verb and tells something about the subject. Predicate
is a verb phrase template that describes a property of objects or a relation among
objects represented by the variables.
Example:
‘The car Tom is driving is blue’;
‘The sky is blue’;
‘The cover of this book is blue.’
Predicate is ‘is blue’, describes property.
Predicates are given names; Let ‘B’ be the name for predicate, ‘is blue’.
Sentence is represented as ‘B(x)’, read as ‘x is blue’;
‘x’ represents an arbitrary Object .
Predicate logic expressions:
The propositional operators combine predicates, like
If (p(....) && ( !q(....) || r (....) ) )
Examples of logic operators: disjunction (OR) and conjunction (AND).
Consider the expression with the respective logic symbols || and && x < y || (
y < z && z < x)
Which is true || (true && true); Applying truth table, found True
Assignment for < are 3, 2, 1 for x, y, z and then the value can be FALSE or
TRUE
3 < 2 || (2 < 1 && 1 < 3)
It is False
92
Self-Instructional
Material
Predicate Logic Quantifiers
Using Predicate Logic
As said before, x > 1 is not a proposition and why?
Also said, that for x > 1 to be a proposition, what is required?
Generally, a predicate with variables (is called atomic formula) can be made
a proposition by applying one of the following two operations to each of its variables:
NOTES
1. Assign a value to the variable; e.g. x > 1, if 3 is assigned to x becomes 3 >
1, and it then becomes a true statement, hence a proposition.
2. Quantify the variable using a quantifier on formulas of predicate logic
(called wff), such as x > 1 or P(x), by using Quantifiers on variables.
Apply Quantifiers on Variables
Variable x
* x > 5 is not a proposition, its truth depends upon the value of variable x to reason
such statements, x needs to be declared.
Declaration x: a
* x: a declares variable
* x: a read as ‘x is an element of set a’
Statement p is a statement about x
* Q x: a • p is quantification of statement
statement declaration of variable x as element of set a quantifier
* Quantifiers are two types:
Universal quantifiers, denoted by symbol “ and existential quantifiers, denoted by
symbol$.
Universe of Discourse
The universe of discourse is also called domain of discourse or universe andindicates:
– a set of entities that the quantifiers deal.
– entities can be set of real numbers, set of integers, set of all cars on a parking
lot, the set of all students in a classroom, etc.
– universe is thus the domain of the (individual) variables.
– propositions in predicate logic are statements on objects of a universe.
The universe is often left implicit in practice, but it should be obvious from
the context.
Examples:
– About natural numbers for All x, y (x < y or x = y or x > y), there is no need
to be more precise and say for All x, y in N, because N is implicit, being the
universe of discourse.
– About a property that holds for natural numbers but not for real numbers, it
is necessary to qualify what the allowable values of x and y is.
Apply Universal quantifier ‘For All’
Universal Quantification allows us to make a statement about a collection of objects.
* Universal quantification: x: a • p
Self-Instructional
Material
93
Using Predicate Logic
* read ‘for all x in a, p holds’
* a is universe of discourse
* x is a member of the domain of discourse
NOTES
* p is a statement about x
* In propositional form, it is written as: x P(x)
* read ‘for all x, P(x) holds’
‘for each x, P(x) holds’ or
‘for every x, P(x) holds’ where P(x) is predicate, x means all the objects x in the
universe P(x) is true for every object x in the universe
Example: English language to Propositional form
* ‘All cars have wheels’
x: car • x has wheel
*x P(x) where P (x) is predicate tells: ‘x has wheels’ x is variable for object ‘cars’
that populate universe of discourse
Apply Existential quantifier ‘There Exist’
Existential Quantification allows us to state that an object does exist without naming
it.
Existential quantification: x: a • p
* read there exists an x such that p holds’
* a is universe of discourse
* x is a member of the domain of discourse.
* p is a statement about x
In propositional form it is written as: x P(x)
* read ‘there exists an x such that P(x)’ or ‘there exists at least one x such that P(x)’
* Where P(x) is predicate, x means at least one object x in the universe P(x) is true
for least one object x in the universe
Example: English language to Propositional form
* ‘Someone loves you’
x: Someone • x loves you
* x P(x),where P(x) is predicate tells : ‘x loves you’ x is variable for object ‘someone’
that populate universe of discourse
Formula:
In mathematical logic, a formula is a type of abstract object, a token of which is a
symbol or string of symbols which may be interpreted as any meaningful unit in a
formal language.
Terms:
Defined recursively as variables, or constants, or functions like f(t1, . . . , tn), where
f is an n-ary function symbol, and t1, . . . , tn are terms. Applying predicates to terms
produce atomic formulas.
94
Self-Instructional
Material
Using Predicate Logic
Atomic formulas:
An atomic formula (or simply atom) is a formula with no deeper propositional
structure, i.e. a formula that contains no logical connectives or a formula that has no
strict sub-formulas.
– Atoms are thus the simplest well-formed formulas of the logic.
NOTES
– Compound formulas are formed by combining the atomic formulas using
the logical connectives.
– Well-formed formula (‘wiff’) is a symbol or string of symbols (a formula)
generated by the formal grammar of a formal language. An atomic formula is one of
the form:
– t1 = t2, where t1 and t2 are terms, or
– R(t1, . . . , tn), where R is an n-ary relation symbol, and t1, . . . , tn are terms.
– ¬ a is a formula when a is a formula.
– (a ‘“ b) and (a v b) are formula when a and b are formula
Compound formula: example
((((a ‘“ b) ‘“ c) (“ ((¬ a ‘“ b) ‘“ c)) (“ ((a ‘“ ¬ b) ‘“ c))
Example:
Consider the following:
– Prince is a megastar.
– Megastars are rich.
– Rich people have fast cars.
– Fast cars consume a lot of petrol.
and try to draw the conclusion: Prince’s car consumes a lot of petrol.
So we can translate
Prince is a megastar
into:
megastar (prince)
and
Megastars are rich
into:
m: megastar (m) rich (m)
Rich people have fast cars, the third axiom is more difficult:
• Is cars a relation and therefore car (c, m) says that case c is m’s car. OR
• Is a car a function? So we may have car_of (m).
Assume cars are a relation then axiom 3 may be written:
c, m: car (c, m) rich (m) fast (c).
The fourth axiom is a general statement about fast cars. Let consume(c) mean
that car c consumes a lot of petrol. Then we may write:
c: fast (c) m : car (c, m) consume (c) .
Is this enough? NO! — Does prince have a car? We need the car_of function
after all (and addition to car):
c: car (car_of (m), m).
The result of applying car_of to m is m’s car. The final set of predicates is:
megastar (prince)
Self-Instructional
Material
95
Using Predicate Logic
m: megastar (m) rich (m)
c: car (car_of (m),m).
c, m: car (c, m) rich (m) fast(c).
c: fast(c) m : car (c, m) consume(c) .
NOTES
Given this, we could conclude:
Consume (car_of (prince)).
Example: A Predicate Logic
1.
Marcus was a man.
Man (Marcus)
2.
Marcus was a Pompeian.
Pompeian (Marcus)
3.
All Pompeians were Romans.
x: Pompeian(x)→Roman(x)
4.
Caesar was a ruler.
Ruler (Caesar)
5.
All Romans were either loyal to Caesar or hated him.
x: Roman(x) → loyalto(x, Caesar) V hate(x, Caesar)
6.
Everyone is loyal to someone.
x: ∃y: loyalto(x, y)
7.
People only try to assassinate rulers they aren’t loyal to.
x:”y:person(x)L ruler(y) L tryassassinate(x, y)®Ø loyalto(x, y)
8.
Marcus tried to assassinate Caesar.
tryassassinate(Marcus, Caesar)
9.
All men are people.
x: man(x) → person(x)
4.4
REPRESENTING INSTANCE AND ISA
RELATIONSHIPS
The two attributes, isa and instance play an important role in many aspects of
knowledge representation. The reason for this is that they support property
inheritance.
isa
Used to show class inclusion. Ex. isa(megastar, rich).
instance
Used to show class membership, Ex. instance (prince, megastar).
From the above it should be simple to see how to represent these in predicate logic.
96
Self-Instructional
Material
4.5 COMPUTABLE FUNCTIONS AND PREDICATES
The objective is to define class of functions C computable in terms of F. This is
expressed as C {F} is explained below using two examples:
(1) ‘evaluate factorial n’ and
(2) ‘expression for triangular functions’.
Using Predicate Logic
NOTES
Ex: (1) A conditional expression to define factorial n or n!
Expression
‘If p1 then e1 else if p2 then e2 . . . else if pn then en’ .
i.e. (p1 ’!?e1, p2 ’!?e2 . . . pn ’!?en)
Here p1, p2 . . . pn are propositional expressions taking the values T or F for
true and false respectively.
The value of (p1 ’!?e1, p2 ’!?e2 . . . pn ’!?en) is the value of the e corresponding
to the first p that has value T.
The expressions defining n! , n= 5, recursively are:
n! = n x (n-1)! for n e”?1
5! = 1 x 2 x 3 x 4 x 5 = 120
0! = 1
The above definition incorporates an instance that the product of no numbers
i.e. 0! = 1, then only, the recursive relation
(n + 1)! =n! X (n+1) works for n = 0.
Now, use conditional expressions
n! = (n = 0 ’!?1, n ‘“ 0 ’!?n. (n – 1)!)
to define functions recursively.
Ex: Evaluate 2! According to above definition.
2! = (2 = 0 ’!?1, 2 ‘“ 0 ’!?2. (2 – 1)! )
= 2 x 1!
= 2 x (1 = 0 ’!?1, 1 ‘“ 0 ’!?1. (1 – 1)! )
= 2 x 1 x 0!
= 2 x 1 x (0 = 0 ’!?1, 0 ‘“ 0 ’!?0. (0 – 1)! )
=2x1x1
=2
4.6 RESOLUTION
Resolution is a procedure used in proving that arguments which are expressible in
predicate logic are correct. Resolution is a procedure that produces proofs by
refutation or contradiction. Resolution leads to refute a theorem-proving technique
for sentences in propositional logic and first-order logic.
Resolution is a purely syntactic uniform proof procedure. Resolution does
not consider what predicates may mean, but only what logical conclusions may be
Self-Instructional
Material
97
Using Predicate Logic
NOTES
derived from the axioms. The advantage is Resolution is universally applicable to
problems that can be described in first order logic. The disadvantage is Resolution
by itself cannot make use of any domain dependent heuristics. Despite many attempts
to improve the efficiency of resolution, if often takes exponential time.
– Resolution is a rule of inference.
– Resolution is a computerized theorem prover.
– Resolution is so far only defined for Propositional Logic.
The strategy is that the Resolution techniques of Propositional logic be adopted
in Predicate Logic.
1. Proof in predicate logic is composed of a variety of processes including
modus ponens, chain rule, substitution, matching, etc.
2. However, we have to take variables into account, e.g. parent of(Ram,
Sam) AND Not Parent of(Ram, Lakshman) cannot be resolved.
3. In other words, the resolution method has to look inside predicates to
discover if they can be resolved.
4.6.1 Algorithm: Convert to Clausal Form
98
Self-Instructional
Material
1. Eliminate→
P → Q ≡ ¬P ∨ Q
2. Reduce the scope of each ¬ to a single term.
¬(P ∨ Q) ≡ ¬P ∧ ¬Q
¬(P ∧ Q) ≡ ¬P ∨ ¬Q
¬‘x: P ≡ ∃x: ¬P
∃¬x: p ≡ ‘x: ¬P
¬¬ P ≡ P
3. Standardize variables so that each quantifier binds a unique variable.
( ∀ x: P(x)) ∨ (∃x: Q(x)) ≡ ( ∀ x: P(x)) ∨ (∃y: Q(y))
4. Move all quantifiers to the left without changing their relative order.
( ∀ x: P(x)) ∨ (∃y: Q(y)) ≡ ∀ x: ∃y: (P(x) ∨ (Q(y))
5. Eliminate ∃ (Solemnization).
∃x: P(x) ≡ P(c)
Skolem constant
∀ x: ∃y P(x, y) ≡ ∀ x: P(x, f(x))
Skolem function
6. Drop ∀ .
∀ x: P(x) ≡ P(x)
7. Convert the formula into a conjunction of disjuncts.
(P ∧ Q) ∨ R ≡ (P ∨ R) ∧ (Q ∨ R)
8. Create a separate clause for each conjunction.
9. Standardize apart the variables in the set of obtained clauses.
Example:1
∀ x: [Roman(x) Λ know(x, Marcus)] →[hate(x, Caesar) V
( ∀ y: ∃z: hate(y, z) → thinkcrazy(x, y))]
1. Eliminate →
∀ x: ¬[Roman(x) Λ know(x, Marcus)] V [hate(x, Caesar) V ( ∀ y:
∃¬z: hate(y, z) V thinkcrazy(x, y))]
2. Reduce scope of¬.
∀ x: [ ¬Roman(x) V ¬ know(x, Marcus)] V [hate(x, Caesar) V ( ∀ y: z:
¬hate(y, z) V thinkcrazy(x, y))]
3. ‘Standardize’ variables:
∀ x: P(x) V ∀ x: Q(x) converts to ∀ x: P(x) V ∀ y: Q(y)
4. Move quantifiers.
∀ x: ∀ y: ∀ z: [¬Roman(x) V ¬ know(x, Marcus)] V
[hate(x, Caesar) V (¬hate(y, z) V thinkcrazy(x, y))]
5. Eliminate existential quantifiers.
∃y: President(y) will be converted to President(S1)
∀ x: ∃y: father-of(y, x) will be converted to ∀ x: father-of(S2(x), x))
6. Drop the prefix.
[ ¬Roman(x) ¬ know(x, Marcus)] V [hate(x, Caesar) V (¬ hate(y, z) V
thinkcrazy(x, y))]
7. Convert to a conjunction of disjuncts.
¬ Roman(x) V ¬ know(x, Marcus) V hate(x, Caesar) V ¬ hate(y, z) V
thinkcrazy(x, y)
8. Create a separate clause corresponding to each conjunct.
9. Standardize apart the variables in the set of clauses generated step 8.
Using Predicate Logic
NOTES
First-Order Resolution
_ Generalized Resolution Rule
For clauses P V Q and ¬Q′VR with Q, Q′ atomic formulae
_ where θ is a most general unifier for Q and Q′
_ (P v R) θ is the resolvent of the two clauses
Applying Resolution Refutation
– Negate query to be proven (resolution is a refutation system)
– Convert knowledge base and negated query into CNF and extract clauses
– Repeatedly apply resolution to clauses or copies of clauses until either the
empty clause (contradiction) is derived or no more clauses can be derived (a
copy of a clause is the clause with all variables renamed)
– If the empty clause is derived, answer ‘yes’ (query follows from knowledge
base); otherwise, answer ‘no’ (query does not follow from knowledge base).
Resolution: Example 1
|- ∃x (P(x) →∀xP(x))
CNF (¬∃x(P(x)→∀xP(x)))
1, 2. ∀x¬(¬P(x)(∀∀xP(x))
2. ∀x (¬¬P(x)’∀¬∀xP(x))
2, 3. ∀x (P(x) ‘∨∃x¬P(x))
4. ∀x (P(x)∧∃y ¬P(y))
5. ∀x∃y(P(x)∧∀¬P(y))
6. ∀x (P(x)∧∀¬P( f (x)))
8. P(x), ¬P( f (x))
1. P(x) [¬ Conclusion]
Self-Instructional
Material
99
Using Predicate Logic
NOTES
2. ¬P( f (y)) [Copy of ¬ Conclusion]
3. _ [1, 2 Resolution {x/ f (y)}]
Resolution — Example 2
|- ∃x∀y∀z ((P(y)(∨Q (z)) → (P(x)(∨Q(x)))
1. P ( f (x))(∨Q(g(x)) [¬ Conclusion]
2. ¬P(x) [¬ Conclusion]
3. ¬Q(x) [¬ Conclusion]
4. ¬P(y) [Copy of 2]
5. Q(g(x)) [1, 4 Resolution {y/ f (x)}]
6. ¬Q(z) [Copy of 3]
7. _ [5, 6 Resolution {z/g(x)}]
4.6.2 The Basis of Resolution
The resolution procedure is a simple iterative process: at every step two clauses
(parent clauses) are compared for deriving a new clause that has been inferred from
them. The new clause represents that the two parent clauses interact with each
other.
4.6.2.1 Soundness and completeness
The notation p |= q is read ‘p entails q’; it means that q holds in every model
in which p holds.
The notation p |- q means that q can be derived from p by some proof mechanism
m.
A proof mechanism m is sound if p |-m q → p |= q.
A proof mechanism m is complete if p |= q → p |-m q.
Resolution for predicate calculus is:
Sound: If it is derived by resolution, then the original set of clauses is
unclassifiable.
Complete: If a set of clauses is unclassifiable, resolution will eventually
derive. However, this is a search problem and may take a very long time.
We generally are not willing to give up soundness, since we want our conclusions
to be valid. We might be willing to give up completeness: if a soundproof procedure
will prove the theorem we want, that is enough.
Example: Resolution
– Another type of proof system based on refutation
– Better suited to computer implementation than systems of axioms and rules
(can give correct ‘no’ answers)
– Generalizes to first-order logic (see next week)
– The basis of Prolog’s inference method
– To apply resolution, all formulae in the knowledge base and the query must
be in clausal form (c.f. Prolog clauses)
Resolution Rule of Inference
Resolution Rule
100
Self-Instructional
Material
¬Q′′ ∨ R
P∨Q
Using Predicate Logic
NOTES
(P ∨ Q) θ
Resolution Rule: Key Idea
_ Consider A V B and ¬B V C
if B is True, ¬B is False and truth of second formula depends only on C
if B is False, truth of first formula depends only on A
Only one of B, ¬B is True, so if both A V B and ¬B V C is True, either A or
C is True, i.e. A V C is True.
Applying Resolution
– The resolution rule is sound (resolvent entailed by two ‘parent’ clauses)
– How can we use the resolution rule? One way:
– Convert knowledge base into clausal form
– Repeatedly apply resolution rule to the resulting clauses
– A query A follows from the knowledge base if and only if each of the clauses
in the CNF of A can be derived using resolution
– There is a better way . . .
Refutation Systems
– To show that P follows from S (i.e. S P) using refutation, start with S and âÿP
in clausal form and derive a contradiction using resolution
– A contradiction is the ‘empty clause’ (a clause with no literals)
– The empty clause _ is unsatisfiable (always False)
– So if the empty clause _ is derived using resolution, the original set of clauses
is unclassifiable (never all True together)
– That is, if we can derive _ from the clausal forms of S and âÿP, these clauses
can never be all True together
– Hence whenever the clauses of S are all True, at least one clause from âÿP
must be False, i.e. âÿP must be False and P must be True
– By definition, S |= P (so P can correctly be concluded from S)
Applying Resolution Refutation
– Negate query to be proven (resolution is a refutation system)
– Convert knowledge base and negated conclusion into CNF and extract clauses
– Repeatedly apply resolution until either the empty clause (contradiction) is
derived or no more clauses can be derived
– If the empty clause is derived, answer ‘yes’ (query follows from knowledge
base), otherwise answer ‘no’ (query does not follow from knowledge base)
Resolution: Example 1
(G V H)→(¬ J Λ ¬K), G ¬J
Clausal form of (G V H) → (¬J Λ ¬K) is
{¬G V ¬J, ¬H V ¬J, ¬G V ¬K, ¬H V ¬K}
Check Your Progress
1. Predicate logic does
not serve as a
standard
of
comparison for other
representation and
reasoning methods.
(True or False)
2. List four types of
logic.
3. Define
formula.
atomic
4. List the two natural
deduction methods.
Self-Instructional
Material
101
Using Predicate Logic
NOTES
1. ¬G V ¬J [Premise]
2. ¬H V ¬J [Premise]
3. ¬G V ¬K [Premise]
4. ¬H V ¬K [Premise]
5. G [Premise]
6. J [¬ Conclusion]
7. ¬G [1, 6 Resolution]
8. _ [5, 7 Resolution]
Resolution: Example 2
P→¬Q, ¬Q→R |– P’!R
Recall P→R ≡ ¬P V R
Clausal form of ¬(¬P V R) is {P, ¬R}
1. ¬P V ¬Q [Premise]
2. Q V R [Premise]
3. P [¬ Conclusion]
4. ¬R [¬ Conclusion]
5. ¬Q [1, 3 Resolution]
6. R [2, 5 Resolution]
7. _ [4, 6 Resolution]
Resolution: Example 3
((P V Q) ∧ ¬P)→Q
Clausal form of ¬(((P V Q) ∧ ¬P)→Q) is {P V Q, ¬P, ¬Q}
1. P V Q [¬ Conclusion]
2. ¬P [¬ Conclusion]
3. ¬Q [¬ Conclusion]
4. Q [1, 2 Resolution]
5. [3, 4 Resolution]
Soundness and Completeness Again
– Resolution refutation is sound, i.e. it preserves truth (if a set of premises are
all true, any conclusion drawn from those premises must also be true)
– Resolution refutation is complete, i.e. it is capable of proving all consequences
of any knowledge base (not shown here!)
– Resolution refutation is decidable, i.e. there is an algorithm implementing
resolution which when asked whether S |– P, can always answer ‘yes’ or ‘no’
(correctly)
Heuristics in Applying Resolution
– Clause elimination—can disregard certain types of clauses
– Pure clauses: contain literal L where ¬ L doesn’t appear elsewhere
– Tautologies: clauses containing both L and âÿL
– Subsumed clauses: another clause exists containing a subset of the literals
– Ordering strategies
– Resolve unit clauses (only one literal) first
– Start with query clauses
– Aim to shorten clauses
102
Self-Instructional
Material
Using Predicate Logic
4.6.2.2 Resolution strategies
Different strategies have been tried for selecting the clauses to be resolved. These
include:
a. Level saturation or two-pointer method: the outer pointer starts at the
negated conclusion; the inner pointer starts at the first clause. The two
clauses denoted by the pointers are resolved, if possible, with the result
added to the end of the list of clauses. The inner pointer is incremented
to the next clause until it reaches the outer pointer; then the outer pointer
is incremented and the inner pointer is reset to the front. The two-pointer
method is a breadth-first method that will generate many duplicate
clauses.
b. Set of support: One clause in each resolution step must be part of the
negated conclusion or a clause derived from it. This can be combined
with the two-pointer method by putting the clauses from the negated
conclusion at the end of the list. Set-of-support keeps the proof process
focused on the theorem to be proved rather than trying to prove
everything.
c. Unit preference: Clauses are prioritized, with unit clauses preferred, or
more generally shorter clauses preferred. To reach our goal, this has
zero literals and is only obtained by resolving two unit clauses. Resolution
with a unit clause makes the result smaller.
d. Linear resolution: one clause in each step must be the result of the
previous step. This is a depth-first strategy. It may be necessary to back
up to a previous clause if no resolution with the current clause is possible.
NOTES
4.6.3 Resolution in Propositional Logic
Algorithm: Propositional Resolution
1. Convert all the propositions of KB to clause form (S).
2. Negate a and convert it to clause form. Add it to S.
3. Repeat until either a contradiction is found or no progress can be made.
a. Select two clauses (α ∨ ¬P) and ( γ ∨ P).
b. Add the resolvent (α ∨ γ) to S.
Ex: 1.
KB = {P, (P ∧ Q) → R, (S ∨ T) → Q, T}
α = R then
KB = {P(a), ‘x: (P(x) ∧ Q(x)) → R(x), ‘y: (S(y) ∨ T(y)) →
Q(y), T(a)}
α = R(a)
Ex: 2. A few Facts in Prepositional Logic
Given Axioms
P
Clause Form
P
(1)
(P Λ Q) → R
¬P V ¬Q V R
(2)
(S V T) → Q
¬S V Q
(3)
¬T V Q
(4)
T
T
(5)
Self-Instructional
Material
103
Using Predicate Logic
¬P V ¬Q V R
¬R
¬P V ¬Q
NOTES
¬T V Q
P
¬Q
¬T
T
Fig. 4.1 Resolution in Prepositional Logic
4.6.4 The Unification Algorithm
The resolution step is still valid for predicate calculus, i.e. if clause C1 contains a
literal L and clause C2 contains ¬ L, then the resolvent of C1 and C2 is a logical
consequence of C1 and C2 and added to the set of clauses without changing its truth
value.
However, since L in C1 and ¬ L in C2 may contain different arguments, we
must do something to make L in C1 and ¬ L in C2 exactly complementary.
Since every variable is universally quantified, we are justified in substituting
any term for a variable. Thus, we need to find a set of substitutions of terms for
variables that will make the literals L in C1 and ¬ L in C2 exactly the same. The
process of finding this set of substitutions is called unification. We wish to find the
most general unifier of two clauses, which will preserve as many variables as
possible.
Example: C1 = P(z) v Q(z), C2 = ¬ P(f(x)) v R(x) . If we substitute f(x) for z in
C1, C1 becomes P(f(x)) v Q(f(x)) . The resolvent is Q(f(x)) v R(x) .
Examples of Unification
Consider unifying the literal P(x, g(x)) with:
1. P(z, y) : unifies with {x/z, g(x)/y}
2. P(z, g(z)): unifies with {x/z} or {z/x}
3. P(Socrates, g(Socrates)) : unifies, {Socrates/x}
4. P(z, g(y)): unifies with {x/z, x/y} or {z/x, z/y}
5. P(g(y), z): unifies with {g(y)/x, g(g(y))/z}
6. P (Socrates, f (Socrates)): does not unify: f and g do not match.
7. P (g(y), y): does not unify: no substitution works.
Example 2:
S = {f (x, g(x)), f (h(y), g(h(z)))}
D0 = {x, h(y)} so s1 = {x/h(y)}
S1 = { f (h(y),g(h(y))), f (h(y),g(h(z)))}
D1 = {y, z} so s2 = {x/h(z), y/z}
S2 = { f (h(z),g(h(z))), f (h(z),g(h(z)))}
i.e. s2 is an m.g.u.
S = { f (h(x),g(x)), f (g(x),h(x))}. . .?
4.6.4.1 Substitutions
104
Self-Instructional
Material
Unification will produce a set of substitutions that make two literals the same. A
substitution ti / vi specifies substitution of term ti for variable vi.
A substitution set can be represented as either sequential substitutions (done
one at a time in sequence) or as simultaneous substitutions (done all at once).
Unification can be done correctly either way.
4.6.4.2 Unification algorithm
Using Predicate Logic
NOTES
The basic unification algorithm is simple. However, it must be implemented with
care to ensure that the results are correct. We begin by making sure that the two
expressions have no variables in common. If there are common variables, substitute
a new variable in one of the expressions. (Since variables are universally quantified,
another variable can be substituted without changing the meaning.)
Imagine moving a pointer left-to-right across both expressions until parts are
encountered that are not the same in both expressionsL for example, one is a variable
and the other is a term not containing that variable?
1. Substitute the term for the variable in both expressions,
2. substitute the term for the variable in the existing substitution set [This
is necessary so that the substitution set will be simultaneous].
3. add the substitution to the substitution set.
Algorithm: Unify (L1, L2)
1. If L1 or L2 is a variable or constant, then:
(a) If L1 and L2 are identical, then return NIL.
(b) Else if L1 is a variable, then if L1 occurs in L2 then return FAIL, else
return {(L2/L1)}.
(c) Else if L2 is a variable, then if L2 occurs in L1 then return FAIL, else
return {(L1/L2)}.
(d) Else return FAIL.
2. If the initial predicate symbols in L1 and L2 are not identical, then return FAIL.
3. If L1 and L2 have a different number of arguments, then return FAIL
4. Set SUBST to NIL.
5. for i 1 to number of arguments in L1:
a) Call Unify with the ith argument of L1 and the ith argument of L2,
putting result in S.
b) If S = FAIL then return FAIL.
c) If S is not equal to NIL then:
i. Apply S to the remainder of both L1 and L2.
ii. SUBST: = APPEND(S, SUBST).
6. Return SUBST.
4.6.4.3 Unification implementation
1. Initialize the substitution set to be empty. A nil set indicates failure.
2. Recursively unify expressions:
1. Identical items match.
2. If one item is a variable vi and the other is a term ti not containing that
variable, then:
1. Substitute ti / vi in the existing substitutions.
2. Add ti / vi to the substitution set.
3. If both items are functions, the function names must be identical and all
arguments must unify. Substitutions are made in the rest of the expression
as unification proceeds.
Self-Instructional
Material
105
Using Predicate Logic
NOTES
Unification
i) We can get the inference immediately if we can find a substitution θ such that
King(x) and Greedy(x) match King (John) and Greedy(y)
θ = {x/John, y/John} works
Unify (α, β) = θ if αθ = βθ
p
q
θ
Knows (John, x)
Knows (John, Jane)
——
Knows (John, x)
Knows (y, OJ)
Knows (John, x)
Knows (y, Mother(y))
Knows (John, x)
Knows (x, OJ)
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
ii) We can get the inference immediately if we can find a substitution θ such that
King(x) and Greedy(x) match King (John) and Greedy(y)
θ = {x/John, y/John} works
Unify (α, β) = θ if αθ = βθ
P
q
θ
Knows (John, x)
Knows (John,Jane)
{x/Jane}}
Knows (John, x)
Knows(y, OJ)
Knows (John, x)
Knows(y, Mother(y))
Knows (John, x)
Knows(x, OJ)
Standardizing apart eliminates overlap of variables, e.g., Knows (z17, OJ)
iii) We can get the inference immediately if we can find a substitution θ such
that King(x) and Greedy(x) match King(John) and Greedy(y)
θ = {x/John, y/John} works
Unify (α, β) = θ if αθ = βθ
p
q
θ
Knows (John, x)
Knows (John, Jane)
{x/Jane}}
Knows (John, x)
Knows (y, OJ)
{x/OJ, y/
John}}
Knows (John, x)
Knows (y, Mother(y))
Knows (John, x)
Knows(x, OJ)
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
iv) We can get the inference immediately if we can find a substitution θ such
that King(x) and Greedy(x) match King(John) and Greedy(y)
θ = {x/John,y/John} works
Unify(α,β) = θ if αθ = βθ
p
q
θ
Knows (John, x)
Knows (John, Jane)
{x/Jane}}
Knows (John, x)
Knows (y, OJ)
{x/OJ, y/John}}
Knows (John, x)
Knows (y, Mother(y))
{y/John, x/Mother
(John)}}
Knows (John, x)
Knows (x, OJ)
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
v) We can get the inference immediately if we can find a substitution θ such
that King(x) and Greedy(x) match King (John) and Greedy(y)
θ = {x/John, y/John} works
Unify (α, β) = θ if αθ = βθ
106
Self-Instructional
Material
p
q
θ
Knows (John, x)
Knows (John, Jane)
{x/Jane}}
Knows (John, x)
Knows (y, OJ)
{x/OJ, y/John}}
Knows (John, x)
Knows (y, Mother(y))
{y/John, x/Mother
(John)}}
Knows (John, x)
Knows (x, OJ)
{fail}
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
To unify Knows (John, x) and Knows (y, z), θ = {y/John, x/z } or θ = {y/John,
x/John, z/John}
The first unifier is more general than the second.
There is a single most general unifier (MGU) that is unique up to renaming of
variables.
MGU = { y/John, x/z }
Using Predicate Logic
NOTES
4.6.5 Resolution in Predicate Logic
Algorithm: Resolution
1. Convert all the propositions of F to clause form.
2. Negate P and convert to clause form. Add it to the set of clauses obtained in
1.
3. Repeat until a contradiction is found, no progress can be made, or a
predetermined amount of effort has been expanded.
a. Select two clauses. Call these parent clauses.
b. Resolve them together. The resolvent will be the disjunction of all the
literals of both parent clauses with appropriate substitutions performed
and with the following exception: If there is one pair of literals T1 and
¬ T2, such that one of the parent clauses contains T1 and the other
contains ¬ T2 and if T1 and T2 are unifiable, then neither T1 nor ¬ T2
should appear in the resolvent. If there is more than one pair of
complementary literals, only one pair should be omitted from the
resolvent.
c. If the resolvent is the empty clause, then a contradiction has been found.
If it is not, then add it to the set of clauses available to the procedure.
Example:
1.
2.
3.
4.
5.
6.
7.
8.
Marcus was a man.
Marcus was a Pompeian.
All Pompeians were Romans.
Caesar was a ruler.
All Pompeians were either loyal to Caesar or hated him.
Every one is loyal to someone.
People only try to assassinate rulers they are not loyal to.
Marcus tried to assassinate Caesar.
• Axioms in clause form are:
1. man(Marcus)
2. Pompeian(Marcus)
3. ¬ Pompeian(x1) v Roman(x1)
4. Ruler(Caesar)
Self-Instructional
Material
107
Using Predicate Logic
5.
6.
7.
8.
NOTES
¬ Roman(x2) v loyalto(x2, Caesar) v hate(x2, Caesar)
loyalto(x3, f1(x3))
¬ man(x4) v ¬ ruler(y1) v ¬ tryassassinate(x4, y1) v loyalto(x4, y1)
tryassassinate(Marcus, Caesar)
Prove:
hate (Marcus, Caesar)
Prove: hate (Marcus, Caesar)
5
hate(Marcus, Caesar)
3
Roman(Marcus) V loyalto(Marcus, Caesar)
Marcus/x
2
Pompeion(Marcus) V loyalto(Marcus, Caesar)
7
loyalto(Marcus, Caesar)
Marcus/x4, Caesar/y1
1
man(Marcus) V
ruler (Caesar)V
ruler (Caesar)V
tryassassinate(Marcus, Caesar)
tryassassinate(Marcus, Caesar)
4
8
tryassassinate(Marcus, Caesar)
Fig. 4.2 Resolution Proof
Prove: loyalto (Marcus, Caesar)
5
loyalto(marcus, caesar)
Marcus/x2
Roman(Marcus) V hate(Marcus, Caesar)
3
Marcus/x1
Pompeion(Marcus) V hate(Marcus, Caesar)
2
hate(Marcus, Caesar)
(a)
hate(Marcus, Caesar)
10
Marcus/x6, Caesar/y3
Presecute(Caesar, Marcus,)
9
Marcus/x5, Caesar/y2
hate(Marcus, Caesar)
:
:
(b)
Fig. 4.3 An Unsuccessful Attempt at Resolution
4.7 NATURAL DEDUCTION
108
Self-Instructional
Material
Natural deduction methods perform deduction in a manner similar to reasoning
used by humans, e.g., in proving mathematical theorems. Forward chaining and
backward chaining are natural deduction methods. These are similar to the algorithms
described earlier for propositional logic, with extensions to handle variable bindings
and unification.
Backward chaining by itself is not complete, since it only handles Horn clauses
(clauses that have at most one positive literal). Not all clauses are Horn; for example,
‘every person is male or female’ becomes ¬ Person(x) V Male(x) V Female(x)
which has two positive literals. Such clauses do not support backward chaining.
Splitting can be used with back chaining to make it complete. Splitting makes
assumptions (Ex. ‘Assume x is Male’) and attempts to prove the theorem for each
case.
Using Predicate Logic
NOTES
4.8 SUMMARY
In this unit, you have learned how predicate logic can be used as the basis of the
technique for knowledge representation. You also learned how resolutions are applied
in knowledge representation and problem solving technique. . There are two ways
to approach achieving the computational goal. The first is to search good heuristics
that can inform a theorem proving program. The second approach is to change
given data to the program but not the program. A difficulty with the use of theorem
proving in AI systems is that there are some kinds of information that are not easily
represented in predicate logic. Consider the following examples:
• ‘It is very cool today’. How can relative degree of cool be represented?
• ‘Black hair people often have grey eyes.’ How can the amount of certainty be
represented?
• ‘It is better to have more pieces on the board than the opponent has.’ How
can we represent this kind of heuristic information?
• ‘I know Dravid thinks the Sachin will win, but I think they are going to lose.’
How can several different belief systems be represented at once?
Using the knowledge representation on these examples we have not yet suitably
given the proper representation. They primarily deal with the knowledge base that
is incomplete, although other problems do also exist, such as the difficulty of
representing continuous phenomena in a discrete system. You will learn about their
solutions in the next units. You have also learned in this unit that:
• First-order logic allows us to speak about objects, properties of objects
and relationships between objects.
• It also allows quantification over variables.
• First-order logic is quite an expressive knowledge representation language;
much more so than propositional logic.
• However, you need to add things like equality if you wish to be able to do
things like counting.
• You have also traded expressiveness for decidability.
• How much of a problem is this?
• If you add (Piano) axioms for mathematics, then you encounter Gödel’s
famous incompleteness theorem (which is beyond the scope of this course).
You have now investigated one knowledge representation and reasoning
formalism.
This means you can draw new conclusions from the knowledge you have:
you can reason.
You have enough to build a knowledge-based agent.
However, you have also learned that propositional logic is a weak language;
there are many things that cannot be expressed.
Self-Instructional
Material
109
Using Predicate Logic
NOTES
To express knowledge about objects, their properties and the relationships
that exist between objects, You need a more expressive language: first-order logic.
4.9 KEY TERMS
• Logic: It is a language for reasoning; a collection of rules are used in logical
reasoning.
• Predicate logic: It is one of the major knowledge representation and reasoning
methods. It serves as a standard of comparison for other representation and
reasoning methods.
• Propositional Logic: It is the study of statements and their connectivity.
• Argument: Any argument can be expressed as a compound statement. Take
all the premises, conjoin them and make that conjunction the antecedent of a
conditional and make the conclusion the consequent. This implication
statement is called the corresponding conditional of the argument.
• Natural deduction: These methods perform deduction in a manner similar
to reasoning used by humans, e.g. in proving mathematical theorems. Forward
chaining and backward chaining are natural deduction methods.
4.10 ANSWERS TO ‘CHECK YOUR PROGRESS’
1. False
2. Propositional logic, predicate logic, temporal logic and modal logic.
3. An atomic formula (or simply atom) is a formula with no deeper propositional
structure, i.e., a formula that contains no logical connectives or a formula
that has no strict sub-formulas.
4. Forward chaining and backward chaining are natural deduction methods.
4.11 QUESTIONS AND EXERCISES
1. Assume the facts given below:
• Ram only likes easy courses.
• Science courses are hard.
• All the courses in the basket weaving department are easy.
• MA-301 is a basket weaving course.
Use a resolution to answer the question ‘What course would Ram like?’.
2. Consider the following sentences:
• Sam likes all kinds of food.
• Apples are food.
• Chicken is food.
• Anything anyone eats and isn’t killed by is food.
• John eats peanuts and is still alive.
• Myke eats everything John eats.
a. Translate these sentences into formulae in predicate logic.
b. Prove that Sam likes peanuts using a backward chaining.
c. Convert the formulas of part a into clause form.
110
Self-Instructional
Material
d. Prove that Sam likes peanuts using resolution.
e. Use a resolution to answer the question ‘What food does Myke eat?’
3. Trace the operation of the unification algorithm on each of the following
pairs of literals.
a. f(Marcus) and f(Caesar)
b. f(x) and f(g(x))
c. f(Marcus, g(x, y)) and f(x, g(Caesar, Marcus))
Using Predicate Logic
NOTES
4. Suppose that we are attempting to resolve the following clauses:
loves(mother(a), a)
%loves(y, x) V loves(x, y)
a. What will be the result of the unification algorithm?
b. What must be generated as a result of resolving these two clauses?
c. What does the example show about the order in which the substitutions
determined by the unification procedure must be performed?
5. Assume the following facts:
– Steve only likes easy courses.
– Computing courses are hard.
– All courses in Sociology are easy.
– ‘Society is evil’ is a sociology course.
Represent these facts in predicate logic and answer the question?
What course would Steve like?
6. Find out what knowledge representation schemes are used in the STRIPS
system.
7. Translate the following sentences into propositional logic:
(i) If Jane and John are not in town we will play tennis.
(ii) It will either rain today or it will be dry today.
(iii) You will not pass this course unless you study.
To do the translation you will need to
(a) Identify a scheme of abbreviation
(b) Identify logical connectives
8. Convert the following formulae into Conjunctive Normal Form (CNF):
(i) P ! Q
(ii) (P ! :Q) ! R
(iii) :(P ^ :Q) ! (:R _ :Q)
9. Show using the truth table method that the following inferences are valid.
(i) P ! Q; :Q j= :P
(ii) P ! Q j= :Q ! :P
(iii) P ! Q;Q ! R j= P ! R
10. Repeat question 3 using resolution. In this case we want to show:
(i) P ! Q; :Q ‘ :P
(ii) P ! Q ‘ :Q ! :P
(iii) P ! Q;Q ! R ‘ P ! R
Self-Instructional
Material
111
Using Predicate Logic
NOTES
11. Determine whether the following sentences are valid (i.e., tautologies) using
truth tables.
(i) ((P _ Q) ^ :P) ! Q
(ii) ((P ! Q) ^ :(P ! R)) ! (P ! Q)
(iii) :(:P ^ P) ^ P
(iv) (P _ Q) ! :(:P ^ :Q)
12. Repeat question 5 using resolution. In this case we aim to show:
(i) ‘ ((P _ Q) ^ :P) ! Q
(ii) ‘ ((P ! Q) ^ :(P ! R)) ! (P ! Q)
(iii) ‘ :(:P ^ P) ^ P
(iv) ‘ (P_Q)!:(:P^:Q)
4.11 FURTHER READING
1. www.mm.iit.uni-kiskilc.hu
2. www.cs.cardiff.ac.uk
3. www.yoda.cis.temple.edu
4. www.csie.ntu.edu.tw
5. www.csie.cyut.edu.tw
6. www.asutonlab.org
7. www.ndsu.nodak.edu
8. www.cs.wise.edu
9. www.acs.madonna.edu
10. www.pub.ufasta.edu.ar
11. www.virtualworldlets.net
12. www.neda10.norminet.org.ph
13. www.csi.usiuthal.edu
14. www.cs.colostate.edu
15. www.cs.cf.ac.uk
16. www.cs.ust.hk
17. www.it.bond.edu.au
18. www.rwc.us.edu
19. www.cs.uwp.edu
20. Hon, Giora. 1995. ‘Going Wrong: To Make a Mistake, to Fall into an Error’.
The Review of Metaphysics, Sept..
21. Jensen, Thomas. 1997. ‘Disjunctive Program Analysis for Algebraic Data
Types’. ACM Transaction on Programming Language and Systems, 1st
September .
22. Field, Hartry. 2006. ‘Compositional Principle VS Schematic Reasoning’. The
Monist, January.
24. Edgington, Dorothy. 1995.‘On Conditionals’. Mind, April.
112
Self-Instructional
Material
UNIT 5 WEAK SLOT AND FILLER
STRUCTURES
Weak Slot and Filler
Structures
NOTES
Structure
5.0 Introduction
5.1 Unit Objectives
5.2 Semantic Nets
5.2.1 Representation in a Semantic Net
5.2.2 Inference in a Semantic Net
5.2.3 Extending Semantic Nets
5.3 Frames
5.3.1 Frame Knowledge Representation
5.3.2 Distinction between Sets and Instances
5.4
5.5
5.6
5.7
5.8
5.0
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
INTRODUCTION
In the preceding unit, you learnt the various concepts relating to the use of predicate
logic and how to use simple facts in logic. In this unit, you will learn about the
various concepts about weak slot and filler structures.
There are two attributes that are of very general significance, the use of which
you have already studied in the previous units: instance and isa. These attributes
are important because they support property inheritance. They are called a variety
of things in AI systems, but the names do not matter. What does matter is that they
represent class membership and inclusion and that class inclusion is transitive. In
slot and filler systems, these attributes are usually represented explicitly. In logicbased systems, these relationships may be represented this way or they may be
represented implicitly by a set of predicates describing a particular class.
This is introduced as a device to support property inheritance along isa and
instance links. This is an important aspect of these structures. Monotonic inheritance
can be performed substantially more efficiently with such structures than with pure
logic, and non-monotonic inheritance is easily supported. The reason that inheritance
is easy is that the knowledge in slot and filler systems consists of structures as a set
of entities and their attributes. These structures turn out to be useful for other reasons
besides the support for inheritance.
Weak slot and filler structures are a data structure and the reasons to use this data
structure are as follows:
• It enables attribute values to be retrieved quickly.
■
Assertions are indexed by the entities.
■
Binary predicates are indexed by first argument.
• Properties of relations are easy to describe.
Self-Instructional
Material
113
Weak Slot and Filler
Structures
• It allows ease of consideration as it embraces aspects of object oriented
programming.
This structure is called a weak slot and filler structure because:
• A slot is an attribute value pair in its simplest form.
• A filler is a value that a slot can take—could be a numeric, string (or any
data type) value or a pointer to another slot.
• A weak slot and filler structure does not consider the content of the
representation.
NOTES
We describe the two kinds of structures: Semantics and Frames. In this unit,
you will learn about representation of these structures themselves and techniques
for reasoning with them. We call these ‘knowledge poor’ structures ‘weak’ by analogy
with the weak methods for problem solving.
5.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
• Define slot and filler systems
• Analyse the concept of semantic nets
• Discuss the role of frames
5.2
SEMANTIC NETS
The simplest kind of structured objects are the semantic nets, which were originally
developed in the early 1960s to represent the meaning of English words. They are
important both historically, and also in introducing the basic ideas of class hierarchies
and inheritance.
A semantic net is really just a graph, where the nodes in the graph represent
concepts, and the arcs represent binary relationships between concepts. The most
important relations between concepts are subclass relations between classes and
subclasses, and instance relations between particular objects and their parent class.
However, any other relations are allowed, such as has-part, colour, etc. So, to
represent some knowledge about animals (as AI people so often do) we might have
the following network:
This network represents the fact that mammals and reptiles are animals, that
mammals have heads, an elephant is a large grey mammal, Clyde and Nellie are
both elephants and that Nellie likes apples. The subclass relations define a class
hierarchy (in this case very simple).
The subclass and instance relations may be used to derive new information,
which is not explicitly represented. We should be able to conclude that both Clyde
and Nellie have a head and are large and grey. They inherit information from their
parent classes. Semantic networks normally allow efficient inheritance-based
inferences using special-purpose algorithms.
Semantic nets are fine at representing relationships between two objects, but
what if we want to represent a relation between three or more objects? Let us say
we want to represent the fact that ‘John gives Mary the book’. This might be
114
Self-Instructional
Material
represented in logic as gives(john, mary, book2) where book2 represents the particular
book we are talking about. However, in semantic networks we have to view the fact
as representing a set of binary relationships between a ‘giving’ event and some
objects.
When semantic networks became popular in the 1970s, there was much
discussion about what the nodes and relations really meant. People were using them
in subtly different ways, which led to much confusion. For example, a node such as
elephant might be used to represent the class of all elephants or just a typical elephant.
Saying that an elephant has_part head could mean that every elephant has some
particular head, that there exists some elephant that has a head, or (more reasonably
in this case) that they all have some object belonging to the class head. Depending
on what interpretation you choose for your nodes and links, different inferences are
valid. For example, if it’s just a typical elephant, then Clyde may have properties
different from general elephant properties (such as being pink and not grey).
Weak Slot and Filler
Structures
NOTES
The simplest way to interpret the class nodes is as denoting sets of objects.
So, an elephant node denotes the set of all elephants. Nodes such as Clyde and
Nellie denote individuals. So, the instance relationship can be defined in terms of
set membership (Nellie is a member of the set of all elephants), while the subclass
relation can be defined in terms of a subset relation— the set of all elephants is a
subset of the set of all mammals. Saying that elephants are grey means (in the
simple model) that every individual in the set of elephants is grey (so Clyde can’t be
pink). If we interpret networks in this way we have the advantage of a clear, simple
semantics, but the disadvantage of a certain lack of flexibility—maybe Clyde is
pink!
In the debate about semantic nets, people were also concerned about their
representational adequacy (i.e. what sort of facts they were capable of representing).
Things that are easy to represent in logic (such as ‘every dog in town has bitten the
constable’) are hard to represent in nets (at least, in a way that has a clear and welldefined interpretation). Techniques were developed to allow such things to be
represented, which involved partitioning the net into sections, and introducing a
special relationship. These techniques didn’t really catch on, so we won’t go into
them here, but they can be found in many AI textbooks.
To summarize, nets allow us to simply represent knowledge about an object
that can be expressed as binary relations. Subclass and instance relations allow us
to use inheritance to infer new facts/relations from the explicitly represented one.
However, early nets didn’t have very clear semantics (i.e. it wasn’t clear what the
nodes and links really meant). It was difficult to use nets in a fully consistent and
meaningful manner, and still use them to represent what you wanted to represent.
Techniques evolved to get round this, but they are quite complex, and seem to
partly remove the attractive simplicity of the initial idea.
A Semantic Net is a formal graphic language representing facts about entities in
some world about which we wish to reason. The meaning of a concept comes from
the ways in which it is connected to other concepts. Computer languages cannot
tolerate the ambiguity of natural languages. It is therefore necessary to have welldefined sets of predicates and arguments for a domain. Semantic nets for realistic
problems become large and cluttered; the problems need to be broken down.
Semantic nets are a simple way of representing the relationships between
entities and concepts.
Self-Instructional
Material
115
Weak Slot and Filler
Structures
The major idea is that:
• The meaning of a concept comes from its relationship to other concepts, and
that,
NOTES
• The information is stored by interconnecting nodes with labelled arcs.
Let’s discuss the following aspects of Semantic Nets:
• Representation in a Semantic Net
• Inference in a Semantic Net
• Extending Semantic Nets
5.2.1 Representation in a Semantic Net
The physical attributes of a person can be represented as in Figure 5.1.
Mammal
is a
has_part
Person
Head
instance
team_colours
Black/Blue
team
India
Dhoni
Fig. 5.1 A Semantic Network
These values can also be represented in logic as: isa (person, mammal),
instance (Dhoni, person), team(Dhoni, India).
We have already seen how conventional predicates such as lecturer (Rao)
can be written as instance (Rao, lecturer). Recall that isa and instance represent
inheritance and are popular in many knowledge representation schemes. But we
have a problem: How we can have more than 2 place predicates in semantic nets?
E.g. score (India, Australia, 236). Solution:
• Create new nodes to represent new objects either contained or alluded to
in the knowledge, game and fixture in the current example. Relate
information to nodes and fill up slots (see Fig. 5.2).
Cricket Match
isa
away-team
Australia
Fixture 3
score
Home-team
India
116
Self-Instructional
Material
Fig. 5.2 A Semantic Network for n-Place Predicate
236
As a more complex example consider the sentence: Sam gave Ravi a book. Here we
have several aspects of an event ( see Fig. 5.3).
gave
book
Weak Slot and Filler
Structures
NOTES
instance
object
agent
book 13
event 1
Sam
receiver
Ravi
Fig. 5.3 A Semantic Network for a Sentence
5.2.2 Inference in a Semantic Net
Basic inference mechanism: follow links between nodes.
Two methods to do this:
5.3.2.1 Intersection search
The notion that spreading activation out of two nodes and finding their intersection
finds relationships among objects. This is achieved by assigning a special tag to
each visited node.
The advantages include the entity-based organization and fast parallel
implementation. However, every structured question need highly structured
networks.
Inheritance
The isa and instance representations provide a mechanism to implement this.
Inheritance also provides a means of dealing with default reasoning. e.g., we could
represent:
• Emus are birds.
• Typically birds fly and have wings.
• Emus run.
In the following Semantic net:
fly
action
has_part
bird
wings
instance
emu
action
run
Fig. 5.4 A Semantic Network for a Default Reasoning
Self-Instructional
Material
117
Weak Slot and Filler
Structures
NOTES
In making certain inferences we will also need to distinguish between the
link that defines a new entity and holds its value and the other kind of link that
relates two existing entities. Consider the example shown where the height of two
people is depicted and we also wish to compare them.
We need extra nodes for the concept as well as its value.
Ravi
Sam
height
height
170
160
Fig. 5.5 Two Heights
Special procedures are needed to process these nodes, but without this
distinction the analysis would be very limited.
Ravi
Sam
height
height
greater than
H2
H1
value
value
170
160
Fig. 5.6 Comparison of Two Heights
Defaults and inheritance
• Defaults and inheritance are ways of achieving some common sense.
• Inheritance is a way of reasoning by default, that is, when information is
missing, fall back to defaults.
• Semantic networks represent inheritance.
grey
1
skin
4
tail
elephant
trunk
legs
1
isa
e1
name
clyde
118
Self-Instructional
Material
Weak Slot and Filler
Structures
Using inheritance
• To find the value of a property of e1, first look at e1.
• If the property is not attached to that node, ‘climb’ the isa link to the node’s
parent and search there.
■
isa signifies set membership:
■
ako signifies the subset relation:
NOTES
• Repeat, using isa/ako links, until the property is found or the inheritance
hierarchy is exhausted.
• Sets of things in a semantic network are termed types.
• Individual objects in a semantic network are termed instances.
Examples of Semantic Networks
State: I own a tan leather chair.
furniture
ako
person
chair
isa
me
part
seal
isa
owner
my-chair
colour
covering
leather
tan
isa
brown
Event: John gives the book to Mary.
give
isa
John
agent
event7
isa
person
object
beneficiary
isa
Mary
book23
isa
book
Complex event: Bilbo finds the magic ring in Gollum’s cave.
Bilbo
isa
habbit
ako
person
isa
ring
agent
event9
isa
find
object
magicring2
location
owned-by
cave77
isa
Gollum
cave
Self-Instructional
Material
119
Weak Slot and Filler
Structures
5. 2.3 Extending Semantic Nets
Here, we will consider some extensions to Semantic nets that overcome a few
problems (see Questions and Exercises) or extend their expression of knowledge.
NOTES
Partitioned Networks
Partitioned Semantic Networks allow for:
• propositions to be made without commitment to truth.
• expressions to be quantified.
Basic idea: Break network into spaces which consist of groups of nodes and
arcs and regard each space as a node.
Consider the following: John believes that the earth is flat. We can encode
the proposition the earth is flat in a space and within it have nodes and arcs that
represent the fact (see Fig. 5.7). We can have the nodes and arcs to link this space
with the rest of the network to represent John’s belief.
believes
instance
John
agent
event1
object
space 1
earth
flat
instance insurance
Prop 1
Object 1
has_property
Fig. 5.7 Partitioned Network
Now consider the quantified expression: Every parent loves their child.
To represent this we:
• Create a general statement, GS, special class.
• Make node g an instance of GS.
• Every element will have at least 2 attributes:
■
a form that states which relation is being asserted.
■
one or more forall ( ∀∃
)∧
or exists ∀∃
( )∧ connections—these represent
universally quantifiable variables in such statements e.g. x, y in
∀∃ y∧: child(y)
x ∧parent(x) →
loves(x, y)
∀∃
∀∃ ∧
Here we have to construct two spaces one for each x, y.
Note: We can express variables as existentially qualified variables and express the
event of love having an agent p and receiver b for every parent p which could
simplify the network.
Also, if we change the sentence to Every parent loves child then the node of
the object being acted on (the child) lies outside the form of the general statement.
120
Self-Instructional
Material
Thus it is not viewed as an existentially qualified variable whose value may depend
on the agent. So we could construct a partitioned network as in Figure 5.8.
Weak Slot and Filler
Structures
instance
GS
NOTES
instance
gs1
parent
loves
child
form
form
gs2
instance
forall
c2
exists
instance
i3
receiver
space2
instance
agent
p1
space1
Fig. 5.8 ∀∃(for
∧ all) Partitioned Network
Construct the semantic nets for the following sentences:
1. The dog bit the mail carrier.
2. Every dog has bitten a mail carrier.
3. Every dog in town has bitten the mail carrier.
4. Every dog has bitten every mail carrier.
1. The dog bit the mail carrier.
Dogs
mail carrier
Bite
isa
isa
isa
m
b
d
assailant
victim
2. Every dog has bitten a mail carrier.
SA
GS
Dogs
isa
isa
isa
isa
mail carrier
Bite
S1
form
g
m
b
d
assailant
victim
Self-Instructional
Material
121
Weak Slot and Filler
Structures
3. Every dog in town has bitten the mail carrier.
Dogs
NOTES
SA
Town Dogs
GS
isa
isa
isa
form
g
mail carrier
Bite
isa
S1
C
b
d
assailant
victim
Every dog has bitten every mail carrier.
SA
Dogs
isa
isa
assailant
GS
1. A semantic net is a
formal, non-graphic
language
representing facts
about entities in
some world about
which we wish to
reason. (True or
False)
2. Semantic nets do
not allow us to
simply represent
knowledge about an
object that can be
expressed as binary
relations. (True or
False)
3. Frames on their
own are not
particularly helpful
but frame systems
are a powerful way
of encoding
information to
support reasoning.
(True or False)
122
Self-Instructional
Material
5.3
isa
m
b
d
Check Your Progress
mail carrier
Bite
victim
g
form
FRAMES
Frames can also be regarded as an extension to Semantic nets. Indeed it is not clear
where the distinction between a semantic net and a frame ends. Initially, semantic
nets were used to represent labelled connections between objects. As tasks became
more complex, the representation needs to be more structured. The more structured
the system, the more beneficial to use frames. A frame is a collection of attributes
or slots and associated values that describe some real-world entity. Frames on their
own are not particularly helpful but frame systems are a powerful way of encoding
information to support reasoning. Set theory provides a good basis for understanding
frame systems. Each frame represents:
• a class (set), or
• an instance (an element of a class).
Instead of properties a frame has slots. A slot is like a property, but can contain
more kinds of information, sometimes called facts of the slot:
• The value of the slot
• A procedure that can be run to compute the value (and if needed, procedure)
• Procedures to be run when a value is put into the slot or removed
• Data type information, constraints on possible slot fillers
• Documentation
Frames can inherit slots from parent frames. For example, man might inherit
properties from Ape or Mammal (A parent class of man).
Weak Slot and Filler
Structures
Properties
• Frames implement semantic networks.
NOTES
• They add procedural attachment.
• A frame has slots and slots have values.
• A frame may be generic, i.e. it describes a class of objects.
• A frame may be an instance, i.e. it describes a particular object.
• Frames can inherit properties from generic frames.
Frames are a variant of nets that are one of the most popular ways of
representing non-procedural knowledge in an expert system. In a frame, all the
information relevant to a particular concept is stored in a single complex entity,
called a frame. Superficially, frames look pretty much like record data structures.
However frames, at the very least, support inheritance. They are often used to capture
knowledge about typical objects or events, such as a typical bird or a typical restaurant
meal.
We could represent some knowledge about elephants in frames as follows:
Mammal
subclass:
Animal
warm_blooded:
yes
subclass:
Mammal
* colour:
grey
* size:
large
instance:
Elephant
colour:
pink
owner:
Fred
instance:
Elephant
size:
small
Elephant
Clyde
Nellie:
A particular frame (such as Elephant) has a number of attributes or slots such
as colour and size where these slots may be filled with particular values, such
as grey. We have used a ‘*’ to indicate those attributes that are only true of a typical
member of the class, and not necessarily every member. Most frame systems will
let you distinguish between typical attribute values and definite values that must be
true.
[Rich & Knight in fact distinguish between attribute values that are true of
the class itself, such as the number of members of that class, and typical attribute
values of members]
In the above frame system, we would be able to infer that Nellie was small,
grey and warm blooded. Clyde is large, pink and warm blooded and owned by Fred.
Self-Instructional
Material
123
Weak Slot and Filler
Structures
NOTES
Objects and classes inherit the properties of their parent classes UNLESS they have
an individual property value that conflicts with the inherited one.
Inheritance is simple where each object/class has a single parent class, and
where slots take single values. If slots may take more than one value it is less clear
whether to block inheritance when you have more specific information. For example,
has_part head, and that an elephant has_part trunk you
may still want to infer that an elephant has a head. It is therefore useful to label slots
according to whether they take single values or multiple values.
i f
y o u
k
n o w
t h a t
a
m
a m
m
a l
If objects/classes have several parent classes (e.g. Clyde is both an elephant
and a circus animal), then you may have to decide which parent to inherit from
(maybe, elephants are by default wild, but circus animals are by default tame).
There are various mechanisms for making this choice, based on choosing the most
specific parent class to inherit from.
In general, both slots and slot values may themselves be frames. Allowing
slots of be frames means that we can specify various attributes of a slot. We might
want to say, for example, that the slot size always must take a single value of type
size-set (where size-set is the set of all sizes). The slot owner may take multiple
values of type person (Clyde could have more than one owner). We could specify
this in the frames:
Size:
instance:
Slot
single_valued:
yes
range:
Size-set
instance:
Slot
single_valued:
no
range:
Person
Owner:
The attribute value Fred (and even large and grey, etc.) could be represented
as a frame, e.g:
Fred:
instance:
Person
occupation:
Elephant-breeder
One final useful feature of frame systems is the ability to attach procedures to
slots. So, if we don’t know the value of a slot, but know how it could be calculated,
we can attach a procedure to be used if needed, to compute the value of that slot.
Maybe, we have slots representing the length and width of an object and sometimes
need to know the object’s area—we would write a (simple) procedure to calculate
it, and put that in place of the slot’s value. Such mechanisms of procedural attachment
are useful, but perhaps should not be overused, or else our nice frame system would
consist mainly of just lots of procedures, interacting in an unpredictable fashion.
124
Self-Instructional
Material
Frame systems, in all their full glory, are pretty complex and sophisticated
things. More details are available in [Rich & Knight, sec 9.2] or [Luger &Stubblefield,
sec 9.4.1] (or less detail in [Bratko]). The main idea to get clear is the notion of
inheritance and default values. Most of the other features are developed to support
inheritance reasoning in a flexible but principled manner. As we saw for nets, it is
easy to get confused about what slots and objects really mean. In frame systems we
partly get round this by distinguishing between default and definite values, and by
allowing users to make slots ‘first class citizens’, giving the slot particular properties
by writing a frame-based definition.
Advantages of frames
Weak Slot and Filler
Structures
NOTES
1. A frame collects information about an object in a single place in an organized
fashion.
2. By relating slots to other kinds of frames, a frame can represent typical
structures involving an object; these can be very important for reasoning based
on limited information.
3. Frames provide a way of associating knowledge with objects.
4. Frames may be a relatively efficient way of implementing AI applications.
5. Frames allow data that are stored and computed to be treated in a uniform
manner. For example, class, a computed grade or marks of a student might be
stored.
Disadvantages of frames
1. Slot fillers must be real data. For example, it is not possible to say that Ram
is a player or a student, since there is no way to deal with a disjunction in a
slot filler.
2. It is not possible to quantify over slots. For example, there is no way to
represent ‘some students make 100 on the Examination’.
3. It is necessary to repeat the same information to make it usable from different
viewpoints, since methods are associated with slots or particular object types.
For example, it may be easy to answer, ‘Whom does Sam love?’, but hard to
answer ‘Who loves Nisha?’
Frames are good for applications in which the structure of the data and the
available information are relatively fixed. The procedures or methods associated
with slots provide a good way to associate knowledge with a class of objects. Frames
are not particularly good for applications in which deep deductions are made from
a few principles as in theorem proving.
Object oriented programming and frame systems have much in common:
• Class / Instance structure
• Inheritance from higher classes
• Ability to associate programs with classes and call them automatically
Because of these similarities it is not possible to draw a hard distinction
between object oriented systems and frames. However, there are some differences
that are typically found.
Frame systems typically have the following features in contrast to the typical
object oriented system:
1. Richer Slots: A slot can contain more kinds of information. E.g.:
documentation, default value, restrictions on slot fillers.
2. Slot orientation: Usually there are no messages that are not associated
with slots.
3. More complex structure: A frame system could potentially have instance
values, default values, and if needed, methods for the same slot. When all
Self-Instructional
Material
125
Weak Slot and Filler
Structures
NOTES
these are combined with multiple inheritance paths, the result can be
complex.
These are certain problems with frames:
• Inheritance causes trouble.
• Inference tends to become complex and ill structured.
• Frames are good for representing static situations; they are not so good
for representing things that change over time.
As example of a simple frame is shown below:
(Ravi
(profession (value professor)
(Age (value 42))
(wife (value Rani))
(children (value A, B)
(address (street (value Nagarampalem)
(city (value Guntur))
(state (value AP)
(PIN (value 522 004))
The general frame template structure is shown below:
(<frame name>)
(<slot 1> (<fact 1><value 1>………<value k1>)
(<fact 2><value 1>………<value k2>)
(<slot 2><fact 1><value 1…….<value km>)
|
|
|)
From the above general structure it is seen that a frame may have any number
of slots, and a slot may have any number of facts, each with number of values. This
provides a general framework from which to build a variety of knowledge structures.
The slots in a frame specify genera or specific.
Now we will discuss:
• Frame knowledge Representation
• Interpreting frames
5.3.1 Frame Knowledge Representation
Consider the example first discussed in Semantic Nets:
Person
isa:
Cardinality:
Mammal
Adult-Male
isa:
Person
Cardinality:
126
Self-Instructional
Material
Cricket-Player
isa:
Adult-Male
Cardinality:
Height:
Weight:
Position:
Team:
Team-Colours:
Back
isa:
Cricket-Player
Cardinality:
Tries:
Dhoni
instance:
Back
Height:
6-0
Position:
Centre
Team:
India
Team-Colours:
Black
Cricket-Team
isa:
Team
Cardinality:
Team-size:
10
Coach:
Cricket team
Instance:
Indian Team
Team size:
10
Coach:
Amarnadh
Players: {Sachin, Yuvaraj Singh, Srinath, Harbajan…..}
Weak Slot and Filler
Structures
NOTES
Fig. 5.9 A Simple Frame System
Here, the frames Person, Adult-Male, Cricket-Player and India-Team are all
classes and the frames Amarnadh and Indian are instances.
Note:
• The isa relation is in fact the subset relation.
• The instance relation is in fact element of.
• The isa attribute possesses a transitivity property. This implies: Amarnadh is
a Back and a Back is a Cricket-Player who in turn is an Adult-Male and also
a Person.
• Both isa and instance have inverses which are called subclasses or all
instances.
• There are attributes that are associated with the class or set such as cardinality
and on the other hand there are attributes that are possessed by each member
of the class or set.
Self-Instructional
Material
127
Weak Slot and Filler
Structures
5.3.2 Distinction between Sets and Instances
It is important that this distinction is clearly understood.
India can be thought of as a set of players or as an instance of an Indian-
NOTES
Team.
If India were a class then
• its instances would be players
• it could not be a subclass of Indian-Team; otherwise, its elements would
be members of Rugby-Team which we do not want.
Instead we make it a subclass of Cricket-Player and this allows the players to
inherit the correct properties enabling us to let the Indian to inherit information
about teams.
This means that Indian is an instance of Cricket-Team.
But there is a problem here:
• A class is a set and its elements have properties.
• We wish to use inheritance to bestow values on its members.
• But there are properties that the set or class itself has such as the manager
of a team.
This is why we need to view Indian as a subset of one class players and an
instance of teams. We seem to have a CATCH 22. Solution:
MetaClasses.
A metaclass is a special class whose elements are themselves classes.
Now consider our rugby teams as:
Class
instance:
Class
isa:
Class
Cardinality:
…
Instance:
class
Isa:
class
Cardinality:
The number of team
Team Size:
10
Team
Indian Team
Isa:
Team
Cardinality:
The number of team
Team Size:
10
Coach:
Amarnath
Instance:
Cricket-team
Team Size:
10
Coach:
Amarnath
India
128
Self-Instructional
Material
Weak Slot and Filler
Structures
Amarnath
Instance:
Back
Height:
6-0
Position:
Slip
Team:
India
NOTES
Team-colours: Blue
Fig. 5.10 A Metaclass Frame System
The basic metaclass is Class, and this allows us to
• define classes which are instances of other classes, and (thus)
• inherit properties from this class.
Inheritance of default values occurs when one element or class is an instance
of a class.
Slots as Objects
How can we represent the following properties in frames?
• Attributes such as weight, age to be attached and make sense
• Constraints on values such as age being less than a hundred
• Default values
• Rules for inheritance of values such as children inheriting parent’s names
• Rules for computing values
• Many values for a slot
A slot is a relation that maps from its domain of classes to its range of values.
A relation is a set of ordered pairs so one relation is a subset of another. Since slot is
a set the set of all slots can be represented a metaclass called Slot, say.
Consider the following:
SLOT
isa:
instance:
domain:
range:
range-constraint:
definition:
default:
to-compute:
single-valued:
Coach
instance:
domain:
range:
range-constraint:
default:
single-valued:
Class
Class
SLOT
Cricket-Team
Person
λx(experience x.manager)
TRUE
Self-Instructional
Material
129
Weak Slot and Filler
Structures
NOTES
Colour
instance:
domain:
range:
single-valued:
Team-Colours
instance:
isa:
domain:
range:
range-constraint:
single-valued:
Position
instance:
domain:
range:
to-compute:
single-valued:
SLOT
Physical-Object
Colour-Set
FALSE
SLOT
Colour
team-player
Colour-Set
not Pink
FALSE
SLOT
Cricket-Player
{Back, Forward, Reserve}
λx x.position
TRUE
Note the following:
• Instances of SLOT are slots.
• Associated with SLOT are attributes that each instance will inherit.
• Each slot has a domain and range.
• Range is split into two parts: one, the class of the elements and the other is a
constraint which is a logical expression; if absent it is taken to be true.
• If there is a value for default then it must be passed on unless an instance has
its own value.
• The to-compute attribute involves a procedure to compute its value. E.g., in
Position where we use the dot notation to assign values to the slot of a frame.
• Transfers through lists – other slots from which values can be derived from
inheritance.
5.4.2.1 Interpreting frames
A frame system interpreter must be capable of the following in order to exploit the
frame slot representation:
• Consistency checking—when a slot value is added to the frame relying on
the domain attribute and the value is legal using range and range constraints.
• Propagation of definition values must be along isa and instance links.
• Inheritance of default values along isa and instance links.
• Computation of value of slot as needed.
• Checking that only correct number of values are computed.
130
Self-Instructional
Material
5.4 SUMMARY
In this unit, you have learned the various concepts about weak slot and filler
structures. There are two attributes that are of very general significance, the use of
which you have already learned in the previous units. These attributes are important
because they support property inheritance. This unit also discussed in detail about
semantic nets and frames. The idea of a frame system as a way to represent declarative
knowledge has been encapsulated in a series of frame oriented representation
languages, whose features have evolved and been driven by an increased
understanding of the sort of representation issues.
Weak Slot and Filler
Structures
NOTES
• Semantic networks start out by encoding relationships like isa and ako but
can represent quite complicated information on this simple basis.
• Frames organize knowledge around concepts considered to be of interest
(like person and patient in the example code).
• A frame can be a generic frame (or template) or an instance frame.
• Frames also allow procedural attachment—that is, demons can be attached
to slots so that the mere fact of creating a frame or accessing a slot can cause.
• Significant computation to be performed.
5.5
KEY TERMS
• Monotonic inheritance: This can be performed substantially more efficiently
with such structures than with pure logic, and non-monotonic inheritance is
easily supported.
• Semantic net: It is a formal graphic language representing facts about entities
in some world about which we wish to reason.
• Frames: They can also be regarded as an extension to Semantic nets. Indeed
it is not clear where the distinction between a semantic net and a frame ends.
5.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
1. False
2. False
3. True
5.7
QUESTIONS AND EXERCISES
1. Discuss in detail with proper illustrations semantic nets and scripts. What are
their advantages and limitations?
2. Express the following statements using semantic nets:
(i) Every student has been hit by every teacher (at least once).
(ii) Every student has been hit by some teacher (or the other).
(iii) Some students have been hit by some teachers.
Self-Instructional
Material
131
Weak Slot and Filler
Structures
3. Give semantic nets to describe the following:
Narayana is a writer.
Narayana lives in Hyderabad.
Jagan is a teacher.
NOTES
Jagan lives in Mumbai.
Narayana sent a copy to his book to Jagan.
Jagan sent his thanks to Narayana.
5.8
FURTHER READING
1. www.cee.hw.ac.uk
2. www.cs.cardiff.ac.uk
3. www.cse.unsw.edu.ac
4. www.mm.iit.uni-miskolc.hu
5. www.cindy.cis.netu.edu.tw
6. www.uet.net.pk
7. Marinov, M. 2008. ‘Using Frames for Knowledge Representation in A CORBA
– Based Distributed Environment’. Knowledge Based Systems. July.
8. Cheng, H.C. 2006. ‘A eb-Based Distributed Problem Solving Environment
for Engineering Application’. Advances in Engineering Software, February.
132
Self-Instructional
Material
UNIT 6 STRONG SLOT AND FILLER
STRUCTURES
Strong Slot and Filler
Structures
NOTES
Structure
6.0
6.1
6.2
6.3
6.4
Introduction
Unit Objectives
Conceptual Dependency
Scripts
CYC
6.4.1 Motivation
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.0
CYCL
Global Ontology
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
INTRODUCTION
In the previous unit, you learnt about slot and filler structures in very general outlines.
In this unit, you will learn about strong slot and filler structures. Individual semantic
networks and frame systems may have specialized links and inference procedures,
but there are no hard and fast rules about what kinds of objects and links are good
in general for knowledge representation. Such decisions are left up to the builder of
the semantic network or frame system.
Strong slot and filler structures typically:
• Represent links between objects according to more rigid rules
• Specific notions of what types of object and relations between them are
provided
• Represent knowledge about common situations
We have two types of strong slot and filler structures:
1. Conceptual Dependency (CD)
2. Scripts
For a specific need, we can place the restrictions on the structure, that is, we can
have a meaningful structure with specific structural elements and primitives. This
will enable us to derive specific methods for using the structure.
• Conceptual dependencies – for dealing with natural languages
Low-level representation capturing textual information
Primitives:
Actions – transfer, movement, mental, sensory
Objects – modifiers of actions, modifiers of objects
Tenses, directionality
Self-Instructional
Material
133
Strong Slot and Filler
Structures
NOTES
Note that many sentences can form the same CD because they convey the
same information, just differently
– I gave the man a book.
– I gave the book to a man.
– That was the man that I gave the book to.
If a sentence is ambiguous, the Conceptual Dependencies must also contain
the ambiguity, or we could create multiple Conceptual Dependencies –
one for each interpretation.
CDs can be very lengthy for even a very simple idea.
• Scripts – for planning or reasoning over ‘stereotypical events’.
Captures the typical sequence of actions along with the actors and props
of a typical event.
Includes: Preconditions, Post conditions, roles, props, settings and
variations.
Used primarily for answering the questions and summarizing the input
text.
What about a typical action?
• CYC – an experiment dealing with common sense knowledge
Most Systems have common sense reasoning abilities at all – this has
been a criticism of AI.
CYC – An experiment to program common sense knowledge.
Use simple interfacing strategies (e.g. Resolution) to drive new common
sense facts.
CYC systems consist of perhaps 10 million individual common sense
facts.
Effort of 5 years and many people.
Uses a frame based representation and is capable of generating new
generations.
• Functional Representation – for reasoning over device functions and behaviour.
How do we represent functionality and device behaviour? How do we
reason about how things work?
We can use a ‘functional’ based representation to perform hypothetical
reasoning.
Elements of a functional representation:
i. Devices are described by a Function, structure and behaviour.
ii. Function: The purposes of component X.
iii. Structures: How components physically construct.
iv. Behaviour: The sequence of states that X undergoes to bring about the
function.
Examples of Functional Representation:
• Physical Objects:
Doorbell Ringer System
Chemical Processing Unit
Human Body subsystem (e.g. Gate)
134
Self-Instructional
Material
• Abstracts Objects:
Compute program
Combat Mission Plan
Diagnostic reasoning process
Strong Slot and Filler
Structures
• Uses of functional representations
Given functional representation of how something works’
Generate a ‘fault tree for diagnosis
NOTES
Perform Hypothetical reasoning
Produce predictions and simulations
Perform redesign
6.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
•
•
•
•
•
•
6.2
Understand conceptual dependency
Describe how to represent knowledge through conceptual dependency
Discuss the role of scripts
Understand the concept of CYC
Explain CYCL
Discuss the concept of global ontology
CONCEPTUAL DEPENDENCY (CD)
Conceptual Dependency originally developed to represent knowledge acquired from
natural language input.
The goals of this theory are:
• To help in the drawing of inference from sentences
• To be independent of the words used in the original input
• That is to say: For any 2 (or more) sentences that are identical in meaning
there should be only one representation of that meaning.
It has been used by many programs that claim to understand English. Conceptual
Dependency is a theory of Roger Schank that combines several ideas.
CD provides:
• a structure into which nodes representing information can be placed
• a specific set of primitives
• a given level of granularity
Sentences are represented as a series of diagrams depicting actions using both abstract
and real physical situations.
• The agent and the objects are represented.
• The actions are built up from a set of primitive acts which can be modified by
tense.
Examples of Primitive Acts are:
ATRANS
Transfer of an abstract relationship. e.g., give.
Self-Instructional
Material
135
Strong Slot and Filler
Structures
PTRANS
Transfer of the physical location of an object. e.g., go.
NOTES
PROPEL
Application of a physical force to an object. e.g., push.
MTRANS
Transfer of mental information. e.g., tell.
MBUILD
Construct new information from old. e.g., decide.
SPEAK
Utter a sound. e.g., say.
ATTEND
Focus a sense on a stimulus. e.g., listen, watch.
MOVE
Movement of a body part by owner. e.g., punch, kick.
GRASP
Actor grasping an object. e.g., clutch.
INGEST
Actor ingesting an object. e.g., eat.
EXPEL
Actor getting rid of an object from the body.
Conceptual Dependency introduced several interesting ideas:
1. An internal representation that is language free, using primitive ACTs instead.
2. A small number of primitive ACTs rather than thousands.
3. Different ways of saying the same thing can map the same internal
representation.
There are some problems too:
1. The expansion of some verbs into primitive ACT structures can be complex.
2. Graph matching is NP-hard.
Six primitive conceptual categories provide building blocks which are the set of
allowable dependencies in the concepts in a sentence:
136
Self-Instructional
Material
ACTs
Actions
PPs
Object (Picture Procedure)
AAs
Modifiers of actions (Action aiders)
PAs
Modifiers of objects, i.e. PPs (picture aiders)
AA
Action aiders are properties or attributes of primitive Actions
PP
Picture procedures are actors or physical objects that perform
different acts or producers.
How do we connect these things together?
Strong Slot and Filler
Structures
NOTES
Consider the example: John gives Mary a book.
• Arrows indicate the direction of dependency. Letters above indicate certain
relationships:
o object.
R recipient-donor.
I instrument e.g. eats with a spoon.
D destination e.g. going home.
• Double arrows (⇔) indicate two-way links between the actor (PP) and action
(ACT).
• The actions are built from the set of primitive acts (see above).
These can be modified by tense etc.
• Primitives (ATRANS, PTRANS, PROPEL) indicates transfer of possession.
• O indicates object relation.
• R indicates recipient relation.
A list of basic conceptual dependencies is shown below. These capture the
fundamental semantic structures of natural language. For example, the first
conceptual dependency shown below describes the relationship between as subject
and its verb, and the third describes the verb object relation.
1. PP ⇔ ACT indicates that an actor acts.
2. PP
3. ACT
ACT Indicates that objects has a certain attribute.
PP indicates that action.
The use of tense and mood in design events are extremely important and Schank
introduced the following modifiers:
p
f
t
ts
tf
k
?
/
delta
c
the absence
past
future
transition
start transition
finished transition
continuing
interrogative
negative
timeless
conditional
of any modifier implies the present tense.
Self-Instructional
Material
137
Strong Slot and Filler
Structures
4. Indicates the recipient and the donor of an object within an action.
NOTES
5. Indicates the direction of an object within an action.
6. Indicates the instrumental conceptualization for an action.
7. Indicates that conceptualization X caused conceptualization Y.
8. Indicates a stage change of an Object.
9. PP1 ← PP2. Indicates that PP2 is either Part of PP1.
10. Describes the relationship between PP and an attribute that is already been
predicated of it.
11. Represents the relationship between PP and a state in which ist started and
another in which it ended.
138
Self-Instructional
Material
12. The two forms of the rule describe the cause of an action and the cause of a
state change.
Strong Slot and Filler
Structures
NOTES
(a)
(b)
13. Describes the relationship between a conceptualization and the time at which
the event it describes occurred.
These can be combined to represent a simple transitive sentence such as:
‘John throws the ball.’
John ⇔ PROPEL ←° Ball
So the past tense of the above example:
John gave Mary a book becomes:
The has an object (actor), PP and action, ACT, i.e. PP ⇔ ACT. The triple arrow (⇔)
is also a two link but between an object, PP, and its attribute, PA, i.e. PP ⇔ PA.
It represents isa type dependencies. E.g. Rao ⇔ lecturer
Rao is a lecturer.
Primitive states are used to describe many state descriptions such as height,
health, mental state, physical state. There are many more physical states than
primitive actions. They use a numeric scale.
E.g. John ⇔ height (+10) John is the tallest
John ⇔ height (< average) John is short
Frank ⇔ health (-10) Frank is dead
Dave ⇔ mental_state (-10) Dave is sad
Vase ⇔ physical_state (-10) the vase is broken
You can also specify things like the time of occurrence in the relationship.
Self-Instructional
Material
139
Strong Slot and Filler
Structures
For example: John gave Mary the book yesterday
P
John
to
0
Mary
R
TRANS
Book
John
NOTES
from
Now, let us consider a more complex sentence: Since smoking can kill you, I
stopped. Let’s look at how we represent the inference that smoking can kill:
• Use the notion of one to apply the knowledge to.
• Use the primitive act of ingesting smoke from a cigarette to one.
• Killing is a transition from being alive to dead. We use triple arrows to indicate
a transition from one state to another.
• Have a conditional, c causality link. The triple arrow indicates dependency
of one concept on another.
One
0
INGEST
One
R
Smoke
Cigarette
From
health(-10)
One
to
__
health( 10)
To add the fact that I stopped smoking
• Use similar rules to imply that I smoke cigarettes.
• The qualification t/p attached to this dependency indicates that the instance
INGESTing smoke has stopped.
One
0
One
R
INGEST
smoke
Cigarette
tfp
INGEST
from
0
R
1
smoke
to
Cigarette
from
One
P
to
Advantages of CD
• Using these primitives involves fewer inference rules.
• Many inference rules are already represented in the CD structure.
140
Self-Instructional
Material
• The holes in the initial structure help to focus on the points still to be
established.
Disadvantages of CD
Strong Slot and Filler
Structures
• Knowledge must be decomposed into fairly low-level primitives.
• Impossible or difficult to find the correct set of primitives.
• A lot of inference may still be required.
NOTES
• Representations can be complex even for relatively simple actions. Consider:
Dave bet Frank five pounds that Wales would win the Rugby World Cup.
Complex representations require a lot of storage.
Applications of CD
MARGIE
(Meaning Analysis, Response Generation and Inference on English) – model natural
language understanding.
SAM
(Script Applier Mechanism) – Scripts to understand stories (see the next section).
PAM
(Plan Applier Mechanism) – Scripts to understand stories. Schank et al. developed
all of the above.
6.3
SCRIPTS
A script is a structure that prescribes a set of circumstances which could be expected
to follow on from one another.
It is similar to a thought sequence or a chain of situations which could be
anticipated. It could be considered to consist of a number of slots or frames but with
more specialized roles.
A Script can be viewed as a frame for a sequence of actions. Understanding
natural language often requires knowledge of typical sequences.
Example:
Shyam went to a restaurant.
He ordered an exclusive dinner.
He had forgotten his wallet.
He had to wash dishes.
The sequence of sentences mentions only the parts of the story that are different
from what might otherwise be expected. Understanding such a sequence requires
knowledge of a restaurant script that specifies typical sequences of actions involved
in going to a restaurant.
In contrast to the relatively static slots of a frame, a script may have a directed
graph of events composing the script.
A script is a structure that described a stereotyped sequence of events in a
particular content. A script consists of slots. Each slot contains some information
Self-Instructional
Material
141
Strong Slot and Filler
Structures
NOTES
about kinds of values it may contain as well as a default value to be used if no other
information is available.
Scripts are frame-like structures used to represent commonly occurring
situations such as going to the movies, shopping in a supermarket, eating in a
restaurant, visiting a dentist, reading about baseball or politics. Scripts are used in
natural language understanding systems to organize a knowledge base in terms of
the situations that the system is to understand.
When people enter a restaurant they know what to expect and how to act.
They are met at the entrance by restaurant staff or by a sign indicating that they
should continue in and be directed to a seat. Either a menu is available at the table
or presented by the waiter or the customer asks for it. These are the routines for
ordering food, eating, paying and leaving.
In fact the restaurant script is quite different from other eating scripts such as
the ‘fast food’ model of the ‘formal family means’. In the fast food model the customer
enters gets in the line to order, pay for the meal (before eating), waits about for a
tray with the food, takes the tray and tries to find a clean table and so on. These are
two different stereotyped sequenced of events and each has a potential script.
Scripts are beneficial because:
• Events tend to occur in known runs or patterns.
• Causal relationships between events exist.
• Entry conditions exist which allow an event to take place.
• Prerequisites exist upon events taking place. E.g. when a student progresses
through a degree scheme or when a purchaser buys a house.
The components of a script include:
Entry Conditions
Entry conditions or descriptors of the world that must be true for the script to be
called. From the above script, these include an open restaurant and a happy customer
that has some money, or conditions that must, in general, be satisfied before the
events described in the script can occur.
Results
Those results or facts that are true once the script has terminated. Example, the
customer is full and poorer, the restaurant owner has more money or conditions that
will, in general, be true after the events described in the script have occurred.
Props
Props or the ‘things’ that support the content of the script. These might include
table, waiters and menus. The set of props supports reasonable default assumptions
about the situation; a restaurant is assumed to have tables and chairs, unless stated
otherwise.
OR
These are the slots representing the objects that are involved in the events
described in the scripts. The presence of these objects can be inferred even if they
are not mentioned explicitly.
142
Self-Instructional
Material
Strong Slot and Filler
Structures
Roles
Are the items those that the individual participants perform? The waiter takes the
order, delivers food and presents the bill. The customers order, eat and pay.
OR
NOTES
These are the slots representing the people, who are involved in the events
described in the script. If specific individuals are mentioned, they can be inserted
into the appropriate slots.
Track
The specific variations in one or more general patterns that is represented by their
particular script.
Scenes
Script is broken into a sequence of scenes each of which presents a temporal aspect
of the script. The sequence of events that occur. Events are represented in conceptual
dependency form. In the restaurant there is entering, ordering, eating, etc.
Example:
Restaurant Script Structure
SCRIPT : Restaurant
TRACK : Oberoi
PROPS : Tables, Chairs, Menu, Money, Food
ROLES : Customer, Waiter, Cashier, Owner, Cook
Entry Conditions
(a) Customer is hungry
(b) Customer has money
Scene 1:
Entering
(a) Customer PTRANS into Restaurant.
(b) Customer ATTEND eyes to the tables.
(c) Customer MBUILD where to sit.
(d) Waiter PTRANS customer to an empty table.
(e) Customer MOVES to sitting position.
Scene 2:
Ordering
(a) Waiter PTANS the menu.
(b) Customer MBUILD choice of food.
(c) Customer MTRANS signal to waiter.
(d) Waiter PTRANS to table.
(e) Customer MTRANS I want food to waiter.
Self-Instructional
Material
143
Strong Slot and Filler
Structures
Scene 3:
Eating
(a) Cook ATRANS food to waiter.
NOTES
(b) Waiter ATRANS food to customer.
(c) Customer INGESTS food.
(Option: Return to scene 2 to order more: otherwise, go to scene 4)
Scene 4:
Eating
(a) Customer MTRANS to waiter.
(b) Waiter PTRANS to customer.
(c) Waiter ATRANS bill to customer.
(d) Customer ATRANS tip to waiter.
(e) Customer PTRANS to cashier.
(f) Customer ATRANS money to cashier.
(g) Customer PTRANS out of restaurant.
Results
(a) Customer is no more hungry.
(b) Customer is satisfied.
(c) Owner gets money.
Example 2:
A Supermarket script structure:
SCRIPT : Food market
TRACK : Super market
PROP
: Shopping cart, Market ITEMS, Check-out Stands, Money
ROLES : Shopping, Daily Attendant, Sea food Attendant, Check-out
Clerk, Sacking clerk, Cashier, other Shoppers.
Entry Condition:
(a) Shopper needs groceries.
(b) Food markets open.
Scene 1:
(a) Shopper enters into market.
(b) Shopper PTRANS from one location to another into market.
(c) Shopper PTRANS shopping_cart to shopper.
Scene 2:
(a) Shopper shops for items.
(b) Shopper moves (MOVES).
144
Self-Instructional
Material
(c) Shopper focuses eyes on display items (ATTEND).
Strong Slot and Filler
Structures
(d) Shopper transfers items to shopping_cart (PTRANS).
Scene 3:
(a) Check-out.
NOTES
(b) Shopper MOVES shopper to check-out stand.
(c) Shopper WAITS for Shopper turn.
(d) Shopper ATTENDS eyes to charge.
(e) Shopper ATRANS money to Cashier.
(f) Sales Boy ATRANS bags to Shopper.
Scene 4:
(a) Exit market
(b) Shopper PTRANS shopper to exit market.
Results
(a) Shopper has less money.
(b) Shopper has grocery money.
(c) Market has less grocery items.
(d) Market has more money.
Scripts are useful in describing certain situations such as robbing a bank. This might
involve:
• Getting a gun
• Hold up a bank
• Escape with the money
Here the Props might be
• Gun, G
• Loot, L
• Bag, B
• Getaway car, C
The Roles might be:
• Robber, S
• Cashier, M
• Bank Manager, O
• Policeman, P
The Entry Conditions might be:
• S is poor.
• S is destitute.
The Results might be:
• S has more money.
• O is angry.
Self-Instructional
Material
145
Strong Slot and Filler
Structures
• M is in a state of shock.
• P is shot.
There are 3 scenes: obtaining the gun, robbing the bank and the getaway.
NOTES
Script: Robbery
Track: Successful Snatch
Props:
G = Run
L = Loot
B = Bag
C = Get away Car
Roles:
R = Robber
M = Cashier
O = Bank Manager
P = Police Man
Entry Conditions:
R is poor
R is destitute
Scene 1:
Getting a gun
R PTRANS R into Gun Shop.
R MBUILD R choice of G.
R MTRANS choice.
R ATRANS buys G.
Scene 2:
Holding up the bank
R PTRANS R into bank.
R ATTEND eyes M, O and P.
R MOVE R to M positions.
R GRASP G
R MOVE G to point M.
R MTRANS ‘Give me the money or else’ to M.
P MTRANS ‘Hold it Hands up’ to R.
R PROPEL Shoots G.
P INGEST bullet from G.
M ATRANS L to M.
M ATRANS L puts in bag B.
146
Self-Instructional
Material
M PTRANS exit.
Strong Slot and Filler
Structures
O ATRANS raises the alarm.
Scene 3:
The getaway
NOTES
M PTRANS C.
Results
R has more money.
O is angry.
M is in a state of shock.
P is shot.
Some additional points to note on Scripts:
• If a particular script is to be applied it must be activated and the activating
depends on its significance.
• If a topic is mentioned in passing, then a pointer to that script could be held.
• If the topic is important then the script should be opened.
• The danger lies in having too many active scripts, much as one might have
too many windows open on the screen or too many recursive calls in a program.
• Provided events follow a known trail we can use scripts to represent the
actions involved and use them to answer detailed questions.
• Different trails may be allowed for different outcomes of Scripts (e.g. The
bank robbery goes wrong).
Advantages of Scripts:
• Ability to predict events.
• A single coherent interpretation may be built up from a collection of
observations.
Disadvantages:
• Less general than frames
• May not be suitable to represent all kinds of knowledge
6.4
CYC
What is CYC?
• An ambitious attempt to form a very large knowledge base aimed at capturing
commonsense reasoning.
• Initial goals to capture knowledge from a hundred randomly selected articles
in the Encyclopedia Britannica.
• Both Implicit and Explicit knowledge encoded.
• Emphasis on study of underlying information (assumed by the authors but
not needed to tell the readers.
Self-Instructional
Material
147
Strong Slot and Filler
Structures
Example:
Suppose we read that Sam learned of Napoleon’s death.
Then we (humans) can conclude Napoleon never new that Sam had died.
NOTES
How do we do this?
We require special implicit knowledge or commonsense such as:
• We only die once.
• You stay dead.
• You cannot learn of anything when dead.
• Time cannot go backwards.
Why build large knowledge bases:
6.4.1 Motivations
Why we should build large knowledge bases at all? There are many reasons:
Brittleness
Specialized knowledge bases are brittle. Hard to encode new situations and nongraceful degradation in performance. Commonsense based knowledge bases should
have a firmer foundation.
Form and Content
Knowledge representation may not be suitable for AI. Commonsense strategies
could point out where difficulties in content may affect the form.
Shared Knowledge
Should allow greater communication among systems with common bases and
assumptions. Building an immense knowledge base is an overwhelming task;
however, we should ask whether there are any methods for acquiring this knowledge
automatically. Here there are two possibilities:
1. Machine Learning: Current techniques permit only modest extensions of a
program’s knowledge when compared with older techniques. In order for a
system to learn a great deal, it must already know a great deal. In particular
systems with a lot of knowledge will be able to employ powerful analogical
reasoning.
2. Natural Language Understanding: Humans can extend their knowledge
by reading books and talking with other humans. Now we have online versions
of dictionaries and encyclopaedias, so why don’t we use these into an AI
program and have it assimilate all the information automatically? Although
there are many methods for building language understanding systems, these
methods are themselves very knowledge intensive.
The approach taken by CYC is to hand-code the 10-million or so facts that make up
commonsense knowledge. It may then possibly to develop more automatic methods.
The CYC is coded by using the following methods:
• By hand
148
Self-Instructional
Material
• Special CYC languages such as
1. LISP like
2. Frame Based
3. Multiple inheritances
4. Slots are fully fledged objects.
5. Generalized inheritance any link not just ‘isa’ and instance.
Strong Slot and Filler
Structures
NOTES
Example:
‘Shyam learnt of Napoleon’s death’.
Then human can conclude Napoleon never knew that Shyam had died. For
doing this, we need special implicit knowledge of common sense such as:
• We only die once.
• You stay dead.
• You cannot learn when dead.
• Time cannot go backwards.
Sometimes, we have to build large knowledge bases. The reasons to construct large
knowledge bases are as follows:
There are multiple ways for representing knowledge. They are:
1. Simple relational knowledge
2. Inheritable knowledge
3. Inferential knowledge
4. Procedural knowledge
1. Inferential knowledge: The simplest way to represent declarative facts is a
set of relations of the same sort. This sort of representation provides very
weak inferential capabilities. The following table shows an example of simple
relational knowledge.
Player
Height in feet
Weight in pounds
Bats – throws
ABC
6–0
180
Left–Right
XYZ
6–1
170
Right–Right
PQR
6–2
190
Right–Left
EFG
5–10
215
Left–Left
This advantage of this knowledge is that it serves as the input to powerful
inference engines. Now, this system cannot answer the questions, such as
‘Who is the heaviest player?’. It needs some procedure for finding the heaviest
player.
2. Inheritable knowledge: The structure of knowledge representation must
correspond to the inference mechanism. The most useful form of inference is
property inheritance, in which elements of specific classes inherits attributes
and values from more general classes. Objects must be organized into classes
and classes must be arranged in a generalization hierarchy. Here the two
attributes, isa and instance, provide the basis for property inheritance; isa is
used for class inclusion and instance is used for class membership. Property
inheritance provides as an inference technique for the retrieval of facts that
have been explicitly stored and of facts that can be derived from those that
Self-Instructional
Material
149
Strong Slot and Filler
Structures
NOTES
are explicitly stored. Thus, network representations provide a means of
structuring and exhibiting the structure in knowledge. This gives us a pictorial
representation of objects, their attributes and the relationships that exist
between them and other entities.
3. Inferential knowledge: We may need traditional logic like first-order
predicate logic to describe the inferences that are needed. How we need an
inference procedure to exploit the knowledge represented and make some
inferences. Many such procedures are provided which may reason either in
backward direction or forward direction. An example is Resolution, in which
the proof will be got by refutation.
4. Procedural knowledge: This knowledge specifies what to do and when to
do. Procedural knowledge can be represented in programs.
6.5
CYCL
CYC’s knowledge is encoded in a representation language called CYCL. CYCL is
a frame-based system that incorporates most of the techniques like multiple
inheritance, slots as full-fledged objects, transfers-through, mutually-disjoint-with,
etc. CYCL generalizes the notion of inheritance so that properties can be inherited
along any link, not just isa and instance. For example:
1. All birds have two legs.
2. All of Mary’s friends speak Spanish.
We can encode the first fast using standard inheritance—any frame with Bird on its
instance slot inherits the value 2 on its legs slot. The second fact can be encoded in
a similar fashion if we allow inheritance to proceed along the friend relation.
Check Your Progress
1. Who developed the
theory of conceptual
dependency?
2. Script is similar to a
thought sequence or
a chain of situations
that
could
be
anticipated. (True or
False)
3. CYCL generalizes
the
notion
of
inheritance so that
properties cannot be
inherited along any
link, not just isa and
instance. (True or
False)
150
Self-Instructional
Material
In addition to frames, CYCL contains a constraint language that allows the
expression of arbitrary first-order logical expressions. The time at which the default
reasoning is actually performed is determined by the direction of the
slotValueSubsumes rules. If the direction is backward, the rule is an if-needed rule,
and it is invoked whenever someone inquires. If the direction is forward, the rule is
an if-added rule and additions are automatically propagated. While forward rules
can be very useful, they can also require substantial time and space to propagate
their values. If a rule is entered as backward then the system defers reasoning until
the information is specifically requested. CYC maintains a separate background
process for accomplishing forward propagations. A Knowledge engineer can continue
entering knowledge while its effects are propagated during idle keyboard time.
Frame based inference is very efficient, while general logical reasoning is
computationally hard. CYC actually supports about twenty types of efficient inference
mechanism, each with its own truth maintenance facility. The constraint language
allows for the expression of facts that are too complex for any of these mechanisms
to handle.
The constraint language also provides an elegant, abstract layer of representation. In reality, CYC maintains two levels of representation: the
Epistemological level (EL) and the heuristic level (HL). The EL contains facts stated
in the logical constraint language, while the HL contains the same
Facts are stored using efficient inference templates. There is a translation
program for automatically converting an EL statement into an efficient HL
representation. The EL provides a clean, simple functional interface to CYC so that
users and computer programs can easily insert and retrieve information from the
knowledge base. The EL / HL distinction represents one way of combining the
formal neatness of logic with the computational efficiency of frames.
Strong Slot and Filler
Structures
NOTES
In addition to frames, inference mechanisms and the constraint language,
CYCL performs consistency checking and conflict resolution.
6.6
GLOBAL ONTOLOGY
Ontology is the philosophical study of what exists. All knowledge based systems
refer to entities in the world, but in order to capture the breadth of human knowledge,
we need a well-designed global ontology that specifies at a very high level what
kinds of things exist and what their general properties are.
The highest-level concept in CYC is called Thing. Everything is an instance
of Thing. Below this top-level concept, CYC makes several distinctions including
1. Individual Object versus Collection
2. Intangible, Tangible and Composite
3. Substance
4. Intrinsic versus Extrinsic properties
5. Event and Process
6. Slots
7. Time
8. Agent
6.7
SUMMARY
In this unit, you have learned about the various concepts related to strong slot and
filler structures. You have studued about CYC which is a multi-user system that
provides each knowledge enterer with a textual and graphical interface to the
knowledge base. User’s modifications to the knowledge base are transmitted to a
central serve, where they are checked and then propagated to other users. We do not
have much experience with the engineering problems of building and maintaining
very large knowledge bases. In future, we should have a lot of tools that checks
consistency in the knowledge base. In this unit, you have also learned in detail
about how conceptual dependency works out in various kinds of scenariosyou also
learned (with several examples) how to write scripts and scenes in various situations.
6.8
KEY TERMS
• Conceptual Dependency: It was originally developed to represent knowledge
acquired from natural language input. The goals of this theory are to help in
the drawing of inference from sentences and to be independent of the words
used in the original input.
Self-Instructional
Material
151
Strong Slot and Filler
Structures
• Script: A script is a structure that prescribes a set of circumstances that could
be expected to follow on from one another.
• CYC: It is an ambitious attempt to form a very large knowledge base aimed
at capturing commonsense reasoning.
NOTES
6.9
ANSWERS TO ‘CHECK YOUR PROGRESS’
1. Roger Schank
2. True
3. False
6.10 QUESTIONS AND EXERCISES
Short-Answer Questions
1. Suggest a semantic net to describe the main organs of the human body.
2. Convert the following statements to Conceptual dependencies.
I gave a pen to my friend.
Ram ate ice cream.
I borrowed a book from your friend.
While going home, I saw a car.
3. Construct a script for going to a movie from the viewpoint of the movie goer.
4. Would conceptual dependency be a good way to represent the contents of a
typical issue of National Geographic?
5. Construct CD representation of the following:
(i) John begged Mary for a pencil.
(ii) Jim stirred his coffee with a spoon.
(iii) Dave took the book from Jim.
(iv) On my way home, I stopped to fill my car with petrol.
(v) I heard strange music in the woods.
(vi) Drinking beer makes you drunk.
(vii) John killed Mary by strangling her.
Long-Answer Questions
1. Try capturing the differences between the following in CD:
John slapped Dave, John punched Dave.
Sue likes Prince, Sue adores Prince.
2. Rewrite the script given in the lecture so that the bank robbery goes wrong.
3. Write a script to allow for both outcome of the bank robbery: Getaway and
going wrong and getting caught.
4. Write a script for enrolling as a student.
5. Find out about how MARGIE, SAM and PAM are implemented. In particular
pay attention to their reasoning and inference mechanisms with the knowledge.
152
Self-Instructional
Material
6. Find out how the CYCL language represents knowledge.
Strong Slot and Filler
Structures
7. What are the two levels of representation in the constraints of CYC?
8. Find out the relevance of Meta-Knowledge in CYC and how it controls the
interpretations of knowledge.
NOTES
6.11 FURTHER READING
1. www.egeria.cs.cf.ac.uk
2. www.cs.cf.ac.uk
3. www.cato.cl-ki.uni-osnabrueck.de
4. www.tip.psychology.org
5. www.csis.hku.hk
6. www.eil.utoronto.ca
7. www.cs.cardiff.ac.uk
8. www.rhyshaden.com
9. Levenston, E. A. 1974. ‘Teaching Indirect Object Structures in English – A
Case Study in Applied linguistics’. ELT Journal.
Self-Instructional
Material
153
UNIT 7 NATURAL LANGUAGE
PROCESSING
Natural Language
Processing
NOTES
Structure
7.0 Introduction
7.1 Unit Objectives
7.2 The Problem of Natural Language
7.2.1 Steps in the Process
7.2.2 Why is Language Processing Difficult?
7.3
7.4
7.5
7.6
Syntactic Processing
Augmented Transition Networks
Semantic Analysis
Discourse and Pragmatic Processing
7.6.1 Conversational Postulates
7.7
7.8
7.9
7.10
7.11
7.12
7.0
General Comments
Summary
Key Terms
Answers to ‘Check Your Progress’
Questions and Exercises
Further Reading
INTRODUCTION
In the preceding unit, you learnt about the various concepts related to strong slot
and filler structures. In this unit, you will learn about the fundamental techniques of
natural language processing and evaluate some current and potential applications.
This unit will also help you to develop an understanding of the limits of these
techniques and of current research issues.
7.1
UNIT OBJECTIVES
After going through this unit, you will be able to:
• Describe the problem of natural language
• Understand the concept of syntactic processing
• Discuss augmented transition networks
• Explain discourse and pragmatic processing
7.2
THE PROBLEM OF NATURAL LANGUAGE
Natural language systems are developed both to explore general linguistic theories
and to produce natural language interfaces or front ends to application systems. In
the discussion here we’ll generally assume that there is some application system
that the user is interacting with, and that it is the job of the understanding system to
interpret the user’s utterances and ‘translate’ them into a suitable form for the
application. (For example, we might translate into a database query). We’ll also
Self-Instructional
Material
155
Natural Language
Processing
NOTES
assume for now that the natural language in question is English, though of course in
general, it might be any other language.
In general, the user may communicate with the system by speaking or by
typing. Understanding spoken language is much harder than understanding typed
language—our input is just the raw speech signals (normally a plot of how much
energy is coming in at different frequencies at different points in time). Before we
can get to work on what the speech means we must work out from the frequency
spectrogram what words are being spoken. This is very difficult to do in general. To
begin with, different speakers have different voices and accents and individual
speakers may articulate differently on different occasions, so there’s no simple
mapping from speech waveform to word. There may also be background noise, so
we have to separate the signals resulting from the wind whistling in the trees from
the signals resulting from Fred saying ‘Hello’.
Even if the speech is very clear, it may be hard to work out what words are
spoken. There may be many different ways of splitting up a sentence into words.
Generally, in fluent speech, there are virtually no pauses between words, and the
understanding system must guess where word breaks are. As an example of this,
consider the sentence, ‘how to recognize speech’. If spoken quickly, this might be
misread as saying ‘how to wreck a nice beach’. And even if we get the word breaks
right, we may still not know what words are spoken—some words sound similar
(e.g. bear and bare) and it may be impossible to tell which was meant without
thinking about the meaning of the sentence.
Because of all these problems, a speech understanding system may come up
with a number of different alternative sequences of words, perhaps ranked according
to their likelihood. Any ambiguities as to what the words in the sentence are (e.g.
bear/bare) will be resolved when the system starts trying to work out the meaning
of the sentence.
Whether we start with speech signals or typed input, at some stage we’ll have
a list (or lists) of words and will have to figure out what they mean. There are three
main stages to this analysis:
Syntactic Analysis:
Where we use grammatical rules describing the legal structure of the language to
obtain one or more parses of the sentence.
Semantic Analysis:
Where we try and obtain an initial representation (or representations) of the meaning
of the sentence, given the possible parses.
Pragmatic Analysis:
Where we use additional contextual information to fill in gaps in the meaning
representation, and to work out what the speaker was really getting at.
(So, if we include speech understanding, we can think of their being four
stages of processing).
We won’t be saying much more about speech understanding in this course.
Generally, it is feasible if we train a system for a particular speaker who speaks
clearly in a quiet room, using a limited vocabulary. State-of-the-art systems might
156
Self-Instructional
Material
allow us to relax one of these constraints (e.g. single speaker), but no system has
been developed that works effectively for a wide range of speakers, speaking quickly
with a wide vocabulary in a normal environment. Anyway, the next section will
discuss in some detail the stage of syntactic analysis. The sections after that will
discuss rather more briefly what’s involved in semantic and pragmatic analysis. We
will present these as successive stages, where we first do syntax, then semantics,
then pragmatics. However, you should note that in practical systems it may NOT
always be appropriate to fully separate out the stages, and we may want to interleave:
for example, syntactic and semantic analysis.
Natural Language
Processing
NOTES
Normally, to define or to understand the natural language processing we should
follow the contents given below:
Introduction: Brief history of NLP research, current applications, generic
NLP system architecture, and knowledge-based versus probabilistic
approaches
Finite state techniques: Inflectional and derivational morphology, finitestate automata in NLP, finite-state transducers
Prediction and part-of-speech tagging: Corpora, simple N-grams, word
prediction, stochastic tagging, evaluating system performance
Parsing and generation I: Generative grammar, context-free grammars,
parsing and generation with context-free grammars, weights and probabilities
Parsing and generation II: Constraint-based grammar, unification, simple
compositional semantics
Lexical semantics: Semantic relations, WorldNet, word senses, word sense
disambiguation
Discourse: Anaphora resolution, discourse relations
Applications: Machine translation, e-mail response, spoken dialogue systems
Natural language processing (NLP) can be defined as the automatic (or semiautomatic) processing of human language. The term ‘NLP’ is sometimes used rather
more narrowly than that, often excluding information retrieval and sometimes even
excluding machine translation. NLP is sometimes contrasted with ‘computational
linguistics’, with NLP being thought of as more applied. Nowadays, alternative
terms are often preferred, like ‘language technology’ or ‘language engineering’.
Language is often used in contrast with speech (e.g. Speech and Language
Technology).
In general, language understanding is a complex problem because it requires
programming to extract meaning from sequences of words and sentences. At the
lexical level, the program uses words, prefixes, suffixes and other morphological
forms and inflections. At the syntactic level, it uses grammar to parse a sentence.
Semantic interpretation (i.e., deriving meaning from a group of words) depends on
domain knowledge to assess what an utterance means. For example, ‘Let’s meet by
the bank to get a few bucks’ means one thing to bank robbers and another to weekend
hunters. Finally, to interpret the pragmatic significance of a conversation, the
computer needs a detailed understanding of the goals of the participants in the
conversation and the context of the conversation.
For example, to build a program that understands spoken language, we need
all the facilities to understand written language as well as enough additional
Self-Instructional
Material
157
Natural Language
Processing
NOTES
knowledge to handle all the noise and ambiguities of the audio signal. Thus it is
useful to divide the entire language processing problem into two tasks:
• Processing the written text using lexical, syntactic and semantic knowledge
of the language as well as the required real-world information
• Processing the spoken language using all the information needed above and
additional knowledge about phonology as well as enough added information
to handle further ambiguities that arise in speech
Examples:
1. The Problem: No natural language program can be complete because new
words, expressions and meanings can be generated liberally:
I’ll bring it to you.
Language can develop with the change in the experiences that we want to
communicate.
2. Same expressions means different things in different contexts:
Where’s the water? (in a chemistry lab, it must be pure)
Where’s the water? (when you are thirsty, it must be potable)
Where’s the water? (dealing with a leaky roof, it can be filthy)
Language lets us communicate about an infinite world using a finite number
of symbols.
3. There are lots of ways to say the same thing:
John was born on November 10.
John’s birthday is November 10.
When you know a lot, facts imply each other. Language is intended to be
used by agents who know a lot.
4. English sentences are incomplete descriptions of the information that they
are intended to convey:
Some dogs are outside.
I called Sonia to ask her to the movies.
↓
Some dogs are on the lawn.
Three dogs are on the lawn.
Don, Spot and Tiger are on
the lawn.
She said she’d love to go.
↓
She was home when I called. She
answered the phone. I actually asked her.
Language allows speakers to be as vague or as precise as they like. It also allows
speakers to leave out things they believe their hearers already know.
We concentrate on written language processing which is usually called natural
language processing. Natural language processing includes understanding and
generation, as well as other tasks such as multilingual translation.
Natural languages are human languages like English, Telugu and Hindi. The
different areas of natural language processing are as follows:
1. Natural language understanding
2. Natural language generation
158
Self-Instructional
Material
3. Natural language translations
The reasons for studying natural languages are:
Natural Language
Processing
1. To understand how language is structured
2. To understand the mental mechanisms necessary to support use of language,
e.g., memory
NOTES
3. Easier communication with computers for humans
Talking is easier than typing.
Compact communication of complex concepts.
4. Machine translation, i.e., converting one language to another language
5. In future, intelligent systems may use natural language to talk to each other
Natural language evolved because humans needed to communicate complex concepts
over a bandwidth-limited serial channel, i.e., speech. All of our communi-cation
methods are serial:
• A small number of basic symbols (characters).
• Basic symbols are combined into words.
• Words are combined into phrases and sentences.
Natural language is strongly biased toward minimality:
• Never say something that the listener already knows.
• Omit things that can be inferred.
• Eliminate redundancy.
• Shorten.
In general, natural language processing involve a translation from the natural
language to some internal representation that represents its meaning. The internal
representation might be as predicate calculus, a semantic network, a frame or as
script representation.
There are many problems in making such a translation:
1. Ambiguity: There may be multiple ways of translating a statement.
Ex. I went to bank
Here bank may contain different meanings.
2. Incompleteness: The statements are usually only the bare outline of the
message. The missing parts must be filled in.
Ex. I was late to my office.
My car wouldn’t start.
The battery was dead.
The study of language is traditionally divided into several broad areas:
1. Syntax: The rules by which words can be put together legally to form
sentences.
2. Semantics: Meaning of sentences.
3. Pragmatics: Knowledge about the world and the social content of language
use.
Self-Instructional
Material
159
Natural Language
Processing
Before getting into detail on the several components of the natural language
understanding process, it is useful to survey all of them and see how they fit together.
7.2.1 Steps in the Process
NOTES
1. Morphological Analysis: Individual words are analysed into their components
and non-word tokens, for example, punctuation marks, are separated from
the words.
2. Syntactic Analysis: Linear sequences of words are changed into structures
that show how the words relate to each other. Some word sequences may be
refused if they have disregarded the language’s rules for how words may be
combined. For example, an English syntactic analyser would refuse the
sentence ‘Boy the go the to store.’
3. Semantic Analysis: The structures created by the syntactic analyser are
assigned meaning. In other words, a mapping is made between the syntactic
structures and objects and in the task domain. Structures for which no such
mapping is possible may be refused. For example, the sentence ‘Colourless
green ideas sleep furiously’ would be refused as semantically not suitable.
4. Discourse Integration: The meaning of an individual sentence may depend
on the sentences that precede it and may influence the meaning of the sentences
that follow it. For example, the word ‘it’ in the sentence, ‘Shyam wanted it’,
depends of the prior discourse content, while the word ‘Shyam’ may influence
the meaning of the subequent sentences (Such as, ‘he always had’).
5. Pragmatic Analysis: The structure representing what was said is reinterpreted
to determine what was actually meant. For example, the sentence ‘Do you
know what time it is?’ should be explained as a request to be told the time.
7.2.1.1 Syntax
The stage of syntactic analysis is the best understood stage of natural language
processing. Syntax helps us understand how words are grouped together to make
complex sentences, and gives us a starting point for working out the meaning of the
whole sentence. For example, consider the following two sentences:
1. The dog ate the bone.
2. The bone was eaten by the dog.
The rules of syntax help us work out that it’s the bone that gets eaten and not
the dog. A simple rule like ‘it’s the second noun that gets eaten’ just won’t
work.
Syntactic analysis allows us to determine possible groupings of words in a
sentence. Sometimes, there will only be one possible grouping, and we will
be well on the way to working out the meaning. For example, in the following
sentence:
3. The rabbit with long ears enjoyed a large green lettuce.
We can work out from the rules of syntax that ‘the rabbit with long ears’
forms one group (a noun phrase), and ‘a large green lettuce’ forms another
noun phrase group. When we get down to working out the meaning of the
sentence we can start off by working out the meaning of these word groups,
before combining them together to get the meaning of the whole sentence.
160
Self-Instructional
Material
In other cases, there may be many possible groupings of words. For example,
in the sentence ‘John saw Mary with a telescope’, there are two different
readings based on the following groupings:
(i) John saw (Mary with a telescope), i.e., Mary has the telescope.
(ii) John (saw Mary with a telescope), i.e., John saw her with the telescope.
Natural Language
Processing
NOTES
When there are many possible groupings, then the sentence is syntactically
ambiguous. Sometimes we will be able to use general knowledge to work out
which is the intended grouping. For example, consider the following sentence:
4. I saw the Forth Bridge flying into Edinburgh.
We can probably guess that the Forth Bridge isn’t flying! So, this sentence is
syntactically ambiguous, but unambiguous if we bring to bear general
knowledge about bridges. The ‘John saw Mary ..’ example is more seriously
ambiguous, though we may be able to work out the right reading if we know
something about John and Mary (is John in the habit of looking at girls through
a telescope?).
Anyway, rules of syntax specify the possible organizations of words in sentences.
They are normally specified by writing a grammar for the language. Of course, just
having a grammar isn’t enough to analyse the sentence; we need a parser to use the
grammar to analyse the sentence. The parser should return possible parse trees for
the sentence, indicating the possible groupings of words into higher-level syntactic
sections. The next section will describe how simple grammar and parsers may be
written; focussing on Prolog’s built in direct clause grammar formalism.
Language is meant for communicating about the world. If we succeed at
building a computational model of language, we should have a powerful tool for
communication about the world. So we can exploit knowledge about the world, in
combination with linguistic facts, to build computational natural language systems.
The largest part of human linguistic communication occurs as speech. Written
language is a recent invention and it plays a major role than speech in most activities.
Communication
Intentional exchange of information brought about by the production and perception
of signs drawn from a shared system of conventional signs.
• Humans use language to communicate most of what is known about the world.
• The Turing test is based on language.
Communication as Action
• Speech act
• Language production viewed as an action
• Speaker, hearer, utterance
Examples:
Query: ‘Have you smelled the wumpus anywhere?’
Inform: ‘There’s a breeze here in 3 4.’
Request: ‘Please help me carry the gold.’
Self-Instructional
Material
161
Natural Language
Processing
‘I could use some help carrying this.’
Acknowledge: ‘OK’
Promise: ‘I’ll shoot the wumpus.’
NOTES
Fundamentals of Language
• Formal language: A (possibly infinite) set of strings
• Grammar: A finite set of rules that specifies a language
• Rewrite rules
non-terminal symbols (S, NP, etc.)
terminal symbols (he)
S → NP VP
NP → Pronoun
Pronoun → he
Chomsky’s Hierarchy
Every language is specified by a particular grammar. Therefore, the classification
of languages is based on the classification of the grammar used to specify them.
Consider a grammar G(VN, Σ, P, S), where VN and ∑ are the sets of symbols and S
∈ VN. For classifying grammar, you need to depend on P, which is the set of
productions of the grammar.
Four classes of grammatical formalisms:
• Recursively enumerable grammar
• Unrestricted rules: both sides of the rewrite rules can have any number of
terminal and non-terminal symbols AB → C
• Context-sensitive grammar
• The RHS must contain at least as many symbols as the LHS ASB → AXB
• Context-free grammar (CFG)
• LHS is a single non-terminal symbol S → XYa
• Regular grammars X → a X → aY
In terms of the forms of production, Chomsky classified the grammar hierarchy
into four types, which are as follows:
• Type 0 Grammar
• Type 1 Grammar
• Type 2 Grammar
• Type 3 Grammar
Type 0 Grammar
In the formal language theory, type 0 grammar is the superset of grammars and is
defined as a phase structure grammar that has no restrictions. In other words, in
type 0 grammar, no restrictions are there on the left and the right sides of the
productions of the grammar. Type 0 grammar is also known as unrestricted grammar.
In type 0 grammar, the productions used for generating strings are known as type 0
productions. Type 0 productions are not associated with any restrictions. The
languages specified by type 0 grammar are known as type 0 languages.
162
Self-Instructional
Material
In a production of the form φAψ → φαψ with A being a variable, φ and ψ are
respectively known as the left context and the right context, while the string φαψ is
known as replacement string. Again, a production having the form, φAψ → φαψ is
known as type 1 production, when α ≠ Λ. If you consider a production, abcAcd →
abcABcd, then Abc is the left context, while cd is the right context. Similarly, in
AC → Λ, A is the left context, while Λ is the right context and the production here
simply erases C.
Theorem 1: If G be type 0 grammar, then you can find an equivalent grammar G1,
where each production is either of the form α → β or A → a. Here, α and β are the
strings variables where A is a variable and a is a terminal.
Natural Language
Processing
NOTES
Proof: To construct G1, consider a production α → β in G with α or β having the
same terminals. Let in both α and β, a new variable Ca replace each of the terminals
to produce α’ and β ’. Now, for every α → β, where α and β have same terminals,
there is a corresponding α′ → β ′ with productions of the form Ca → a for each
terminal that appears on α or β. Hence, the new productions obtained from the
above construction are the new productions for G1. Also, the variables of G along
with the new variables of the form Ca are the variables of G1. Similarly, the terminals
and the start symbol of G1 are also same as those of G. Thus, G1 satisfies the required
conditions for a grammar and it is equivalent to G. Therefore, L(G) = L(G1).
Note: This theorem also holds for grammars of types 1, 2 and 3.
Type 1 Grammar
A grammar is termed as type 1 grammar when all of its productions are type 1
productions. In other words, you can define a grammar G (VN , ∑, P, S) , where all
the productions are of the form φAψ → φαψ. Type 1 grammar is also known as
context-sensitive or context-dependent grammar. The formal language that can be
generated using type 1 grammar is known as type 1 or context-sensitive language.
In type 1 grammar, production of the type S → Λ is allowed; however, here, S does
not appear on the right-hand side of any of the production.
A production φAψ → φαψ in type 1 grammar does not increase the length of the
working string as |φAψ| ≤ |φαψ| and α Λ. However, if a production α → β is such
that |α| ≤ |β|, then it is not necessary that the production is of type 1. For example,
AB → BC is not of type 1.
Again, a grammar G (VN , ∑, P, S) is called monotonic when every production in P
is of the form α → β having |α| ≤ |β| or S → . In the second situation, S does not
appear on the right-hand side of any of the production of G.
Theorem: Every monotonic grammar G is equivalent to type 1 grammar.
Proof: For the grammar G, you can apply theorem 1 to get an equivalent grammar
G1. Now, consider a production A1A2…Am → B1B2…Bn, where n ≥ m in G1. For
m = 1, this production is of type 1. Now, consider m ≥ 2. Then you can introduce
new variables C 1, C 2, …, C nr to construct the following type 1 productions
corresponding to A1A2…Am → B1B2…Bn
A1A2…Am → C1A2…Am
C1A2…Am → C1C2A3…Am
C1C2A3…Am → C1C2C3A4…Am…
C1C2…Cm−1Am → C1C2… CmBm+1 Bm+2…Bn
Self-Instructional
Material
163
Natural Language
Processing
NOTES
C1C2… CmBm+1 Bm+2…Bn → B1C2… CmBm+1 Bm+2…Bn
B1C2 C3…Bn → B1B2C3…Bn
B1 B2…CmBm+1…Bn → B1B2… Bm…Bn
Now, explaining the above construction, you can find that since more than one
symbol on the left-hand side of the production A1A2…Am → B1B2…Bn are replaced,
the production is not of type 1. In the chain of the above productions, A1, A2, …, Am−
are replaced by C1, C2, …, Cm−1. Here, Am is replaced by CmBm+1…Bn. Similarly,
1
C1, C2, etc. are correspondingly replaced by B1, B2, etc. Since only one variable is
replaced at a time in these productions so these productions are of type 1.
Now, for each production in G1 that is not of type 1, you can repeat this construction
process. This can give rise to a new grammar G2 having variables that are of G1together with the new variables. The productions of G2 are of type 1 and the terminals
and the start symbol of G2 are those of G1.
Therefore, you can conclude that G2 is of type 1 or context-sensitive, i.e.,
L(G2) = L(G1) = L(G)
Type 2 Grammar
In the formal language, a grammar is called type 2 grammar, if it only contains type
2 productions. A type 2 production is of the form A → α, where A is in VN and α is
in (VN ∪ ∑)*. In type 2 production, the right-hand side of the production does not
have any right or left context. For example, S → Ba, A → ab, B → abc, etc. are the
examples of type 2 productions.
Type 2 grammar is often termed as context-free grammar and its productions are
called context-free productions. If a language is generated using type 2 grammar,
then it is known as type 2 language or context-free language.
Type 3 Grammar
A grammar is termed as type 3 grammar if it only contains the productions of type
3. Here, a production of type 3 is a production of the form A → a or A → aB, where
A and B are in VN, while a is in ∑. In type 3 grammar, the productions of the form
S → Λ are allowed; however, in such cases, the right-hand side of the productions
does not contain S. Type 3 grammar is generally known as regular grammar. The
languages specified by type 3 grammar are known as type 3 languages or regular
languages.
Example 1. Find the highest type number that can be applied to the following
productions:
1. S → A0, A → 1 | 2 | B0, B → 012
2. S → ASB | b, A → bA | c
3. S → bS | bc
164
Self-Instructional
Material
Solution:
1. Here, S → A0, A → B0 and B → 012 are of type 2, while A → 1 and A →
2 are of type 3. Therefore, the highest type number is 2.
2. Here, S → ASB is of type 2, while S b, A → bA and A → c are of type 3.
Therefore, the highest type number is 2.
3. Here, S → bS is of type 3, while S → ab is of type 2. Therefore, the highest
type number is 2.
Chomsky Normal Form
In the Chomsky Normal Form (CNF), there are certain restrictions on the length
and the type of symbols to be used in the RHS of the productions. A grammar G is
said to be in the CNF if the productions defined in the grammar is in one of the
following forms:
Natural Language
Processing
NOTES
• A→a
• A → BC
• S → ∧ if ∧ ∈ G
For example, the grammar G with the productions S → AB, S → Λ, A → a and B →
b is in the CNF. The grammar G with the productions S → ABC, S → aC, A → b, B
→ a and C → c is not in the CNF because the productions S → ABC and S → aC
are not in the form as required by the definition of the CNF. You can reduce a CFG
into an equivalent grammar G2, which is in CNF, by performing the following steps:
• Step 1: Eliminate all the null productions and then eliminate all the unit
productions from the given grammar. Represent the grammar obtained after
the elimination of all the null and unit productions by G = (VN, Σ, P, S).
• Step 2: Eliminate all the terminals that appear with some non-terminals on
the RHS of the productions. For this, you need to define the grammar G1 =
(V2 N, Σ, P1, S). P1 and V2 N in G1 are constructed in the following ways:
o The productions in P, which are of the forms A → a and A → BC,
are placed in P1. Also, all the non-terminal symbols in VN are included
in V2 N.
o
Consider the production, A → X 1X2 … Xn with some terminal
symbols on the RHS. If Xi represents the terminal symbol, say ai,
then add a new non-terminal Cai to V2 N and a new production Cai
→ ai to P1. Also, all the terminal symbols in the production of the
form A → X1X2 … Xn are replaced by the corresponding new nonterminal symbol and the resulted production is included in P1.
• Step 3: Restrict the number of the non-terminals on the RHS of a production.
The RHS of the productions in P1 either consists of a terminal symbol or
more than one non-terminal symbols. Thus, you need to define the grammar,
G2 = (V2 2†N, Σ, P2, S), in the following ways:
o
All the productions in P1, which are in the required form, are included
in P2. All the non-terminals in V2 N are also included in V2 2†N.
o
For the productions of type A → A1A2 . . . Am, where m > 3, add new
productions of the following form to P2 2†:
A → A1C1
C1 → A2C2
...
...
...
Cm - 2 → Am – 1Am
Where, C1, C2, . . . , Cm – 2 are the new non-terminal symbols added
to V2 2†N.
Self-Instructional
Material
165
Natural Language
Processing
NOTES
Example 2. Reduce the grammar G to the CNF with the following productions:
S → bBaA, A → aA | a, B → bB | b
Solution: Since the grammar does not contain any null and unit productions, you
need to proceed with step 2.
Step 2: Let G1 = (V2 N, Σ, P1, S)
Now, construct P1 and V2 N as follows:
Add the production A → a and B → b to P1.
Eliminate the terminal symbols from the productions S → bBaA, A → aA | a and B
→ bB | b by using the following productions:
S → CbBCaA
A → CaA
B → CbB
Ca → a
Cb → b
All these productions are added to P1 and V2 N = {S, A, B, Ca, Cb}
Step 3: The length of the RHS of the production S → CbBCaA is four; therefore, it
should be replaced by the following productions:
S → CbC1
C1 → BC2
C2 → CaA
Therefore, the productions in P2 are as follows:
S → CbC1, C1 → BC2, C2 → CaA, A → CaA, B → CbB, Ca → a, Cb → b, A → a, and
B → b.
V2 2 N = {S, A, B, Ca, Cb, C1, C2}
Σ = {a, b}
Thus, the grammar in the CNF is represented by G2 = (V2 2 N, Σ, P2, S).
Component Steps of Communication
SPEAKER:
• Intention
Know (H, ¬Alive (Wumpus, S3))
• Generation
‘The wumpus is dead’
• Synthesis
[thaxwahmpaxsihzdehd]
HEARER:
• Perception:
‘The wumpus is dead’
166
Self-Instructional
Material
Natural Language
Processing
• Analysis (Parsing):
(Semantic Interpretation): ¬Alive (Wumpus, Now)
Tired (Wumpus, Now)
(Pragmatic Interpretation): ¬Alive (Wumpus1, S3)
NOTES
Tired (Wumpus1, S3)
HEARER:
• Disambiguation:
¬Alive (Wumpus1, S3)
• Incorporation:
TELL (KB, ¬Alive (Wumpus1, S3))
Writing Grammar
Natural language grammar specifies allowable sentence structures in terms of basic
syntactic categories such as nouns and verbs, and allows us to determine the structure
of the sentence. It is defined in a similar way to a grammar for programming language,
though it tends to be more complex, and the notations used are somewhat different.
Because of the complexity of natural language, a given grammar is unlikely to
cover all possible syntactically acceptable sentences.
[Note: In natural language we don’t usually parse language in order to check
that it is correct. We parse it in order to determine the structure and help work out
the meaning. But most grammars are just concerned with the structure of ‘correct’
English, as it gets much more complex to parse if you allow bad English.]
A starting point for describing the structure of a natural language is to use
context free grammar (as often used to describe the syntax of programming
languages). Suppose we want grammar that will parse sentences like:
1. John ate the biscuit.
2. The lion ate the schizophrenic.
3. The lion kissed John.
But we want to exclude incorrect sentences like:
1. Ate John biscuit the.
2. Schizophrenic the lion the ate.
3. Biscuit lion kissed.
A simple grammar that deals with this is the following:
1. sentence → noun_phrase, verb_phrase.
Self-Instructional
Material
167
Natural Language
Processing
2. noun_phrase → proper_name.
3. noun_phrase → determiner, noun.
4. verb_phrase → verb, noun_phrase.
NOTES
proper_name → [Mary]
proper_name → [John]
noun → [schizophrenic]
noun → [biscuit]
verb → [ate]
verb → [kissed]
determiner → [the]
The notation is similar to what is sometimes used for grammars of programming
languages. A sentence consists of a noun phrase and a verb phrase. A noun phrase
consists of either a proper noun (using rule 2) or a determiner (e.g., the, a) and a
noun. A verb phrase consists of a verb (e.g., ate) and a noun phrase. The rules at the
end are really like dictionary entries, which state the syntactic category of different
words. Basic syntactic categories such as ‘noun’ and ‘verb’ are terminal symbols in
grammar, as they cannot be expanded into lower level categories. (We put words in
square brackets because that’s the way it is generally done in prologs built in grammar
notation).
If we consider the example sentences just stated, the sentence ‘John ate the
biscuit’ consists of a noun phrase ‘John’ and a verb phrase ‘ate the biscuit’. The
noun phrase is just a proper noun, while the verb phrase consists of a verb ‘ate’ and
another noun phrase (‘the biscuit’). This noun phrase consists of a determiner ‘the’
and a noun ‘biscuit’. The incorrect sentences will be excluded by grammar. For
example, ‘biscuit lion kissed’ starts with two nouns, which is not allowed in grammar.
However, some odd sentences will be allowed, such as ‘the biscuit kissed John’.
This sentence is syntactically acceptable, just semantically odd, so should still be
parsed.
For a given grammar, we can illustrate the syntactic structure of the sentence
by giving the parse tree, which shows how the sentence is broken down into different
syntactic constituents. This kind of information may be useful for later semantic
processing. Anyway, given the above grammar, the parse tree for ‘John ate the lion’
would be:
Of course, the grammar given above is not really adequate to parse natural
language properly. Consider the following two sentences:
• Mary eat the lion.
• Mary eats the ferocious lion.
If we have ‘eat’ and ‘eats’ categorized as verbs, then, given the simple grammar
above, the first sentence will be acceptable according to the grammar, while the
second won’t—we don’t have any mention of adjectives in our grammar. To deal
with the first problem we need to have some method of enforcing number/person
agreement between subjects and verbs, so that things like ‘I am..’ and ‘We are ..’ are
accepted, but ‘I are ..’ and ‘We am ..’ are not. To deal with the second problem, we
need to add further rules to our grammar.
168
Self-Instructional
Material
To enforce subject-verb agreement, the simplest method is to add arguments
to our grammar rules. If we’re only concerned about singular vs plural nouns (and
assume that we don’t have any first or second person pronouns), we might get the
rules and dictionary entries, which include the following:
sentence → noun_phrase(Num), verb_phrase(Num)
Natural Language
Processing
NOTES
noun_phrase(Num) → proper_name(Num)
noun_phrase(Num) → determiner(Num), noun(Num)
verb_phrase(Num) → verb(Num), noun_phrase(_)
proper_name(sing) → [Mary]
noun(sing) → [lion]
noun(plur) → [lions]
det(sing) → [the]
det(plur) → [the]
verb(sing) → [eats]
verb(plur) → [eat]
Note that strictly, we no longer have context-free grammar having added these extra
arguments to our rules. In general, getting the agreement right in grammar is much
more complex that this. We need fairly complex rules, and we also need to put more
information in dictionary entries. A good dictionary will not state everything
explicitly, but will exploit general information about word structure, such as the
fact that, given a verb such as ‘eat’ the third person singular form generally involves
adding an ‘s’: hit/hits, eat/eats, like/likes, etc. Morphology is the area of natural
language processing concerned with such things.
To extend the grammar to allow adjectives we need to add an extra rule or
two, e.g.:
noun_phrase(Num) → determiner(Num), adjectives, noun(Num).
adjectives → adjective, adjectives.
adjectives → adjective.
adjective → [ferocious].
adjective → [ugly].
etc.
That is, noun phrases can consist of a determiner, some adjectives and a noun.
Adjectives consist of an adjective and some more adjectives, OR just an adjective.
We can now parse the sentence ‘the ferocious ugly lion eats Mary’.
Another thing we may need to do to grammar is to extend it so we can
distinguish between transitive verbs that take an object (e.g. likes) and intransitive
verbs that don’t (e.g. talks). (‘Mary likes the lion’ is OK while ‘Mary likes’ is not.
‘Mary talks’ is OK while ‘Mary talks the lion’ is not). This is left as an exercise for
the reader.
Our grammar so far (if we put all the bits together) still only parses sentences
of a very simple form. It certainly wouldn’t parse the sentences I’m currently writing!
We can try adding more and more rules to account for more and more of English:
for example, we need rules that deal with prepositional phrases (e.g. ‘Mary likes
Self-Instructional
Material
169
Natural Language
Processing
NOTES
the lion with the long mane’), and relative clauses (e.g., ‘The lion that ate Mary
kissed John’). These are left as another exercise for the reader.
As we add more and more rules to allow more bits of English to be parsed
then we may find that our basic grammar formalism becomes inadequate, and we
need a more powerful one to allow us to concisely capture the rules of syntax.
There are lots of different grammar formalisms that have been developed (e.g.,
unification grammar, categorical grammar), but we won’t go into them.
Formal Grammar
• The lexicon for εo:
Noun → stench | breeze | glitter | wumpus | pit | pits | gold | …
Verb → is | see | smell | shoot | stinks | go | grab | turn | …
Adjective → right | left | east | dead | back | smelly | …
Adverb → here | there | nearby | ahead | right | left | east | …
Pronoun → me | you | I | it | …
Name → John | Mary | Bush | Rocky | …
Article → the | a | an | …
Preposition → to | in | on | near | …
Conjunction → and | or | but | …
Digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• The grammar for εo:
S → NP VP
| S Conjunction S
NP → Pronoun
| + feel a breeze
| feel a breeze + and + I smell a wumpus
|
| Name
John
| Noun
pits
| Article Noun
the + wumpus
| Digit Digit
34
| NP PP
the wumpus + to the east
| NP RelClause
the wumpus + that is smelly
VP → Verb stinks
| VP NP
feel + a breeze
| VP Adjective
is + smelly
| VP PP
turn + to the east
| VP Adverb
go + ahead
PP → Preposition NP
to + the east
RelClause → that VP
that + is smelly
• Parts of speech
Open class: noun, verb, adjective, adverb Closed class: pronoun, article,
preposition, conjunction …
170
Self-Instructional
Material
• Grammar
Natural Language
Processing
Overgenerate: ‘Me go Boston’
Undergenerate: ‘I think the wumpus is smelly’
7.2.1.2 Parsers
NOTES
Having a grammar isn’t enough to parse natural language; you need a parser. The
parser should search for possible ways the rules of the grammar can be used to
parse the sentence, so parsing can be viewed as a kind of search problem. In general,
there may be many different rules that can be used to ‘expand’ or rewrite a given
syntactic category, and the parser must check through them all, to see if the sentence
can be parsed using them. For example, in our mini-grammar above there were two
rules for noun_phrases—a parse of the sentence may use either one or the other. In
fact we can view the grammar as defining an AND-OR tree to search: alternative
ways of expanding a node give OR branches, while rules with more than one syntactic
category on the right hand side give AND branches.
So, to parse a sentence we need to search through all these possibilities,
effectively going through all possible syntactic structures to find one that fits the
sentence. There are good ways and bad ways of doing this, just as there are good
and bad ways of parsing programming languages. One way is basically to do a
depth-first search through the parse tree. When you reach the first terminal node in
the grammar (i.e. a primitive syntactic category, such as noun) you check whether
the first word of the sentence belongs to this category (e.g., is a noun). If it does,
then you continue the parse with the rest of the sentence. If it doesn’t, you backtrack
and try alternative grammar rules.
As an example, suppose you were trying to parse ‘John loves Mary’, given
the following grammar:
sentence → noun_phrase, verb_pharse.
verb_phrase → verb, noun_phrase.
noun_pharse → det, noun.
noun_phrase → p_name.
verb → [loves].
p_name → [john].
p_name → [Mary].
You might start off expanding ‘sentence’ to a verb phrase and a noun phrase. Then
the noun phrase would be expanded to give a determiner and a noun, using the third
rule. A determiner is a primitive syntactic category (a terminal node in the grammar);
so we check whether the first word (John) belongs to that category. It doesn’t—
John is a proper noun—so we backtrack and find another way of expanding
‘noun_phrase’ and try the fourth rule. Now, as John is a proper name this will work
OK, so we continue the parse with the rest of the sentence (‘loves Mary’). We
haven’t yet expanded the verb phrase, so we try to parse ‘loves Mary’ as a verb
phrase. This will eventually succeed, so the whole thing succeeds.
It may be clear by now that in Prolog the parsing mechanism is really just
Prolog’s built-in search procedure. Prolog will just backtrack to explore the different
possible syntactic structures. The extra arguments which Prolog (internally) adds to
Self-Instructional
Material
171
Natural Language
Processing
NOTES
DCG rules allows it to parse a bit of the sentence using one rule, then the rest of the
sentence using another.
Note that this kind of simple top-down parser can be implemented reasonably
easily in other languages that don’t support backtracking by using an agenda-based
search mechanism. An item in the agenda may be the remaining syntactic categories
and bits of sentences left to parse, given this branch of the search tree (e.g. [[verb,
noun_phrase], [loves, Mary]]).
Simple parsers are often inefficient, because they don’t keep a record of all
the bits of the sentence that have been parsed. Using simple depth-first search with
backtracking can result in useful bits of parsing being undone on backtracking: if
you have a rule ‘a -> b, c, d’ and a rule ‘a -> b, c, e’ then you may find the first part
of the sentence consists of ‘b’ and ‘c’ constituents, go on to check if the rest is a ‘d’
constituent, but when that fails a Prolog-like system will throw away its conclusions
about the first half when backtracking to try the second rule. It will then duplicate
the parsing of the ‘b’ and ‘c’ bits. Anyway, better parsers try to avoid this. Two
important kinds of parsers used in natural language are transition network parsers
and chart parsers. A transition network parser makes use of a more flexible network
representation for grammar rules to partially avoid this problem. The above rules
would be represented something like:
The transition network parser would basically traverse this network, checking
that words in the sentence match the syntactic categories on each arc.
A chart parser goes further in avoiding duplicating work by explicitly recording
in a chart all the possible parses of each bit of the sentence.
Parse Tree
Syntactic Analysis (Parsing)
• Parsing: The process of finding a parse tree for a given input string.
• Top-down parsing: Start with the S symbol and search for a tree that has the
words as its leaves.
• Bottom-up parsing: Start with the words and search for a tree with root S.
Trace of Bottom-up Parsing:
172
Self-Instructional
Material
List of nodes
Subsequence
Rule
the wumpus is dead
the
Article → the
Article wumpus is dead
wumpus
Noun → wumpus
Article Noun is dead
Article Noun
NP → Article Noun
NP is dead
is
Verb → is
NP Verb dead
dead
Adjective → dead
NP Verb Adjective
Verb
VP → Verb
NP VP Adjective
VP Adjective
VP → VP Adjective
NP VP
NP VP
S → NP VP
Natural Language
Processing
NOTES
S
7.2.1.3 Semantics and pragmatics
The remaining two stages of analysis, semantics and pragmatics, are concerned
with getting at the meaning of a sentence. In the first stage (semantics), a partial
representation of the meaning is obtained based on the possible syntactic structure(s)
of the sentence, and on the meanings of the words in that sentence. In the second
stage, the meaning is elaborated based on contextual and world knowledge. To
illustrate the difference between these stages, consider the sentence:
He asked for the boss
From the knowledge of the meaning of the words and the structure of the sentence
we can work out that someone (who is male) asked for someone who is the boss.
But we can’t say who these people are and why the first guy wanted the second. If
we know something about the context (including the last few sentences spoken/
written) we may be able to figure these things out. Maybe the last sentence was
‘Fred had just been sacked.’, and we know from our general knowledge that bosses
generally sack people and if people want to speak to people who sack them it is
generally to complain about it. We could then really start to get at the meaning of
the sentence— Fred wants to complain to his boss about getting sacked.
Anyway, this second stage of getting at the real contextual meaning is referred
to as pragmatics. The first stage—based on the meanings of the words and the
structure of the sentence—is semantics and is what we’ll discuss a bit more next.
• Semantics
• Pragmatics
Semantics
In general, the input to the semantic stage of analysis may be viewed as being a set
of possible parses of the sentence, and information about the possible word meanings.
The aim is to combine the word meanings, given knowledge of the sentence structure,
to obtain an initial representation of the meaning of the whole sentence. The hard
thing, in a sense, is to represent word meanings in such a way that they may be
combined with other word meanings in a simple and general way.
It is not really possible even to present a simple approach to semantic analysis
in less than half a lecture—we can only outline some of the problems and show
what the output of this stage of analysis may be (though in general there are many
different semantic theories and representations, just as there are many different
grammar formalisms).
Self-Instructional
Material
173
Natural Language
Processing
First, let’s go back to our syntactically ambiguous sentences and see how
semantics could help:
• Time flies like an arrow.
NOTES
• Fruit flies like a banana.
If we have some representation of the meanings of the different words in the sentence
we can probably rule out the silly parse. We might look up ‘banana’ (maybe in some
frame or semantic net system) and find that it is a fruit, and fruits generally don’t
fly. We might then be able to throw out the reading ‘flies like a banana’ if we made
sure that sentences which mean ‘X does something like Y’ require that X and Y can
do that thing!
Sometimes ambiguity is introduced at the stage of semantic analysis. For
example:
John went to the bank
Did John go to the river bank or the financial bank? We might want to make this
explicit in our semantic representation, but without contextual knowledge we have
no good way of choosing between them. This kind of ambiguity occurs when a
word has two possible meanings, but both of them may, for example, be nouns. To
obtain a semantic representation it helps if you can combine the meanings of the
parts of the sentence in a simple way to get at the meaning of the whole (The term
compositional semantics refers to this process). For those familiar with lambda
expressions, one way to do this is to represent word meanings as complex lambda
expressions, and just use function application to combine them. To combine a noun
phrase ‘John’ and a verb phrase ‘sleeps’ we might have:
Verb phrase meaning
: λ X.sleeps (X)
Noun phrase meaning
: john.
Apply VP meaning to NP meaning : λ X.sleeps (X)(john) = sleeps (john)
The output of the semantic analysis stage may be anything from a semantic net to
an expression in some complex logic. It will partially specify the meaning of the
sentence. From ‘He went to the bank’ we might have two possible readings,
represented in predicate logic as:
∃ X male (X)
wentto (X, Y)
financialbank (Y)
∃ X male (X)
wentto (X, Y)
riverbank (Y)
Good representations for sentence meaning tend to be much more complex than
this, to properly capture tense, conditionals, etc., but this gives the general flavour.
Semantic Interpretation:
• Semantics: Meaning of utterances
• First-order logic as the representation language
• Compositional semantics: Meaning of a phrase is composed of meaning of
the constituent parts of the phrase
Exp(x) → Exp(x1) Operator (op) Exp(x2)
{x = Apply (op, x1, x2)}
Exp(x) → (Exp (x))
174
Self-Instructional
Material
Exp(x) → Number(x)
Natural Language
Processing
Number(x) → Digit(x)
Number(x) → Number(x1) Digit(x2) {x = 10 × x1 + x2}
Digit(x) → x {0 ≤ x ≤ 9}
NOTES
Operator(x) → x {x “ {+, –, ×, ÷}}
Example:
John loves Mary
Loves (John, Mary)
(λy λx Loves (x, y)) (Mary) ≡ λx Loves (x, Mary)
(λx Loves (x, Mary)) (John) ≡ Loves (John, Mary)
S (rel (obj)) → NP (obj) VP (rel)
VP (rel (obj)) → Verb (rel) NP (obj)
NP (obj) → Name (obj)
Name (John) → John
Name (Mary) → Mary
Verb (λy λx Loves (x, y)) → loves
Pragmatics
Pragmatics is the last stage of analysis, where the meaning is elaborated based on
contextual and world knowledge. Contextual knowledge includes knowledge of
the previous sentences (spoken or written), general knowledge about the world and
knowledge of the speaker.
One important task at this stage is to work out referents of expressions. For
example, in the sentence ‘he kicked the brown dog’ the expression ‘the brown dog’
refers to a particular brown dog (say, Fido). The pronoun ‘he’ refers to the particular
guy we are talking about (Fred). A full representation of the meaning of the sentence
should mention Fido and Fred.
We can often find this out by looking at the previous sentence, e.g.:
Fred went to the park.
He kicked the brown dog.
Self-Instructional
Material
175
Natural Language
Processing
NOTES
We can work out from this that ‘he’ refers to Fred. We might also guess that
the brown dog is in the park, but to figure out that we mean Fido we’d need some
extra general or contextual knowledge—that the only brown dog that generally
frequents the park is Fido. In general, this kind of inference is pretty difficult, though
quite a lot can be done using simple strategies, like looking at who’s mentioned in
the previous sentence to work out who ‘he’ refers to. Of course, sometimes there
may be two people (or two dogs) that the speaker might be referring to, e.g.:
There was a brown dog and a black dog in the park. Fred went to the park
with Jim. He kicked the dog.
In cases like this we have referential ambiguity. It is seldom quite as explicit
as this, but in general can be a big problem. When the intended referent is unclear a
natural language dialogue system may have to initiate a clarification subdialogue,
asking for example ‘Do you mean the black one or the brown one.’.
Anyway, another thing that is often done at this stage of analysis (pragmatics)
is to try and guess at the goals underlying utterances. For example, if someone asks
how much something is you generally assume that they have the goal of (probably)
buying it. If you can guess at people’s goals you can be a bit more helpful in
responding to their questions. So, an automatic airline information service, when
asked when the next flight to Paris is, shouldn‘t just say ‘6 pm’ if it knows the flight
is full. It should guess that the questioner wants to travel on it, check that this is
possible, and say ‘6 pm, but it’s full. The next flight with an empty seat is at 8 pm.’
Pragmatic Interpretation
• Adding context-dependent information about the current situation to each
candidate semantic interpretation
• Indexicals: Phrases that refer directly to the current situation
Ex.: ‘I am in Boston today’
(‘I’ refers to speaker and ‘today’ refers to now)
Generation
We have so far only discussed natural language understanding. However, you should
be aware also of the problems in generating natural language. If you have something
you want to express (e.g. eats (John, chocolate)), or some goal you want to achieve
(e.g. get Fred to close the door), then there are many ways you can achieve that
through language:
He eats chocolate.
It’s chocolate that John eats.
John eats chocolate.
Chocolate is eaten by John.
Close the door.
It’s cold in here.
Can you close the door?
A generation system must be able to choose appropriately from among the
different possible constructions, based on knowledge of the context. If a complex
text is to be written, it must further know how to make that text coherent.
176
Self-Instructional
Material
Anyway, that’s enough on natural language. The main points to understand
are roughly what happens at each stage of analysis (for language understanding),
what the problems are and why, and how to write simple grammar in Prolog’s DCG
formalism.
Language Generation
Natural Language
Processing
NOTES
The same DCG can be used for parsing and generation
• Parsing:
Given: S (sem, [John, loves, Mary])
Return: sem = Loves (John, Mary)
• Generation:
Given: S (Loves (John, Mary), words)
Return: words = [John, loves, Mary]
• Ambiguity
Lexical ambiguity
‘the back of the room’ vs ‘back up your files’
‘In the interest of stimulating the economy, the government lowered the interest
rate.’
Syntactic ambiguity (structural ambiguity)
‘I smelled a wumpus in 2,2’
Semantic ambiguity
‘the IBM lecture’
Pragmatic ambiguity
‘I’ll meet you next Friday’
Discourse Understanding
• Discourse: Multiple sentences
• Reference resolution: The interpretation of a pronoun or a definite noun phrase
that refers to an object in the world.
‘John flagged down the waiter. He ordered a ham sandwich.’
‘He’ refers to ‘John’
‘After John proposed to Mary, they found a preacher and got married. For the
honeymoon, they went to Hawaii.’
‘they’? ‘the honeymoon’?
• Structure of coherent discourse: Sentences are joined by coherence relations
• Examples of coherence relations between S1 and S2:
Enable or cause: S1 brings about a change of state that causes or enables
S2
‘I went outside. I drove to school.’
Explanation: The reverse of enablement, S2 causes or enables S1 and is
an explanation for S1.
‘I was late for school. I overslept.’
Exemplification: S2 is an example of the general principle in S1.
Self-Instructional
Material
177
Natural Language
Processing
‘This algorithm reverses a list. The input [A, B, C] is mapped to [C, B, A].’
Etc.
7.2.2 Why is Language Processing Difficult?
NOTES
Consider trying to build a system that would answer e-mails sent by customers to a
retailer selling laptops and accessories via the Internet. This might be expected to
handle queries such as the following:
• Has my order number 4291 been shipped yet?
• Is FD5 compatible with a 565G?
• What is the speed of the 565G?
Assume the query is to be evaluated against a database containing product and
order information, with relations such as the following:
ORDER
Order number
Date ordered
Date shipped
4290
2/12/08
2/1/09
4291
2/12/08
2/1/09
4292
2/12/08
USER
: Has my order number 4291 been shipped yet?
DB QUERY
: Order (number=4291, date shipped=?)
RESPONSE TO USER : Order number 4291 was shipped on 2/12/08
It might look quite easy to write patterns for these queries, but very similar strings
can mean very different things, while very different strings can mean much the
same thing. 1 and 2 below look very similar but mean something completely different,
while 2 and 3 look very different but mean much the same thing.
1. How fast is the 565G?
2. How fast will my 565G arrive?
3. Please tell me when I can expect the 565G I ordered.
While some tasks in NLP can be done adequately without having any sort of account
of meaning, others require that we can construct detailed representations that will
reflect the underlying meaning rather than the superficial string.
In fact, in natural languages (as opposed to programming languages),
ambiguity is ubiquitous, so exactly the same string might mean different things. For
instance in the query:
Do you sell Sony laptops and disk drives?
Check Your Progress
1. List the three main
stages of speech
analysis.
2. State
the
two
important kinds of
parsers used in
natural language.
178
Self-Instructional
Material
The user may or may not be asking about Sony disk drives. We’ll see lots of
examples of different types of ambiguity in these lectures.
Often, humans have knowledge of the world which resolves a possible
ambiguity, probably without the speaker or hearer even being aware that there is a
potential ambiguity. But hand-coding such knowledge in NLP applications has turned
out to be impossibly hard to do for more than very limited domains: the term AIcomplete is sometimes used (by analogy to NP-complete), meaning that we’d have
to solve the entire problem of representing the world and acquiring world knowledge.
The term AI-complete is intended somewhat jokingly, but conveys what’s probably
the most important guiding principle in current NLP: we’re looking for applications
which don’t require AI-complete solutions, i.e., ones where we can work with very
limited domains or approximate full world knowledge by relatively simple
techniques.
7.3
Natural Language
Processing
NOTES
SYNTACTIC PROCESSING
The most basic unit of linguistic structure appears to be the word. Morphology is
the study of the construction of words from more basic components corresponding
roughly to meaning units. In particular, a word may consist of a route from plus
additions affixes. For example, the word, friend can be considered to be a route
from which the adjective friendly is constructed by adding a suffix – ly. This form
can then be used to derive the adjective unfriendly by adding a prefix. Unknown
phrases(abbreviated as NP) are used to refer to things like objects, concepts, places,
events, qualities and so on. While noun phrases are used to refer the things to
sentences (abbreviated as S) are used to assert, and queries are bring about some
partial description of the word. You can build more complex sentences from smaller
sentences by allowing one sentence to include another as a sub-clause. To examine
how the syntactic structure of a sentence can be computed, we must consider two
things:
1. Grammar, which is a formal specification of the structures allowable in the
language, and
2. Parsing technique, which is the method of analysing the sentence to determine
its structure according to the grammar.
The most common way of representing a sentence structure is to use a tree-like
representation that outlines how a sentence is broken into its major subparts.
For example, consider the following sentence, Shyam ate the bun.
It can be represented as a tree representation, i.e.:
Here
S
–
Sentence
NP
–
Noun Phrase
Self-Instructional
Material
179
Natural Language
Processing
NOTES
VP
–
Verb Phrase
NAME –
Noun
ART
Article
–
To construct such descriptions, you must know what structures are legal for English.
A set of rules called rewrite rules describe the tree structures that are allowable.
These rules say that a certain symbol may be expanded in the tree by a sequence of
other symbols. For instance, a set of rules that would allow the tree structure in the
above figure is the following figure:
S → NP VP
VP → VERB NP
NP → NAME
NP → ART NOUN
Here, S may consist of NP followed by a VP, VP may consist of VERB followed by
NP and so on. Grammars consisting entirely of rules of the form, are called Context
<Sym> → <Sym>1………..<Sym>m for n>=1.
Free Grammars (CFGs).
Symbols that cannot be further decomposed in a grammar, such as NOUN, ART
and VERB in the above example, are called terminal symbols. The other symbols
such as NP, VP, S are called non-terminal symbols.
Two common and simple techniques for CFGs are known as top-down parsing
and bottom-up parsing. Top-down Parsing begins by starting with the symbol S and
rewriting it say, to NP VP. These symbols may then themselves be rewritten as per
the rewrite rules. Finally, terminal symbols such as NOUN may be rewritten by a
word, such as rat, which is marked as a noun in the lexicon. Thus, in top-down
parsing you use the rules of the grammar in such as way that the right hand side of
the rule is always used to rewrite the symbol on the left hand side. In bottom-up
parsing, we start with the individual words and replace them with their syntactic
categories.
A possible top-down parsing for ‘Shyam ate the bun’ will be
S → NP VP
→ NAME VP
→ Shyam VP
→ Shyam VERB NP
→ Shyam ate NP
→ Shyam ate ART NOUN
→ Shyam ate the NOUN
→ Shyam ate the bun.
A possible bottom-up parsing of the sentence ‘Shyam ate the bun’ may be
→ NAME ate the bun
→ NAME VERB the bun
180
Self-Instructional
Material
→ NAME VERB ART bun
→ NAME VERB ART NOUN
Natural Language
Processing
→ NP VERB ART NOUN
→ NP VERB NP
→ NP VP
NOTES
→S
Another grammar representation is often more convenient for visualizing
the grammar. This formalism is based open the notion of transition network
consisting of nodes and labelled arcs. Consider the network named S in the following
grammar:
Each arc is labelled with a word category. Starting at a given node, you can traverse
an arc if the current word in the sentence is in the category on the arc. If the arc is
followed, the current word is updated to the next word. This network recognized
the same set of sentences as the following content-free grammar:
S → NOUN S1
S 1 → VERB S2
S 2 → NOUN
Example:
Shyam ate mango can be accepted by the above grammar.
The Simple Transition Network (STN) formalism is not powerful enough to
describe all languages that can be described by a CFG. To get the descriptive power
of CFGs, we need a notion of recursion in the network grammar. A Recursive
Transition Network (RTN) is like a simple transition network, except that it allows
arc labels that refer to other networks rather than word categories.
Self-Instructional
Material
181
Natural Language
Processing
NOTES
7.4
AUGMENTED TRANSITION NETWORKS
An augmented transition network (ATN) is a top-down parsing procedure that allows
various kinds of knowledge to be incorporated into the parsing system so it can
operate efficiently.
Suppose we want to collect the structure of legal sentences in order to further
analyse them. One structure seen earlier was the syntactic parse tree that motivated
context-free grammars. It is more convenient, though, to represent structure using a
slot filler representation, which reflects more the functional role of phrases in
a sentence.
For instance, you could identify one particular noun phrase as the syntactic
subject (SUBJ) of a sentence and another as the syntactic object of the verb (OBJ).
Within noun phrases, you might identify the determiner structure, adjectives and
the head noun and so on. Thus, the sentence ‘Shyam found a diamond’ might be
represented by the structure.
(S SUBJ (NP NAME Shyam)
MAIN-V Found
TENSE PAST
OBJ (NP DET A
HEAD diamond)
Such a structure is created by the Recursive Transition network parser by allowing
each network to have a set of registers. Registers are local to each network. Thus
each time a new network is pushed, a new set of empty registers is created. When
the network is popped, the registers disappear. Registers can be set to values, and
values can be retrieved from registers. In this case, the registers will have the names
of the slots used for each for each of the preceding syntactic structures. Thus the NP
network has registers named DET, ADJS, HEAD and NUM. Registers are set by
actions that can be specified on the arcs. When an arc is followed, the actions
associated with it are executed. The most common action involves setting a register
to a certain value. When a pop arc is followed, all the registers set in the current
network are automatically collected to form a structure consisting of the network
name followed by a list of the registers with their values. An RTN with registers,
and tests and actions on those registers, is an Augmented Transition Network.
The registers can be used to make the grammar more selective without
unnecessarily complicating the network. For instance, in English, the number of
the first noun phrase must correspond to the number of the verb. Thus, you cannot
have a singular first NP and a plural verb, as in ‘The cat is sleeping’. More importantly,
182
Self-Instructional
Material
number agreement is crucial in disambiguating some sentences. Consider the contrast
between ‘flying planes is dangerous’, where the subject is the activity of flying
planes, and ‘flying planes are dangerous’, where the subject is a set of planes that
are flying. Without checking the subject-verb agreement, the grammar could not
distinguish between these two reading. You could encode number information by
classifying all words as to whether they are singular or plural; then you could build
up a grammar for singular sentences and for plural sentences. In an RTN, this
approach is shown in the following figure:
But the method doubles the size of the grammar. In addition, to checking
some other feature, you would need to double the size of the grammar again. Besides
the problems of size, this approach also forces you to make the singular plural
distinction even when it is not necessary to check it.
A better solution is to allow words and syntactic structures to have features as
well as a basic category. We can do so by using the slot value list notation introduced
earlier for syntactic structure. This notation allows you to store number information
as well as other useful information about the word in a data structure called the
lexicon. The following table shows a lexicon with the root and number information
for some words, in which ‘3S’ means, third person singular and ‘3P’ means third
person plural.
Word
Representation
Cats
(NOUN ROOT Cat
Num {SP})
Cat
(NOUN ROOT Cat
Num {35})
the
(ART ROOT THE
NUM M{3S 3P})
a
(ART ROOT A
Natural Language
Processing
NOTES
Check Your Progress
3. List the two common
techniques for CFGs.
4. Simple transition
network
(STN)
formalism is not
powerful enough to
describe
all
languages that can be
described by a CFG.
(True or False?)
5. An
augmented
transition network
(ATN) is a bottomup parsing procedure
that allows various
kinds of knowledge
to be incorporated
into the parsing
system so it can
operate efficiently.
(True or False?)
NUM {3S})
Self-Instructional
Material
183
Natural Language
Processing
cried
(VERB ROOT CRT
NUM {3S 3P})
hates
(VERB ROOT hate
NUM {3S})
NOTES
hate
(VERB ROOT hate
NUM {3P})
Shyam
(NAME ROOT Shyam)
Now extend the RTN formalism by adding a test to each arc in the network. A test
is simply a function that is said to succeed if it returns a non-empty value, such as a
set or atom, and to fail if it returns the empty set or nil. If a test fails, its arc is not
traversed.
7.5
SEMANTIC ANALYSIS
Deriving the syntactic structure of a sentence is just one step toward the goal of
building a model of the language understanding process. A sentence’s structure is
as important as its meaning. While it is difficult to define precisely, the sentence
meaning allows you to conclude that the following two sentences are saying much
the same thing:
• I gave a contribution to the local party.
• The local party received a donation from me.
Sentence meaning, coupled with general knowledge, is also used in question
answering. Thus, if it is asserted that ‘Giri drove to the office’, then the answer to
the question, ‘Did Giri get into a car?’, is ‘yes’. To answer such a question, you
need to know that the action described by the verb, driving is an activity that must
be done in a car.
The approach taken here divides the problems of semantic interpretation into
two stages. In the first stage, appropriate meaning of each word is determined and
these meanings are combined to form a logical form (LF). The logical form is then
interpreted with respect to contextual knowledge, resulting in a set of conclusions
that can be made from the sentence.
An intermediate semantic representation is derivable because it provides a
natural division between two separate, but not totally independent problems. The
first problem concerns word sense and sentence ambiguity. Just as many words fall
into multiple syntactic classes many have several different meanings or senses. The
verb, go, for instance, has over 50 distinct meanings. Even though most words in a
sentence have multiple senses, a combination of words in a phrase often has only a
simple sense, since the words mutually constrain each other’s possible interpretations.
184
Self-Instructional
Material
The second problem involves using knowledge of the world and the present
context to identify the particular consequences of a certain sentence. For instance,
if you hear the sentence, ‘The principal has resigned’, you must use knowledge of
the situation to identify who the principal is (and what college he or she heads) and
to conclude that the person no longer holds a position in that college. Depending on
the circumstances, you might also infer other consequences, such as the need for a
replacement principal. In fact, the consequences of a given sentence could have far
reaching effects. Most attempts to formalize this process involve encoding final
sentence meaning in some form of logic and modelling the derivation of
consequences as a form of logical deduction.
The first process just described, which is known as word sense
disambiguation, is not easily modelled as a deductive process. The components of
meaning—word meanings, phase meanings and so on— often correspond at best to
fragments of logic, such as predicate names, types and so on, over which deduction
is not defined. For example, when analysing the meaning of the following sentences
you might use the predicate BREAK to model the verb meaning.
Natural Language
Processing
NOTES
1. Sasi broke the window with a hammer
2. The hammer broke the window.
3. Sasi broke the window.
4. The window broke.
Furthermore, assume that BREAK takes three arguments: the breaker, the thing
broken and the instrument used for breaking. To analyse these sentences, you must
utilize information about the semantics of the NPS Sasi, the hammer and the window,
together with the syntactic form of the sentence, to decide that the subject NP
corresponds to the third argument in sentence 2, the first argument in sentences, 1
and 3 and the second argument in sentence 4.
Semantic Grammars
A semantic grammar is a context-free grammar in which the choice of non- terminals
and production rules is governed by semantic as well as syntactic function. There is
usually a semantic action associated with each grammar rule. The result of parsing
and applying all the associated semantic actions is the meaning of the sentence.
This close coupling of semantic actions to grammar rules works because the grammar
rules themselves are designed around key semantic concepts. Let us look at an
example of semantic grammar:
S → What is FILE – PROPERTY of FILE?
{Query FILE, FILE-PROPERTY}
S → I want to ACTION
{command ACTION}
FILE-PROPERTY → the FILE-PROP.
{FILE-PROP}
FILE-PROP → extension | protection | creation date | owner
{value}
FILE → FILE-NAME | FILE 1
{value}
FILE 1 → USER’S FILE 2
{FILE2.owner:USER}
FILE1 → FILE2
FILE2 → EXT file
{instance: file-struct extension: EXT)
(FILE2}
Self-Instructional
Material
185
Natural Language
Processing
EXT → init | .text | .lsp | .for | .PS | .mss
Value
ACTION → Print FILE
{instance: Printing
NOTES
object: FILE}
ACTION → print FILE on PRINTER
{instance: printing
Object: FILE
Printer: PRINTER ,}
USER → Bill | Susan
(Value)
A semantic grammar can be used by a parsing system in exactly the same way in
which a strictly syntactic grammar could be used. Several existing parsing systems
that have used semantic grammars have been built around an ATM parsing system,
since it offers a great deal of flexibility.
7.6
DISCOURSE AND PRAGMATIC PROCESSING
To understand even a single sentence, it is necessary to consider the discourse and
pragmatic context in which the sentence was uttered. There are a number of important
relationships that may hold between phrases and parts of their discourse contexts,
including:
• Identical entities.
Ex. — Ram had a red balloon.
Sam wanted it.
The word ‘it’ identified as referring to the red balloon. Such references are
called anaphoric references or anaphora.
• Parts of entities.
Ex. — Rani opened the book she just bought.
The title pages were torn.
The phrase ‘the title page’ recognized as part of the book bought by Rani.
• Parts of actions.
Ex. — Sam went on a business trip to New York.
He left on an early morning flight.
Taking a flight should be recognized as part of going on a trip.
• Entities involved in actions.
Ex. — My house was broken into last week.
They took the TV and stereo.
Pronoun ‘they’ should be recognized as referring to the burglars who broke
the house.
186
Self-Instructional
Material
• Elements of sets.
Natural Language
Processing
Ex. — The decals we have in stock are stars, the moon, item and a flag.
I’ll take two moons.
To understand the second sentence all that is required is that we use the context
of the first sentence to establish that the word ‘moons’ mean moon decals.
NOTES
• Name of the individuals.
Ex. — John went to the movie.
John should be understood to be some person name John.
• Causal chains.
Ex. — There was a big snow storm yesterday.
The schools were closed today.
The snow should be recognized as the reason that the schools were closed.
• Planning Sequences.
Ex. — Shalu wanted a new car.
She decided to get a job.
Shalu’s sudden interest in a job should be recognized as arising out of her
desire for a new car and thus for the money to buy a car.
• Illocutionary force.
Ex. — It sure is cold in here.
In many circumstances, this sentence should be recognized as having, as its
intended effect, that the hearer should do something like close the window or
turn up the thermostat.
• Implicit presuppositions.
Ex. — Did Sam fail CS101?
Speaker’s presuppositions include the fact that CS101 is a valid course, Sam
is a student, and that Sam took CS101, should recognize that he failed.
In order to be able to recognize these kinds of relationships among sentences, a
great deal of knowledge about the world is required. Programs that can do multiple
sentence understanding rely either on large knowledge bases or on strong constraints
on the domain of discourse so that only a more limited knowledge base is necessary.
The way this knowledge is organized is critical to the success of the understanding
program. We should focus on the use of the following kinds of knowledge:
1. The current focus of the dialogue.
2. A model of each participant’s current beliefs.
3. The goal-driven character of dialogue.
4. The rules of conversation shared by all participants.
7.6.1 Conversational Postulates
Unfortunately, this analysis of language is complicated by the fact that we don’t say
exactly what we mean. We often use indirect speech acts such as ‘It sure is cold in
here’. Regularity gives rise to a set of conversational postulates, which are rules
Self-Instructional
Material
187
Natural Language
Processing
about conversation that are shared by all speakers. Usually, these rules are followed.
Some of these conversational postulates are:
• Sincerity condition
Ex: — Can you open the door?
NOTES
• Reasonableness conditions
Ex: — Can you open the door?
Why do you want it open?
• Appropriateness Conditions
Ex: — Who won the race?
Someone with long, dark hair.
I thought you knew all the runners.
Assuming a cooperative relationship between the parties to a dialogue, the shared
assumption of these postulates greatly facilitates communication.
7.7
GENERAL COMMENTS
• Even ‘simple’ NLP applications, such as spelling checkers, need complex
knowledge sources for some problems.
• Applications cannot be 100 per cent perfect, because full real world knowledge
is not possible.
• Applications that are less than 100 per cent perfect can be useful (humans
aren’t 100 per cent perfect anyway).
• Applications that aid humans are much easier to construct than applications
which replace humans. It is difficult to make the limitations of systems which
accept speech or language obvious to naive human users.
• NLP interfaces are nearly always competing with a non-language based
approach.
• Nearly all applications either do relatively shallow processing on arbitrary
input or deep processing on narrow domains. MT can be domain-specific to
varying extents: MT on arbitrary text isn’t very good, but has some
applications.
• Limited domain systems require extensive and expensive expertise to port.
Research that relies on extensive hand-coding of knowledge for small domains
is now generally regarded as a dead-end, though reusable hand coding is a
different matter.
• The development of NLP has mainly been driven by hardware and software
advances and societal and infrastructure changes, not by great new ideas.
Improvements in NLP techniques are generally incremental rather than
revolutionary.
188
Self-Instructional
Material
7.8
SUMMARY
In this unit, you have learned about the tough problem of language understanding.
One interesting way to summarize the natural language understanding problem is
to view it as a constraint satisfaction problem. Unfortunately, many more kinds of
constraint satisfactions must be considered. But constraint satisfaction does provide
a reasonable framework in which to view the whole collection of steps that together
create a meaning for a sentence. Essentially, each of the steps described in the unit
exploits a particular kind of knowledge that contributes a specific set of constraints
that must be satisfied by any correct final interpretation of a sentence.
Natural Language
Processing
NOTES
You have also learned about syntactic processing which contributes a set of
constraints derived from the grammar of the language. Semantic processing
contributes an additional set of constraints derived from the knowledge it has about
entities that can exist in the world. The unit also discussed discourse processing
which contributes a further set of constraints that arise from the stricture of coherent
discourses. And finally, pragmatics contributes yet another set of constraints. There
are many important issues in natural language processing. By combining
understanding and generation systems, it is possible to attack the problem of machine
translation, by which we understand text written in one language and then generate
it in another language.
7.9
KEY TERMS
• Natural language systems: These are developed both to explore general
linguistic theories and to produce natural language interfaces or front ends to
application systems.
• Natural language processing: It can be defined as the automatic (or semiautomatic) processing of human language.
• Syntax: The stage of syntactic analysis is the best understood stage of natural
language processing. Syntax helps us understand how words are grouped
together to make complex sentences, and gives us a starting point for working
out the meaning of the whole sentence.
7.10 ANSWERS TO ‘CHECK YOUR PROGRESS’
1. Syntactic analysis, semantic analysis and pragmatic analysis.
2. Two important kinds of parsers used in natural language are transition network
parsers and chart parsers
3. Two common and simple techniques for CFGs are top-down parsing and
bottom-up parsing.
4. True
5. False
Self-Instructional
Material
189
Natural Language
Processing
7.11 QUESTIONS AND EXERCISES
1. What is natural language processing?
NOTES
2. Write the production rules necessary to check the syntax of an English noun.
The grammar shall include both proper and common nouns.
3. What is a simple transition network (STN)?
4. Describe syntactic grammars and semantic grammars.
7.12 FURTHER READING
1. www.home.yawl.com.br
2. www.cee.hw.ac.uk
3. www.comp.nus.edu
4. www.gmitweb.gmit.ie
5. www.atn.ath.cx
6. www.cs.darmouth.edu
7. www.aiexplained.com
8. www.cl.cam.ac.uk
9. www.kbs.twi.tudelft.nl
10. www.eecs.berkeley.edu
11. www.csic.cornell.edu
12. www.claes.sci.eg
13. www.cl.cam.ac.uk
14. www.bluehawk.monmouth.edu
15. www.www-ist.massey.ac.nz
16. www.min.ecn.purdue.edu
17. www.crg9.uwaterloo.ca
18. www.free-esl-blogs.com
19. www.csic.cornell.edu
20. De, Suranjan. 1986.‘A Framework for Integrated Problems, Solving in
Manufacturing’. IIE Transactions, 9th January.
190
Self-Instructional
Material