Download 3.1.5 Input Output Specifications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Unit 3.
Analysis and Specification
You may recall from Unit 2 that the specification document (or simply,
the specifications) acts as a contract between the client and the
developer. It sets out exactly what the software product that the client
is purchasing must do. Ideally, the client's requirements have been
elicited accurately, and the specifications make explicit the behavior
expected of the software product in all the circumstances that may arise
during its use.
As we mentioned earlier, the goal of the specification or system analysis
phase is to build a model of the software product that the client requires.
Pressman 1997 provides the following principles of analysis (page 278):
1. The information domain of a problem must be represented and
understood.
2. The functions that the software is to perform must be defined.
3. The behavior of the software (as a consequence of external events)
must be represented.
4. The models that depict information, function, and behavior must be
partitioned in a manner that uncovers detail in a layered (or
hierarchical) fashion.
5. The analysis process should move from essential information toward
implementation detail.
In this unit, we will look at various types of specification techniques
that address the above principles. These techniques are commonly used in
structured systems analysis, as opposed to object-oriented analysis,
which will be addressed in Unit 4. Not all systems are object-oriented,
however, nor should all systems be designed that way. Some of the
techniques and many of the ideas of the more traditional structured
systems analysis can still be valid for object-oriented analysis.


3.1 Structured Systems Analysis
3.2 Entity-Relationship Modeling
Assessments


Exercise 2
Multiple-Choice Quiz 3
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.1





Structured Systems Analysis
3.1.1
3.1.2
3.1.3
3.1.4
3.1.5
Informal Specifications
Data Flow Diagrams
Process Logic
Data Dictionaries
Input Output Specifications
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.1.1
Informal Specifications
Informal specifications are, as the name says, the least formal type of
specification. They are written in a natural, human language, such as
English or French, and do not require the reader to understand any special
notation. On the positive side, this enables the most unsophisticated of
clients to understand the content of the specifications document; on the
negative side there are several potential hazards.
Readings:



Schach (4th Edition), sections 10.1–10.2.
Schach (5th Edition), sections 11.1–11.2.
Schach (6th Edition), sections 11.1–11.2.
One drawback of informal specifications is that except for the simplest
of software products, the text becomes long, verbose, and generally hard
to read and comprehend. Typically, natural language specifications are
written as a set of if-then clauses, according to the following pattern:
If some input or internal condition is met, then the software will produce
the corresponding output. It is difficult to assess whether all possible
circumstances are covered by the specifications, and, by the time the
reader has reached the end of the document, it is hard to detect whether
there are inconsistencies in the content simply because there is so much
content. To understand how this might happen, think of the directions for
filling out tax forms as a specification for how a software product for
computing taxes must operate. It is not easy to determine what one could
do when faced with so many rules and regulations, and it would be just
as hard to understand what the software should do!
Another risk related to informal specifications is that the language may
be ambiguous, or vague, or may inaccurately portray what the client�s
initial requirements were. Suppose you were building a simple
checkbook-balancing program and one of the clauses in the specification
reads, "When the balance in the account reaches 0, print out a big warning
and refuse to process any more debits." What does this clause actually
say about negative balances? How is the client likely to react if you
implement exactly what the specification says instead of what the
specification should have said about what the program was intended to do?
In general, informal specifications by themselves are neither a crisp nor
an accurate way of setting down the requirements for a software product.
They need, at the very least, to be augmented with more formal techniques.
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.1.2
Data Flow Diagrams
Data flow diagrams (DFDs) are a type of graphical notation for describing
how data flows into, out of, and within a system. The use of graphics as
a means of specifying software dates back to the 1970s.
Readings:



Schach (4th Edition), section 10.3.
Schach (5th Edition), section 11.3.
Schach (6th Edition), section 11.3.
One of the originators of data flow diagrams stated, "Graphics should be
used wherever possible," because graphics suffer less from the
ambiguities that arise in descriptive text (DeMarco 1978, ch. 10).
Different graphical schemes have been proposed, several of which are
essentially equivalent. We will use the graphical notation shown in your
textbook (Schach (4th Edition), figure 10.1 pg. 334 or Schach (5th
Edition), figure 11.1 pg. 324 or Schach (6th Edition), figure 11.1 pg.
308).
A data flow diagram captures how information or data enters and exits the
system, and how it is passed from component to component. It portrays the
logical data flow, as opposed to the control flow or process logic, which
we will discuss shortly. Note that the word "system" does not necessarily
imply a software system—one can just as easily use data flow diagrams
to describe a hardware system or an organizational system in which people
or departments are the components. In fact, a data flow diagram does not
make any commitment regarding the implementation of the system or any of
its components. The ability to "differentiate between the logical and the
physical" (DeMarco 1978, chapter 10) is a feature of data flow diagrams,
as well as other graphical representations used in specifying systems.
Pressman (Pressman 5th ed, 2000, chapter 11) says that software design
proceeds like an architect's design for a building. It starts by
expressing the totality of what is to be built. Then the details of each
piece are gradually filled in (e.g. details of dimensions comes before
details about materials to be used which in turn comes before details of
lighting). Similarly software design moves from the essential to the more
detailed. This gradual elaboration of details can be easily applied to
DFDs. A level 0 DFD (termed a fundamental system model or a context model)
just shows the entire software product as a process, with input and output
flowing into and out of it. For example, suppose you were specifying a
translation system that translated English input (text or speech) into
French, the level 0 DFD is described by the following diagram.
By partitioning the system a little more and showing an additional amount
of detail, one could imagine breaking down the translation system as
follows:
The system that is described by the level 1 DFD converts the English input,
through a process of interpretation, into an intermediate representation
of meaning that is language independent. This meaning representation is
then used to generate the corresponding meaning in French. Even though
this diagram is starting to make some assertions about how the process
of translation takes place, it still does not make any commitments to a
particular implementation. The DFD could be describing a software system
or a human interpreter.
A further refinement of the level 1 DFD might show more detail about the
interpretation process, by highlighting additional data sources and an
intermediate step in the processing of input.
You can imagine that substantially more refinement is possible, although
you will need at some point to start making some assumptions about the
actual implementation of the system. It is a significant advantage of data
flow and other types of diagrams that they can be incrementally refined
to show the workings of a system in more and more detail. For large systems,
the additional detail can give rise to extensive and very complex diagrams,
but even large diagrams will be clearer and easier to read than large
informal specifications. The levels of refinement shown in the diagrams
above are an example of in-place refinement. In addition, data flow
diagrams can show hierarchical refinement, with more general diagrams
containing placeholders for complex processes that are then expanded to
show greater detail in a separate data flow diagram. For example, the level
2 DFD could have been expanded hierarchically as shown in the following
diagram.
When do you stop refining a DFD? When you cannot decompose into
subprocesses any further without entering into algorithm design.
References
Demarco, T. Structured Analysis and System Specification. New York:
Yourdon Press, 1978.
Pressman, Roger S. Software Engineering: A Practitioner's Approach. 5th
ed. New York: McGraw Hill, 2000.
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.1.3





Process Logic
Decision Trees
Processing Specifications (PSPECs)
Control Flow Diagrams (CFDs) for Real-Time Systems
Control Specifications (CSPECs)
Data Flow vs. Control Flow
While a dataflow diagram shows the input and output for each conceptual
component of a system, it does not specify the process logic of the system.
Process logic is how control flows within and between each of the component
processes of the system.
Readings:









Required: Schach (4th Edition), section 10.6.
Remark: This material on real-time systems is
required and is not fully covered in the
discussion below.
Required: Schach (5th Edition), section 11.6.
Remark: This material on real-time systems is
required and is not fully covered in the
discussion below.
Required: Schach (6th Edition), section 11.7.
Remark: This material on real-time systems is
required and is not fully covered in the
discussion below.
Optional: Schach (4th Edition), section 10.7.
Remark: Further reading on concurrent systems.
Optional: Schach (5th Edition), section 11.7.
Remark: Further reading on concurrent systems.
Optional: Schach (6th Edition), section 11.8.
Remark: Further reading on concurrent systems.
Optional: Schach (4th Edition), sections
10.8–10.15. Remark: Skim this material in order
to get an overview.
Optional: Schach (5th Edition), sections
11.8–11.15. Remark: Skim this material in order
to get an overview.
Optional: Schach (6th Edition), sections
11.9–11.16. Remark: Skim this material in order
to get an overview.
At this point, the actual architectural and detailed design of the
software has not yet been created, so the control information that is added
to the data flow diagram does not refer to specific conditional branching
and looping inside individual processes, but rather to how different input
or input states cause other processes to be activated.
Decision Trees
Different specifications of process logic are appropriate for different
types of software products. Some types of software compute output via a
multi-step decision based on different features of their input. Therefore,
the process logic can be depicted using a decision tree. An example of
this type of system is given in Schach (4th Edition), fig. 10.5 pg. 338
or Schach (5th Edition), fig. 11.6 pg. 329 or Schach (6th Edition), fig.
11.6 pg. 312. This type of specification would also apply to a software
product used by a parcel delivery service company. Such a software product
would determine fees for shipping parcels based on the sizes of the parcels,
the destinations, and the delivery times. The specification for the
product would need to include at least the following variables:





Parcel dimensions and/or weight
Origin and destination of the parcel
Time constraints for delivery (which will determine the means of
transport)
Extra insurance
Special handling requirements
Similarly, a decision tree can be applied to specify the control flow for
a translation system, such as the one described in the previous section.
The interpreter process could detect different styles of documents upon
input and utilize different subprocesses or templates for translation.
The translation templates are chosen based on the selected output language.
Therefore, the same basic processes for translating words and grammatical
structures might be used as a common resource for all documents; an
incoming letter-type document would activate the letter-translation
template, while an incoming journal-article-type document would activate
the journal-article-translation template.
A decision tree is a useful tool for specifying this kind of once-only
decision procedure, because it helps the reader realize whether all
possible combinations of input have been considered—whether the process
logic specification is complete or not. Other kinds of systems, however,
require a different type of specification.
Processing Specifications (PSPECs)
Processing specification or process specification (PSPEC) is another way
of specifying how control flows between components of the software product
based on data (input and input states). The PSPEC serves as a guide for
design of the program component that will implement the process. It is
attached to processes in a data flow diagram of the appropriate level.
It describes, in a general way, the logic of the process from input to
output. The contents of the PSPEC can consist of narrative text,
mathematical equations, tables, charts, diagrams, and/or a description
in a program design language (PDL). For example, assume your software
product had a component process that read a two-dimensional geometrical
figure and determined how many sides it had. The PSPEC written in a PDL
would look as follows:
Control Flow Diagrams (CFDs) for Real-Time Systems
Real-time systems, which monitor input continuously or semi-continuously,
iterate through different internal states of the system based on the input
received from the environment and other components of the system. In order
to specify how real-time systems process their input, the notation of data
flow diagrams must be augmented to show control flow and control
processing explicitly. Normally, a control flow diagram (CFD) is created
by stripping the data flow arrows off a data flow diagram and adding the
control information. In the diagram below, control flow for copy machine
software is superimposed onto the DFD for clarity. Solid lines are used
for data flow and dashed lines for control flow, according to Hatley and
Pirbhai's notation (Hatley & Pirbhai 1987, quoted in Pressman 5th ed. 2000,
Section 12.4.4). The notation also uses vertical bars to indicate the
presence of a control specification (CSPEC), and control flows or event
flows are shown flowing into and out of a CSPEC. All CSPEC bars in a control
flow diagram refer to the same CSPEC.
A CSPEC's contents would be similar to a PSPEC's contents with regard to
showing how the input is to be processed. For example, the CSPEC for events
start/stop, jammed, or empty would sound an alarm. The events jammed and
empty would also invoke the process perform problem diagnosis. A control
event can also be input directly into a process without going through
3.1.4 Data Dictionaries
A data dictionary is specifically used to describe the kinds of data that
are defined and must be processed within the product. The data dictionary
acts as a semi-formal grammar for defining the format of the data objects
identified during data flow analysis.
If the input data is very simple (if it contains very few items with little
internal structure) and the processing undergone by the data is
straightforward, there is no need for a data dictionary. You can just list
the operations and the numeric input. For example, for a calculator
program that processes the following kind of information, a data
dictionary is unnecessary:


Numbers: 0, positive and negative real and integer numbers
Operations: + = addition, - = subtraction, * = multiplication, /
= division, etc.
Many software products, however, need to perform more elaborate data
processing. In such systems, a data dictionary is a very useful tool for
organizing information about the data and its use in the software product.
Consider, for example, a database product used in an automatic machine
translation (AMT) system that translates text from English (the source
language in this case) to several other languages (the target languages).
Included in the data that the AMT software processes are the source words
that are input to the system, and the corresponding translations in the
target languages. If word strings such as "sleep" were the only input the
system needed, you would still not need a data dictionary, but things are
seldom as simple as they seem. For starters, the word "drink" in English
has two very different meanings:
1. drink, the noun, which is the thing you drink (a coke, coffee,
water)
2. drink, the verb, which is the action of drinking
Although in English there are many words for which the exact same string
of characters is used for nouns and verbs, in most other languages (and
often even in English, e.g. "food," "eat") the noun and the verb use
different strings of characters. For example, in Spanish:
1. drink (NOUN) = bebida
2. drink (VERB) = beber
Therefore, your representation of input words will at least need to
include, in addition to a string of characters such as "drink," the part
of speech (NOUN, VERB, etc.). In addition, the idea of "drink" (NOUN) and
"drinking" (VERB) will not always appear exactly as "drink" in the input
text. Sometimes you might find "drinks" meaning more than one drink, the
plural noun, and at other times meaning "he or she drinks," the third
person singular verb. In English, the plural of nouns and conjugation of
verbs is often regular, but you do find nouns with irregular plurals (e.g.,
"child" becomes "children") and verbs with irregular conjugations (e.g.,
"be" becomes "am," "are," "is"). Even the verb "drink" has an irregular
past ("drank" instead of "drinked").
In order to understand a variety of words used in different ways, the AMT
system will need to represent these irregularities and be able to process
them. Just to account for the type of variation in input described above,
a lot more information will be needed to represent a word than just a string.
Therefore, for each term in the English vocabulary that the AMT is expected
to process, at least the following data will be required:
Data Item Name
Data Type
Cardinality
Modality
Word
String
Single-valued
Mandatory
Part-of-speech
NOUN, VERB, ADJ, �
Single-valued
Mandatory
Plural (for NOUN)
String
Single-valued
Optional (if regular)
String
Single-valued
Optional (if regular)
Past (for VERB)
String
Single-valued
Optional (if regular)
Transitivity (for
TRANS, INTRANS
Multi-valued
Mandatory
3
rd
person singular
(VERB)
VERB)
The table above gives you an idea of the type of information that you might
want to put in a data dictionary for each data item. In addition to the
name of the data item itself, you will want to specify:



The type of the data
Its cardinality, that is, whether it can have one or more values.
In our example, you would indicate whether a verb is transitive
(must take a direct object, as in the example of "amend," because
you always amend something), intransitive (cannot take a direct
object, as in the example of "walk"), or can be used both ways (as
in the example of "move").
Its modality, that is, whether a value is mandatory (modality 1)
or optional (modality 0). In our example, you might want to omit
regular plurals for nouns and regular past tenses for verbs in order
to save space and because it�s easy to generate them "on-the-fly"
by adding either an "s" or an "ed" to the noun or the verb
respectively.
In different types of software products, the data dictionary will contain
different types of items. For example, in a large software product, the
data dictionary may contain the names of all the variables, with their
types and locations, and the names of all the procedures, their types,
locations, and parameters. Depending on the application, other
information in the data dictionary might include aliases (different names
for the same item); preset values, if any; a content description, possibly
in a formal language; and manner of use (where and how the item is used,
whether as input or output, in which process). Depending on the
development environment, some of this information may be gathered
automatically.
While a data dictionary written in a human-readable format is already a
very useful input to the design phase, a data dictionary is most valuable
when it is also machine readable, and data dictionaries are usually
implemented within a Computer-Assisted Software Engineering (CASE) tool.
Other software can use a machine-readable data dictionary to check
consistency between the design/implementation and the specification, to
print out a report on the data, to check for duplicate names of data and
functional objects, or to determine display requirements for on-screen
display of the data. The information in a data dictionary can also be used
to create an entity-relationship model for object-oriented systems and
databases.
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.1.5
Input Output Specifications
The input output specifications define what input a software product must
accept and what corresponding output are expected. This is easier to
specify for some products than for others. Referring back to the
calculator example of the previous section, the input and output
specifications need to contain little more than statements of the
following sort:
INPUT:
Operator: multiplication (*)
Multiplicands: n1, n2, n3, ...
OUTPUT:
n1 * n2 * n3 * ...
On the other hand, when the product uses a forms interface to a database,
the input is more complex—many fields in the form may be changed at
once—and there may not be any visible output. The values typed in by the
user may be placed in a temporary memory store, and permanent output, like
changes in the database itself, may be delayed until the user submits the
entire form. The input output specifications only need to describe the
final effects of the input on the database, but the submit action will
also be part of the input. In contrast, if the user fills in a form and
then cancels instead of submitting the input, the combination of field
values and cancel action will give a different output—no changes to the
database.
As a third example, consider again the automatic machine translation
system (AMT) of the previous section. In addition to translating specific
words and phrases, the system will be expected to translate whole
sentences. Since each language (and even type of document) has its own
style of conveying the same basic content, you cannot always expect
sentences to be translated literally. So, while English may use a rather
personal and direct style to give commands in a manual, French may prefer
a more indirect rendition of the same idea. The input output
specifications would contain statements like the following:
English commands using the pronoun "you" will be translated in French
using the impersonal pronoun construction "on." For example,
ENGLISH INPUT:
"You must put the lever in position 'on.'"
FRENCH OUTPUT:
"On doit mettre le levier sur la position 'activé.'"
(Roughly equivalent to "One must put the lever in position 'activated.'")
The specification document should address both legal and illegal input.
In the case of an illegal input—for example, division by zero in a
calculator program—the product should avoid crashing if possible.
Instead, the specification should describe the error-reporting behavior
of the product. Illegal input is preferably detected before processing,
so it can be reported to be unacceptable in a graceful manner. For example:
INPUT:
Operator: division ( / )
Dividend: n1
Divisor: 0
OUTPUT:
ERROR: Illegal division by zero
If illegal input cannot always be detected, then other types of
software-generated errors will be given as output. Preferably cryptic
system errors are translated into language the user can understand to
provide some information for diagnosing the source of the error.
As with all specifications documents, the input output specifications
should be precise, unambiguous, complete, and consistent. This will make
it easier to trace the design document back to the specification document
and will therefore make it easier to verify the design.
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
3.2 Entity-Relationship Modeling
Like data dictionaries, entity-relationship (ER) modeling is a formal
technique that is oriented to specifying data as opposed to control
information. Entity-relationship modeling was used extensively, as far
back as the 1970s, for specifying databases, and as we shall explain in
Unit 4, it has more recently been adopted in object-oriented analysis.
Readings:



Schach (4th Edition), section 10.5.
Schach (5th Edition), section 11.5.
Schach (6th Edition), section 11.6.
Entity-relationship modeling is usually expressed graphically, through
an entity-relationship diagram (ERD). Like a data flow diagram and a
process description language, an entity-relationship diagram is a model
of objects and their relationships in the world and does not imply a
commitment to a specific implementation. An entity-relationship diagram
of a software product may be implemented as a relational database, as an
object-oriented system, or in other viable ways.
In an entity-relationship model, there are two types of components:
entities and relationships. �Entities represent sets of distinguishable
objects in the world.� In the airline database example below, the
entities are Passenger, Departure, and Flight.� A relationship between
two entities describes the way in which they are associated. The
relationship between Passenger and Departure is that each passenger is
booked on one or more departures.� Similarly, the relationship between
Departure and Flight is that a departure is a specific instance of a flight
on a given date.�
The concept of cardinality, introduced in 3.1.4 Data Dictionaries for data
items, extends to relationships as well. In the partial
entity-relationship diagram shown below for an airline database, the
relationship between Departure and Flight has the cardinality many-to-one,
since each departure is an instance of a single flight but each flight
can have many departures. The inverse of this relationship would be
one-to-many. On the other hand, the relationship between Passenger and
Departure has the cardinality many-to-many because passengers may have
more than one booking—one for each leg of a round trip or for different
trips—and each departure will have one or more passengers. If you added
the entities Airline and Route, the relationship between them would be
another example of a many-to-many relationship because several airlines
would have flights between San Francisco and New York and each airline
would have several routes.
The cardinality one-to-one also exists, although it is rarer. For example,
in the ER model of a company where a manager manages a single department
and each department has only one manager, the relationship between the
entities Department and Manager would have the cardinality one-to-one.
The concept of modality, similarly introduced in 3.1.4 Data Dictionaries
for data items, also extends to entity-relationship modeling. An employee
can exist only if he or she works for a department, making participation
in a relationship WORKS_FOR mandatory. This is total participation. In
contrast, the relationship MANAGES, between Employee and Department, is
optional, because not every employee must manage a department. This case
demonstrates partial participation in a relationship.
In addition to showing entities and relationships, the
entity-relationship diagram might also show attributes—the properties
associated with entities.� The choice between modeling an object as an
attribute of an entity or as an entity itself depends on whether you expect
such objects to participate in relationships or not.� For example,
Departure is modeled as an entity above because, in addition to being
linked to Passenger via the Booked_On relationship, Departure also
participates in a relationship with Flight.� On the other hand, the Name,
Address, and Phone number of the passenger do not participate in any other
relationship in this particular problem, so they can be modeled as
attributes.� A subset of attributes that uniquely identifies an entity
is called the key.� Sometimes a key is a single attribute, but often it
is a combination of attributes.� In the above example, we cannot use the
attribute Name as a key for Passenger because there are many John Smiths
booking flights on airlines.� On the other hand, the Name and
Phone_Number is probably a good key.
© Copyright 1999-2004 iCarnegie, Inc. All rights reserved.
Take Assessment: Exercise 2
Please answer the following question(s).
If the assessment includes multiple-choice questions, click the "Submit
Answers" button when you have completed those questions.
1.
Go to bottom of question.
Entity-Relationship Modeling for WebOrder
In this exercise, you will be required to create the database using a database
management system of your choice (MySQL/PostgreSQL/Microsoft Access), and then
design and implement the database tables.
Logical Data Modeling: Create an entity-relationship model using the following steps:

Identify and model the entities.

Identify and model the relationships between the entities.

Identify and model the attributes.

Identify a primary key (and alternate keys, if necessary) for each entry.
Physical Database Design: Write physical table definitions for the database using the
following steps:

Normalize relations and define tables in the physical database using the
algorithm for mapping an ER model to the relational model. (See the Mapping
Algorithm on the Appendix A. Course Project page for more information.)

Attributes become columns in the physical database. You have to choose an
appropriate data type for each of the columns according to the data types
available in the database you select for the project. (If you choose MySQL, see
the MySQL tutorial for more information).

Primary keys are unique.

Relationships are modeled using foreign keys.

Entities become tables.
Table Creation: Implement the tables for your WebOrder system.

Create the database. (If you choose MySQL, read the MySQL Reference Manual
on the Appendix A. Course Project page).

Write table creation statements for your database tables.

Create the tables by using the table creation statements.
Go to top of question.
File to submit:
Upload File
Go to top of assessment.
© Copyright 2004 iCarnegie, Inc. All rights reserved.