Download Part II: Organisation and Structure of Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Corecursion wikipedia , lookup

Transcript
 Part II: Organisation and Structure of Data
Field, records, and files
A field is single piece of data. There are two types of fields: fixed length and variable
length fields.
•
•
fixed length field has a fixed size
variable length field has a varied length/size
A record is a collection of fields related to one object that is treated as a unit for processing.
In addition, we can store our records as either fixed or variable length records.
Fixed length records have the same number of bytes (or characters) in each record, and the
same number of fields in each record. These are used when most data will fit in the field size
as they’ll be of a reasonably consistent length. This way, it’s easier for applications to process
as they can predict the amount of memory required to hold the data. Examples of these are:
Customer records, such as Customer name, address, telephone number, post code, DOB
Car records, such as car make, model, registration number, colour.
With variable length records, we use a different number of bytes in each record, and
possibly a different number of fields in each record. This gives more flexibility, but it’ll cause
slower access as it’s more difficult to process for applications.
Serial files are where the records are not stored in any particular order. To add a new record
to this file type, we simply add to the end of the file.
Sequential files are where records are stored in a particular order. This is usually sorted by
the primary key field. To add a new record to a sequential file, we create a new file, copying
the records into it until the correct point for the new record. At this point, the new record is
added, then the remaining records are appended. The file is saved, and this then replaces the
original file.
Sometimes, master files are used along with transaction files. The master file is essentially a
database, containing all the records. The transaction file is a list of instructions that are to be
carried out on the database. Companies who have to run a monthly payroll usually work by
building up a transaction file of all the details that need to be altered (e.g. parking expenses,
deductions), which are all run through the database once a month before employees are paid.
Here are the steps to update a master file: (you need to remember this!)
1. transaction file is sorted into the same order as the master file
2. each transaction record is read and used to update the corresponding record in the
master file
3. new master record is written to the new master file
4. after the last transaction record is processed, remaining old master records are written
to the new master file
This could be shown like this…
or in the employee monthly pay case:
Sorted trans.file Examples of sequential files “in action” include:
Sorted transaction file (of actions carried out on a database)
Bank master file
Payroll file
Customers of a utility company
We use sequential files when records are required to be in sequential order for ease of
processing. The advantage of this is that the data file is easier (and therefore quicker) to
search through.
Database
A database is a collection of data that is organized so that its contents can easily be accessed,
managed, and updated. Computer databases typically contain collections of data records or
files, such as sales transactions, product catalogues and inventories, and customer profiles. In
our school, a database may contain all students’ details and those details can be searched and
accessed in many ways, such as to produce school reports.
A database organises data in tables that consist of multiple records. Each record in a table is
called a row. A row or record is made of multiple fields. Each field has a field name, field
length and data type, such as string/text, integer, etc.
Each record has a primary key to uniquely identify it in a database. Example of primary
keys could be student ID, employee number, or catalogue number.
For example, a table of employee records may look like this:
And a table of product inventory records may look like this:
Advantages of database
•
•
•
•
•
Reduced updating errors and increased consistency
Greater data integrity by applying data validation and verification (see the next
sections)
Improved data access to users through use of queries and reports
Improved data security by using backups, access rights restrictions and encryptions
Facilitated development of new applications program
Disadvantages of database
•
•
•
Database systems are complex, difficult, and time-consuming to design
Substantial hardware and software start-up costs
Damage to database affects virtually all applications programs
•
Initial training required for all programmers and users
Database management systems (DBMS)
The role of a DMBS is to control the creation, maintenance, and use of a database. A DBMS
provides facilities for controlling data access, enforcing data integrity, managing concurrent
access, and recovering the database after failures and restoring it from backup files, as well as
maintaining database security.
It is normally a database administrator’s job to maintain databases and DBMS.
Data verification
Data verification is performed to ensure that the data entered exactly matches the original
source.
There are two main methods of verification:
1. Double entry - entering the data twice and comparing the two copies. This effectively
doubles the workload, and as most people are paid by the hour, it costs more too.
2. Screen check/Proofreading data - this method involves someone checking the data
entered against the original document. This is also time consuming and costly.
3. Check digit - the last one or two digits in a code are used to check the other digits are
correct. Bar code readers in supermarkets use check digits.
Data validation
Validation is an automatic computer check to ensure that the data entered is sensible and
reasonable, but not necessarily correct. It does not check the accuracy of data.
There are a number of validation types that can be used to check the data that is being
entered.
Validation
type
How it works
Example usage
Format check
checks the data is in the right format
a National Insurance number is in the form LL 99 99 99 L where
L is any letter and 9 is any number
Length check
checks the data isn't too short or too long
a password which needs to be six letters long
Type check
Checks the data of the right data type
Lookup table
looks up acceptable values in a table
there are only seven possible days of the week
Presence
check
checks that data has been entered into a
field
in most databases a key field cannot be left blank
Range check
checks that a value falls within the
specified range
number of hours worked must be less than 50 and more than 0
Age is defined as integer. Input “Twelve” instead of “12” will not
be accepted.
Part III: Data Type and Data Structures
In the world of software development, there are a number of important terms you need to be
familiar with.
In programming, there are a number of “primitive” data types, on which others are
constructed. For each, you need to be able to give examples of values they could hold, and
when you’d use it. These are...
Boolean – held as a true or false (sometimes a 0 or 1). These are the most space efficient
variables as they can exist in a single bit of RAM. Examples include “OwnsACar”,
“LateToSchool” or “LikesBiscuits”.
To store Boolean data type: 1 bit of storage is needed.
Character – these are a single letter, number or punctuation mark. As with Boolean, these
are reasonably space efficient. Examples are “a”, “3” or “-“. They could be used to hold a
person’s initial, or act as a reference to a file type (e.g. j for JPEG).
To store Character data type: 1 byte of storage is needed.
String – an array of characters. This is probably one of the most common data types in
existence, and has a wide range of applications. Example strings are “Hello” or “This is an
example of a string”. This could be used to store a person’s name or a file name.
To store String data type: multiple bytes of storage is needed.
Integer – this refers to any whole number (e.g. 4, 65000 or -56). These could be used to store
a person’s age, or the number of years they’ve worked in a company.
Real – a number with a floating point (E.g. 3.141592). These are memory intensive, but
allow computers to store complex numbers for mathematical operations.
If a standard variable could be thought of as a box that you can put a single value in, then an
array is a way of adding dividers to the box to put more into it. At AS level, you need to
understand arrays in both one and two dimensions. E.g.
Standard variable:
Dim x as Integer
x=6
Single dimension array:
Dim x(3) as Integer
x(0)=2
x(1)=4
x(2)=100
x(3)=289
x(0)
2
x(1)
4
x(2)
100
x(3)
289
Two dimensional array: (Don’t call them 2D arrays in the exam!)
Dim x(2,2) as Integer
X(0,0)=1
X(2,2)=67
Etc...
X(0,0)=1
X(0,1)=5
0X(0,2)=7
X(1,0)=3434 X(1,1)=5432 X(1,2)=1
X(2,0)=
X(2,1)=0
X(2,2)=-67
Arrays are very useful for storing data within a program in a “virtual” table so that you can
manipulate them within your code. Many people think they’re going to be difficult to work
with, but are mostly surprised at how easy they are to work with. The other advantage is that
you can loop through the values they contain.