Download Information systems and databases

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
Information systems and databases


They organise data into information
Allow analysing of information to give knowledge
There are different types of information systems for different purposes e.g.
 to process transactions
 to help decision making (e.g. risk analysis)
 to provide users with information about the organisation (e.g. stock inventories)
 to manage information used within an organisation (e.g. fax and email)
Database information systems
They are an organised collection of data stored and retrieved to meet the needs of users. For example:
 school databases holding information on teachers, subjects, classrooms and students
 the Roads and Traffic Authority holding information on vehicles and holders of drivers licences
 Video stores holding information on borrowers and videos
Organisation (structuring of data)
Non-computer methods of organising
 Telephone books which contain names, addresses and telephone numbers organised alphabetically in
columns
 Card catalogues which contain files on folders, stored in different cabinets
Computer methods of organising
 Flat file systems which hold data in one table
 Database management systems which have multiple tables with links between the tables
 Hypermedia which links texts, graphics, audio and video to allow navigation within the system
Method
Non-computer
Advantages
 A small amount of data can be quickly
and easily retrieved
 No special training is required
Computer
 Storage – vast amounts of data can be
stored in a small space
 Retrieval – for complex searches, data
retrieval is quicker
 Processing – manipulation of data is
more accurate
 Networking – electronic transmission is
cheaper
Disadvantages
 Requires large amounts of
storage space
 Data retrieval can take time for
complex searches
 Special training is required
 Initial and maintenance costs
of hardware and software is
expensive
Logical organisation in a flat file database
 Files: a block of data stored as one unit (a table)
 Records: one entry in the file
 Fields: a specific aspect of a record; and key fields: which hold data unique to each record (primary key)
 Characters (one character = 1 byte)
Logical organisation in relational database
 Schemas: an organised plan of the entire database consisting of:
o Entities (table)
o Attributes (field)
o
Relationships (the way entities are related)
 One To One: where there is exactly one record in the first table that corresponds to
exactly one record in the related table
 One to Many: where for each instance of table A, many instances of table B exist,
but for each instance of table B, only one instance of table A exists
 Many to Many: When multiple rows in table A can correspond to multiple rows in
table B
 Tables: holds data about one entity, consisting of:
o Attributes
o Records
 Linking tables: primary key of one table is related to the primary key of another table (now called a foreign
key) to access data from the second table. The data in the primary keys must match before they can be
related.
 User views for different purposes
o List view – to display all records in the entity
o Form view – a record is displayed as if it were on a paper form to assist with data entry
o Report view – displays only the information the user has chosen for printing
o Query view – displays the results when questions are asked of the database
Flat-file v Relational databases
Similarities
Computer methods of organising data
Gives users ability to store, retrieve and update
data
Issues include data security, integrity,
redundancy
Differences
One table in a flat file system; multiple tables in a
relational system
Flat file systems usually contain smaller amounts
of data compared to relational systems
Flat file system have no relationships but
relational systems have multiple relationships
between entities
Data modelling tools for organising databases
 Data dictionaries (describe the characteristics of data)
 Schemas (show the relationship between entities)
 Normalisation (reduces data redundancy by dividing data into tables)
Logical organisation of hypermedia
 Nodes (where data is stored e.g. servers) and links (a connection between related material e.g. hypertext)
 Universal Resource Locators (URL): address of a resource on the web e.g.
http://www.cambridge.edu.au/sdd/html
Hypertext Transfer
Protocol: rule for
retrieving data
from the web
Domain name: Organisation
(Cambridge) and domain in
which it operates (education)
Webpage: address and name of
the webpage to be retrieved and
the language it is written in
(hypertext mark-up language)
 HTML tags: language that explains the use of data in a hypermedia source. Viewed on a web browser e.g.
<B>…. </B> Tag that indicates the start and end of text that will appear in bold
Tools for organising hypermedia
 Storyboards: each screen is represented as a frame. On each frame, the media elements of the screen
are shown and the navigation links between each screen.
o Storyboard layouts can be linear; hierarchical or unstructured
 HTML editors: software that allows text, graphics, audio and video to be hyperlinked e.g. Front
Page
Comparison of data organisation
Databases
Hypermedia
Metadata structure is data dictionaries
Metadata structure is HTML tags
Method of organising is in tables
Method of organising is creating links between
pages
Language used to manipulate data is SQL
Language used to manipulate data is HTML
Type of data organised is mainly text
Uses all data types as navigational links
Storage and Retrieval
Database management systems:
 Handle access to the database
 Is only an interface – that is, the actual data is independent from the DBMS
 Can be centralised – where the data is stored on one location
 Can be distributed – where the data is stored on several computers across the network (so data is
where it is most needed)
Access
 Sequential – data is retrieved in the order in which it was saved
 Direct – data is retrieved, using a data address, directly from the location where it is stored (quicker)
Storage
 Online – data is directly accessible for processing
 Offline – data is not directly accessible for processing (it needs to be loaded into the system)
Storage media
 Hard disks: fixed metal platters; magnetic; direct access
 CD-ROM: plastic disk with a thin metal coating; optical; direct access
 Tapes: thin strip of plastic wound on reels; magnetic; sequential access
Data security (from theft, destruction or alteration):
 Encryption: data is coded to render it unreadable by any who do not possess the key for decryption.
But codes can be cracked.
 Back up: where copies of the data are stored separately (full back-up; differential back-up and
incremental back-up). But does not guard against the loss of data in the main source.
 Firewall: software that prevents unauthorised access to data as a remote computer validates requests
for access based on passwords. But it can slow the system for users.
Tools for database storage and retrieval
 Sorting – arranging data (ascending or descending) to extract relevant information
 Searching – querying data
o Query by example: <Fieldname><Operator><Criteria>
Used by participants with no programming training
o Structured Query Language:
SELECT (field names)
FROM (table names)
WHERE (operator and data)
Order BY (field name ASC or DESC)
Logical operators: AND, OR, NOT
Relational operators: >, <, =
Tools for hypermedia search and retrieval
 Free text searching – search for specific characters on the screen

Search engine – user types key words in the search box and the search engine returns a list of related
links.
o Collected data from each webpage (an index) is used by search engines to do relevancy
calculations about the keywords so only the most relevant information is retrieved.
o To expand the index, search robots are used to trawl the web and collect more sites.
o Searches can be refined using + symbols between keywords (to include all the words) or
quote marks (to search phrases).
Displaying in database information systems
 Reports – organised presentation of data stored with the database. A query can be used to generate a
report so only needed information is displayed.
 Forms – must use design principles such as consistency of styles and clear headings
Issues related to information systems and databases
Acknowledgement of data sources
 Copyright Act requires users to acknowledge all data sources (because all data belongs to someone)
 Duplicating data is very easy (e.g. cutting and pasting from internet)
Freedom of Information Act
Allows individuals to access data held about them by government so they have some control
Privacy principles
 Privacy Act gives individuals rights to control their personal data
 Confidential data should be available on a ‘need to know’ basis and with the approval of the individual
 Does the data belong to the individual or the organisation that collects it?
Quality of data (its reliability and integrity)
 Accuracy – a major issue with internet data (need to check source of the data and then cross-check
against a known reputable source)
 If it is up to date
 Validation – entering data in the correct format and range (use radio buttons to minimise issue)
Access to data, ownership and control of data
 Data security allows for access control to prevent hacking and cracking
 News and entertainment controlled worldwide by a few companies. Therefore, they effectively own
the data and control access to it.
Data matching to cross link data across multiple databases
This gives more accurate data but individual lacks control of where data is linked
New trends
 Data warehousing – electronic storage centre of historical data for future analysis
 Data mining – software which sifts through data to detect patterns e.g. find shopping habits of
customers to target advertising
 Online Analytical Processing – quickly answers complex queries using a multidimensional database. It
is used for data mining.
 Online Transaction Processing – manages data entry and retrieval for a large number of users who will
be simultaneously performing transactions that change real-time data e.g. banking and ticketing
systems