* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Files, Database, eCommerce
Microsoft SQL Server wikipedia , lookup
Microsoft Access wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
Files, Database, eCommerce Pertemuan ke 10 Magister Teknik Elektro Universitas Udayana 1 / 43 Managing Files • Managing Files: Basic Concepts. – Data is organized in a data storage hierarchy of increasingly complex levels: bits, bytes (characters), fields, records, files, and databases. • • • • A character is a letter, number, or special character. A field consists of one or more characters (bytes). A record is a collection of related fields. A file is a collection of related records. A database is, as mentioned, an organized collection of integrated files. Important to data organization is the key field, a field used to uniquely identify a record so that it can be easily retrieved and processed. 2 / 43 • Data storage hierarchy – The ranked levels of data stored in a computer: bits, bytes (characters), fields, records, files, and databases. – Why it’s important: Understanding the data storage hierarchy is necessary to understand how to use a database. 3 / 43 4 / 43 • Character – A single letter, number, or special character. Why it’s important: Characters—such as A, B, C, 1, 2, 3, #, $, %—are part of the data storage hierarchy. – Use an ASCII chart from the Web such as http://wls.wwco.com/ref/ascii.html to spell out your name in binary. Don’t forget the space between your first and middle name nor the one between your middle and last name. How many bits were required? 5 / 43 • Field – Unit of data consisting of one or more characters (bytes). An example of a field is your name, your address, or your Social Security number. – Why it’s important: A collection of fields makes up a record. Also see key field. • Record – Definition: Collection of related fields. An example of a record would be your name and address and Social Security number. – Why it’s important: Related records make up a file. 6 / 43 • File – Collection of related records. An example of a file is data collected on everyone employed in the same department of a company, including all names, addresses, and Social Security numbers. – Why it’s important: A file is the collection of data or information that is treated as a unit by the computer; a collection of related files makes up a database. – Often an unknown file’s extension will be the only way of finding out what application created it. 7 / 43 • Files are given names—filenames – Filenames also have extension names, three-letter additions such as .doc and .txt. Among the types of files are the following. • (1) Program files are files containing software instructions. The two most important are source program files, which contain instructions in the form written by the programmer, and executable files, which contain instructions that tell a computer how to perform a particular task. • (2) Data files are files that contain data. • (3) Other common files are ASCII files, which are text only; image files for digitized graphics; audio files, which contain digitized sound; animation/video files, used for conveying moving images; and Web files, which are files carried over the World Wide Web. 8 / 43 • Filename – The name given to a file – Why it’s important: Files are given names so that they can be differentiated. Filenames also have extension names. These extensions of up to three letters are added after a period following the filename—for example, the .doc in Psychreport.doc is recognized by Microsoft Word as the extension for "document." Extensions are usually inserted automatically by the application software. 9 / 43 • Program files – Definition: Files containing software instructions. Why it’s important: Contrast data files. – For More Info: Below are the contents of an actual program file named bmi. This program is written in the Perl programming language, which is very popular with system administrators, programmers, and Web developers. The program asks the user for his or her height and weight, and then calculates the person’s BMI (Body Mass Index) based on those two inputs. Finally, the program outputs the user’s BMI, along with a statement indicating whether the user is normal weight, overweight, or obese. (You can find many pages relating to BMI on the WWW, some of which have built-in BMI calculators that do the equivalent of this Perl program.) 10 / 43 • • • • • • • • • • • • • • • • #! /usr/bin/perl print "Enter your height in inches: "; $height = <STDIN>; print "Enter your weight in pounds: "; $weight = <STDIN>; $bmi = ($weight / $height / $height) * 703; print "Your BMI is $bmi\n"; if ($bmi < 25) { print "Normal weight: < 25 BMI\n"; } elsif ($bmi < 30) { print "Overweight: 25 to 29.9 BMI\n"; } else { print "Obese: 30 and above BMI\n"; } 11 / 43 • Two main ways in which a storage device accesses stored data are sequential access and direct access. – Sequential storage means that data is stored and retrieved in sequence, as is the case with magnetic-tape storage. – Direct access storage means that a computer can go directly to the information you want, as in a CD player; hard disks and other types of disks are of this nature. 12 / 43 • Sequential storage – Definition: Storage system whereby data is stored and retrieved in sequence, such as alphabetically. – Why it’s important: An inexpensive form of storage, sequential storage is the only type of storage provided by tape, which is used mostly for archiving and backup. The disadvantage of sequential file organization is that searching for data is slow. Compare direct access storage. 13 / 43 Magnetic tape sequential storage system. 14 / 43 • Direct access storage – Storage system that allows the computer to go directly to the desired information. The data is retrieved (accessed) according to a unique data identifier called a key field. It also uses a file allocation table (FAT), a hidden on-disk table that records exactly where the parts of a given file are stored. – Why it’s important: This method of file organization, used with hard disks and other types of disks, is ideal for applications where there is no fixed pattern to the requests for data—for example, in airline reservation systems or computer-based directory-assistance operations. Direct access storage is much faster than sequential access storage. 15 / 43 • Whether on magnetic tape or disk, data may be stored offline or online. – Offline storage means that data is not directly accessible for processing until the tape or disk has been loaded onto an input device. – Online storage means that stored data is randomly (directly) accessible for processing. 16 / 43 • Offline storage – System in which stored data is not directly accessible for processing until the tape or disk it’s on has been loaded onto an input device. – Why it’s important: The storage medium and data are not under the direct, immediate control of the central processing unit. – In addition to online storage and offline storage, more recently the term "near line storage" has come into existence. Like online storage, near line storage is directly accessible by the CPU. But like offline storage, users may have to wait awhile before their request for data is fulfilled. 17 / 43 • A database management system (DBMS) consists of programs that control the structure of a database and access to the data. The benefits of databases are file sharing, reduced data redundancy, improved data integrity, and increased security. Databases can be classified as four types. • – – – – – (1) An individual database is a collection of integrated files used by one person. It could be a personal information manager, which helps people keep track of information they use daily. (2) A shared database, or company database, is shared by users in one organization in one location. (3) A distributed database is stored on different computers in different locations connected by a client/server network. (4) A public databank is a compilation of data available to the public; many such databanks are Web sites. The last three databases should have a database administrator to coordinate activities and needs. 18 / 43 • Database management system (DBMS) – Also called a database manager; software that controls the structure of a database and access to the data. Allows users to manipulate more than one file at a time. – Why it’s important: This software enables sharing of data (same information is available to different users); economy of files (several departments can use one file instead of each individually maintaining its own files, thus reducing data redundancy, which in turn reduces the expense of storage media and hardware); data integrity (changes made in the files in one department are automatically made in the files in other departments); security (access to specific information can be limited to selected users). 19 / 43 Examples of Database Management Systems • • • • • • • • • • • • • • • • • • Oracle database IBM DB2 Adaptive Server Enterprise FileMaker Firebird Ingres Informix Microsoft Access Microsoft SQL Server Microsoft Visual FoxPro MySQL PostgreSQL Progress SQLite Teradata CSQL OpenLink Virtuoso Daffodil DB 20 / 43 • Individual database – Collection of integrated files used by one person. – Why it’s important: Microcomputer users can set up their own individual databases using popular database management software; the information is stored on the hard drives of their personal computers. Today the principal database programs are Microsoft Access, Corel Paradox, and Lotus Approach. In addition, types of individual databases known as personal information managers (PIMs) can help users keep track of and manage information used on a daily basis, such as addresses, telephone numbers, appointments, to-do lists, and miscellaneous notes. Popular PIMs are Microsoft Outlook, Lotus Organizer, and Act. 21 / 43 • Shared database – Also called a company database; a database shared by users in one company or organization in one location. The organization owns the database, which may be stored on a server such as a mainframe. Users are linked to the database via a local area or wide area network; the users access the network through terminals or microcomputers. – Why it’s important: Shared databases, such as those you find when surfing the Web, are the foundation for a great deal of electronic commerce, particularly B2B commerce. 22 / 43 • Public databank – Compilation of data available to the public. – Why it’s important: The public databank is one of the basic types of database. – Web Exercise: One public database available on the Web is the Social Security Death Index. 23 / 43 • Database administrator – Person who coordinates all related activities and needs for an organization’s database. – Why it’s important: The DBA determines user access privileges; sets standards, guidelines, and control procedures; assists in establishing priorities for requests; prioritizes conflicting user needs; and develops user documentation and input procedures. He or she is also concerned with security—setting up and monitoring a system for preventing unauthorized access and making sure that the system is regularly backed up and that data can be recovered should a failure or disaster occur. 24 / 43 • Database Models. Databases can be organized in four ways. – – – – (1) In a hierarchical database, fields or records are arranged in related groups resembling a family tree, with child (lowerlevel) records subordinate to parent (higher-level) records. (2) A network database is similar to a hierarchical database but each child record can have more than one parent record. (3) A relational database relates, or connects, data in different files through the use of a key field. Structured query language is an easy-to-use computer language for making queries to a relational database and for retrieving selected records. One feature of most query languages is query by example (QBE), which allows users to ask for information in a relational database by using a sample record to define the qualifications they want for selected records. (4) An object-oriented database uses objects, software written in small, reusable chunks, as elements within database files. An object consists of data in any form and instructions on the action to be taken on the data. 25 / 43 • Hierarchical database – Database in which fields or records are arranged in related groups resembling a family tree, with child (lower-level) records subordinate to parent (higher-level) records. The parent record at the top of the database is called the root record. – Why it’s important: The hierarchical database is one of the common database structures. 26 / 43 27 / 43 • Network database – Database similar in structure to a hierarchical database; however, each child record can have more than one parent record. Thus, a child record, which in network database terminology is called a member, may be reached through more than one parent, which is called an owner. – Why it’s important: The network database is one of the common database structures. 28 / 43 29 / 43 • Relational database – Common database structure that relates, or connects, data in different files through the use of a key field, or common data element. In this arrangement there are no access paths down through a hierarchy. Instead, data elements are stored in different tables made up of rows and columns. In database terminology, the tables are called relations (files), the rows are called tuples (records), and the columns are called attributes (fields). All related tables must have a key field that uniquely identifies each row; that is, the key field must be in all tables. – Why it’s important: The relational database is one of the common database structures; it is more flexible than hierarchical and network database models. 30 / 43 31 / 43 • Structured Query Language (SQL) – Standard language used to create, modify, maintain, and query relational databases. Why it’s important: SQL further simplifies database use. – Historical Perspective: SQL is pronounced as "sequel." How did this acronym get such an unlikely pronunciation? The first structured query language was developed by IBM in the 1970s; its product name was "Sequel2.“ • Query by example (QBE) – Feature of query-language programs whereby the user asks for information in a database by using a sample record to define the qualifications he or she wants for selected records. – Why it’s important: QBE further simplifies database use. 32 / 43 • Object-oriented database – Database that uses "objects," software written in small, reusable chunks, as elements within database files. – An object consists of (1) data in any form, including graphics, audio, and video, and (2) instructions on the action to be taken on the data. – Why it’s important: A hierarchical or network database might contain only numeric and text data. By contrast, an object-oriented database might also contain photographs, sound bites, and video clips. Moreover, the object would store operations, called methods, the programs that objects use to process themselves. 33 / 43 • Features of a Database Management System. A database management system may have a number of components. – – – – – (1) A data dictionary is a procedures document or disk file that stores the data definitions or a description of the structure of data used in the database. (2) DBMS utilities are programs that allow you to maintain the database by creating, editing, and deleting data, records, and files. (3) A report generator is a program for producing an onscreen or printed document from all or part of a database. (4) Different users are given different user access privileges, as determined by the database administrator. (5) A DBMS should have system recovery features, so the database administrator can recover the contents of the database in the event of hardware or software failure. Four approaches are: mirroring, with two copies of the database in different locations; reprocessing, in which the processing can be redone from a known past point; roll forward, a variant on reprocessing; and rollback, which is used to undo unwanted changes to the database. 34 / 43 • Data dictionary – File that stores data definitions and descriptions of database structure. It may also monitor new entries to the database as well as user access to the database. Why it’s important: The data dictionary monitors the data being entered to make sure it conforms to the rules defined during data definition. The data dictionary may also help protect the security of the database by indicating who has the right to gain access to it. • DBMS Utilities – Programs that allow the maintenance of databases by creating, editing, and deleting data, records, and files. Why it’s important: DBMS utilities allow people to establish what is acceptable input data, to monitor the types of data being input, and to adjust display screens for data input. 35 / 43 • Report generator – In a database management system, a program users can employ to produce on-screen or printed-out documents from all or part of a database. – Why it’s important: Report generators allow users to produce finished-looking reports without much fuss. • Crystal Reports, http://www.businessobjects.com/product/catalog/ crystalreports/ made by Crystal Decisions, is a very popular commercial report generator. 36 / 43 • Databases & the New Economy: ECommerce, Data Mining, & B2B Systems. – Databases underpin the so-called New Economy of computer, telecommunications, and Internet companies in three ways: ecommerce, data mining, and business-tobusiness (B2B) systems. – E-commerce, or electronic commerce, is the buying and selling of products and services through computer networks; an example is Amazon.com. 37 / 43 • E-commerce – Electronic commerce; the buying and selling of products and services through computer networks. Why it’s important: By 2003, total U.S. e-commerce sales to consumers are expected to reach $108 billion, or 6% of consumer retail spending; online shopping is growing even faster than the increase in computer use, which has been fueled by the falling price of personal computers. • Data mining – Computer-assisted process of sifting through and analyzing vast amounts of data in order to extract meaning and discover new knowledge. – Why it’s important: The purpose of DM is to describe past trends and predict future trends. Thus, data-mining tools might sift through a company’s immense collections of customer, marketing, production, and financial data and identify what’s worth noting and what’s not. – KDNuggets (short for "Knowledge Discovery Nuggets") provides a wealth of resources on data mining and Web mining: a free enews letter; lists of publications; course descriptions; info on companies and products; job listings; and much more. 38 / 43 • Data mining is the computer-assisted process of sifting through and analyzing vast amounts of data in order to extract meaning and discover new knowledge. • Data mining begins with acquiring data and cleaning it of errors to yield cleaned-up data and a version of it called meta-data (which shows its origins and transformations), which are then sent to a data warehouse, a special database of cleaned-up data and meta-data. • Three kinds of tools are used to perform data mining, or finding and analyzing tasks: query and reporting tools, multidimensional-analysis tools, and intelligent agents. • Data mining is used in applications ranging from marketing to health to science. 39 / 43 • Data warehouse – A database containing cleaned-up data and meta-data (information about the data). Stored using high-capacity disk storage. – Why it’s important: Data warehouses combine vast amounts of data from many sources in a database form that can be searched, for example, for patterns not recognizable with smaller amounts of data. 40 / 43 • Business-to-business systems (B2B systems) allow businesses to sell to other businesses, using the Internet or private network to cut transaction costs and increase efficiencies. The Ethics of Using Databases: Concerns about Accuracy & Privacy. • – – – – In morphing, a film image is altered pixel by pixel, so that the image becomes something else. This manipulation of digitized images and sounds raises some ethical issues. Sound performances can be misrepresented, photos may be manipulated, and video and TV images may be altered in undetectable ways and all stored in a database. Databases are also limited in accuracy and completeness, since not all facts can be found in a database, nor are all data items true. In addition, databases raise several concerns about privacy. Finally, those who own databases may be in a position to monopolize information. 41 / 43 42 / 43 • Morphing – Altering a film or video image displayed on a computer screen pixel by pixel, or dot by dot. – Why it’s important: Morphing and other techniques of digital manipulation can produce images that misrepresent reality. 43 / 43