Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 05 - Data Resource Management 5 Data Resource Management CHAPTER OVERVIEW Chapter 5: Data Resource Management emphasizes management of the data resources of computer-using organizations. This chapter reviews key database management concepts and applications in business information systems. LEARNING OBJECTIVES After reading and studying this chapter, you should be able to: 1. Explain the business value of implementing data resource management processes and technologies in an organization. 2. Outline the advantages of a database management approach to managing the data resources of a business, compared with a file processing approach. 3. Explain how database management software helps business professionals and supports the operations and management of a business. 4. Provide examples to illustrate each of the following concepts: a. Major types of databases. b. Data warehouses and data mining. c. Logical data elements. d. Fundamental database structures. e. Database development. SUMMARY • Data Resource Management. Data resource management is a managerial activity that applies information technology and software tools to the task of managing an organization’s data resources. Early attempts to manage data resources used a file processing approach in which data were organized and accessible only in specialized files of data records that were designed for processing by specific business application programs. This approach proved too cumbersome, costly, and inflexible to supply the information needed to manage modern business processes and organizations. Thus, the database management approach was developed to solve the problems of file processing systems. 5-1 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management • Database Management. The database management approach affects the storage and processing of data. The data needed by different applications are consolidated and integrated into several common databases instead of being stored in many independent data files. Also, the database management approach emphasizes updating and maintaining common databases, having users’ application programs share the data in the database, and providing a reporting and an inquiry/response capability so that end users can easily receive reports and quick responses to requests for information. • Database Software. Database management systems are software packages that simplify the creation, use, and maintenance of databases. They provide software tools so that end users, programmers, and database administrators can create and modify databases; interrogate a database; generate reports; do application development; and perform database maintenance. • Types of Databases. Several types of databases are used by business organizations, including operational, distributed, and external databases. Data warehouses are a central source of data from other databases that have been cleaned, transformed, and cataloged for business analysis and decision support applications. That includes data mining, which attempts to find hidden patterns and trends in the warehouse data. Hypermedia databases on the World Wide Web and on corporate intranets and extranets store hyperlinked multimedia pages on a Web site. Web server software can manage such databases for quick access and maintenance of the Web database. • Data Access. Data must be organized in some logical manner on physical storage devices so that they can be efficiently processed. For this reason, data are commonly organized into logical data elements such as characters, fields, records, files, and databases. Database structures, such as the hierarchical, network, relational, and objectoriented models, are used to organize the relationships among the data records stored in databases. Databases and files can be organized in either a sequential or direct manner and can be accessed and maintained by either sequential access or direct access processing methods. • Database Development. The development of databases can be easily accomplished using microcomputer database management packages for small end-user applications. However, the development of large corporate databases requires a top-down data planning effort that may involve developing enterprise and entity relationship models, subject area databases, and data models that reflect the logical data elements and relationships needed to support the operation and management of the basic business processes of the organization. KEY TERMS AND CONCEPTS 1. Data dependence (): The degree to which a given software application depends on the format of stored data. The greater the dependence on format, the greater the long-term program maintenance costs. 2. Data dictionary (): A software application and database containing descriptions and definitions concerning the structure, data elements, interrelationships, and other characteristics of an organization’s databases. 3. Data integration (): The degree to which necessary data is integrated into a single database for access. 4. Data integrity (): Refers to the accuracy of the data stored within a database. 5-2 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management 5. Data mining (): A process where data in a data warehouse is searched and analyzed to discover useful, new insights. 6. Data modeling (): A process where the relationships between data elements are identified and defined. 7. Data redundancy (): The degree to which data has been duplicated between various files and databases. The greater the redundancy, the more difficult data maintenance tasks become. 8. Data resource management (): A managerial activity that applies information systems technology and management tools to the task of managing an organization’s data resources. Its three major components are database administration, data administration, and data planning. 9. Database administrator (DBA) (): A specialist responsible for the development, maintenance, and security of an organization’s databases. 10. Database interrogation (): The activities associated with retrieving data in a database. 11. Database management approach (): An approach to the storage and processing data in which independent files are consolidated into a common pool or database of records and made available to different application programs and end users for processing and data retrieval. 12. Database management system - DBMS (): A set of computer programs that controls the creation, maintenance, and use of an organization's databases. 13. Database structure (): The manner or format in which data are organized within files. a. Hierarchical structure (): A logical data structure in which the relationships between records form a hierarchy or tree structure. The relationships among records are one-to-many, since each data element is related only to the parent element stored above it. b. Multidimensional model (): A relational database containing summary information cross tabulated by various data categories. c. Network structure (): Allows many-to-many or web-like relationships between data records. d. Object-oriented model (): Uses objects as data elements, i.e. elements that include both data and the methods or processes that act on the data. e. Relational model (): A logical data structure in which all data elements within the database are viewed as being stored in the form of tables. Applications can link records between various tables via common data elements. 14. Duplication (): A process that copies one master database to multiple sites at pre-arranged times; also known as "mirroring". 5-3 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management 15. File processing (): Data organized, stored, and processed in independent files of data records and accessed and manipulated directly by one or more applications. 16. Logical data elements (): A conceptual framework of several levels of data representing various data groupings. a. Attribute (): A characteristic or quality of an entity. b. Character (): The most basic logical data element, consisting of a single alphabetic, numeric, or other symbol. c. Database (): A collection of logically related files or tables. A database consolidates files or tables into a common pool that serves many applications. d. Entity (): A person, place, or thing. e. Field (): A data element consisting of a grouping of characters that describe a single attribute of an entity. f. File (): A collection of related records treated as a unit. Sometimes called a data set. g. Record (): A collection of related data fields that taken together describe a single entity. 17. Metadata (): Data describing the attributes, entities, relationships, and other characteristics of a database. 18. Replication (): A periodic two-way exchange of information additions, deletions, and changes between databases so they once again contain identical information. 19. Structured Query Language (): A high-level human-like language provided by a database management system that enables users to extract data and information from a database. 20. Types of databases (): There are several major conceptual categories of databases that may be found in many organizations. These include: a. Data warehouse (): A central store of data that has been copied from various organizational databases, standardized, and integrated for use throughout an organization. b. Distributed (): The concept of mirroring or replicating databases or portions of a database to remote sites where the data is more easily accessed. Sharing is made possible through a network connecting the databases. c. External (): Commercially operated databases that provide information for a fee and that can be accessed through the Internet. These are also known as "commercial databases". 5-4 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management d. Hypermedia (): A website stores information in a database consisting of a home page and other hyperlinked pages of multimedia or mixed media (text, graphics and photographic images, video clips, audio segments, and so on). e. Operational (): Databases that support the major business functions of an entire organization, also called subject area databases, transaction databases, and production databases. ANSWERS TO REVIEW QUIZ Q. A. Key Term Q. A. Key Term 1 Database integration 20 Relational model 2 Data integration 21 Multidimensional model 3 Database administrator (DBA) 22 Operational 4 Structured Query Language (SQL) 23 Data warehouse 5 Data dictionary 24 External 6 Database interrogation 25 Data dependence 7 Database management system (DBMS) 26 Database structures 8 Distributed 27 Character 9 Object-oriented model 28 Replication 10 Hypermedia 29 Data integrity 11 Data resource management 30 Metadata 12 Data mining 31 Attribute 13 Data modeling 32 Types of databases 14 Field 33 Data redundancy 15 Record 34 Duplication 16 File 35 Entity 17 Database 36 Network structure 18 File processing 37 Logical data elements 19 Hierarchical structure 5-5 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management ANSWERS TO DISCUSSION QUESTIONS 1. How should a business store, access, and distribute data and information about its internal operations and external environment? Store Businesses should take a database management approach and store vital information in the form of structured databases. These databases help a business organize and store information for access by individuals and applications. Typically these data are stored as relational tables. These tables may support transaction processing or information reporting applications. They may support summary data retrieval in the form of multi-dimensional tables. They may also contain an organization's vast transaction history in order to support data mining. Access People or applications should access information through the organization's database management system. This system keeps the information organized and manages access control. Through this approach, many different applications can share rather than duplicate the data. By structuring its information in this way, organizations can increase its accuracy, reduce its redundancy, and make it available to the appropriate people both inside and outside the organization. Distribute Information can be accessed via applications over the internal network, extranet, or the Internet through a variety of applications. Where network connectivity poses limitations, database administrators may elect to mirror a database (if it's updated centrally) or replicate a database (if information is updated remotely). 2. What role does database management play in managing data as a business resource? Data is an asset and should be treated as such. Some data poses a liability should it fall into the wrong hands. As a result, data is a critical organizational resource that requires professional management. This management helps ensure the information's reliability and availability. Database managers also help inform business managers learn about this resource's availability. Database managers support both data and application acquisition processes by providing expert advice regarding the impact these acquisitions will have on the organization's existing data structures. 3. What are the advantages of a database management approach to the file processing approach? Give examples to illustrate your answer. Advantages: The database management approach consolidates data records and objects into databases that can be accessed by many different applications. The management software serves as an interface between applications and databases thereby reducing data dependence. It also helps solve other problems inherent in file processing such as data redundancy, lack of data integration, and data integrity. Example: University course registration systems should share the same databases which support the student financial aid processes. Shared information would include a student's name, ID, and financial status. In that way, if a student changes his or her name or if the student's debits would prohibit course registration, both applications would "know" about it. 4. Refer to the Real World Challenge in the chapter. In the case, it is quite evident that data were either unavailable or inaccurate to the point that business decisions could not rely on them too much. Who was responsible for the company being in that state of affairs? From the case as written, it is not evident exactly who was responsible for this state of affairs. Most probably the blame lay with several people in different areas. However, it was ultimately the responsibility of the IS/IT 5-6 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management department to make the data “clean” and useable. When the systems were first set up it should have been evident to the persons in charge of developing the systems that the data were being stored in different databases/locations, on different hardware, that the infrastructure of data storage was very distributed and convoluted, and that the data would not be available across the organization. However, these same people may not have been aware of the importance of certain data. At the same time, these IT designers should have been aware of the importance of having any needed data available whenever and wherever it was needed. Again, the responsibility ultimately lay with the designers of the system. The importance of oversight for even the best persons in a field should be evident. Another question should be asked here: if these data were needed and it took so long to access these data, why did IT management take so long to do something about the problem? This is not a new situation to IT, and IT management should have been able to see the problems and address them before they got completely out of hand. Communication between users, operations management, and IT is important to solving problems. A good question that should be asked is why this communication was not happening in this instance. 5. What is the role of a database management system in a business information system? The DBMS is to data what an operating system is to a computer. It allows application developers to focus on the development tasks associated with the application's function while leaving routine data management tasks entirely to the DBMS. This greatly simplifies application development and maintenance tasks as well as enables rapid application development initiatives. As a result, the DBMS serves as a business information system's core. 6. In the past, databases of information about a firm’s internal operations were the only databases that were considered important to a business. What other kinds of databases are important for a business today? Important databases: Competitors products, prices, promotions, and markets Customer trends, preferences, attitudes, demographics Economics databases Legislation tracking systems Census databases 7. Refer to the Real World Solution in the chapter. Although trucking companies would not generally be considered part of the “new economy,” they are nonetheless heavily reliant on data. Are all companies, both old and new, going the way of becoming data-driven when it comes to running them? Was this always the case? Even though trucking companies are not generally considered part of the “new economy” per se, their reliance on data makes their very success and profitability extremely reliant on the very thing that creates the new economy – information technology. The reason that most firms today adopt IT solutions is that they are more and more reliant on data for their success. Competing without this data is just not a possibility in most businesses in today’s data-driven business world. In the past this was also true, but was rarely recognized because of the lack of tools to work with the data at the speed and with the quality that we see today in the information technology that provides today’s solution. Once the tools became available it quickly became apparent how important access to the daily data was and that IT was the way to store it in a quality manner and access it in a timely fashion. 5-7 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management 8. What are the benefits and limitations of the relational database model for business applications today? Benefits: Most popular of the three main database structures. The data structures are logical and easy to understand. Provides significant flexibility over time. Limitations: Cannot process large amounts of business transactions as quickly and efficiently as the hierarchical and network models. They depend on indexes that create memory and processing overhead. 9. Why is the object-oriented database model gaining acceptance for developing applications and managing the hypermedia databases on business Web sites? Object-oriented databases are able to handle complex types of data (graphics, pictures, voice, and text) better than other structures. They are also relatively easy for programmers to use. 10. How have the Internet, intranets, and extranets affected the types and uses of data resources available to business professionals? What other database trends are also affecting data resource management in business? Effects: Networks allow business professionals to access and share information. Internet technologies have expanded this access to information sources outside the organization. Trends: Larger datasets More powerful analytical tools better able to generate output textually, graphically, and as animations Increased adoption of a powerful yet free RDBMS called MySQL Improved replication technologies Offline data access enabled via mirroring or replication Web 2.0 Mobile data access 5-8 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management ANSWERS TO ANALYSIS EXERCISES 1. Joining Tables Notes: most businesses need the information a system provides and not the other way around. Encourage students to focus on their business needs. These needs form the basic assumptions behind any design and development exercise. a. Using these data, design and populate a table that includes basic training rate information. Designate the “Technical” field type as “Yes/No” (Boolean). See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. b. Using these data, design and populate a course table. Designate the CourseID field as a “Primary Key” and allow your database to automatically generate a value for this field. Designate the “Technical” field type as “Yes/No” (Boolean). See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. c. Prepare a query that lists each course name and its cost per day of training. See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. d. Prepare a query that lists the cost per student for each class. Assume maximum capacity and that you will schedule two half-day classes on the same day to take full advantage of HOTT’s per-day pricing schedule. See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. 2. Training-Cost Management a. Using the information provided in the sample below, add a course schedule table to your training database. Designate the ScheduleID field as a “Primary Key” and allow your database program to generate a value for this field automatically. Make the CourseID field a number field and the StartDate field a date field. See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. b. Using the information provided in the sample below, add a class roster table to your training database. Make the ScheduleID field a number field. Make the Reminder and Confirmed fields both “Yes/No” (Boolean) fields. See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. c. Because the Class Schedule table relates to the Course Table and the Course Table relates to the Pricing Table, why is it appropriate to record the Price per Day information in the Class Schedule table too? The pricing table reflects current course costs. The PricePerDay cost in the Class Schedule table reflects the price paid for the course. If the negotiated rate changes over time, we would not want that price change reflected on courses already completed. 5-9 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management d. What are the advantages and disadvantages of using the participant’s name and e-mail address in the Class Roster table? What other database design might you use to record this information? Advantages: Allows the instructor to record a different or "custom" version of the student's name and e-mail address without affecting other users. A roster-based query for name and e-mail address information will run more quickly. Disadvantages: All instructors would need to update their rosters with name or e-mail address changes. A background process to automatically propagate changes would require additional CPU time (and eliminate the benefit of "custom" names). Alternative design: Hierarchical Network structure Object-oriented Note: this is a relational database, so "relational database" would not be an appropriate answer. A multidimensional database would be great for tabulating student counts and grade averages by class, etc, but usually would not be used to store base-level records. e. Write a query that shows how many people have registered for each scheduled class. Include the class name, capacity, date, and count of attendees. See Analysis Exercise Data Solutions files: [Chapter 05 - Solutions.mdb]. 3. Selling the Sawdust a. What are your college’s or university’s policies regarding student directory data? Policy manuals typically cover student and faculty behavior but rarely include operational policies, though this information may be found on-line. If not readily available, the provost's office might make a good starting point. Public universities are subject to the Freedom of Information Act and have no right to withhold this information. b. Does your college or university sell any of its student data? If your institution sells student data, what data do they sell, to whom, and for how much? Student privacy is a hot topic, so this information may be hard to find. On the other hand, this avenue of research may be well worth pursuing for that very reason. c. If your institution sells data, calculate the revenue earned per student. Would you be willing to pay this amount per year in exchange for maintaining your privacy? This is an open-ended question with no right or wrong answer. It should also make an interesting in-class discussion topic. Along a similar line of discussion, many credit card companies purchase instant credit scoring services that compare a transaction in progress with previous transactions and incorporate this information into their transaction approval process. Students may have experienced their valid transactions being denied or delayed pending "verification." These services help reduce fraud rates and provide participating credit card companies with a competitive advantage. Competition between credit card companies forces them to pass along a substantial portion of their cost savings to their customers in the form of lower interest rates. Would students be willing to pay higher interest rates or annual fees for greater privacy? 5-10 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management 4. Data Formats and Manipulation Note: though the names and organization have been changed, this exercise represents a real-world problem. This exercise introduces students to the CSV format and familiarizes them with its structure and capabilities. Note that many database applications also use XML when uploading or downloading information. The CSV format is decades older than XML, is more common, and is far easier for humans to read. XML, on the other hand, permits more sophisticated data structures and enables greater programming flexibility. This exercise uses CSV because non-IT managers will more likely encounter it than they will raw XML files. a. Download and save “partners.csv” from the MIS 9e OLC. Open the file using Microsoft Word. Remember to look for the “csv” file type when searching for the file to open. Describe the data’s appearance. The first line will contain field names. The CSV format doesn't require it, but it helps assure that users interpret the data correctly. The second line will contain the first record of the data set. Quotation marks surround the full name text field because it contains a comma. Commas separate each field. Each record comprises a single "paragraph" as it appears in a word processing document. b. Import the “partner.csv” file into Excel. Remember to look for the “csv” file type when searching for the file to open. Does Excel automatically format the data correctly? Save your file as “partner.xls.” Excel recognizes the CSV format and will import and format CSV data correctly, but it does so only as long as the CSV format is used consistently throughout the file. See Data Solutions Disk for a sample spreadsheet: [partners.xls] c. Describe in your own words why you think database manufacturers use common formats to import and export data from their systems. By using a common format, database manufacturers ensure that their system can communicate information between databases developed by other manufacturers. While database manufacturers would probably prefer the whole world use nothing but their own proprietary systems, their customers demand this interoperability. 5. Cloud Data Transience AVOS Buys Yahoo's Delicious…1 Delicious, Yahoo's poorly named online bookmarking application, has a stagnant but loyal customer base. Delicious allows users to create an account, store their favorite URLs, mark them as private or shared, assign tags to them for easy retrieval, and access them from any computer or mobile device. This utility makes the standard browser-based, hierarchically organized bookmarking approach virtually obsolete. Yahoo, however, found Delicious to be an unprofitable distraction from their core business and decided to sell it off to AVOS, an Internet startup by Chad Hurley and Steve Chen, YouTube's founders. After the big announcement, Delicious sent its users an e-mail explaining that to keep their accounts they must agree to the transition and accept both AVOS' new privacy policies and terms of service. Those who fail to agree will lose their accounts and data. Fortunately, Delicious also allows users to download a copy of all their bookmarks, though without the tag functionality, they're pretty useless. 1 http://www.delicious.com/ 5-11 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management 1) What types of data do people maintain online for their personal use? Make a list. Types of data people maintain online include: E-mail – containing just about anything Contact information Calendars/schedules "Wish lists" or gift registries Bookmarks Blogs Personal websites Stock portfolios (for tracking stocks) Office automation documents (word processing, spreadsheets, etc) Account names/passwords 2) Can you transfer your personal notes from MySpace to a new Facebook? What difficulties would you encounter? Yes. MySpace has a "Sync with Facebook" tool. The tool has limitations. For example, it will sync profiles but not pages, and it's liable to break any time there's an upgrade or security change to either site. Another problem would involve figuring out how to set up the tool to work properly. 3) What are the advantages to keeping data online? Advantages: Free or inexpensive storage Accessible anywhere via the Internet Automated backups Automatic updates/upgrades to service 4) What are the disadvantages to keeping data online? Disadvantages May disappear any time May not be available offline when needed Security is not guaranteed May not integrate well with online/offline applications Policies, fees, privacy settings may change with no or minimal notice. ANSWERS TO REAL WORLD CHALLENGE/SOLUTION Real World Challenge 1. This case chronicles the many issues associated with the IT environment in which U.S. Xpress currently operates. How did U.S. Xpress get into this situation? Was this the result of different business priorities in the past? If so, which ones? What are the lessons for companies that frequently acquire other business? a) Background early technology adopter - on-board satellite systems rapid growth through acquisition resulting in a patchwork of disparate IT systems resulting in redundant applications resulting in duplicate data 5-12 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management resulting in bad data making it difficult to acquire an up to date view of the whole organization b) Past priorities growth through acquisition c) Lessons you can't manage what you can't measure you can't manage on bad data include computing platform and data migration into the cost of an acquisition 2. Moving forward, what does U.S. Xpress need to do in the future regarding its IT infrastructure in general, and its data issues in particular? What do you think should be the next three steps the company should take? Next steps migrate to a single platform build data quality control into the applications undertake a data cleaning initiative conduct period data quality audits improve employee training track the source of data quality problems automated data acquisition wherever possible 3. Should companies take a periodic (e.g., clean every so often) approach or a continuous, more expensive, approach to data quality? What are the advantages and disadvantages of each? Recommendations conduct periodic data audits clean data as needed evaluate the cost of bad data in order to justify budget for improving data quality evaluate an acquisition target's IT systems and data quality prior to buying them Continuous cleaning advantages consistent quality quickly identify and correct bad data sources improved decision making Continuous cleaning disadvantages expensive (but is it really?) may introduce new errors Periodic cleaning advantages least expensive solution may be completed immediately prior to major initiatives requiring clean data Periodic cleaning disadvantages potentially disruptive fails to identify problems in a timely manner 5-13 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management Real World Solution 1. Once the technical aspects of data quality are put in place, who should be in charge of making decisions about these issues? Is this a technical responsibility or a business one? What are the advantages and disadvantages of either approach? Who should be in charge of making decisions? business leaders with input from technology leaders Technical responsibility - advantages efficiency insights gained can be applied to changes in the software gained appreciation for data quality priorities Technical responsibility - disadvantages lack of resources lack of direct accountability to business managers using the data lack of business knowledge Business responsibility - advantages direct accountability business insights can be applied to the data better able to prioritize cleanup tasks Business responsibility - disadvantages lack of resources may not be able to identify the significance of technical challenges may not adequately communicate insights into data quality problems to IT 2. Are the benefits outlined in the case the result of better technology or improved decision making? Today, is it possible to clearly separate the two anymore? What are the implications for U.S. Xpress as it decides where to go next and how to invest in future projects? Better technology better technology isn't a benefit in and of itself better technology (data, software, systems) enable better decision making Improved decision making good decisions ultimately lower costs or increase revenue good decisions require ready access to good data Are technology and business decision making separable? no Implications Data quality considerations should be integral with new acquisitions. Data quality initiatives must come from business managers 5-14 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management ANSWERS TO REAL WORLD CASES RWC 1: Beyond Street Smarts Case Study Questions 1. What are some of the most important benefits derived by the law enforcement agencies mentioned in the case? How do these technologies allow them to better fight crime? Provide several examples. Benefits Analyze historic patterns Assess risk Deploy resources efficiently Crime fighting Position police in high risk areas during high risk times. Share information between agencies. Examples New York City used insights to target tough sentencing against certain types of low level crimes "Son of Sam" was caught by a lead generated from a parking ticket. Carriers use crime data to avoid trouble spots. Insurance companies use crime data to assess insurance risk. Police use crime data to efficiently allocate resources. 2. How are the data-related issues faced by law enforcement similar to those that could be found in companies? How are they different? Where do these problems come from? Explain. Similarities Large data volumes Value in data mining Privacy concerns Security concerns Differences Non-financial transactions Not well integrated between organizations Problem source Inadequate funding Inadequate planning Privacy zealots Legal challenges 3. Imagine that you had access to the same crime-related information as that managed by police departments. How would you analyze this information, and what actions would you take as a result? Analysis Track patterns to find persistent prolific offenders, high crime areas, and high crime targets. Actions Communicate and cooperate more effectively with other agencies. Engage the public in crime awareness and reporting. Reassign resources to maximize their effectiveness. Determine which low-level criminal activities serve as indicators of high-level activity and shut these perpetrators down for longer periods by encouraging maximum sentences. 5-15 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management Real World Activities 1. The case discusses many issues related to data quality, sharing, and accessibility that both government bodies and for-profit organizations face. Go online and research how these issues manifest themselves in companies, and some of the approaches used to manage them. Would those apply to police departments? Prepare a report to share your findings. Data Quality Push data capture closer to the data source Automate data capture Enforce data entry coding standards to check for various error types Data Accessibility Mobile computing ADA compliant programming User training Improve analysis and reporting tools Data Sharing Structure data Standardize data structures industry-wide Implement a trustworthy user authentication system Applicability These issues are 100% applicable to police departments. For example, many police vehicles are equipped with laptop computers that securely connect to police systems, dashboard video, and even microphones worn by police officers. 2. The case discusses the large volume of very detailed information collected daily by law enforcement agencies. Knowing this, how comfortable do you feel about the storing and sharing of that data? What policies would you put in place to assuage some of those concerns? Break into small groups with your classmates to discuss these issues and arrive at some recommendations. Comfort, in this case, is a matter of personal opinion. Law and order advocates take the position that the authors of the U.S. Constitution and its amendments did not see fit to specifically mention "privacy". Our privacy rights derive from the U.S. Supreme Court's interpretation of various amendments. While privacy rights advocates fear how governmental agencies might abuse personal information to curtail other personal rights. The general discussion could take two different paths: Preventing governmental abuse Preventing theft or misuse by individuals2 Policy areas Data retention Data access Audit trails Legal barriers to communication3 User software training User policy training 2 One topic directly related to this includes sex offenders registries. The registries were intended to alert the public and encourage public vigilance. However, some people have used this data to harass and even murder registered offenders. 3 For example, while federal law bars the sale of weapons to people with certain mental illness, various state laws may prohibit communicating this information to any other agency. 5-16 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management Privacy rights advocacy as part of all policy and software development initiatives Recommendations This area probably requires a massive overhaul not only of local, state, and federal law, but an equally massive effort to overhaul these disparate information systems. RWC 2: Duke University Health System, Beth Israel Deaconess Medical Center, and Others Case Study Questions 1. What are the benefits that result from implementing the technologies described in the case? How are those different for hospitals, doctors, insurance companies and patients? Provide examples of each from the case. Benefits Real-time care Identify experimental treatment candidates Identify best medical tests to perform Identify best treatment options Differences by user group Hospitals greater efficiency, lower error rates Doctors improved productivity, increased effectiveness Patients improved safety Insurance lower costs, better risk assessment Examples Hospital Doctor Patient Insurance 2. identified and notified best H1N1 vaccine candidates warned about patient metabolic problems limited radiation exposure reduced unnecessary testing Many of the technologies described in the case require access to large volumes of data in order to be effective. At the same time, there are privacy considerations involved in the compiling and sharing of such data. How do you balance those? Balancing privacy recommendations Limit access to patient identities Inform patient of how their records are maintained and used Track all accesses to patient records Train staff about privacy policies and laws Enforce privacy standards Audit patient record access Employ appropriate security measures to protect from unauthorized access Trigger alarms (notifications) when the system detects suspicious activity 3. What other industries that manage large volumes of data could benefit from an approach to technology similar to the one described in the case? Develop at least one example with sample applications. All of them. Valuable data with strong personal privacy issues Customer data Credit information Tax data Human resources data School records Financial aid information Police records 5-17 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part. Chapter 05 - Data Resource Management Adoption records Genetic records Real World Activities 1. The legal and regulatory environment of the health care industry has changed significantly in recent times. How does this affect technology development and implementation in these organizations. Go online and research new uses of information technology in health care motivated by these developments. Prepare a presentation to share your findings. Regulatory effects Significant IT commitments Non-regulatory projects put on hold Increased user training costs Increased instability as new applications roll out Increased liability risk from failure to interpret rules, system design errors, and data errors Disruptions in service Improved patient privacy Search terms health care information technology Useful websites http://healthit.hhs.gov/ http://www.medpac.gov/publications/congressional_reports/June04_ch7.pdf http://www.himss.org/ http://www.healthcareitnews.com/ 2. Some of the technologies described in the case verify the diagnostics made by doctors and can sometimes make recommendations of their own. Does this improve the quality of care, or are these organizations putting too much faith on a computer algorithm that did not attend medical school? Break into small groups to discuss this and provide some recommendations about what organizations should do before deploying these technologies in the field, if anything. Faith in technology At present, systems make recommendations to doctors. So long as doctors remain the final arbiters of treatment decisions, then these systems pose little additional risk. While the system's programmer may not have graduated from medical school, the rules they program into their systems come from medical experts. As a result, these systems may help raise patient care quality. There is some risk that a careless doctor may simply rely on whatever the program tells him or her to do without applying his or her own knowledge and experience. More worrisome, at some future point hospitals might opt to delegate treatment decisions to less qualified people who follow only what the system recommends and who have no authority to override its decisions. While this might be useful in situations with no access to medical expertise (submarines, for example), it could lead to avoidable errors. 5-18 © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.