Download Data Mining and Business Intelligence: Getting a Glimpse of the Future

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data Mining and Business Intelligence:
Getting a Glimpse of the Future
To: Professor J.E. Boritz, University of Waterloo, ACC 626 Term Paper, July 27, 2007
Prepared By: Tony Chu
Executive Summary
Data mining (DM) is used to search for patterns and correlations within a database of
information. Business intelligence (BI) focuses on detail integration and organization. DM aids
BI’s objectives.
DM and BI work together to process data and analyze it in a way that eases the workload
for the users and aids with the understanding of the materials/findings. This is accomplished
through recognizing relationships in the data and identifying opportunities and risks of the
company. It also allows users to manipulate the data to fulfil their specific user-oriented
objectives. Fields that benefit from DM and BI include marketing, corporate analysis and risk
management, fraud detection and management, E-Commerce, bioinformatics, and customer
relationship management.
C-Suite executives (CEO, CFO, COO, etc.) must weigh the benefits of DM and BI
systems with the costs and problems. These systems require accurate and timely data assessable
by the system. Moreover, cost overruns and unexpected problems are common in implementing a
system. Mitigation strategies include data risk management, data governance, and data
management technology.
Ideal candidates for DM and BI are usually found in the information-intensive industries.
Examples include credit card, transportation, large consumer packaged goods, and
pharmaceutical companies.
In public practice, DM and BI may help increase the efficiency of an audit. If combined
with computer-assisted auditing techniques (CAATs), auditors can obtain better reports and
execute more effective audit tests using the greater detail in the information. Consequently, they
can design better audits with the information. Public accountants can also get involved in
consulting to help with the implementation of a DM and/or BI system. Care must be given to
ensure there are no self-review problems if conducting both the audit and the consulting.
For industry and government jobs, DM and BI can be used to aid in fraud detection,
inventory logistics, defect analysis and focused hiring.
DM and BI use historical data. If economic, social, or environmental conditions change,
the analysis may become incorrect. The analysis must be altered to fit the circumstances.
Consideration must be given to ensure DM and BI are not turned into “tech tools”
dominated by technological jargon, but remain focused on the needs of the decision makers.
1
Introduction
Using technology to gain an edge in business is not a new idea. Whenever there is
something new, entrepreneurs will be quick to try to find an application for it in the business
world to make money. Data mining (DM) and business intelligence (BI) are among the
information technology applications that have business value. This paper will first outline what
data mining and business intelligence are, then move on to practical usages in various business
contexts. It will then proceed to a section dealing with how C-Suite executives, like the CFO,
CIO, etc., will handle the choice of whether to implement a system and how to go about doing it.
Suggestions as to which industries are best suited for this technology are also given. Finally,
there is a section on how DM and BI will affect the accounting profession.
What are Data Mining and Business Intelligence?
Data mining is the process of searching through data using various algorithms to discover
patterns and correlations within a database of information. Business intelligence, on the other
hand, focuses more on data integration and organization. It will combine data analyse to help
managers make operational, tactical, or strategic business decisions. Data mining can be used to
aid the objectives of a business intelligence system.
Why are DM and BI important?
BI tools, in conjunction with DM, make the process of getting data and analyzing it less
onerous for users. BI software is usually flexible enough so that analysts can “slice and dice” the
data any way they want. Since the information comes from a centralized set of data (possibly
combining data from multiple databases), data extracts from the system are consistent with each
other, seeing as the analysis is done on one set of data instead of on individual desktop computers
that may have their own data set or analysis tools (Burns).
In addition, DM and BI can give users the ability to spot patterns by putting the data in a
visual form. They can further enhance the usefulness of the information by enabling models to
identify or confirm relationships, and providing the tools to the user to drill-down and focus on
particular areas of interest. If the system is used regularly, comprehensive and timely information
can be utilized to spot technical, organizational, and behavioural problems within the entity in
time and with sufficient detail to correct the problem (Froelich).
2
These techniques are available due to the maturity of the necessary technologies (massive
data collection, powerful multiprocessor computers, and data mining algorithms) (Chaterjee).
Effective and beneficial usage of DM and BI include, but not limited to: 1. Marketing –
relationships are discovered between certain customer characteristics and buying patterns. 2.
Corporate analysis and risk management – data is gathered and analyzed to aid in financial
planning both internally and externally. 3. Fraud detection and management – patterns or
irregularities are investigated, with past successes acting as heuristics to increase future
successful detection. 4. E-Commerce – tracking customer preferences to provide customized
content, products, and services. 5. Bioinformatics – finding sequencing and other relationships to
help further scientific research. 6. Customer Relationship Management – finding, reaching,
selling, satisfying, and retaining customers through understanding their wants and need (Hsu).
All of these techniques and usages provide a powerful tool for companies to gain a competitive
advantage.
C-Suite Executives Decisions and Obtaining a DM/BI
There are a number of concerns expressed by C-Suite executives about DM and BI. These
concerns include whether these systems can provide relevant and useful information while at the
same time be cost-effective. As described in the section of this paper addressing the importance
of DM and BI, there are many benefits to these systems. They must be weighed against negative
factors. An example includes DM and BI both depend on a database of information that has
accurate and timely data that is accessible to the system. If the data is not formatted correctly (for
the system to read) and/or contains inaccurate information, the reports produced by the system
will be incomplete and inaccurate. Needless to say, if the data is not timely, by the time the report
is produced, the relevant decision would no longer be feasible nor could it be implemented.
Furthermore, there will be industry or company specific hurdles that need to be analyzed. These
specific hurdles are outside the scope of this paper.
Implementation of DM and BI systems is also a major risk for a company. Often, there
are major cost overruns on top of the fact that the information and reports produced do not fulfil
the requirements of the users. In a survey done during Gartner’s Business Intelligence Summit,
where leading BI, DM, and data warehousing professionals gather, Teksouth Corp, a service
company specializing in implementing BI and data warehousing solutions, confirms this
3
statement. Two thirds of those surveyed were forced to scale back or ask for more funding due to
cost overruns. Scaling down a project may cause the exclusion of some of the requirements the
users of the system require. Two thirds also stated they have ran into unanticipated problems
when designing and implementing their BI systems, with 44% also indicating these problems
have delayed their projects. Of particular interest is in the area of small businesses, where reports
of the cost and time overruns have discouraged them from implementing a system, despite
understanding the benefits of such a system (Havenstein).
There are a number of ways C-Suite executives, and their subordinates, can mitigate these
risks. Under Deloitte United Kingdom’s consulting methodology, there are three main categories
of techniques and methods used by successful businesses: data risk management, data
governance, and data management technology.
Data Risk Management
Data risk management is “providing data assurance in the form of investigations,
reconciliations, reviews, and assessments.” This could include work by the internal auditors to
ensure that the data is workable by the system and ensuring that the data is cleaned regularly. An
additional data risk management method includes having specialized software to monitor and
report on the data in the system. For example, SAS for Enterprise Risk Management boasts that
“they provide a unified, quantitative risk management framework” that includes “integrated and
comprehensive data management system, powerful predictive analytics, user-friendly and selfservice reporting, and a transparent environment that lets you manage the entire process – from
identifying risk, to measuring, mitigating and monitoring it on an ongoing basis.” Using SAS
software, or similar software from other vendors, in conjunction with internal audit work,
decision makers will have the best possible data at their disposal to manage data risk.
The bottom line may also be affected by DM and BI systems due to their effect on risk. It
could manifest itself in the form of out-of-control budgets and undelivered promises as described
by Gruman in Rethinking Business Intelligence. He recommends that when implementing a
system, the company should focus on the core, meaning that there is a well-defined business
objective. Once the objective is clear, the data can be analyzed to see what useful information
can be gleaned from it. If the data is “dirty”, compensating methodologies might be used, but the
process will always have the business objective in mind. Second, the proposed system can be
4
downsized. Focus on the core. If the objective can be achieved by using a smaller system, go for
it. Lastly, push BI close to operations. Look for areas that already collect data, and put a BI
system there to help make sense of the data. This will help identify trends and anomalies quickly
to managers without having to work on the data collection stage.
Data Governance
Data governance is “creating the policies and identifying the people who govern the
retention and disposition of all corporate information to build the framework for a data-driven
enterprise.” This is where an accountant’s strength in procedures, controls, and reporting can be
used tremendously. Procedures can be designed and paired with a division of duties to form a
strong data governance policy to certify that the data the company has collected are retained and
disposed of only under predetermined circumstance. After all, data is a precious asset that must
be preserved, but it can also be expensive to maintain and retain due to network, server
processing, and storage demands. Bannan, in 12 Tips for Generating Rich Data, has a few
pointers that practitioners can use in data governance. She recommends that a balance be struck
between server space (in the current DM/BI system versus archives) and strong analysis. Her
recommendation is 13 months worth of data, plus 3 years worth of contact data and key points
(for example, when the customer became a customer, last marketed/brought from the entity, etc).
This will help year-to-year analysis as well as keep a log of how to contact the customer and if it
is time to contact them again. She goes on to say that data should not be deleted, but rather,
aggregated so that it is not lost. Other types of data governance include standardization of the
data set, talking to the users to see where the procedures can be improved, creating a continuity
plan to make sure the data is not lost, and finally, to treat the employees like partners to increase
morale and lessen the incentive to circumvent the system.
Froelich, Ananyan, and Olson in Business Intelligence Through Text Mining have
additional steps. They feel that the data should also be pre-processed into a format needed for
further analysis and then have the important concepts and terms extracted. These important
concepts should then be used to identify the patterns and co-occurrences. They feel structured
data logging and analysis is key to getting a working and effective system. It would seem that
data governance, in the eyes of these experts, is only achievable through a rigid structure to
ensure uniformity.
5
Data Management Technology
The last dimension of the Deloitte approach is data management technology. Data
management technology is “selecting, implementing, integrating, and applying the technology
required to ensure effective data management.” This is where many experts have a lot to say and
have developed their own methodology as to how to implement the system. This paper will
attempt to integrate and describe the best of the approaches in the following paragraphs.
Generally, according to Burns in the CA Magazine, there are three major flaws in the
implementation of BI systems. They are: 1. An assumption by many IT departments that once a
data warehouse (a database system containing the historical data of a company), with no other
technology, is built, users will immediately use it and see its benefits 2. Spreadsheets are being
relied on extensively and possibly exclusively instead of relying on the BI systems. This has the
added problem of having erroneous spreadsheets being used by executives. The errors stem from
the lack of quality control as well as having re-keying and calculation mistakes due to the lack of
testing. 3. Data quality is low – so no matter what data management technology is used, the
reports may not present reliable results due to the lack of good data to analyze. The old adage of
garbage in, garbage out will certainly apply in this case.
King’s Better Decisions extends Burns’ flaws with additional problems that BI systems
face. They are 1. Making sure that business events are defined in the context of their use since
different portions of the entity have different goals. 2. Users do not know how to work the system
and find data that is useful to the way business is done, even if the BI software is easy. 3. No
concrete BI goals set so that the data is taken out of context. 4. Processes paralyzing the analysis.
5. The balance of power is upset when the data is available to everybody, thus, the subordinates
may question the superior’s decision (particularly a problem in hierarchical organizations).
These issues should be addressed by having context driven definitions for business events [e.g.
after defining what is inventory (does the definition for the company include scrap or not?), to
stop purchasing a particular item that the system has indicated is overstocked], better training in
the use of the new systems, having concrete BI goals set, taking the processes step away from the
user by programming it into the system, and educating and changing corporate culture so that
managers will be more receptive to the system.
A Deloitte Consulting principal, Griffin, from the United States, has some practical ways
to deal with some of the problems above. She recommends that companies “align every BI
6
initiative with the company’s strategic goals and objectives.” Thus, the system will not be built
before users have a chance to give their input. The system will fit the users’ needs instead of
having the users try to find what they require from a system. Next, she recommends that the right
information should be delivered to the right people. The technology used should also be right for
the information to be analyzed. This will help in reducing the usage of ad hoc spreadsheets.
Finally, data quality is addressed through understanding the business’ critical processes
and seeing how the strategic processes can align with the BI. It is important that processes
supporting BI provide the correct information, as well as a way to intelligently use the
information. This means that the data collected should be sorted and understood from the
viewpoint of how it fits in the business processes of the company, essentially adding context to
the data and increasing its value to the entity.
Deloitte UK itself has a set of six techniques for data analysis that the systems can do for
their clients. They include the following: 1. Data Visualization: “dynamic graphical analysis to
facilitate the understanding of patterns and relationships in data” – e.g. interpreting complex
relationships within multidimensional data. 2. Cluster Analysis: “identifying distinct groups of
items within large data sets that display similar characteristics” – e.g. marketing to target groups
that are most likely to respond. 3. Factor Analysis: “identifies similar groups of characteristics” –
e.g. fraud detection (through models). 4. Propensity Modelling: “gives each customer a
probability score showing the likelihood of the customer behaving in a certain way” – e.g.
customer management to help with retention rates. 5. Decision Trees: “identify groups of
customers who behave in a similar way while at the same time showing the drivers of that
behaviour” – e.g. fraud detection (what drives fraud). 6. Artificial Neural Networks: “non-linear
predictive models that learn through training and resemble biological networks in structure” –
e.g. medical treatments, traffic flows, and detecting patterns in fraudulent credit card usage.
Management should consider these techniques and see which of them will work best for their
particular industry/company and the data they have on hand. Once that has been identified, a
specific system can be implemented.
An interesting alternative to make the DM and BI system work better is suggested by Lau,
Lee, Ho, and Lam in their article Mining the Web for Business Intelligence: Homepage Analysis
in the Internet Era. They propose that the business intelligence side of data mining is only
accomplishable through the construction of a “dictionary”. This dictionary will have “concepts”,
7
such as demographics, stage of life cycle, hobbies/interests, wealth/purchasing power, etc. which
will then be matched to desirable characteristics that the company is looking for. This use of
heuristics was applied to web trawling of homepages and other public domains in their study,
coming up with a 80,750 keywords/phrases dictionary that yielded a significant correlation
between user defined criteria of desirable characteristics, and the corresponding concept as
defined by the dictionary. This “dictionary” idea can be applied to other businesses, to aid them
to gain useful relationships using DM techniques that will assist in decision making.
Possible vendors to use after the above points have been addressed include SAS, Oracle,
SPSS, Cognos, Micro Strategies, and Microsoft. Each of these vendors has their own strengths
and ideal system scale. Discussion of specific software is outside the scope of this paper.
The Ideal Candidate for DM/BI
DM and BI lend themselves naturally to information-intensive industries. These
industries, in general, have “large, well-integrated data warehouses and a well-defined
understanding of the business process within which data mining is to be applied (such as
customer prospecting, retention, campaign management, and so on)” (Chaterjee). Examples of
such industries include credit card, transportation, large consumer packaged goods, and
pharmaceutical companies. If a company within these industries wishes to enquire whether it
would be worthwhile to implement a DM/BI system in their company, they will have numerous
success stories and experienced vendors to help them with their endeavour. But this does not
preclude companies from other industries from obtaining a DM/BI system. But they may require
more specialist help to aid them in obtaining the data they require and then to implement a
system that fulfils their needs.
How DM and BI Impact the Accounting Profession
Public Practice
DM and BI offer promise to auditors in aiding them to run more efficient audits. The most
obvious usage would be to connect computer-assisted auditing techniques (CAATs) software to
DM and BI software so auditors can obtain the same metrics and reports management are getting,
and thus, be able to understand the business better and design a better audit. DM and CAATs can
also work together to allow auditors to better see the relationships between different events,
8
transactions, and accounts so management explanations can be corroborated and explored in
more detail. Predictive data can also be obtained from DM and BI systems, which will give
auditors yet another tool to corroborate management produced future oriented reports.
Ellis, in his Data Mining and Business Intelligence: Where Will it Lead Us, identifies
some more detailed implications of DM and BI on audits. He feels that auditors will gain more
predictive capabilities and/or anomaly reporting by using these systems. They can use the system
to help find duplicate, large or similar payments from multiple vendors, missing
invoices/cheques, among other items. Moreover, if XBRL formatted information, a standard to
report financial statement data, is put through a DM and BI system, more meaningful analysis
may result since the analysis program has data with meaningful descriptions to which to find
relationships – higher quality data will yield better analysis. These relationships can then be
given to internal and external auditors to help them identify areas of strength and weaknesses.
There is yet another possibility for public accountants – helping companies implement
DM and BI systems. The consulting business is a lucrative business line for accountants. A
professional accountant’s know-how about entity controls, reporting, and decision making
requirements coupled with a working knowledge of IT systems make them invaluable to
companies who wish to have a DM and/or BI system. Care must be given so that accounting
firms who provide these consulting services are not involved in the audit as well since it would
be a self-review and conflict of interest situation. It is against the rules of accounting institutes in
multiple jurisdictions to provide both assurance and consulting services for the same system.
Industry and Government Practice
Accountants are often in management positions, possibly in charge of the accounting
department or perhaps in a more general leadership position in the organization. As such, they are
instrumental in decision-making. That is where DM and BI make a significant impact on
accountants in industry. Wu, in his article Business Intelligence: The Value in Data Mining,
outlines a number of practical uses of DM. They include fraud detection, inventory logistics,
defect analysis, and focused hiring.
Fraud detection is useful to accountants in management positions to prevent fraud given
that it is an obvious issue in good financial stewardship. One DM technique is neural networks,
which bases its analysis on known fraudulent activities/methodologies. This will in turn produce
9
reports that predict if, when, and where the fraud has or will take place. Accountants can then use
these findings to follow up on and to investigate if a fraud has taken place, and if it is due to a
control weakness, how the weakness can be reduced or eliminated. This is an invaluable tool to
investigate fraud. Professor L. Robinson of the University of Waterloo, a forensic accountant, has
stated that fraud is hard to detect, and often missed, since relationships in the control environment
are hard to visualize when paired with relatively immaterial amounts that are involved in the
fraud. She states that sometimes, the most minute clue such as a $10 entry can be the only fraud
indicator accountants can pick up in their investigation. But more often than not, this $10 entry
can be traced to all the other fraudulent activities. It can be the key to unravelling the fraud. DM
provides accountants the tools to find this $10 entry to detect fraudulent activities and take the
appropriate action.
Inventory logistics is also an important area where DM can help accountants in
management positions. Customers’ wants and needs have to be fulfilled. One need they have is to
find goods on the shelf they want to buy. Consequently, having incorrect merchandise on the
shelf costs retailers a lot of money since the customers will not buy the product, which results in
the goods having to either sit on the shelf for long periods of time, costs incurred to ship the
goods back to the warehouse/manufacturer, or the writing off the product due to obsolescence.
DM can find the relationship between demographics, location, and buying patterns so companies
can identify hot items at a particular store and stock these items more frequently.
Searching for the source of an error can be a time consuming endeavour. Defect analysis
done by DM can be the answer. DM can help identify characteristics that defective products have
in areas such as the component used, individuals who have worked on them, the production run,
among other indicators. Once the causal characteristic has been identified, the problem can be
solved. Having good defect analysis programs will help with the company’s reputation since
there will be less warranty claims/disgruntled customers due to defective products, as well as less
returns from dissatisfied customers.
An excellent example, given in Lamont’s Business Intelligence: The Text Analysis
Strategy, is the case of Honda (America). Using SAS Text Miner to monitor warranty claims,
Honda was able to detect early signs of engineering problems. The data sources for the program
were gleaned from technician feedback, call centres, and data from the dealer network. Both top
down and bottom up methods can be used in SAS Text Miner. Managers can use the bottom up
10
approach to identify problems they should be looking at, versus the top down approach, where
they are looking for a cause for the problem.
Human resources is a key competitive advantage in the current world. Thus, having good
staff is one of the things accountants are always striving for. Using DM techniques to find
characteristics of top performing individuals, accountants and other people in management
positions can increase the likelihood that their new hires will be star performers like the ideal
employees modelled by the DM system. Characteristics that might be used include education,
professional certification, experience, skills, and personality traits.
It is important to note that many of the applications for DM and BI stated above use
historical trends – looking at past and present characteristics and indicators that might predict
current and future performance. As such, changes in economic, social, or environment conditions
may render the analysis provided by the DM and BI systems to be incorrect. Professionals who
choose to use these systems must consider the above changes and adjust their decisions
accordingly.
IT Professionals Making DM/BI a “Tech Tool”
Since DM and BI depend on information technology to function, many of these systems
have been turned into IT projects with technical feasibility and jargon dominating the job. Focus
was diverted to the “reporting, query tools, multidimensional analysis, and OLAP tools”
(Gruman). This in turn can have a negative impact on the end-users of the system, the decision
makers, since the system is focused on technology versus providing information to help choose a
more informed course of action. IT professionals cannot be the dominating force behind the
implementation and running of DM and BI systems. Decision makers, such as accountants, must
be the designers of the system, communicating what is required so that they can perform their job
better. The IT professionals should only be there to facilitate this vision.
General Comments
Assuming DM/BI does provide benefits to corporations, it provides interesting
implications for accountants in general. It is obvious that accountants in managerial positions will
benefit from timely, accurate, complete, and valid information from good DM/BI systems to help
them with their decision-making. But would DM/BI systems help auditors? Computer-Assisted
11
Auditing Techniques (CAATs) are often used to increase efficiency and to minimize costs in an
audit as well as provide assurance in areas where there are gaps in the audit program that only
computerized testing can fill. But with the advent of advance BI/DM and CAATs systems, is it
possible that audits can be almost fully automated, assuming the CAATs program can identify all
the relevant information it requires from the client DM/BI. Audits comprise of inspection,
observation, enquiry, confirmation, recalculation, reperformance, and analytical review. Out of
the seven evidence methods mentioned above, only inspection and observation cannot be
exclusively done by a computer (assuming the DM/BI system has everything auditors need to
enquire about), though it should be noted that computers can assist in these two methods. It is
possible that auditors will be relegated to physically inspecting sites and observing the processes,
which is essentially a junior auditor’s job. Senior auditors may be primarily concerned about the
systems integrity and making sure the CAATs system is properly set up to interface with the
DM/BI system.
Some of the data mining methodologies may also infringe on the privacy rights of the
consumers. Did the consumers consent to their information being used in such a way? Canadian
legislation in the form of PIPEDA and FIPPA protect consumer data from unauthorized use.
Does data mining from information obtained through sales transactions infringe on the right to
privacy? Do consent forms have to be obtained before the data is used? What about Lau’s idea of
data mining public domain sites (personal websites)? Is it allowed because the owner has chosen
to put the information in the public domain? DM/BI has many privacy issues that must be
resolved before corporations decide to use specific data that might not be legal to obtain and
analyze.
Conclusion
Many businesses, such as the airlines and the auto sector, depend on DM and BI
technologies to keep their businesses running efficiently. Accountants, often holding decision
making roles, should make use of their competencies and experience to help their respective
businesses, whether public, industry, or government, and decide whether a DM and/or a BI
system is right for it. Many issues are still unresolved in this area. Care must be given to these
sensitive issues to ensure they are resolved both ethically and legally.
12
Annotated Bibliography:
Author
Title of
Article
Periodical/
website
Vol./ No.
Bannan,
Karen
12 Tips for
Generating
Rich Data
Vol. 9, Issue
Customer
9
Relationship
Management;
http://proquest.u
mi.com/pqdweb?
did=887969611&
Fmt=4&clientId=
16746&RQT=30
9&VName=PQD
Burns,
Michael
Business
Vol. 138,
CA Magazine;
Intelligence http://proquest.u Issue 5
mi.com/pqdweb?
Survey
did=865601201&
Fmt=4&clientId=
16746&RQT=30
9&VName=PQD
Edition/
Date
accessed
Year
Pages
Location, Annotation
data base,
website
September
2005
Edition;
assessed
May 13,
2007
2005
34-39
Jun/July
2005
Edition;
assessed
May 13,
2007
2005
18
Online; ABI The 12 tips recommended by the author
Inform
include: 1. Share data with caution. 2. Look
beyond transactional data (buy demographic
and psycholographic data, or do your own
market research). 3. Clean your data
regularly. 4. Distribute data at every level. 5.
Fund training and relearning. 6. Balance
server space with strong analysis (13 months
worth, 3 years worth of contact data, key
points like when they became a customer, last
marketed/bought from you). 7. Aggregate, do
not delete. 8. Standardize whenever possible.
9. Talk to your users often. 10. Get executive
buy-in. 11. Create a continuity plan for your
data. 12. Treat your partners like employees
(build good relationships with the vendors).
Online; ABI BI tools take the mechanics out of the process
Inform
of getting data and analyzing it. It is usually
flexible so that analysts can “slice and dice
data any way they want.” There is only one
version of the truth as well since the data
being analyzed is centralized, and not on
individual desktop computers.
There are 3 major flaws to BI: 1. There is an
assumption by many IT departments that once
a data warehouse is built, users will
immediately use it and see its benefits 2.
Spreadsheets being relied on extensively and
possibly exclusively (not using BI systems,
but using spreadsheets that are custom made,
possibly with re-keying and calculation
mistakes. They may also be outdated) 3. Data
Chaterjee,
Jagadish
Cullen,
Michael and
Allcock,
Neil
Using Data
Mining for
Business
Intelligence
Data Mining
– Making
Data
Intelligent
MS SQL Server; Online
http://www.aspfr
ee.com/c/a/MSSQLServer/UsingData-Mining-forBusinessIntelligence/
January 24;
assessed
May 13,
2007
www.deloitte.co. N/A
uk;
www.deloitte.co.
uk/data (search
function)
N/A;
assessed
May 16,
2007
2005
2007
N/A
N/A
quality.
www.aspfre Data mining is ready to be used since the
e.com
three technologies that support it are mature
enough: massive data collection, powerful
multiprocessor computers, and data mining
algorithms.
Information-intensive industries are the ideal
candidate for data mining, and they have
jumped at the opportunity to use it. They have
been successful due to having “large, wellintegrated data warehouses and a well-defined
understanding of the business process within
which data mining is to be applied (such as
customer prospecting, retention, campaign
management, and so on).” Examples include
credit card companies, transportation
companies, large consumer package goods
companies, and pharmaceutical companies.
www.deloitt The Data Management Team uses the
e.co.uk
following techniques:
1.
2.
3.
4.
1
Data Visualisation: “dynamic
graphical analysis to facilitate the
understanding of patterns and
relationships in data” – e.g.
interpreting complex relationships
within multidimensional data
Cluster Analysis: “identifying
distinct groups of items within large
data sets that display similar
characteristics” – e.g. marketing to
target groups that are most likely to
respond
Factor Analysis: “factor analysis
identifies similar groups of
characteristics” – e.g. fraud
detection, help make predictive
models that detect fraud
Propensity Modelling: “Gives each
5.
6.
Ellis, Doug
Data Mining
and
Business
Intelligence:
Where Will
it Lead Us?
Infotech Update; Vol. 13,
http://proquest.u Issue 6
mi.com/pqdweb?
did=768176921&
Fmt=3&clientId=
16746&RQT=30
9&VName=PQD
Nov/Dec
2004
Edition;
assessed
May 13,
2007
2004
1-3
customer a probability score
showing the likelihood of the
customer behaving in a certain way”
– e.g. customer management to help
with retention rates
Decision Trees: “Identify groups of
customers who behave in a similar
way while at the same time showing
the drivers of that behaviour – e.g.
fraud detection, by seeing which
groups are affected, and what drives
the fraud
Artificial Neural Networks: “nonlinear predictive models that learn
through training and resemble
biological neural networks in
structure” – e.g. useful for many
things, including medical treatments,
traffic flows, and detecting patterns
in fraudulent credit card usage.
Approach to data management in three
main categories – data risk management,
data governance, and data management
technology.
Online; ABI Reports using data that is in relational tables
Inform
or databases using On-Line Analytical
Processing (OLAP) is a form of business
intelligence. OLAP also allows users to look
beyond the summary level, and “drill up and
drill down” to the level of detail required for
the analysis.
Data mining tools, with or without BI, help
with adding more predictive capabilities
and/or anomaly reporting. Data mining has an
added bonus to auditors by helping them find
duplicate, large or similar payments from
multiple vendors, missing invoices/cheques,
among other things that have relationships.
2
BI and data mining can be used in
conjunction with XBRL, which will aid in the
quality of the information analyzed since
there are set standards in how the reports are
created (www.xbrl.org).
Froelich,
Josh,
Ananyan,
Sergei, and
Olson,
David L.
Business
Intelligence
Through
Text Mining
Griffin, Jane Putting the
“Business”
back into
Business
Intelligence
Initiatives
Vol. 10,
Business
Issue 1
Intelligence
Journal;
http://proquest.u
mi.com/pqdweb?
did=795851671&
Fmt=4&clientId=
16746&RQT=30
9&VName=PQD
Deloitte
Consulting LLP;
www.deloitte.co
m (search
function)
N/A
Winter 2005 2005
Edition;
assessed
May 13,
2007
February;
assessed
May 16,
2007
2007
43-50
N/A
There is a movement to have products that
standardize reporting solutions around a
common platform to “minimize data
movement, decrease maintenance costs, lower
training costs, minimize duplication of data
and its “underlying “data structures””
Online; ABI Text mining software gives users the ability to
Inform
spot patterns through putting the data in
visual form, forming models to identify or
confirm relationships, and drill-down query
tools to focus on specific areas. This is aided
by report generation tools. Mechanical,
organizational, and behaviour problems can
be spotted in a comprehensive and timely
manner using text mining. The example used
in this article is the airline industry.
Steps to do meaningful text mining are 1. Preprocess data to the format needed for further
analysis 2. Extract important concepts and
terms through initial text analysis 3. Write a
narrative analysis to identify patterns and cooccurrences of identified concepts 4. Develop
an automated solution 5. Build a taxonomy
using Narrative Summaries (meaningful
groups)
www.deloitt Business Intelligence initiatives are veering
e.com
off course due to losing sight of the business
(USA)
objectives, and letting IT run things. Things
to fix this include:
1.
3
“Align every BI initiative with the
company’s strategic goals and
Gruman,
Galen
Rethinking InfoWorld;
N/A
Business
http://www.infow
Intelligence orld.com/archives
/emailPrint.jsp?R
=printThis&A=/a
rticle/07/04/02/14
FEbizintel_3.htm
l
N/A;
assessed
May 13,
2007
2007
N/A
objectives.” Information is only
useful if it improves business
performance somehow.
2. Get the right information to the right
people. This is done by picking the
right technology and integrating the
information into it.
3. Examine the business’ critical
processes, then see if it aligns with
the strategic processes if BI is used.
It is critical that the “processes
supporting providing the correct
information, as well as the intelligent
use of that information.”
IDC analyst Dan Vesset states that “BI’s hitOnline;
http://infow or-miss ROI lies not with the technology
orld.com
itself but with the fundamental disconnect”
between IT’s interpretation that BI is
“reporting, query tools, multidimensional
analysis, OLAP tools, and maybe data
mining.” End-users think BI means anything
that supports their decisions. The end-users
are right. Treating BI as a set of technology
will veer most organizations off-track. What
organizations need to do is to get “a better
understand of the underlying data and
business requirements.”
Focusing on the Core
“The best strategy is to reduce the data
sources to those that serve well-defined
business objectives.”
Some data will be dirty, so the analysis must
take that into account and find compensating
methodology to ensure the data remains
meaningful.
Downsizing Solutions
Focus on one BI system before creating other
4
ones. It will save time by reducing efforts on
reconciling the differences between two BI
systems.
Data can also be analyzed even if it’s not
compiled into a data warehouse. The existing
relationships can be useful as well.
Pushing BI Closer to Operations
Add BI functionality to applications that
already collect data, which will then produce
trends and anomaly listings to managers
quickly.
Havenstein, Survey:
Heather
Cost
Overruns,
Delays Mar
Most Data
Warehousin
g Projects
Computerworld; Online
http://www.comp
uterworld.com/ac
tion/article.do?co
mmand=viewArti
cleBasic&taxono
myName=data_w
arehousing&artic
leId=9013783&ta
xonomyId=55&i
ntsrc=kc_top
Hsu, Jeffrey Data Mining Business
and
Intelligence in
Business
the Digital
March 20;
Assessed
July 22,
2007
Compilation Idea Group
of Articles Publishing
by Mahesh
2007
N/A
Online;
www.compu
terworld.co
m
Teksouth Corp., a company that helps
businesses implement data warehouses,
conducted a survey of professionals in that
field at Gartner’s Business Intelligence
Summit. They found that efforts to eliminate
time and cost overruns in business
intelligence and data warehousing projects are
mostly unsuccessful.
As a precaution, 62% of those surveyed factor
delays and cost overruns into budgets. 67%
are “forced to scale back their project or
request additional funding to finish their
project due to cost overruns” and have run
into unanticipated problems while “designing
and implementing their data warehouse”.
Moreover, 44% cited these problems as one
of the causes for the delay in their projects.
2004
5
141-191
Book
Small firms are continually shying away from
projects since there are widespread reports
about time and cost overruns. This is despite
the fact they understand the potential benefits.
Businesses do not use the full potential of
their data for gaining insight into their own
business, customers, competition, and overall
King, Julia
Lamont,
Judith
Lau, Kinnam Lee,
Kam-hon,
Ho, Ying,
Intelligence:
Tools,
Technologie
s, and
Applications
Better
Decisions
Business
Intelligence:
The Text
Analysis
Strategy
Mining the
Web for
Business
Intelligence:
Economy:
Opportunities,
Limitations and
Risks
Raisinghani
Computerworld; Vol. 39,
http://proquest.u Issue 38
mi.com/pqdweb?
did=901296701&
Fmt=4&clientId=
16746&RQT=30
9&VName=PQD
September
2005
Edition;
assessed
May 13,
2007
2005
48-50
Vol. 15,
KM World;
http://proquest.u Issue 10
mi.com/pqdweb?
did=1162460421
&Fmt=4&clientI
d=16746&RQT=
309&VName=P
QD
N/A;
assessed
May 13,
2007
2006
8-9, 30
Journal of
Database
Marketing &
Customer
N/A;
assessed
May 13,
2007
Vol. 12,
Issue 1
2004
6
32-54
business environment. They should use DM
to extract critical and useful patterns,
associations, relationships, and useful
knowledge from their data. Hsu’s article
discusses benefits and capabilities of DM.
Online; ABI Frontline workers can make use of business
Inform
intelligence knowledge, but there are pitfalls
that should be avoided. They are: 1. Business
events (e.g. what is inventory? Do scraps
count?) are defined in different ways in the
enterprise 2. Users do not know how to work
the system and find data that is useful to the
way business is done, even if the BI software
is easy 3. No concrete BI goals set so that the
data is taken out of context 4. Processes
paralyzing the analysis 5. The balance of
power is upset when the data is available to
everybody, thus, the subordinates may
question the superior’s decision (especially a
problem in hierarchical organizations)
Online; ABI Both structured and unstructured data can
Inform
prove fruitful for decision-making. An
example of software that can analyze such
data is SAS Text Miner.
Honda (America) uses SAS Text Miner to
monitor warranty claims. It helps them detect
early signs of engineering problems. The
sources where the data is gleaned from
include technician feedback, call centres, and
other data from the dealer network. This is an
example of a bottom-up approach where
analysts look at the data to see what it is
telling them. This contrasts with the top down
approach, where users have to set a direction
of what they are looking for.
Online; ABI Data mining information on websites is a
Inform
good opportunity for marketers to gain insight
into customer preferences and acquire
customers through this knowledge. This is
and Lam,
Pong-yuen
Wu,
Jonathan
Homepage
Analysis in
the Internet
Era
Business
Intelligence:
The Value
in Data
Mining
accomplished by using search engines to
obtain the information through Internet
searches (web crawling) and consequently
organize the data into a database.
Strategy
Management;
http://proquest.u
mi.com/pqdweb?
did=807279411&
Fmt=4&clientId=
16746&RQT=30
9&VName=PQD
DMReview.com; N/A
http://www.dmre
view.com/article_
sub.cfm?articleId
=4618
February 1,
assessed
May 14,
2007
2002
N/A
The authors of this document believe that the
business intelligence side of the data mining
is only accomplishable through the
construction of a dictionary. Their attempts at
making one total 80,750 keyword/phrases.
But even with such a dictionary, success is
still uncertain. But gaining “Key aspects of
personal information (labelled as 'concepts'),
(general demographics, stage of life cycle,
hobbies/interests, wealth/purchasing power,
etc.” can still be done. Note that this is limited
by the fact that many website owners do not
post all pertinent information.
Practical uses of data mining include:
Online;
www.dmrev
iew.com
Fraud detection:
Use sophisticated data mining techniques
called neural networks, which bases its
analysis on known fraudulent activities
(predictive result)
Inventory Logistics:
Incorrect merchandise on shelf (not what the
customer wants) costs retailers a lot of
money. Using data mining with information
on demographics data with what products the
different groups buy can help stores identify
hot items and stock them more frequently.
Defect Analysis:
Help identify characteristics that defective
products have (production run, component
used, individuals working on it, etc.) Will also
help with reputation and reduce return
materials allowances and field service recalls.
7
Focused Hiring:
Find characteristics of top performing
individuals, like education, years of
experience, skills and personality traits, and
find similar individuals to hire. This is a
historical based approach, and may not be
“indicative of future top-performing
individuals due to changes in social,
economic and environmental conditions.”
8