Download Final Review and Study Guide

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Final Review and Study Guide
MIS2502, Spring 2011
Section 03
1. BI Mental Map
Competitive Advantage
Performance
Better Understanding
Good Business Decision
Data Mining
External Source
Data Warehouse
Product No.
Product Name
Price
MySQL
ERD
Customer No.
Name
Address
Membership
Description
Exam
• Be able to express BI mental map in your own
words.
• Fully understand the role, usage, importance
of ERD, SQL, and Data mining in this map.
2. OLAP
•
OLAP (On-Line Analytical Processing)
– Multidimensional data analysis techniques
– Why particular business events have occurred and or forecast what may occur in
the future
•
Slice, Dice, Pivot, and Drill Down/UP
•
Questions expected on transaction data from operational system
– Who purchased a particular product?
– How much did an employee get paid?
– How many of a product was manufactured?
VS.
•
Questions expected on OLAP
– What are the total sales for each product?
– What are the total sales for each department?
– Which salesperson has sold the most?
– Which products does each salesperson sell the most of?
– In which month did most of the sales occur?
Multidimensional View of Sales
• Multidimensional analysis involves viewing data
simultaneously categorized along potentially many
dimensions
Exam
• The role of OLAP
• What kind of questions that OLAP can
address.
• Be able to recall our pivoting practice
• For a given pivoting result table, be able to
explain the results in your own words.
3. Data Mining
• Seeks to discover patterns or relationships within the
data
• Data mining tools automatically search data for
patterns and relationships
• Data mining tools
–
–
–
–
–
Analyze data
Uncover problems or opportunities
Form computer models based on findings
Predict business behavior with models
Require minimal end-user intervention
Data Mining Tools
Exam
• Know what data mining can do
• Be able to differentiate data mining tasks from
other tasks.
– E.g. : computing the total sales of a company.
• Q: is this a data mining task?
• A: No. This is a simple accounting task.
3.1. Data Exploration
• This is the first step of any data analysis task.
– Data understanding
– Data validating
• You should be familiar with some basic data
exploration techniques:
–
–
–
–
–
Descriptive analysis (mean, max, min…)
Histogram
Plot
Pie
…
3.2 Clustering and Segmentation
• Unsupervised classification: grouping of cases based
on similarities in input values.
• For exam
– You should be able to differentiate this technique from others.
– You should be able to define the logic behind setting the
number of groups. This is a subjective decision, but the number
of groups shouldn’t be higher than 10 in most cases. E.g.
• Three versions: Windows 7 Home, Windows 7 Pro, Windows 7 Enterprise
Vs.
Only one version: Mac OS X
– Segmentation: help us to profile the clusters. You need be able
to describe the features of each group based on segmentation
results.
3.3. Association Rules
• A legend: Beer and Nappies
– Men who have children and who (have to) do the shopping
on Saturdays often tend to buy nappies for their little ones
besides the beer for the weekend evenings in front of the
television. Subsequently, the superstore decided to
position the palettes of beer besides those of nappies on
Saturdays - with the success of strongly risen sales figures.
• For Exam:
– Based on given result, tell what’s the best rule, and how to
explain this rule in your own words.
– Whether the rule is symmetric is also important.
3.4. Regression
• By modeling the variances, to find out the relationships between
our interest and all independent factors.
• For exam
– How to explain R square?
• A critical criterion for model performance. 0<R square <1. It represents the
percentage of variance that a model can explain.
– How to explain P value?
• A critical criterion for factor influence. P value is always positive and less than
1. When p value of a factor is smaller than 0.1, we say this factor is
significantly influential to our interest.
– How to explain Estimate Coefficient?
• You only need to consider tis estimate coefficient when according p value is
less than 0.1. In explanation, every unit increase of this factor will result an
[estimate coefficient increase] of our interest.
– All factors that use to estimate our interest should be independent!
4. Three Levels of Strategies
• To finish any task or solve a problem, you have
three levels of strategies:
– Manage to do it by yourself
– Hire someone to do it for you
– Design a mechanism and have a group of people to do
it, probably FREE.
• Collective intelligence falls in the third level and
shows critical important in IT business success.
Example
• TASK – to capture more customers. Design your three types of strategies:
– 1. direct marketing by yourself.
• E.g., Individual job seeking
– 2. hire salesperson or marketing agent.
• E.g., most companies
– 3. viral marketing mechanism, word-of-mouth, affiliate program.
• E.g., Facebook friend invite, Ponzi scheme, Hotmail footer, Google word-of-mouth
• For exam, you should be able to design three levels of mechanisms for
a specific task.
• In the future, for any recurring tasks, please try your best to design the
third level of strategy. If you can make it, you can get success in your area.
This is not a dream or legend, but the smartest choice very few people
considered.
Collective Intelligence
• Explain why and why not collective intelligence lead
to better decisions.
• Describe the key issues in implementing collective
intelligence.
• Provide an example why managers need to consider
many key issues when designing collective
intelligence tool – from loss of control to the balance
of diversity.
Resume Tips
• Entry level jobs:
– Business Analyst or Data Analyst. (SQL and analysis techniques. Major
in finance is a strong plus for this position)
– Marketing Analyst. (SQL, cluster and segmentation, association rules,
regression. Major in marketing is a strong plus)
– Database Management. (SQL, ERD. Major in MIS is a strong plus)
• Besides, this course provides very good knowledge base for future
– Skills: PHP, Java, Use Case
– Positions: product manager, project manager, database developer
• Skill Bullets:
– Business intelligence and data analytics:
ERD, SQL, OLAP, data mining (clustering, segmentation, association
rules, basket analysis, decision trees, regression)