Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
D ATA S CIENCE A PPROACH TO A NALYSIS C LIENT T RANSACTIONS IN M ITCH B OS T ELECOMMUNICATION M ARKETING AND L EIDEN I NSTITUTE OF M ICHAEL E MMERICH A DVANCED C OMPUTER S CIENCE [email protected] O VERVIEW Quizzes, games, ringtones and adult content, they are all part of the Telecommunication Marketing Business. In the Telecommunication Marketing Business thousands of transactions are done on a daily basis. But what is possible by using all this data in order to come up with pricing schemes, predict future profits and target specific markets. By using different Data Mining Techniques we will analyse these transactions and try to find these. T HE D ATA In order to find the relations and patterns for the the Telecommunation Marketing Business, the data was used from the company Telefuture. The data consists of millions of transaction payments. Customers subscribe to a specific service and the information from these customer payments is stored. Every different payment is a different transaction. The parameters of the different payments are: the service, the time of payment, the status of the payment, country the payment was made and the amount. The status of the payment tells us if the payment was completed or not. At the time of writing all the data I want is not yet available for me but will be soon. The data will then be processed and ordered so it will be possible to use the analysis techniques on it. R ESEARCH Q UESTION Research Question Given large data volumes from past transactions of a telecommunication marketing firm: are there patterns in the transaction data that can be used to determine pricing schemes for clients? And, how can we maken the process of payment more streamlined and efficient? R ELEVANT W ORK The concept of Associated Rule Mining was first introduced by Agrawad, Imielinski and Swami. It was based on customers that fill a basket with goods from the supermarket. By analyzing these baskets, patterns could be found. This concept can also be called, Market Basket Analysis. P REDICTIVE A NALYSIS E XPLORATORY A NALYSIS Association Rule Mining Predictive Analysis is a concept that includes a variety of techniques from predictive modelling, data mining and machine learning. It uses historical data and transactions to forecast opportunities in the future. Neural Networks Neural Networks is a powerful analysis technique that mimics the neurons of the biological brain. Neural networks learn through training. The input and the output is known but not the way to get there. The concept of Association Rule Mining was introduced in order to discover patterns between different products in a large-scale data set of transactions. The best known example is of the rule: {onions, potatoes} =⇒ {burgers} This rule means that if a customer buys onions and potatoes, that this customer will also be likely to buy hamburger meat. By finding these rules businesses can use these for different marketing and pricing strategies. In order to find only significant rules in the data set, two constraints are used. These are the minimum support and confidence constraints. The support tells how many times a specific item set appears in the data set and the confidence how many times a specific rule is true. Two steps are taken to find the rules: Some time later Agrawal and Srikant came up with a new algorithm to find Association Rules fast. They introduced the Apriori and AprioriTid algorithms. 1. The minimum support is used to find all of the frequent itemsets. The Frequent Path algorithm is a second algorithm that was introduced by Han, Pei and Yin. An advantage over ther Apriori Algorithm is that the FP Growth algorithm only reads the dataset twice. In our research this will be the algorithm we are using. 2. The minimum confidence is used to find the relevant sets and make rules. Clustering An schema of a Neural Network Fraud Detection By using predictive analysis, it is possible to identify and track fraudulent transactions. In fraude detection, scores will be given to different transactions and customers. This way risky transactions can be identified and filtered out. So at the same time reducing the exposure of risk for the company. The concepts of clustering is an analysis method that groups a set of items together in such a way that items with the same characteristics are placed together. This way it might be possible to find similar characteristics for specific services and see what the different transactions have in common. It is not possible to precisely define the concept of clustering. There are many different goals for clustering. This is also the reason that there are so many clustering algorithms. k-means clustering