Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Social Network Mining Tool for Target Advertising: Mining Association of User Profile in Social Network Sunisa Apibansmut Assoc. Prof. Kitsana Waiyamai, Ph.D. Computer Engineering Department, Faculty of Engineering, Kasetsart University Tel :(662)942-8555 Ext. 1430 Fax: (662)579-6245 E-mail: [email protected] ABSTRACT Nowadays, we have entered into Web 2.0 era which emphasizes to create communities for enhancing information sharing and collaboration among users. This concept relates with "social network" or virtual communities. Meanwhile the effectiveness of targeting a small portion of customers for advertising has been recognized by many businesses, an approach which can identify potential customers by from social networks is needed. Within this direction, we develop a MineAds software tool to mine social network which can help the advertisers in searching for new targeted customer groups. Mine-Ads is divided into three parts. The first part collects user profiles from social networks and then parses them into database. The second part uses interactive graph to visualize user networks and provides variety of options to show relationships among users. The final part integrates two data mining techniques which are clustering and association rules discovery to discover new knowledge from social networks. Discovered knowledge in the form of clusters and association rules is used for determining where to best target your customers Keywords: Social network, Target advertising Association rules, Data mining, Clustering 1. INTRODUCTION 1.1 BLACKGROUND AND PROBLEM STATEMENT Nowadays, Target Advertising is the crucial method to create the advertising. In the past, surveying was the way to access the consumer information but this way made the business lost more time and very expensive. Some marketing campaigns are unprofitable because it doesn’t penetrate the market. Therefore this project might be the new way to help business to identify potential target customers. Crucial of marketing decisions based on how much they know about their customers and potential customers. After, social network has been very popular so, many developers construct the software for finding relationship and properties of target consumers by use social network. But the most software made for commercial that was very expensive. Hence, the small business cannot approach this system. This reason makes us to construct the freeware: A Software Tool to Mining Social Network for Target Advertising for supporting the small business in our country to find the target consumers efficiently like the large business. 1.2 SCOPE OF WORK Data collection Phase - Crawler information of user’s profile from www.myfri3nd.com and then parser data into database. Using only MySQL for database Analysis Phase - - Mining association rules that can represent the relationship of each characteristic of Myfri3nd’s user by using Weka and represent as association rules Search prospective target group from website Myfri3nd.com Represent interesting trend of user preference as bar charts 2. MATERIAL REVIEW AND SOLUTION TECHNIQUE 2.1 DATA MINING AND KNOWLEDGE DISCOVERY Data mining is about analyzing data and finding hidden patterns using automatic or semiautomatic means. During the past decade, large volumes of data have been accumulated and stored in databases. The result of this data collection is that organizations have become data-rich and knowledge-poor. The main purpose of data mining is to extract patterns from the data at hand, increase its intrinsic value and transfer the data to knowledge. The term data mining is often used to apply to the two separate processes of 1. Knowledge discovery : provides explicit information that has a readable from and can be understood by a user(e.g., association rules mining). 2. Prediction: or predictive modeling provides predictions of future events and may be transparent and readable in some approaches (e.g., rule-based systems) 2.2 ASSOCIATION RULES DISCOVERY Association [3] is the one of popular data mining task. Association is also called market basket analysis. A typical association business problem is to analyze a sales transaction table and identify those products often sold in the same shopping basket. The common usage of association is to identify common sets of items (frequent itemsets) and rules for the purpose of cross-selling. In terms of association, each product, or more generally, each attribute/value pair is considered an item. Figure 1 shows the association task that has two phases: to find frequent itemsets and to find association rules. Times. The frequency threshold (support) is defined by the user before processing the model. For example, support = 2% means that the model analyzes only items that appear in at least 2% of shopping carts. A frequent itemset may look like {Product = “Pepsi”, Product = “Chips”, Product = “Juice”}. Each itemset has a size, which is the number of items that it contains. The size of this particular itemset is 3. 2.3 WEKA Weka [8] is a popular suite of machine learning software written in Java, developed at the University of Waikato. WEKA is free software available under the GNU General Public License. The Weka workbench contains a collection of visualization tools and algorithms for data analysis and predictive modeling, together with graphical user interfaces for easy access to this functionality. 2.4 SOCIAL NETWORK A social structure made of individuals or organizations that are connected through various familiarities ranging from casual acquaintance to close familial bonds. In the Internet, social networking refers to a category of applications that connect friends, business partners, or other individuals together using a variety of tools. Examples of social networking sites include [9] see more detail in Figure 2 Figure 2 : User Interface of www.myfri3nd.com 2.4.1 SOCIAL NETWORK ANALYSIS Figure 1: The two-step process of the association rule discovery algorithm Most association type algorithms find frequent itemsets by scanning the dataset multiple Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, web sites, and other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. SNA provides both a visual and a mathematical analysis of human relationships. Management consultants use this methodology with their business clients and call it Organizational Network Analysis [10]. 3 ANALYSIS & DESIGN 3.2 DESIGN 3.1 SYSTEM ARCHITECTURE 3.2.1 DESIGN ARCHITECTURE The developed system is separated into two parts as show in Figure 3. - - This project is designed following Figure 4 by separating into 3 phases that are Part I : (Social Network Cohesive Subgroup Identification) is to discover some interesting subgroups from www.myfri3nd.com, which is large social network Part II :(Mining association of user profile in social network ) by displaying the result in the form that easy to understand such as graph, table etc. and analyze the relationship and the interesting trend of the social network users. This work focuses only Part II System process in the Part II 1. Receive Cluster group from the 1st program or receive cleansing data from database. 2. Create arff file form selected database. 3. Analyze arff file by using association rule discovery. 4. Display association rules and frequent itemsets. 5. Summarize user behavior in the form of graph. Figure 4 : System architecture Data collection Phase: The crawler is implemented and integrated for downloading data from social network, the MyFri3nd website after that parser data to collected into database. The database consist of 4 tables : User-Profile table, Friend table, Time table, and Liking table. This database is selected only Profile table and Liking table for cleansing database before make Arff files and make file roll up as shown in Figure 5. Figure 3 : System process Figure 5 : ER-Diagram of cleansing database Analysis Phase: Creating arff files and use them for mining association rules that can represent the relationship of each characteristic of Myfri3nd’s user by using Weka as shown in Figure 6 and summarize user behavior in form of graph. See more detail in Figure 7 3.2.2 Database and import file Database in this project is generated by using MySQL that collect Myfri3nd user’s profile, Myfri3nd likes and Myfri3nd’s likes (roll up). See in Figure Table 3.1 Table 3.1: Profile attributes table 3.2.3 Arff file In the project has 3 arff file show as follow 1) Profile.arff : create arff file all of user profile. See more detail in Figure 8. Figure 6 : Association rule output Figure 8 : Example of profile.arff Figure 7 : Summary of user behavior in the form of graph 4 IMPLEMENTATION This project use WEKA for mining association rules see in Figure 9. Figure 11: Comparison of Mine-Ads with other related systems 6 ACKNOWLEDGEMENT Figure 9: This diagram shows the steps of association rules mining In this topic emphasizes on the Backend phase and show the example of the process as follows. 5 EXPERIMENT AND EVALUATION Conclusion of questionnaire evaluating program. This questionnaire is concluded from Software and Knowledge students about 20 people see in Figure 10 and compare with other related software see in Figure 11. Figure 10: Benefit evaluation of Mine-Ads The writer would like to thank you the advisor, Assistant Professor Kitsana Waiyamai, Ph.D. and Mr. Eakasit Pacharawongsakda for suggestion and giving the guideline to do this project be convenient. Moreover, the faculty of engineering that gives the research fund, also NSC (National Software Contest) Finally, Thanks my family is always giving my inspiration and takes care all time. 7 REFERENCE [1] http://www.positioningmag.com/search/ default.aspx?search=social%20network. [2] http://www.quantum3.co.za/CI%20Glossary .htm#S [3] ZhaoHui T., and Jamie M., “Data Mining with SQL Server 2005,” Wiley Publishing, Inc., 2005. 2-8. [4] http://technet.microsoft.com/enus/library/ms174949.aspx [5] Jiawei Han and Micheline Kamber, “Data Mining (Concepts and Techniques)”, 2nd Edition [6] Pavel Berkhin,Accrue Software, Inc., “Survey of Clustering Data Mining Techniques” [7] http://en.wikipedia.org/wiki/Data_clustering [8] http://en.wikipedia.org/wiki/Weka_ (machine_learning) [9] http://opencontent.wgbh.org/report/glossary.html [10] http://www.orgnet.com/sna.html [11] http://en.wikipedia.org/wiki/Social_network [12] Part II of project: Mining association among dimensions of user profile in social network