Download Equation Section 1 Social Network Mining Tool for Target

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Social Network Mining Tool for Target Advertising:
Mining Association of User Profile in Social Network
Sunisa Apibansmut
Assoc. Prof. Kitsana Waiyamai, Ph.D.
Computer Engineering Department, Faculty of Engineering, Kasetsart University
Tel :(662)942-8555 Ext. 1430 Fax: (662)579-6245 E-mail: [email protected]
ABSTRACT
Nowadays, we have entered into Web 2.0
era which emphasizes to create communities for
enhancing information sharing and collaboration
among users. This concept relates with "social
network" or virtual communities. Meanwhile the
effectiveness of targeting a small portion of
customers for advertising has been recognized by
many businesses, an approach which can identify
potential customers by from social networks is
needed. Within this direction, we develop a MineAds software tool to mine social network which can
help the advertisers in searching for new targeted
customer groups. Mine-Ads is divided into three
parts. The first part collects user profiles from
social networks and then parses them into database.
The second part uses interactive graph to visualize
user networks and provides variety of options to
show relationships among users. The final part
integrates two data mining techniques which are
clustering and association rules discovery to
discover new knowledge from social networks.
Discovered knowledge in the form of clusters and
association rules is used for determining where to
best target your customers
Keywords: Social network, Target advertising
Association rules, Data mining,
Clustering
1. INTRODUCTION
1.1 BLACKGROUND AND PROBLEM
STATEMENT
Nowadays, Target Advertising is the crucial
method to create the advertising. In the past,
surveying was the way to access the consumer
information but this way made the business lost
more time and very expensive. Some marketing
campaigns are unprofitable because it doesn’t
penetrate the market. Therefore this project might
be the new way to help business to identify
potential target customers. Crucial of marketing
decisions based on how much they know about
their customers and potential customers. After,
social network has been very popular so, many
developers construct the software for finding
relationship and properties of target consumers by
use social network. But the most software made for
commercial that was very expensive. Hence, the
small business cannot approach this system. This
reason makes us to construct the freeware: A
Software Tool to Mining Social Network for Target
Advertising for supporting the small business in our
country to find the target consumers efficiently like
the large business.
1.2 SCOPE OF WORK

Data collection Phase
-

Crawler information of user’s profile
from www.myfri3nd.com and then
parser data into database.
Using only MySQL for database
Analysis Phase
-
-
Mining association rules that can
represent the relationship of each
characteristic of Myfri3nd’s user by
using Weka and represent as
association rules
Search prospective target group from
website Myfri3nd.com
Represent interesting trend of user
preference as bar charts
2. MATERIAL REVIEW AND
SOLUTION TECHNIQUE
2.1 DATA MINING AND KNOWLEDGE
DISCOVERY
Data mining is about analyzing data and
finding hidden patterns using automatic or
semiautomatic means.
During the past decade, large volumes of
data have been accumulated and stored in
databases. The result of this data collection is that
organizations have become data-rich and
knowledge-poor.
The main purpose of data mining is to
extract patterns from the data at hand, increase its
intrinsic value and transfer the data to knowledge.
The term data mining is often used to apply to the
two separate processes of
1. Knowledge discovery : provides
explicit information that has a readable
from and can be understood by a
user(e.g., association rules mining).
2. Prediction: or predictive modeling
provides predictions of future events
and may be transparent and readable in
some approaches (e.g., rule-based
systems)
2.2
ASSOCIATION RULES DISCOVERY
Association [3] is the one of popular data
mining task. Association is also called market
basket analysis. A typical association business
problem is to analyze a sales transaction table and
identify those products often sold in the same
shopping basket. The common usage of association
is to identify common sets of items (frequent
itemsets) and rules for the purpose of cross-selling.
In terms of association, each product, or
more generally, each attribute/value pair is
considered an item. Figure 1 shows the association
task that has two phases: to find frequent itemsets
and to find association rules.
Times. The frequency threshold (support) is defined
by the user before processing the model. For
example, support = 2% means that the model
analyzes only items that appear in at least 2% of
shopping carts. A frequent itemset may look like
{Product = “Pepsi”, Product = “Chips”, Product =
“Juice”}. Each itemset has a size, which is the
number of items that it contains. The size of this
particular itemset is 3.
2.3 WEKA
Weka [8] is a popular suite of machine
learning software written in Java, developed at the
University of Waikato. WEKA is free software
available under the GNU General Public License.
The Weka workbench contains a collection of
visualization tools and algorithms for data analysis
and predictive modeling, together with graphical
user interfaces for easy access to this functionality.
2.4 SOCIAL NETWORK
A social structure made of individuals or
organizations that are connected through various
familiarities ranging from casual acquaintance to
close familial bonds. In the Internet, social
networking refers to a category of applications that
connect friends, business partners, or other
individuals together using a variety of tools.
Examples of social networking sites include [9] see
more detail in Figure 2
Figure 2 : User Interface of www.myfri3nd.com
2.4.1 SOCIAL NETWORK ANALYSIS
Figure 1: The two-step process of the association
rule discovery algorithm
Most association type algorithms find
frequent itemsets by scanning the dataset multiple
Social network analysis [SNA] is the
mapping and measuring of relationships and flows
between people, groups, organizations, computers,
web sites, and other information/knowledge
processing entities. The nodes in the network are the
people and groups while the links show relationships
or flows between the nodes. SNA provides both a
visual and a mathematical analysis of human
relationships. Management consultants use this
methodology with their business clients and call it
Organizational Network Analysis [10].
3 ANALYSIS & DESIGN
3.2 DESIGN
3.1 SYSTEM ARCHITECTURE
3.2.1 DESIGN ARCHITECTURE
The developed system is separated into
two parts as show in Figure 3.
-
-
This project is designed following Figure 4
by separating into 3 phases that are
Part I : (Social Network Cohesive Subgroup
Identification) is to discover some interesting
subgroups from www.myfri3nd.com, which is
large social network
Part II :(Mining association of user profile in
social network ) by displaying the result in the
form that easy to understand such as graph,
table etc. and analyze the relationship and the
interesting trend of the social network users.
This work focuses only Part II
System process in the Part II
1. Receive Cluster group from the 1st program or
receive cleansing data from database.
2. Create arff file form selected database.
3. Analyze arff file by using association rule
discovery.
4. Display association rules and frequent
itemsets.
5. Summarize user behavior in the form of
graph.
Figure 4 : System architecture
 Data collection Phase: The crawler is
implemented
and
integrated
for
downloading data from social network, the
MyFri3nd website after that parser data to
collected into database. The database
consist of 4 tables : User-Profile table,
Friend table, Time table, and Liking table.
This database is selected only Profile table
and Liking table for cleansing database
before make Arff files and make file roll
up as shown in Figure 5.
Figure 3 : System process
Figure 5 : ER-Diagram of cleansing database
 Analysis Phase: Creating arff files and
use them for mining association rules that
can represent the relationship of each
characteristic of Myfri3nd’s user by using
Weka as shown in Figure 6 and summarize
user behavior in form of graph. See more
detail in Figure 7
3.2.2 Database and import file
Database in this project is generated by using
MySQL that collect Myfri3nd user’s profile,
Myfri3nd likes and Myfri3nd’s likes (roll up). See
in Figure Table 3.1
Table 3.1: Profile attributes table
3.2.3 Arff file
In the project has 3 arff file show as follow
1) Profile.arff : create arff file all of user
profile. See more detail in Figure 8.
Figure 6 : Association rule output
Figure 8 : Example of profile.arff
Figure 7 : Summary of user behavior in the
form of graph
4 IMPLEMENTATION
This project use WEKA for mining
association rules see in Figure 9.
Figure 11: Comparison of Mine-Ads with
other related systems
6 ACKNOWLEDGEMENT
Figure 9: This diagram shows the steps of
association rules mining
In this topic emphasizes on the Backend
phase and show the example of the process as
follows.
5 EXPERIMENT AND EVALUATION
Conclusion of questionnaire evaluating
program. This questionnaire is concluded from
Software and Knowledge students about 20 people
see in Figure 10 and compare with other related
software see in Figure 11.
Figure 10: Benefit evaluation of Mine-Ads
The writer would like to thank you the
advisor, Assistant Professor Kitsana Waiyamai,
Ph.D. and Mr. Eakasit Pacharawongsakda for
suggestion and giving the guideline to do this
project be convenient. Moreover, the faculty of
engineering that gives the research fund, also NSC
(National Software Contest) Finally, Thanks my
family is always giving my inspiration and takes
care all time.
7 REFERENCE
[1] http://www.positioningmag.com/search/
default.aspx?search=social%20network.
[2] http://www.quantum3.co.za/CI%20Glossary
.htm#S
[3] ZhaoHui T., and Jamie M., “Data Mining with
SQL Server 2005,” Wiley Publishing, Inc., 2005.
2-8.
[4] http://technet.microsoft.com/enus/library/ms174949.aspx
[5] Jiawei Han and Micheline Kamber, “Data
Mining (Concepts and Techniques)”, 2nd Edition
[6] Pavel Berkhin,Accrue Software, Inc., “Survey
of Clustering Data Mining Techniques”
[7] http://en.wikipedia.org/wiki/Data_clustering
[8] http://en.wikipedia.org/wiki/Weka_
(machine_learning)
[9] http://opencontent.wgbh.org/report/glossary.html
[10] http://www.orgnet.com/sna.html
[11] http://en.wikipedia.org/wiki/Social_network
[12] Part II of project: Mining association among
dimensions of user profile in social network