Download prediction of crm using regression modelling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Principal component analysis wikipedia , lookup

Multinomial logistic regression wikipedia , lookup

Transcript
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
PREDICTION OF CRM USING REGRESSION
MODELLING
Aroushi Sharma#1, Ayush Gandhi#2, Anupam Kumar#3
#1, 2
Students, Dept. of Computer Science, MAIT, GGSIP University,
Delhi, INDIA
#3
Assisstant Prof., Dept. of Computer Science, MAIT, GGSIP University,
Delhi, INDIA
1
[email protected]
2
[email protected]
3
[email protected]
Abstract- Regression Analysis Technique can be applied for improved customer
experience and retaining the customer to the organization. This predictive
analytics technique is proposed to predict the sales of the laptop using various
customer related attributes like base, width, and processor configuration, date of
purchase, individual’s income and price of the product. This paper proposes the
implementation of the regression analysis technique for the prediction of CRM
(Customer Relationship Management). Upon the application of regression analysis
on this data, we get the actual attributes responsible for driving the sales of the
laptop, and hence focus on those particular attributes to improve the quality of the
business and improve the profits.
Keywords- CRM, Forecasting with Regression, Logistic Regression, Predictive
Analytics, Regression Analysis, Segmented Regression
various activities like reviews,
telephone, company’s website, email
feedbacks, social media and many
more. Adopting this technique may
lead to favoritism of a particular group
of people but this issue can be
resolved by efficiently utilizing the
CRM technique and its related
approaches. The paper proposes the
regression Model for CRM in order to
retain customers of any business.
I.
INTRODUCTION
Customer Relationship Management
is an ERP (Enterprise Resource
Planning) component which usually
deals with managing the organizations
current and the future prospects of the
customers. CRM analyses the
historical data of a customer to
determine the elements that could help
in retaining the customer to the
organization and therefore improve
the business. Not only this, but CRM
also helps to bring new business to the
organization with a technique called
‘Lead management’. Leads are the
potential customers who can do
business with the organization. The
historical data upon which helps CRM
in customer retention is obtained via
II. MOTIVATIONS
Sales prediction is the backbone of a
business plan and it is a major
requirement for each and every
organization these days. Companies
measure a business and its growth by
sales, and sales prediction sets the
1939
www.ijaegt.com
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
standard for expenses, growth and
profit.
III. OBJECTIVES
Here we are trying to apply the
concepts of regression analysis to
attempt to predict the sales for laptops
by analyzing the relationships among
the
various
customer-related
attributes. In this paper we propose
Regression Modeling Technique
which deals with the correlation and
association
between
statistical
variables and the variables taken here
are treated in a symmetric way. The
various steps of Regression Modeling
can firstly develop CRM Model that
collects and analyzes data and targets
the desired customer by finding out
relationships
between
customers
attributes, then generate Regression
Model,
followed
by
applying
Regression Modeling for Data Mining
and finally generate Results. This
approach is followed by discussing all
steps and analyzing the Results for
predicting the future by examining
relationships among the various data
sets.
Predictive analytics
Predictive analytics is a part of
various data mining strategies that can
be used generation and gathering of
information of data. This information
collected can be used for the
prediction of behavioral patterns or
the current trends that have been going
on. It has its applications in the field
of crime detection and investigation
and the identification of suspects.
Fraudulent credit card users can be
tracked by the use of this technique.
This type of technique has its
applications in prediction of an
unknown event whether it belongs to
the past, present, or future. The
predictive analytics technique uses the
information collected from the past
experiences and looks to evaluate the
relationships among the explanatory
and predictor variables. These
relationships help to predict those
unknown events.
Predictive modeling, data mining and
machine learning form the very
important components to predictive
analytics. These components help in
the analysis of the previous and
present facts to predict about the
future events. But this analysis for
the accuracy and use of these results
will completely depend upon various
assumptions such as no outliers, no
multicollinearity,
linearity
and
normality and of course the quality of
the analysis being done.
Regression Model
The solution approach to the above
issue can be solved by applying
Regression Modeling technique on
data Mining Techniques which is
based on statistical Methods for
CRM that analyses continuous
valued attributes. It is used to
estimate the probability values
associated with the data cube cells.
The more the number of attributes,
more will be the number of
dimensions in the cuboid. These
dimensions are mapped to attributes
of the data set collected. Further, the
dimensions can be reduced to the 3-D
cuboids and 2-D cuboids for the
particular set of attributes. Thus the
higher order cuboids can be built
from the lower- order cuboids. This
proves to be the building block of
most of our Regression Model. Thus
our Regression Model includes the
following steps:
Regression Analysis
The data can be analyzed with the
help of statistical analytic technique.
1940
www.ijaegt.com
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
These techniques include Linear
Regression, which is one of the
simplest forms of regression. It is used
to find the relationship between a
random variable, Y known as the
response variable and another variable
X which is known as the predictor
variable and this relationship comes
out to be linear. Thus the equation
becomes like this according to linear
Regression:
y=a+bx
Where the variance of Y is assumed to
be constant, a and b are regression
coefficients which specifies the Yintercept and slope of the line. The
coefficients can be solved with the
method of Least Squares, which helps
in minimization of the data between
the actual data and the estimated line
Where Slope(b) = (NΣXY (ΣX)(ΣY)) / (NΣX2 - (ΣX)2),
Intercept(a) = (ΣY - b(ΣX)) / N
Logistic Regression
Logistic regression is a type of
a regression model
in
which
the dependent
variable
(DV) is
categorical.
Logistic regression was developed by
the famous statistician David Cox in
1958 (although much work was done
in single independent variable case
almost two decades earlier). Binary
logistic model can be used to
estimate the probability value of a
binary response based on one or
many predictor (or independent)
variables. As such it is not merely a
classification method; it could be
called a qualitative response/discrete
choice model in the terminology
of economics.
Segmented Regression
Segmented regression could be a
technique in multivariate analysis
within which the predictor variables
are partitioned off into intervals and a
specific line section is work into every
of the interval. Divided regression is
helpful once the freelance variables
that are clustered into completely
different teams, exhibit completely
different relationships between the
variables in these regions. Divided
multivariate analysis also can be
performed on variable information by
partitioning the varied freelance
variables gift.. The boundaries
between the segments are thoughtabout as breakpoints.
Segmented linear regression is the one
in which the relationships between the
intervals are evaluated by using linear
regression technique.
Forecasting with Regression
We can use an equation to easily
generate forecasts from a simple linear
model
y^=β^0+β^1x
Here x stores the value of the
predictor variable for which we are
trying to forecast. This means that if
we provide some value for xx in this
equation we could generate an
equivalent forecasting for y^.
A term ‘fitted value’ can be defined as
the resulting value of y^ for doing the
calculation using an observed value of
xx from the dataset. This is not
considered as a plain forecast as the
actual value of y for that predictor
value was used in estimation of the
model here. Thus the value of y^ is
affected only by the true value of y.
This shows that for y^ to show a
resulting value for a genuine forecast
the values of xx should have a new
value or that should not exist in the
data that were used in the estimation
of the model.
1941
www.ijaegt.com
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
IV. RESULTS
Screenshots
Normality condition
FIGURE 1 BEFORE NORMALIZATION
This is an exponential curve and no logarithmic or exponential transformations have
been applied here. An exponential function is any function where the variable is the
exponent of a constant.
FIGURE 2 AFTER NORMALIZATION
Database normalization (or normalization) is the process of organizing the columns
(attributes) and tables (relations) of a relational database to minimize data
redundancy. This is a bell shaped curve that shows that log transformation have been
applied on the predictors’ exponential curve. The normal distribution is the
bell curve (or normal curve).
1942
www.ijaegt.com
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
Summary
.
VIF or Variance Inflation Factor is used to check the condition for nomulticollinearity which is essential to fulfill the assumption made before applying the
linear regression technique
1943
www.ijaegt.com
ISSN No: 2309-4893
International Journal of Advanced Engineering and Global Technology
I
Vol-04, Issue-03, May 2016
The summary of the estimation done by the Linear Modeling technique (applied using
the lm function) can be seen here.
V. FUTURE WORKS AND
CONCLUSION
Linear Regression techniques have
been applied and it can be seen that
how the target customer can be
achieved by applying the Linear
regression Analysis. Further Multiple
Regression
Analysis
can
be
applicable where it is based on more
than two predictor variables. Then
the further constants can be retrieved
by applying Least Square Methods.
Also, further the Non-Linear Models
can be converted to linear Models.
As seen in our approach the
implementation of Log-Linear Model
can be shown with the help of data
sets collected where the attributed
can be transformed from the
categorical labels to the continuous
valued attributes. Thus an iterative
technique can be followed to build
higher order cubes from lower order
cubes.
VI. REFERENCES
[1] Bueren, A., Schierholz, R., Kolbe, L.,
Brenner,
W.:Customer
Knowledge
Management - Improving Performance of
Customer Relationship Management with the
Knowledge
Management.
St.
Gallen,
Switzerland
[2] Weiss, S.M., Kulikowski, C.A.:Computer
Systems That can Learn: Classification and
Prediction Methods from Statistics, Neural
Nets, Machine Learning, and the Expert
Systems. Morgan Kaufmann, San Mateo, CA
(2011)
[3] Weiss, S.M., Indurkhya, N.: Predictive Data
Mining. Morgan Kaufmann, San Francisco
(2012)
[4] Li, Y.-M.: A general linear-regression
analysis applied to the 3-parameter Weibull
distribution. IEEE Transactions on Reliability
43(2), 255–263 (2014)
[5] Shani, D. and Chalasani, S., "Exploiting
niches using relationship marketing", The
Journal of
Consumer Marketing, Vol. 9,
No. 3, pp. 33-42,2010.
[6] Porter, M. E. and Millar, V. E., "How
Information
Gives
You
Competitive
Advantage", Harvard Business Review, No. 4,
pp. 149-160,2010.
[7] Douglas C. Montgomery, Elizabeth A.
Peck, G. Geoffrey Vining:‖ Introduction to
Linear Regression Analysis‖ Department of
statistics blacksburg,VA , pp. 104-117,2012.
[8] V. V Das, R. Vijaykumar et al. (Eds.):‖ICT
2010,
CCIS
101‖ Springer-Verlag
Berlin Heidelberg ,pp. 195–200, 2010.
1944
www.ijaegt.com