Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
The University of Manchester
Manchester Business School
BSc (Hons) Management (Marketing)
A Data Mining Approach to Enhance
Marketing Communications Practices for
Undergraduate Academic Programmes in the
U.K.
Mihai Rareș Ciurba
Supervisor: Dr Ilias Petrounias
2015
1
Declaration of Originality
This dissertation is my own original work and has not been submitted for any
assessment or award at University of Manchester or any other university.
2
Acknowledgements
Firstly, I would like to sincerely thank my supervisor, Ilias. His patience,
guidance and input throughout this dissertation were invaluable.
I would also like to thank Kate Scott for the time she set aside to discuss this
dissertation.
3
Abstract
Data mining and analytics have started to play a significant role in enhancing business
decision making during recent years, with examples ranging from improving direct marketing
campaigns or managing risk in a credit portfolio. However, their application in improving
marketing communications practices for undergraduate academic programmes is lacking. To
identify the potential benefits of data mining in a university’s student selection process, this
study analysed a data set containing 4,110 student applications for Manchester Business
School. This dissertation presents the relationships that were identified between various
candidate characteristics together with their implications on marketing communication
practices.
It was observed that marketing communication messages can be targeted at student segments
based on the applicant’s domicile, gender, the academic programme he/she applied for and the
response he/she gave the university based on the offer he/she received. The analysis further
revealed sub optimal data management practices which negatively impacted the university’s
ability to draw value from any data mining initiative. In conclusion, data mining provides
significant value in enhancing marketing communication practices for undergraduate
academic programmes. Future studies should consider other attributes, such as a candidate’s
personal statement, to generate further actionable information for higher education
institutions.
4
Table of Contents
1. Introduction ......................................................................................................................................... 6
1.1. Background to the investigation ................................................................................................... 6
1.2. Structure of the dissertation .......................................................................................................... 9
2. Literature review ............................................................................................................................... 10
2.1. Key concepts .............................................................................................................................. 10
2.1.1. Business intelligence ........................................................................................................... 10
2.1.2. Data mining ......................................................................................................................... 11
2.2. Marketing in the higher education sector ................................................................................... 14
2.3. Marketing communications in the higher education sector ........................................................ 15
2.3.1. Theoretical background ....................................................................................................... 15
2.3.2. Existing marketing communications practices in the higher education sector .................... 17
2.4. Conclusion .................................................................................................................................. 22
3. Methodology ..................................................................................................................................... 23
3.1. Data understanding and data preparation ................................................................................... 23
3.2. Data modelling ........................................................................................................................... 37
4. Result analysis and discussion........................................................................................................... 43
4.1. Findings – Cluster analysis......................................................................................................... 43
4.2. Findings – Association rules ...................................................................................................... 48
5. Discussion ......................................................................................................................................... 56
6. Conclusion ......................................................................................................................................... 60
6.1. Summary of Findings ................................................................................................................. 60
6.2. Wider implications ..................................................................................................................... 60
6.3. Limitations of Study ................................................................................................................... 60
6.4. Future Research Opportunities ................................................................................................... 61
7. References ......................................................................................................................................... 65
5
1. Introduction
1.1. Background to the investigation
Recent scholarship stresses the importance of targeted and personalised marketing
communications, regardless of the nature and scope of the organisation (Vesanen, 2007;
Roberts, 2003). The argument put forward is that by creating content, which is relevant to a
receiver (Petty et al., 2000), attention and elaboration are enhanced (Tam and Ho, 2005),
more positive attitudes are developed (Kalyanaraman and Sundar, 2006) and response rates
are increased (Ansari and Mela, 2003).
In fact, personalised marketing communications have seen a substantial increase in
popularity and complexity as a consequence of recent technological advances, such as
substantial increases in storage capacity, reduced storage cost, advances in statistical sciences
and increased processing power (Linoff and Berry, 2011). These developments have
permitted the storage and processing of terabytes of data of customer activity, allowing
organisations to create an incredibly detailed picture of their marketing performance and
accurately reveal what actually works, in terms of marketing strategy, and what customers
really want (Nichols, 2013).
For instance, online advertising relies heavily on predictive modelling and data
analytics to target individuals, those who are more likely to respond to offers, with
personalised messages that have been generated based on several personal attributes. These
attributes can vary from demographic information to the user’s previous activity online, such
as web pages visited, searches made and clicking and purchasing behaviour (Einav and Levin,
2014; Athey, 2014; IAB, 2015).
More traditional advertising practices have also been transformed by the emergence of
advance data analytics. Direct marketing campaigns can achieve increased response rates by
selecting prospective customers which are more likely to reply by looking at key attributes
which influenced existing customers to get back in touch with the company and applying that
knowledge to prospective customers (Linoff and Berry, 2011). Similarly, cross-selling can be
a more effective and efficient process through leveraging data analytics, not only to determine
what product to offer to whom, but also when is the right time to make that recommendation
(Kamakura et al., 2003).
Having said this, and building upon Gibbs and Knapp’s (2002, pg. 75) argument that
“every organisation is cast in the role of communicator and promoter”, the importance of
marketing communications for higher education institutions cannot be overlooked. With
6
multiple target audiences, ranging from prospective students to academic staff, universities
are competing not only on attracting high quality students, but also securing grants from
different research councils (Adams and Smith, 2014). In fact, marketing communications
have recently become a more pressing issue as the limit on the number of students an
institution can enrol on a yearly basis has been lifted (Morgan, 2013). Building upon this
recent development, and the fact that around 62% of the annual revenue a higher education
institution generates comes from tuition fees (Boffey, 2014), marketing communications have
become extremely relevant. Simply stated, a university’s research capabilities and success are
highly dependent on the number of tuition paying students it can entice to enrol. However, for
those universities, which already attract a large number of applicants, the issue is instead
concerned with securing high quality candidates.
Given that the importance of marketing communications has increased, together with
the fact that profitability, or financial performance, appears to be positivity correlated with an
institution’s ability to personalise marketing communications and deliver individually relevant
content (Smith, 2005), why is Manchester Business School not leveraging technological
advances to generate targeted and personalised marketing communications?
The current practices at Manchester Business School in terms of marketing
communications targeted at prospective undergraduate students vary greatly. First of all, the
institution segments its target audience based on the academic programme the candidate is
applying for, such as ‘BSc (Hons) Management’ or ‘BSc (Hons) Accounting’, with no
segmentation taking place at a geographical level. That is to say that prospective students do
not receive different messages if they live in the U.K. or are applying from an overseas
country (e.g. China) (pers. comm. Kate Scott).
In terms of the actual communications that go out to different candidates, they receive
either a subject specific message, particular to their academic programme, or a general one,
catering to all programmes, depending on the month the communication is sent out. For
example, candidates receive an academic programme tailored message in December, and in
the following month, January, they receive a universal one, which is sent out to all applicants
(pers. comm. Kate Scott).
Before further evaluating existing marketing communication practices, it is important
to mention at this point the difference between targeted marketing communications and
personalised marketing communications. This dissertation acknowledges the existence of
personalisation when messages differ amongst each other based on an individual applicant’s
description, which is unique to that candidate. Similarly, if new information is inferred from
7
these distinctive characteristics and is used to alter marketing communications that is also an
instance of personalisation. For example, having the name of a candidate included in the
communication is considered an instance of personalisation. This working definition of
personalisation ties in with the academic literature, which considers personalisation as the
process of using customer information to tailor interactions between an organisation and
individual prospects (Vesanen, 2007). At the same time, Roberts (2003) recognises that not
only stated characteristics are important, but also those which are implied or inferred.
On the other hand, targeted marketing communications are concerned with grouping
individuals based on common candidate characteristics and using that information to produce
and transmit messages, which are relevant to that particular segment only (Chaffey, 2011). It
does not look at a personal level per se, instead relying on clusters to identify and select an
appropriate marketing communication mix. For example, gender or age is an attribute, which
can be included in the grouping process.
As previously mentioned, Manchester Business School engages in targeted marketing
communications on the basis of the academic programme the prospective student has decided
to enrol on. Another instance of targeted communications occurred when the university sent
out to U.K. high school students which were about to take their A-level exams and applied for
the Manchester Business School a post card, which whished them good luck with their
assessments. Outside these two instances, the university does not carry out any other targeted
marketing communications (pers. comm. Kate Scott).
With respect to the level of personalisation present in each piece of marketing
communications sent, practices seem to be equally limited. Hard copy materials have no
degree of personalisation, while the online version of the same content, which is sent through
email, has the name of each prospective student and the course they are applying for printed at
the beginning of the document. Any further personalisation practices were not observed (pers.
comm. Kate Scott).
As such, while evidence from the literature states that targeted and personalised
marketing communications are beneficial for an organisation, unfortunately Manchester
Business School engages in limited such practices. To further investigate this behaviour and
suggest recommendations, the author applied data mining techniques, namely cluster analysis
and association rules, to a data set containing 4,110 student applications for the Manchester
Business School.
This dissertation discusses the significance of data mining practices on improving
marketing communications for undergraduate academic programmes and shows how such an
8
approach can assist the university in developing more targeted, highly personalised, marketing
communications which are relevant to the prospective student. As a consequence the
institution can work towards one of its organisational objectives, to attract high quality
undergraduate students, by putting across its unique selling proposition in a manner which
links directly to the individual candidate.
1.2. Structure of the dissertation
The purpose of this chapter was to familiarise the reader with the research problem
and enunciate the opposition between academic research and organisational behaviour.
Chapter 2 will critically evaluate existing literature on targeted and personalised marketing
communications in the higher education sector, with the aim of identifying potential gaps in
the literature. Chapter 3 is concerned with the methodology of the study, including issues such
as data understanding, data pre-processing and the techniques deployed to reveal potentially
meaningful patterns in the data. Chapter 4 presents the results obtained from the data mining
activity and contains the discussion, which elaborates on the findings on an individual basis
and their implication for Manchester Business School. Chapter 5 attempts to bring all the
findings together in order to suggest an overall set of recommendations that Manchester
Business School can follow. Finally, Chapter 6, the conclusion, will attempt to extend the
findings of this dissertation to other higher education institutions, but also identify limitations
in the study and potential avenues for future research.
9
2. Literature review
This chapter examines the relevant literature with respect to marketing
communications in higher education institutions together with evaluating existing theories and
practices on the degree of personalisation of these. In particular, it aims to critically analyse
existing research in the subject field, while simultaneously addressing insightful concepts
from within business intelligence and data mining. This dual approach will highlight potential
gaps in existing research, which can be reduced, and this study seeks to do so, by applying
knowledge developed from the field of business intelligence. Before presenting relevant
findings from the literature, key concepts on which the dissertation is based around will be
discussed.
2.1. Key concepts
2.1.1. Business intelligence
The term business intelligence (BI) has been defined in multiple ways by different
academics (Stackowiak et al., 2007; Cui et al., 2007; Gorfarelli et al., 2004), but for the
purpose of this dissertation business intelligence represents the gathering of vast amount of
data in order to provide insights that drive tactical and strategic business decisions
(Stackowiak et al., 2007; Zeng et al., 2006; Gorfarelli et al., 2004). Business intelligence
incorporates a broad category of technologies (Gangadharan and Swamy, 2004), which allow
business users to gather, store, access and analyse data to improve the business decision
making capabilities (Marr, 2014).
It is important to note that business intelligence encompasses a series of systems, such
as the data warehouse, and processes, such as data mining, performance management and
analytical reporting (Negash, 2004; Marr, 2014). However, these processes and technologies
will not be covered in great detail in this dissertation. Instead, the focus is on why should an
organisation carry out business intelligence and what are the rewards for doing so. The
motivation towards and rewards of business intelligence will help shed light on why
Manchester Business School should adopt such an approach.
Building upon Kahaner’s (1996) and Miller’s (2001) understanding, competitive
intelligence is a systematic, and when done responsibly, ethical programme for gathering,
analysing and managing information concerning the business environment an organisation
operates in. Moreover, using the data-information-knowledge-wisdom hierarchy (Weinberger,
2010), it can be argued that competitive intelligence does not deal solely with information, but
with a combination of data, information, knowledge and wisdom. When competitive
10
intelligence is acted upon, it can confer the organisation a competitive advantage or improve
their decision making process (Qiu, 2007). However, competitive intelligence looks at the
whole environment, which includes the organisation itself, as opposed to business
intelligence, which focuses exclusively on the organisation. Nevertheless, this dissertation
will focus on the organisational level, without considering factors from the whole
environment. As such, the more Manchester Business School knows about the nature and
preferences of its customers, in this situation the students, the better decisions it can make
with respect to its strategic decisions. It is the delivery of customer value and the developing
of long term relationships, through carefully targeting students with marketing
communications, which contribute towards the performance of the university’s mission and
form the major driving force towards adopting business intelligence (Payne, 2005; Drucker,
2013).
Having said this, there are clear benefits that an organisation has by adopting business
intelligence and those specifically to Manchester Business School will be discussed in the
subsequent section. According to Nucleus Research (2014), companies that spend one dollar
on analytics and business intelligence will have an average rate of return of $13.01, or 1300%.
It is important to also mention non-financial rewards, such as the possibility of making
strategic decisions faster and better, or working more effectively and efficiently towards the
key performance indicators of the firm (Lachlan, 2014). All these factors are important,
because they highlight the usefulness of business intelligence in an organisation and set the
premises for why analytics should be implemented in the process of marketing undergraduate
programmes by universities.
Thus, the potential rewards that the university will get by deploying business
intelligence tools in targeting potential students will be around the strategic goals as
highlighted in ‘Manchester 2020 – The Strategic Plan for University of Manchester’ (The
University of Manchester, 2011). More specifically, carrying out more effective marketing
communications will enable the university to recruit students from various socio-economic
groups and for which the university’s offerings are relevant to their particular needs, ensuring
a closer match between the university’s value proposition and the student’s needs and wants.
2.1.2. Data mining
Data mining, according to Linoff and Berry (2011, pg. 2) is “a business process for
exploring large amounts of data to discover meaningful patterns and rules”. As such, it is an
ongoing process, which starts with an initial dataset, then through analysis informs action,
11
which, in turn, generates data that requires more data mining. The fundamental goal of data
mining, from a business stand point, is to find patterns that are of real value for the
organisation, or simply said to ultimately improve a metric that the company wants to
improve (Linoff and Berry, 2011).
While data mining has existed, at least as an academic construction, for decades, it has
grown in popularity over the last 20 years due to several factors, which include the fact that
increasingly vast amounts of data are collected daily, coupled with the rise in storage
capabilities and affordability of computational power (Han et at., 2012; Linoff and Berry,
2011). More importantly though, as Kumar and Bhardwaj (2011) argue, businesses operate in
increasingly dynamic environments, where the emphasis is put on the quality of services and
the speed at which those are delivered. As such, data mining constitutes a fundamental
technology in driving competitive advantage by delivering meaningful customer value and
building long term relationships, benefits which can also be developed in the higher education
sector (Fayyad et al., 1996; Shaw et al., 2001).
Due to the fact that this dissertation is focused on the business perspective of data
mining and analytics, there is the need to identify a suitable model, or data mining
methodology, which will guide the data collection and processing of the research. The choice
of methodology is important as it will allow the translation of the business problem into a data
mining problem, but at the same time deliver results which are understandable, and
applicable, to managers in the organisation (Linoff and Berry, 2011).
There are several data mining methodologies, including CRISP-DM and SEMMA,
which were extensively reviewed by Marban et al. (2009) and Kurgan and Musilek (2006).
Their findings suggest that CRISP-DM is easy to understand, intuitive, constructed with
industrial input, setting out clear and understandable stages in the data mining process, which
can be directly related to the business problem, and more importantly to this dissertation,
previously applied to direct marketing projects. On the other hand, Fayyad et al.’s (1996)
model or Anand and Buchner’s (1998) one are academic in nature, with limited usage in
industry, as also outlined by the results from KDnuggets (2007), which state that 42% of
respondents use the CRISP-DM methodology. While Marban et al. (2009, pg. 11) identify
issues with the CRISP-DM model, such as ‘knowledge importation’, where-by the framework
does not consider the reuse of existing knowledge, the CRISP-DM model will represent the
reference point by which the data mining project is carried out, a decision also supported by
the fact that the software used in this dissertation to carry out data analysis, IBM SPSS
Modeler 14.2, also uses the CRISP-DM methodology (IBM, 2012; Chapman et al., 2000).
12
Figure 1: The CRISP-DM
reference model. This figure
highlights the different phases
of the data mining project,
their respective tasks, and the
relationship
which
exists
between these tasks. The outer
circle shows the fact that data
mining is a cyclical model, in
which deploying a solution
does not end the process,
instead it gives rise to learning
which is reapplied to business
questions. (Source: Chapman
et al., 2000)
Figure 1 shows the phases of the CRISP-DM model. The stages presented above will
be briefly discussed, from a marketing communications stand point, to contextualise the
model with respect to this dissertation. However, a detailed breakdown of the methodology
will be discussed in a subsequent chapter.
Business understanding
Business understanding relates to identifying the specific business goal, or project
objective and requirements. The main goal of the data mining process is to assist the
university in recruiting prospective students. As such, the marketing communications
campaign will begin with the segmentation of students into groups based on common
characteristics, in order to best decide who to target and how to target them, through different
media and varying degree of personalisation of marketing communications (Chapman et al.,
2000; Li et al., 2010).
Data understanding
Data understanding is concerned with acquiring the data, exploring the data, through
queries and visualisation, and verifying the quality of the data. A dataset containing a mix of
continuous and discrete data values on general demographic characteristics, together with
specific information, such as academic programme, regarding students will form the basis of
the analysis. First insights into the data will be obtained through visualisation and data quality
problems will be identified and solved (Chapman et al., 2000).
13
Data preparation
This phase is concerned with constructing the final dataset, on which modelling will
be carried out, from the initial raw data. As Chapman et al. (2000, pg. 11) argue, it includes
“table, record and attribute selection, as well as transformation and cleaning of data for
modelling tools”.
Modelling
At this phase, several modelling techniques are chosen and applied. An important
point to mention that occurs at this stage is assessing the models. Which model sheds the most
knowledge with respect to the problem at hand should be selected and form the basis of
decisions for the organisation (Chapman et al., 2000; Moro et al., 2011).
Evaluation
In this stage, the steps followed to build the model are reviewed and the business
factors taken into account to produce the model are carefully revised. For example, a certain
attribute will have to be reconsidered in the model if it is determined, at a later stage, that it
should have an impact into the business outcome (Chapman et al., 2000).
Deployment
With respect to this dissertation, the deployment phase will consist in presenting the
results of the data mining project in such a way that staff from the university can act upon the
information, without prior background in analytics and business intelligence (Chapman et al.,
2000).
2.2. Marketing in the higher education sector
So far relevant terminology and key concepts relating to them have been discussed to
allow for the contextualisation of business intelligence and analytics to the marketing
communications of higher education institutions. The following section will briefly discuss
the importance of marketing in universities.
In the past many higher education institutions have taken a passive approach in terms
of student recruitment (Naude and Ivy, 1999). As Kotler and Fox (1995) argue, marketing was
perceived as an unnecessary expense, where-by the value of education was evident and
universities simply selected the most suitable candidates from the ones which have applied.
However, things have shifted from this initial view, at first with the idea that superior
14
products, in this case courses, do not sell themselves and advertising and promotions became
key activities for many higher education institutions (Maringe and Mourad, 2012). Later on,
as universities intensified their promotion and recruitment activities, they started to realise
that they could not be the best university for all the students, and, as such, invested in
positioning the institutions on real attributes that prospective students will value (Kotler and
Fox, 1995).
More recently though, major environmental changes, such as increased student
mobility (Naidoo and Wu, 2011) and increased global competition among higher education
institutions, have driven universities to market themselves more aggressively (Helgesen,
2008). As Gibbs (2001) argues, higher education institutions had to steadily take over
financial responsibilities as governments reduced their spending. This transition has paved the
way for more flexibility for the organisation, a welcomed response as common market based
logic, such as differentiation, loyalty, value or customer centeredness, emerged in the higher
education sector (Maringe and Mourad, 2012). As a result of these environmental shifts,
universities “have started to search for a unique definition of what they are” (Chapleo et al.,
2011, pg. 25).
The necessity of building the university as a distinctive organisation also stems from
the fact that higher education institutions face numerous challenges and issues in the current
era, as Cetin validly points it out:
“Universities are being urged to provide high quality education, exist as a
well-reputed university, achieve enrolment success, improve competitive
positioning, provide contemporary and well-designed academic programs,
and maintain financial strength”. (Cetin, 2003, pg. 57)
This has forced universities to realise the fact, which is supported by previously
mentioned findings, that they need to operate more like a business, which, importantly for this
dissertation, also includes the use of sound marketing communication strategies and
techniques (Hancock and McCormick, 1996, cited in Beneke, 2011).
2.3. Marketing communications in the higher education sector
2.3.1. Theoretical background
In order to critically evaluate existing practices with respect to communicating courses
and institutions to prospective students, but also to provide the basis of recommendations in
15
later stages of this dissertation, there is the need to define what marketing communication
means and identify an appropriate framework for analysis.
Fill (2013) argues that marketing communications is an audience centred activity
designed to encourage engagement between participants and to provoke conversations. This
dissertation reasons that this definition is a suitable one for analysing marketing
communications in the higher education sector. First of all, it refers to audiences, rather than
customers, and this is appropriate as there is much debate in the literature about who are the
customers of a university and if students can be viewed as such (Eagle and Brennan, 2007;
Svensson and Wood, 2007). Moreover, it does not emphasise selling products to customers,
instead focuses on other dimensions of marketing communications, such as engagement and
conversations, which again can be argued is a more suitable lens to look at marketing
communications in higher education institutions, and as Chapleo (2013, pg. 23) puts it,
student recruitment is “not a hard sell”.
The marketing communication mix proposed by Fill (2013) is an appropriate choice to
form the framework of analysis as it acknowledges the existence of elements beyond
marketing communication tools, such as advertising and sales promotion, and introduces
media, such as paid or owned media, and content, or what type of messages are delivered to
the target audience, such as informational or emotional. Figure 2 presents the marketing
communications mix.
5
Communication
tools
3 Categories
of media
Marketing
communications
mix
4 Core types of
messages
Figure 2: The marketing communications mix. This figure shows the 3 components of the
marketing communications mix: tools, media and content. (Source: Fill, 2013)
16
Expanding on Figure 2, the 3 components of the marketing communications mix
contain (Fill, 2013):
5 Communication tools
•
•
•
•
•
Advertising
Sales Promotion
Personal Selling
Public relations
Direct marketing
3 Categories of media
• Paid media
• Owned media
• Earned media
4 Core types of messages
(content)
•
•
•
•
Informational
Emotional
User generated content
Branded messages
The above representation of the marketing communications mix sets the theoretical
foundations to evaluate existing practices in the higher education sector across the three
dimensions, but at the same time identify elements in which business intelligence and
analytics can be used to enhance communications.
2.3.2. Existing marketing communications practices in the higher education
sector
Advertising
The first observation when looking at research into marketing communications
practices in universities reveals that one of the predominant communication tool used is
advertising (Durkin et al., 2012; James, 2011; Mainardes et al., 2012). In fact, Chapleo (2010)
has identified that several universities spend a considerable part of their budget on above the
line marketing communications.
For example, Durkin et al. (2012) conducted an in-depth analysis of a brand repositioning exercise within Ireland’s larger university, University of Ulster. The campaign
was centred around the ‘Eddie’ TV advertisement, depicting an animated character that is
confused with respect to his future direction in the post-school environment, but ultimately
finds the correct path which leads to the University of Ulster. While the TV campaign was the
focal point, it was supported by a combination of paid, outdoor and radio broadcasting, and
owned, such as their personal website, media. The campaign was a success as measured by
the institution, due to the fact that choice commitment through UCAS for the University of
Ulster has increased by 4% as a result of the marketing communications campaign (Durkin et
al., 2012).
17
An important point to mention is the change in focus that the campaign exemplified.
Traditional marketing communications were transmitting an informational message, with
content on student experience, future employment prospects or the research capabilities of the
university (Durkin et al., 2012). They were targeted at parents and other key influencers,
rather than prospective students. However, the ‘Eddie’ campaign focused on emotional
content and was directed to the students themselves. This shift is important, as it potentially
highlights the significance of emotions, or affect, in marketing communications as argued by
Vakratsas and Ambler (1999). In fact, De Chernatony (2003) recognises that in order to
communicate the ‘promised experience’ effectively, a combination of rational and emotional
values needs to be included. Applied to universities, according to (Chapleo et al., 2011), a
blend between information on research and teaching, at a rational level, and information on
social responsibility, for example, at an emotional level needs to be achieved to communicate
the university brand effectively.
However, the usage of advertising by universities has been questioned by several
authors, such as Beneke (2011), who is sceptical towards it. He argues that internal resistance
within the organisation to not resemble a for-profit organisation can hinder the potential value
gained from advertising, and posed the question of what is that needs to be advertised in an
academic institution. This uncertainty over the usefulness of advertising is reflected in Jansen
and Brenn-White’s (2011) study, in which above the line advertising was not very popular
among university respondents. A more recent study conducted by Chapleo (2013) portrays a
shift within U.K. higher education institutions, away from traditional advertising tools and
towards digital communications, but that is not to say that there is no room at all for
advertising across various forms of media. An important point that the author mentions, and is
also supported by Schuller and Rasticova (2011), is that personalised and tailored marketing
communications would be at the forefront in future campaigns. This statement acknowledges
that advertising is often a one-way communication tool with little prospect of targeting
particular segments in the market (James, 2011), and leads the discussion to another
marketing communications tool as outlined by Fill (2013), direct marketing.
Direct marketing
Several researchers have identified that direct marketing, in the form of printed
materials such as brochures, pamphlets and the prospectus are important elements to a
university’s marketing communications mix (Ivy, 2008; Chapleo, 2013; Nedbalova et al.,
2014). According to Chapleo (2013), this could potentially be as a result of parents and other
18
key decision makers having an increasing influence in the decision making process.
Nevertheless, there is little evidence in the literature on how direct marketing can be
specifically tailored to particular segments in the market in order to have relevance to the
prospective student.
Cheung et al. (2010) argue that segmentation is required in order to deploy marketing
communications effectively, and suggest several generally applicable segmentation strategies,
such as benefit segmentation or demographic segmentation, but do not translate these general
practices in great detail to the context of the higher education sector. Similarly, James’s
(2011) findings suggest that universities are acknowledging that not every student is the same.
One example used is the difference between young and mature students and their respective
needs and wants. However, similarly to the previous study, it does not deal specifically with
what changes can be made to the marketing communications mix with respect to this
knowledge.
There is ambiguity in the literature of identifying prospective students and how to
personalise their marketing communications links to the concept of Customer Relationship
Management (Kumar, 2010). Hughes (2002) argues that information about customers needs to
be used to generate the right communication at the right time to the right audience. In fact,
Chapleo (2013) covers this concept in the context of universities by highlighting that
marketing communications should move away from targeting the media, for example the
website, to targeting the user, the student, with the help of a CRM system.
While an elaborate discussion about CRM is not the scope of this dissertation, it does
provide insight into existing literature, which has, as opposed to those mentioned above, tried
to define who the right customer is and how can you approach them with marketing
communications.
Schuller and Rasticova (2011, pg. 65) argue that information obtained from students
will “help achieve the correct focus for each activity of the communication mix selecting its
proper intensity”. What is important to mention is that their research tackles issues, with
respect to information gathering, such as the type of school attended before university,
reasons for applying, where did you hear about the university programme or previous
education. These initial parameters build towards the methodology of this dissertation,
whereby marketing communications can be targeted with respect to previous school,
geographic position, subjects taken for A-level exams and the grades obtained, to name a few.
A more comprehensive research was conducted by Ho and Hung (2008). The authors
were concerned with the formulation of the marketing mix in Taiwan’s higher education and
19
their starting point was student segmentation. Contrary to the research mentioned above, they
decided to segment prospective students based on their “expected educational benefits and
needs” (Ho and Hung, 2008, pg. 330) instead of focusing on demographic characteristics. In
order to do so, the authors established a hierarchy, through a series of interviews and theory,
of what students looked for when selecting a university, and their findings revealed that
among the most important attributes were learning, such as research and curriculum,
economy, with respect to tuition fees and employability, and reputation.
Based on this hierarchy and the information gathered from 602 prospective students,
they identified five discernible clusters: the ‘prominence’ group, the ‘less aware’ group, the
‘pragmatic’ group, the ‘austerity’ group and the ‘fastidious’ group. Each cluster scored
differently on what they valued most at a university. For example, the ‘austerity’ group valued
economic factors more than the ‘fastidious’ group, which emphasised university reputation
and excellent learning environments. The authors argue that this level of knowledge about
prospective students can help universities to select their appropriate target market and thus
position their institutions effectively (Ho and Hung, 2008).
The paper is a move forwards in terms of how students are segmented and targeted,
acknowledging that there are several other, deeper, dimensions on which clustering can be
performed, instead of the more basic ones, such as young and mature (James, 2011) or home
and international (Furey et al., 2014). However, the research does not cover how to
specifically alter the marketing communications mix to engage with the suitable clusters.
Moreover, the marketing mix is limited to the knowledge extracted from the initial research
and based solely on the five value categories as identified by Ho and Hung (2008). A data
mining approach, as postulated in this dissertation, will potentially allow a greater degree of
flexibility in terms of the factors on which clusters are constructed and targeted.
So far the two main communications tools, advertising and direct marketing, have
been discussed with respect to the higher education sector, both through paid and owned
forms of media. The discussion will move towards other marketing communications tools,
such as public relations and personal selling.
Public relations
Chapleo’s (2010) findings suggest that public relations, especially press coverage but
also, for example, university staff appearing in the media (Schuller and Rasticova, 2011), is
one of the marketing communications tools preferred by many higher education institutions,
particularly in what Chapleo describes as ‘older’ universities, or those incorporated before
20
1950. Moreover, several other authors, such as Nicholls et al. (1995) and Kotler and Fox
(1995), identify public relations as a common communications tools in the higher education
sector. The popularity of public relations could be a result of its associated low costs and the
fact that it creates awareness for the university brand (Chapleo, 2010). Having said this, it can
be argued that analytics could play a diminished role in enhancing public relations with
respect to student recruitment, as public relations involve issues such as monitoring public
opinion and preparing for press releases (Fill, 2013), rather than targeting prospective
students, and as such will not be further discussed in this dissertation.
Personal selling
Another marketing communications tool, as outlined by Fill (2013), is personal
selling. At this point it is worth mentioning that potentially personal selling is not a suitable
construct name as marketing in universities, as mentioned previously, is not a hard sell. As
such, this dissertation proposes that personal selling should be replaced by personal
communication, as outlined by Schuller and Rasticova (2011). Thus, specific practices that
fall under this category are open days, fairs or direct presentations done by the university’s
staff across various locations (Moogan, 2011).
The use of direct communications by universities has been documented by existing
literature (Schuller and Rasticova, 2011; Chapleo, 2013). Moreover, Cubillo et al. (2006),
highlight that personal contact between prospective students with graduates and members of
staff, at open days or across various presentations, can assist students in their decision making
process. A study conducted by Moogan (2011) reveals information with respect to when high
school students begin searching for universities, what types of information sources they value
the most and what attributes weighted the most in terms of choosing a university. For
example, “information searches on HEIs commenced within the first year of post-16
education” (Moogan, 2011, pg. 577). Analytics can build upon this knowledge, the
importance of marketing communications timing, and identify prospective students towards
particular degree programmes. As such, direct communications can be delivered at an
appropriate time and to the right and relevant audiences by the right people. For example, if a
high school has a large number of students traditionally applying for the Medical school, as
highlighted by data analytics, the university can include a Medical School graduate to deliver
the presentation to those students.
21
2.4. Conclusion
This chapter has examined the relevant literature regarding existing practices with
respect to marketing communications in the higher education sector and critically evaluated
their degree of personalisation. It has analysed a number of marketing communications tools,
as outlined by Fill (2013), including advertising, direct marketing, public relations and
personal selling. Findings suggest that universities use a combination of these to communicate
to prospective students, both through paid and owned forms of media. While the importance
of earned media and the impact of word of mouth communications has been acknowledged by
several authors (Mainardes et al., 2012; Chapleo, 2013), it was not looked at due to the fact
that they are beyond the scope of this dissertation. The predominant type of content
universities have in terms of marketing communications is informational, nonetheless some
universities are beginning to deploy emotional content as a way of increasing awareness and
student enrolment.
The analysis carried out in this chapter has revealed that data mining is not widely
used in segmenting prospective students and delivering personalised communications to them.
As such, this dissertation argues that business analytics can contribute to marketing
communications by clustering prospective students around various attributes, and from those
attributes develop targeted content.
22
3. Methodology
This chapter examines the methods deployed to generate insight, which was used to
suggest ways in which more relevant marketing communications for prospective
undergraduate students at Manchester Business School can be created. The software that was
utilised to achieve this was IBM SPSS Modeler 14.2.
As such, the chapter is divided into subheadings, which cover issues, such as data
understanding and data preparation, and modelling, the stage at which data mining exercises
were carried out. This approach follows the stages mentioned previously in the CRISP-DM
model (Chapman et al., 2000). However, the above methodology acknowledges the existence
of two individual steps for data understanding and data preparation. Nevertheless, this
dissertation argues that a better understanding of the data set will be achieved if both of these
stages are brought together under the same section. This will allow the reader to obtain a
complete picture of each attribute so that at the end of the section a comprehensive knowledge
of the data set is built.
3.1. Data understanding and data preparation
A data set containing 4,110 entries, which represented anonymised undergraduate
student applications at Manchester Business School for the year 2013 was obtained. It
comprised 32 different attributes, ranging from the UCAS course code they were applying for
to the age and gender or each applicant. However, not all attributes were relevant to the data
mining process in the sense that when carefully examined they contained either irrelevant or
duplicate information or unique information portraying each candidate. As such, a discussion
is necessary to evaluate whether or not each attribute is relevant to the data mining exercise
and provide justifications as to why that is the case.
It is important to mention at this point that any sensitive information, such as the full
name of each applicant, was removed from the data set prior to the data being provided to the
author. This was done to ensure the anonymity of each candidate and to comply with the
ethical requirements as set by the Manchester Business School.
Having said this, the discussion that follows critically evaluates the appropriateness of
each attribute to the data mining process. At the same time, the section will identify and
explain the range of values that each attribute takes and state any data preparation activities
that were taken.
1. UCAS Scheme
o A code which is used to identify the application of each candidate (UCAS, 2015a).
23
o This attribute does not provide any useful information that can be used to tailor
marketing communication as it is unique to each applicant and as such was removed
from the data set.
2. UCAS Person ID
o A 10-digit number that each candidate receives when deciding to apply through UCAS
(UCAS, 2015a).
o This attribute is meaningless from an analysis point of view as it is unique to each
individual, and as such was removed from the data set.
3. HO Fees
o The type of fee that each applicant needs to pay in relation to his/her geographic
location. Examples include home fees or overseas fees.
o Because this attribute links directly to the ‘HO Domicile’ field in the sense that the
geographic location of a candidate will dictate what type of fees he/she needs to pay, it
can be argued that the information contained in the ‘HO Domicile’ field subsumes the
one in ‘HO Fees’. This suggests that there is no need to keep both attributes for the
modelling stage, and as such the decision was made to remove ‘HO Fees’ from the
data set and keep ‘HO Domicile’.
4. School
o This attribute contains the name of the institution that each candidate is applying for.
o Because each of the 4,110 candidates has the same value for this attribute,
‘Manchester Business School’, it adds no value to the analysis and as such was
removed from the data.
5. UCAS Course Code
o This contains the course code, with respect to how UCAS labels their course offerings,
that each candidate is applying for.
o This attribute is irrelevant to the analysis as the same information is contained in the
‘Academic Programme’ field, which is directly related to how Manchester Business
School labels and advertises its courses, and thus the decision was made to remove it
from the data set.
6. UCAS Choice Number
o This contains the choice number that each applicant has selected for Manchester
Business School when applying through UCAS. Values range from 1 to 5, with a
value of 1 implying that Manchester Business School is the first choice, or preferred
academic institution, for the candidate.
24
Figure 3: Frequency distribution of ‘UCAS Choice Number’
o As seen from Figure 3, the majority of the candidates have attributed a choice between
1 and 5 to Manchester Business School. However, certain candidates were recorded in
the data-base with choice ‘0’ (n=361), choice ‘6’ (n=8), choice ‘9’ (n=35), choice ‘71’
(n=9) and choice ‘72’ (n=2). Because there are no corresponding academic
programmes or course offerings at Manchester Business School to match the above
mentioned choices, they were flagged as mistakes in the data set and were removed
using a Select node. It is important to note however that these mistakes should either
not have been allowed by UCAS during the application process or they are an error in
how The University of Manchester collects the data. After removal, 3,695 valid entries
remained with a ‘UCAS Choice Number’ value range from 1 to 5.
o The attribute is relevant to the analysis because, in combination with other attributes,
can help tailor marketing communications. Because of this, the decision to keep it for
the analysis was made.
7. Academic Year of Entry
o This field contains the year when each candidate will begin studies.
o Because there was one, universal, value for all candidates, ‘2013’, this characteristic of
applicants adds no value to the analysis and as such the decision was to remove it from
the data set.
25
8. Record Creation Date
o This feature is concerned with the date, as a day, month and year format, when an
individual candidate’s application was created on the database of Manchester Business
School.
o This attribute is worthless from an analysis point of view as it is unique to each
applicant, and as such was removed from the data set.
9. Decision
o This characteristic contains the decision that Manchester Business School made with
respect to each candidate after they received the application. The working values for
this field are:
C the candidate received a conditional offer;
U the candidate received an unconditional offer;
R the candidate was rejected;
W the candidate withdrew his/her application;
Figure 4: Frequency distribution of ‘Decision’
o Figure 4 shows the occurrence of a particular decision. There are two important
aspects to mention here. First of all, decision ‘F’ (n=7) could not be attributed to any
action taken by Manchester Business School with respect to its candidates and as such
was considered an error and thus removed from the analysis with a Select node. After
removal, 3,688 valid records remained. The second remark is that 37 observations out
of the total 4,110, which represents the original data set prior to any data processing,
did not contain an entry for the ‘Decision’ field. Simply put, they were null values.
However, the removal of these null entries was not necessary at this stage because all
26
of them were removed together with the unwanted entries from ‘UCAS Choice
Number’. As such, the working data set at this point had 3,688 valid records.
o The decision was made to keep this characteristic for the analysis, as it can prove
potentially relevant in designing targeted marketing communications.
10. Response
o This field contains the response that Manchester Business School received from each
candidate. The working values in this field are:
D the candidate declined the offer, either a conditional or unconditional
one, that he/she received from the university;
F the prospective student communicated to the university that
Manchester Business School was a firm choice, or first option, as labelled
by the UCAS application system.
I the prospective student communicated to the university that
Manchester Business School was an insurance choice, or second option, as
labelled by the UCAS application system.
Figure 5: Frequency distribution of ‘Response’
o Figure 5 shows the occurrence of a particular response. An interesting behaviour needs
to be mentioned here. Although not graphically represented in Figure 5, 1,557 entries
had, as in the case of ‘Decision’, null values. However, these values were not removed
from the analysis due to the fact that all of the null values were associated with a reject
decision. Intuitively, a candidate who got rejected will not be in a position to
communicate any response to the university, and this is the reason why such a large
number of applicants have missing values for ‘Response’.
27
o Similar to the ‘Decision’ field, the decision was made to keep the attribute for the
analysis.
11. Country of Domicile Code
o It contains the code of the country that the applicant is resident of.
o This attribute is meaningless to the analysis as the same information is contained in the
‘Country of Domicile’ field, and as such was removed from the data set.
12. Country of Domicile
o This field contains the name of the country of which the applicant is resident.
o Similarly to the situation encountered for the ‘Decision’ field, the entire data set
contained 12 null values for this attribute, but their removal was not necessary at this
point as the entries containing these null values were removed when processing
‘UCAS Choice Number’.
o The information contained in this field can contribute to the objective of this data
mining exercise and thus was kept in the data-base.
13. HO Domicile
o Includes information about the geographic location of each applicant. The values in
this field, as represented in Figure 6, are:
U.K. the candidate has the domicile in The United Kingdom of Great
Britain and Northern Ireland.
EU the candidate has the domicile in a European Union country.
OS an overseas candidate, meaning that he/she has the domicile in a
country which is not part of the U.K. or EU.
Figure 6: Frequency distribution of ‘HO Domicile’
28
o This attribute was kept for the data mining exercise.
14. Address Postcode
o Includes the address postcode of each candidate, particularly for those which have
their domicile in the U.K.
o This value for removed from the analysis for several reasons. First of all, the entire
postcode, with a format such as M15 6PB, was unique to each candidate. Secondly, no
technical means were possible to group the postcodes based on the first, or second,
letter to create a cluster of different regions from the U.K. Moreover, if such a
grouping was achieved, the information would have been duplicate to the one
contained in the ‘LEA Domicile’ field. It is important to mention that although this
attribute was removed for the purposes of this particular investigation, a more
comprehensive or different analysis utilising more extensive resources could make use
of the postcodes, though the resources required/expenses would be higher.
15. LEA Domicile Code
o This field contains the identification code of the local education authority that the
applicant is part of.
o This attribute provides no value to the analysis as the same information is contained in
the ‘LEA Domicile’ field, and as such was removed from the data set.
16. LEA Domicile
o This attribute contains the name of the local education authority that the applicant is
part of. For students outside the U.K., this field was populated with their country of
domicile (eg. China). However, for students within the U.K., it contained information
that linked directly to a specific local education authority (eg. Manchester;
Lancashire).
o Similar to the situations discussed previously, 361 entries had a null value for this
attribute, but were removed when processing ‘UCAS Choice Number’. As such, the
data set still had 3,688 valid entries.
o This attribute was selected for the data mining exercise as it provides the means to
identify U.K. applicants from specific areas in the country.
17. Application Number
o Includes the application number of each candidate.
o Contains unique information to each individual and thus adds no value to the analysis.
As such, it was removed.
29
18. Gender
o Specifies the gender of each applicant. The values for this field are:
F the candidate is female.
M the candidate is male.
Figure 7: Frequency distribution of ‘Gender’
o As seen from Figure 7, there is not a great imbalance between the applicants in terms
of gender, suggesting that Manchester Business School receives an almost equal
amount of applications from male and female candidates.
o This characteristic was kept for the data mining exercise.
19. Age
o Contains the age of each applicant.
Figure 8: Frequency distribution of ‘Age’
30
o From Figure 8 it appears that there are a number of extreme values in terms of age,
particularly those over 25. However, in terms of data understanding, these values are
not outliers and represent mature students at Manchester Business School. As such, no
values were removed.
o This attribute was kept in the analysis process.
20. Academic Programme Code
o This field contains the identification code of the academic programme that each
candidate is applying for, as labelled by Manchester Business School.
o This attribute is meaningless from an analysis point of view as the same information is
contained in the ‘Academic Programme’ field, and as such was removed from the data
set.
21. Academic Programme
o Encompasses the name of the academic programme, as advertised by Manchester
Business School, that an individual is applying for. The values and occurrences can be
seen in Table 1.
Table 1: Frequency distribution of ‘Academic Programme’
Percentage out of
Academic Programme Name
entire population
Count
(%)
BA(Hons) International Business, Finance and
7.35
269
7.82
286
0.05
2
7.87
288
8.01
293
4.43
162
BSc (Hons) Management
22.91
838
BSc (Hons) Management (Accounting and Finance)
7.41
271
BSc (Hons) Management (Human Resources)
3.64
133
Economics
BSc (Hons) Accounting
BSc (Hons) Accounting, Management and Information
Systems
BSc (Hons) International Business, Finance and
Economics
BSc (Hons) International Management
BSc (Hons) International Management with American
Business Studies
31
BSc (Hons) Management (Innovation, Sustainability
4.46
163
4.76
174
BSc (Hons) Management (International Studies)
2.32
85
BSc (Hons) Management (Marketing)
13.94
510
2.71
99
2.32
85
and Entrepreneurship)
BSc (Hons) Management (International Business
Economics)
BSc(Hons) in Information Technology Management
for Business
BSc(Hons) in Information Technology Management
for Business with Industrial Experience
o The following academic programmes were part of the initial data set, but were
removed with the discarded entries from ‘UCAS Choice Number’: ‘BSc (Hons)
Management (Singapore)’ (n=133), ‘BSc (Hons) Management with Compliance’
(n=20), ‘BSc (Hons) Management with Trusts and Estates’ (n=18), ‘ERASMUS
Business School’ (n=65), ‘ERASMUS IBFE’ (n=3), ‘EXCHANGE Business (NonEU)’ (n=114). The discarded values were not essential to the analysis as they
represented students, which do not attend Manchester Business School as full time
students and thus were outside the scope of this dissertation.
o This attribute was kept in the data mining exercise.
22. Academic Plan Code
o This characteristic is comprised of the identification code of each academic plan that
an individual is applying for.
o This attribute provides no insight from an analysis point of view as the same
information is contained in the ‘Academic Plan’ field, and as such was removed from
the data set.
23. Academic Plan
o Encompasses the name of the academic plan for which a candidate is applying.
o The differences between this field and the ‘Academic Programme’ one consists in the
fact that this attribute distinguishes between first semester and second semester
students enrolled on the Exchange or Erasmus programmes. As previously mentioned,
the Exchange and Erasmus students will not be taken into consideration when carrying
32
out the analysis, and as such the decision made was to remove the ‘Academic Plan’
attribute and focus exclusively on the ‘Academic Programme’ one.
24. UCAS Course
o This field contains the course name as displayed by UCAS when the individual makes
his/her application.
o The field contains duplicate information with the ‘Academic Programme’ one and
therefore was removed from the analysis.
25. Academic Level of Entry Point
o This field contains information with respect to the year number that the prospective
student enters if his/her application is successful. For example, a value of ‘1’ denotes
that fact that the applicant will be enrolled in the first year of his/her academic
programme.
Figure 9: Frequency distribution of ‘Academic Level of Entry Point’
o The initial data set contained 361 null values and one entry with a value of ‘5’, which
was flagged as a mistake, for this field. These were removed with the entries from
‘UCAS Choice Number’. The data also included values of ‘0’ (n=19) and ‘3’ (n=11).
Some students from the Singapore programme, depending on the results from their
first 2 years, can come to Manchester Business School for their 3rd year, explaining the
value ‘3’ for ‘Academic Level of Entry Point’. However, they are irrelevant for this
particular analysis and were removed. In terms of the value ‘0’, there was no
justification for it and as such was flagged as a mistake and removed. Both values
were removed with a Select node. As a consequence, 3,658 valid records remained.
33
o The attribute was kept in the data mining activity as it could potentially shed insight
into an unexpected behaviour.
26. Mode of Attendance
o This field contains information regarding the type of attendance the student will have
according to the academic programme he/she is applying for.
o There were 38 records which had a value of ‘Distance Learning P/T’ for this field. All
values corresponded, in terms of academic programme, to either ‘BSc (Hons)
Management with Compliance’ or ‘BSc (Hons) Management with Trusts and Estates’,
both of which were previously removed. Similarly, 133 records had a value of ‘Parttime’, all corresponding to ‘BSc (Hons) Management (Singapore)’, the entries for
which were also removed earlier in the analysis.
o By removing ‘Distance Learning P/T’ and ‘Part-time’, only applicants with full-time
attendance remain. As such, the attribute adds no value to the data mining exercise and
was removed from further analysis.
27. Programme Action Code
o The field holds the identification code of the programme action that was taken for each
individual.
o The same information is contained in the ‘Programme Action’ field and thus was
removed from the analysis.
28. Programme Action
o This field contains the action that was taken either by the candidate or the university
with regards to his/her application for the academic programme. The values are:
Administrative Withdrawal Manchester Business School withdrew the
student’s application.
Applicant Withdrawal the prospective student has withdrawn his/her
application.
Deny the student’s application was denied, for example as a
consequence of not having sufficiently high grades.
Matriculation the student’s application was successful and was enrolled
on his/her course.
Application the student selected Manchester Business School as an
insurance choice.
34
Figure 10: Frequency distribution of ‘Programme Action’
o The information contained in this attribute is relevant and thus was kept for further
analysis.
29. Programme Action Reason Code
o Contains the identification code of the reason that was employed as a justification for
the programme action taken.
o The same information is contained in the ‘Programme Action Reason’ field and thus
was removed from the analysis.
30. Programme Action Reason
o Contains the reason for why a particular academic action was taken.
o This information was considered irrelevant to the analysis because knowing the reason
why a candidate was, for example, rejected will not aid the university in developing
more targeted marketing communications. It will help the institution in its candidate
selection process, but this area is beyond the scope of this dissertation. As such, the
attribute was removed from the analysis.
31. Withdrawn Status Code
o Contains the identification code for the status of a candidate’s withdrawn application.
o Because 3,983 values out of 4,110 did not contain an entry for this field, the attribute
was removed from the analysis.
32. SNC Flag
o This attribute contains information on whether or not the candidate’s results for his/her
A-level examination, or equivalent, were below ‘ABB’. That is to say that those
35
applicants, which performed below this threshold were flagged as such in the data set,
while those that performed at or above this threshold did not contain an entry for this
attribute.
o This attribute was removed from the analysis. It was considered that in order for
Manchester Business School to work towards its objective of attracting high quality
students, information on those which received ‘AAA’ or equivalent for their
examination is required, instead of those which could not meet a certain standard. In
essence, this attribute provides information regarding which students the university
should avoid enrolling, but does not allow the institution to target those applicants,
which performed extremely well in their exams.
Following the above discussion, the final data set, on which data mining exercises will
be performed, has 3,658 valid records with 11 fields, or attributes, out of the initial 32. A
short extract of the data can be observed in Table 2.
Table 2: Extract from the entire data set
UCAS
Choice
Number
Decision
Response
Country
Of
Domicile
HO
Domicile
LEA
Domicile
Gender
Age
Academic
Programme
Academi
c Level
Of Entry
Point
Programme
Action
5
U
F
United
Kingdom
UK
Wandsworth
F
20
BSc (Hons)
Management
1
Matriculation
1
Applicant
Withdrawal
3
U
D
Singapore
OS
Singapore
M
20
BSc (Hons)
Management
(International
Business
Economics)
4
C
D
Singapore
OS
Singapore
M
22
BSc (Hons)
Management
1
Applicant
Withdrawal
4
C
D
United
Kingdom
UK
Norfolk
F
19
BSc (Hons)
Management
(Marketing)
1
Applicant
Withdrawal
1
C
D
Germany
EU
Germany
M
19
BSc (Hons)
Management
1
Applicant
Withdrawal
1
Applicant
Withdrawal
1
Matriculation
4
W
D
France
EU
France
F
18
BA(Hons)
International
Business,
Finance and
Economics
1
U
F
United
Kingdom
UK
Peterborough
F
19
BSc (Hons)
Management
36
UK
The Royal
Borough of
Windsor,
Maidenhead
F
19
BSc (Hons)
Management
(International
Business
Economics)
1
Applicant
Withdrawal
1
C
D
United
Kingdom
1
C
D
United
Kingdom
UK
Wiltshire
F
19
BSc (Hons)
Management
1
Applicant
Withdrawal
5
C
D
United
Kingdom
UK
Birmingham
M
19
BSc (Hons)
Management
1
Applicant
Withdrawal
3.2. Data modelling
The three most common data mining techniques are association rules, classification
and cluster analysis (Hegland, 2001). Out of these three techniques, this dissertation utilises
only cluster analysis and association rules. As such, a discussion of what each technique does
and the justification for why it has been selected or omitted from the methodology is required.
Furthermore, a clear explanation of how these techniques have been deployed in IBM SPSS
Modeler 14.2 is essential.
Firstly, classification is concerned with building a model to predict future instances or
occurrences, for example future customer behaviour, by classifying each data item in the
target data into a number of specific and predefined classes (Ngai et al., 2009). For example,
consider a direct marketing campaign, which aims at reducing costs by mailing the marketing
communication only to the set of customers which are more likely to buy the advertised
product. Presume that the company has introduced a similar product in the past and, as such,
knows which customers decided to purchase that product and which decided not to do so.
This {purchase, don’t purchase} forms a class attribute. The company has other information
regarding each customer and can form additional classes (set of criteria) which are predefined
for example on demographic characteristics or on previous company-customer interactions,
such as for example calling the company to get additional product information. As such, the
prospective customer will be assigned to a class, {likely to purchase, unlikely to purchase},
with predefined values for the above attributes, such as gender and age, based on previous
customer knowledge that were imputed into a classification model.
Put differently, if the company knows from the previous product, as a result of
classification analysis, that customers who are above 35 years old and married are more likely
to purchase the product, it will send out the new product offering to the same type of
customers as the likelihood of a purchase is increased (Linoff and Berry, 2011).
37
As previously mentioned, classification was not utilised in the data mining exercise.
This was due to the fact that classification is concerned with predicting future behaviour,
which is beyond the scope of this dissertation. The objective of this dissertation is to identify
patterns in the data that can be used to target and personalise marketing communications,
instead of determining whether or not, based on certain attributes, an applicant to Manchester
Business School will be enrolled in a course.
Cluster analysis is concerned with “segmenting a heterogeneous population into a
number of more homogeneous clusters” (Ngai et al., 2009, pg. 2595). Unlike classification,
these classes are not defined prior to the data mining exercise and instead emerge with the
creation of the clusters (Ngai et al., 2009). Cluster analysis is widely popular with respect to
marketing communications as it allows the marketer to identify similar populations in terms
of their characteristics and produce a message, which is directly relevant to that segment
(Hegland, 2001). Put simply, cluster analysis identifies potential segments in the population
that the manager can target. As such, it links directly with the objective of this dissertation, to
produce relevant marketing communication to prospective students based on certain
characteristics. For example, one intended outcome of using this technique is that it will be
able to identify a cluster of students, which reside in the same geographic area with a good
proportion of them applying to the same academic programme. This, from a marketing
perspective, translates to the possibility of sending a previous graduate from that identified
academic programme to schools in that geographic area to talk to students about the benefits
of studying at Manchester Business School.
Clustering analysis was implemented in IBM SPSS Modeler 14.2 through a ‘KMeans’ node with the number of clusters set to 20. This value will ensure that a wide range of
clusters are identified that in turn facilitates the recognition of groups, which are small in
terms of size, but which offer interesting glimpses into how the data is behaving. For
example, as a consequence of experimenting with different number of clusters, a value of 5
yielded only large clusters, which did not capture interesting patterns in the data. On the other
hand, identifying 25 clusters in the data yielded clusters with a size of 1, which were not
useful for the analysis. As a consequence, a setting of 20 was settled upon as it offered a
balance between too large and too small clusters.
An important observation to make here is that there are other clustering techniques
that the software application offers, namely ‘Kohonen’ and ‘TwoStep’. ‘Kohonen’ was not
used as the number of clusters that it creates is automatic and subsequently, through testing, it
was acknowledged that this analysis type loses subtle data behaviour as opposed to ‘K38
Means’. The reasons for not selecting a ‘TwoStep’ analysis are twofold. First of all, the
‘TwoStep’ method excludes entries that have null values from the analysis (Horn and Huang,
2009), which for this particular analysis represents a disadvantage as the significance of null
values for the ‘Response’ attribute was previously highlighted. Secondly, the ‘TwoStep’
analysis produced clusters of similar sizes as opposed to ‘K-Means’ for which cluster size had
a greater variation. From a segmentation perspective, having clusters, which vary greatly in
size has more face validity. For example, customers who have an extremely high income
comprise a smaller group than those who earn a middle level income (Horn and Huang,
2009).
As mentioned previously, there were 11 attributes, which were chosen for the data
mining exercise. When undergoing cluster analysis, a combination of these attributes was
selected as inputs for the model. For example, as seen in Figure 11, one instance of analysis
included ‘Academic Programme’ as one attribute and ‘LEA Domicile’ as the other attribute.
However, this was not the only attribute combination chosen for cluster analysis, but
subsequent variations of attribute selection will be presented in the results section of the
report so as to avoid duplicate information.
Figure 11: Cluster analysis with ‘Academic Programme’ and ‘LEA Domicile’ selected as inputs
The second data mining technique used was association rules. According to Ngai et al.
(2008, pg. 2595), this technique “aims to establish relationships between items which exist
39
together in a given record”. More specifically, the focus is on predicting the occurrence of
items given that certain other items are present. For example, a customer who purchases pasta
is very likely to purchase a pasta sauce. This technique has applicability to this dissertation as
it can reveal high-quality, actionable information that the university can use.
Association rules were utilised in this dissertation in order to reveal and compare
relationships between various attributes. For example, one intended outcome of using this
technique was to understand the difference between academic programmes with respect to the
probability of them being chosen as firm choices by evaluating the likelihood of a particular
academic programme occurring together with a firm choice.
Having said this, there are two important measures for association rules that will be
used in this dissertation and as such need further discussion. These two are support and
confidence. Consider the example mentioned below which represents one association rule
extracted from the analysis and which follows the presentation style from the results section:
Table 3: Rules identified with ‘Academic Programme’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Academic Programme = BSc
(Hons) in Information
Response = F
Technology Management for
2.32
21.17
Business with Industrial
Experience
What Table 3 translates into is that, out of all the records that are present, 2.32% of
them contain ‘Academic Programme = BSc (Hons) in Information Technology Management
for Business with Industrial Experience’. This represents the support of the rule. However, out
of all transactions where ‘Academic Programme = BSc (Hons) in Information Technology
Management for Business with Industrial Experience’, 21.17% also contain ‘Response = F’,
representing the confidence of the rule.
In academic research, support is the percentage of data entries, which contain both the
antecedent attribute and the consequence attribute (Rajak and Gupta, 2008). However, as
clearly evident from the above example, but also supported from the software’s manual (IBM,
2011), IBM SPSS Modeler 14.2 defines support as data entries, which contain only the
antecedent attribute when utilising the Apriori node, which is the model used in this
investigation. This difference will not influence the analysis as the study is concerned with
evaluating confidence levels as opposed to the support ones.
40
Confidence is the percentage of data entries that contain the antecedent attribute
which also contain the consequence attribute (Rajak and Gupta, 2008). For an example, please
refer to Table 4.
Table 4: Customer transactions
Item purchased
Item purchased
(Consequence)
(Antecedent)
1
Pizza
Water
2
Bread
Tea
3
Pizza
Tea
4
Bread
Coffee
Transaction number
From Table 4, we are interested in looking at transactions containing ‘Pizza’ and
‘Tea’. ‘Pizza’ and ‘Tea’ occur together in one transaction out of the four, which is transaction
number 3. Thus, the support is 25%. Out of the transactions that contain ‘Tea’, which are two,
number 2 and 3, one transaction also contain ‘Pizza’, number 3. As such, confidence is 50%.
The same interpretation with respect to confidence applies to the results presented in this
dissertation.
Having said this, there is an important point to mention: a low confidence level
suggests that the rule is not reliable. However, the focus is on comparing confidence levels
between different rules, as was the case mentioned previously with an academic programme
and the likelihood of occurring with a firm choice, and as such low confidence values will be
taken into consideration. By contrasting confidence levels relevant information can be
gathered, as will be presented in the results section.
When utilising association rules in IBM SPSS Modeler 14.2 an Apriori node was
selected. The settings were: a minimum antecedent support of 1.0%, a minimum rule
confidence of 10.0% and a maximum number of 3 antecedents. Through experimenting with
these values, it was discovered that a setting for support higher than 1.0%, for instance 5.0%,
loses valuable insight, as will be the case presented later on in the dissertation where several
valuable rules have a support of around 1.0%. Similarly, a setting of 20.0% for confidence,
ensuring a higher degree of rule reliability as opposed to 10.0%, losses important rules used
for comparative purposes. As such, these values were chosen because they yielded the optimal
amount of rules for this data mining exercise. As the focus was on contrasting confidence
levels instead of identifying reliable rules, low support and confidence values were selected so
41
as to yield a large number of rules that can be compared amongst each other. Moreover, as is
the case with clustering analysis, a wide array of input and target attributes were selected
which will be presented in the results section.
42
4. Result analysis and discussion
This chapter presents the results obtained from carrying out the data mining analysis
on the data set following the procedure indicated in the methodology section. The findings
will be accompanied by relevant explanations and interpretations on how these results affect
marketing communication practices at Manchester Business School. If applicable, areas of
improvement on how the university collects its data and captures it will also be discussed.
As such, the section will commence by looking at results from the cluster analysis as
the findings are easily extrapolated to marketing communication efforts. Following that, the
discussion will focus on the association rules that were discovered and what do they imply for
the university.
4.1. Findings – Cluster analysis
The first instance of cluster analysis was carried out by selecting ‘Academic
Programme’ and ‘LEA Domicile’ as inputs for the model. As mention in the methodology
section, segmenting the population this way was one of the intended outcomes of this analysis
and as such was carried out first. The results are in Table 5.
Table 5: Cluster analysis results with 'Academic Programme' and 'LEA Domicile' as inputs
Cluster size
Cluster size as a
percentage of the
population (%)
Input 1
Input 2
99
2.7
BSc (Hons) in Information Technology
Management for Business (100%)
LEA Domicile:
Lancashire (9.1%)
85
2.3
BSc (Hons) in Information Technology
Management for Business with Industrial
Experience (100%)
LEA Domicile:
Manchester (4.7%)
162
4.4
BSc (Hons) International Management
with American Business Studies (100%)
LEA Domicile:
Lancashire (4.9%)
807
22.1
BSc (Hons) Management (100%)
LEA Domicile:
China (15.4%)
Table 5 suggests that 9.1% of all the applicants to ‘BSc (Hons) in Information
Technology Management for Business’ have their domicile in Lancashire. Similarly, 4.7% of
all candidates for ‘BSc (Hons) in Information Technology Management for Business with
43
Industrial Experience’ have their domicile in Manchester. Due to the fact that students, once
enrolled, can change to or from the course with industrial experience, it can be argued that a
significant amount of applicants for these two academic programmes, more specifically
13.8%, apply from geographical regions close to Manchester Business School.
This finding suggests that there is an opportunity to make marketing communications
more relevant to this group of applicants, and in this situation it can be done by, for instance,
sending a graduate from ‘BSc (Hons) in Information Technology Management for Business’
to schools in the vicinity of Manchester Business School to deliver a presentation. By
communicating relevant course related information through an alumni who has had hands on
experience with the academic programme, the university positions itself as an institution,
which understands the nature of its prospective students and one which adequately responds
to their needs.
Moreover, the above mentioned observation is not limited to just one academic
programme, as can be seen from Table 5. A good proportion of students applying for ‘BSc
(Hons) International Management with American Business Studies’ (4.9%) reside in
Lancashire, while the predominant candidates for the ‘BSc (Hons) Management’ programme
(15.4%) have their domicile in China. While it can be feasible to send a member of staff to
visit schools in Lancashire, it is not possible to do so for China, as the geographic region is
not specific enough. Instead, the university can introduce in its direct marketing
communications short extracts on, for example, the Chinese New Year celebrations which
occur in Manchester. This can potentially shed light into an event which Chinese students
were not aware of, and once they do know about it, can encourage them to continue with their
applications to Manchester Business School as they feel more connected to the city and the
university.
An important point to make here, and one which is applicable for all the other cluster
analysis findings that will be discussed, including those mentioned above, is that while at first
sight the number of applicants in one input seems small, it is in fact not the case. So for
example, even if 9.1% of all the applicants to ‘BSc (Hons) in Information Technology
Management for Business’ have their domicile in Lancashire can be regarded as being a small
value, it is in fact a substantial group when you consider that the majority of the other ‘LEA
Domicile’ values account, on an individual basis, to only 1% of the entire applicants to ‘BSc
(Hons) in Information Technology Management for Business’, as is the case of ‘United
States’, for example.
44
The cluster analysis performed above has confirmed the opportunity arising from
using such a technique, which was hinted at in the methodology section. It identifies the
customer segments, and their respective characteristics, that the university can directly target
in order to communicate relevant information to that particular group. It is a move forward
from previous academic research, as the analysis identifies more specific and easily targetable
geographic locations rather than for instance the work of Furey et al. (2014), which only
distinguishes from home or overseas candidates. In essence, it steers academic institutions
towards practices more common in for-profit organisations (Sun, 2009).
The second instance of cluster analysis was done by selecting ‘Academic Programme’
and ‘Gender’ as inputs for the model. The inputs were selected to identify potential
abnormalities in gender distribution, and while most of the academic programmes had an
almost even split between genders, there were two clusters, which behaved interestingly and
are presented in Table 6.
Table 6: Cluster analysis results with 'Academic Programme' and 'Gender' as inputs
Cluster size
Cluster size as a
percentage of the
population (%)
Input 1
Input 2
163
4.5
BSc (Hons) Management
(Innovation, Sustainability and
Entrepreneurship) (100%)
Gender: M (68.1%)
133
3.6
BSc (Hons) Management
(Human Resources) (100%)
Gender: F (69.9%)
The first point to mention is that the clusters in Table 6 are small as the number of
applications for the two academic programmes presented in the table are few. Nonetheless,
Table 6 clearly shows that a large proportion, 68.1%, of prospective students for ‘BSc (Hons)
Management (Innovation, Sustainability and Entrepreneurship)’ is male. Similarly, 69.9% of
candidates are female for ‘BSc (Hons) Management (Human Resources)’.
An important consideration to be made here is that the university wants a balance
between male and female candidates for each of its academic programmes. This objective is
increasingly important for ‘BSc (Hons) Management (Innovation, Sustainability and
Entrepreneurship)’ as studies shown that there are more male entrepreneurs in the world than
female ones (Chamorro-Premuzic, 2014). As such, it is in the interest of Manchester Business
45
School, as an institution, which nurtures diversity and equality (Manchester Business School,
2015), to promote and encourage female entrepreneurship.
Knowing this information, together with the one in Table 6, there is an opportunity for
the university to change its marketing communications to make each course more appealing to
the gender from which it receives fewer applications. As such, the course brochure for ‘BSc
(Hons) Management (Innovation, Sustainability and Entrepreneurship)’ can potentially
include a female alumni’s success story after she graduated. In doing so, the university will
signal to potential female applicants that the course is also relevant for them by highlighting
the benefits arising from undertaking this academic programme. The same line of reasoning
can be applied to ‘BSc (Hons) Management (Human Resources)’, but this time the focus
could on a male applicant’s testimonial.
This consideration becomes more important as students are able to change their
specialisms for the management courses during their academic studies, with the final choice
to be made in the final year. What this implies is that even if the university does not wish to
change their marketing communications prior to student enrolment, it can still communicate
the benefits of these courses to the less interested audiences after they have enrolled at
Manchester Business School. As a suggestion of how this can be done, the university can hold
a guest lecture, where a female graduate talks about the benefits she personally received from
enrolling on ‘BSc (Hons) Management (Innovation, Sustainability and Entrepreneurship)’,
while a male graduate can discuss his personal experience as a student enrolled on ‘BSc
(Hons) Management (Human Resources)’. Such a recommendation is based on academic
research on the spokesperson gender effect, which suggests that the gender of a spokesperson
can have a positive impact on the same gender target audience’s attitude towards the
promoted service (Wolin, 2003).
The third and final instance of cluster analysis was done by selecting ‘UCAS Choice
Number’ and ‘Response’ as inputs for the model. This attribute selection was done in order to
get a better understanding of what happens between when students select their UCAS choices
and when they have to select their firm and insurance choices, which happens later in the
application process. It is worth mentioning at this point that the UCAS choice number is
guiding in terms of what university a student prefers as most personal statements are tailored
to the academic programme offered by their number one university choice, with the exception
of those candidates who apply to the same academic course but offered by different
universities. The results are in Table 7.
46
Table 7: Cluster analysis results with 'UCAS Choice Number' and 'Response' as inputs
Cluster size
Cluster size as a
percentage of the
population (%)
Input 1
Input 2
321
8.8
UCAS Choice Number: 1
(100%)
Response: D (100%)
146
4.0
UCAS Choice Number: 1
(100%)
Response: F (100%)
34
0.9
UCAS Choice Number: 1
(100%)
Response: I (100%)
The first point to mention is that out of the 321 student applications which selected
Manchester Business School as choice number 1 and then declined the offer made by the
university, as seen in Table 7, 290 of them received a conditional offer. This implies that,
even if Manchester Business School was their preferred choice, the marks required to be
enrolled into the course were too high for those particular candidates and, as such, were
forced to decline. Such behaviour suggests that the university is a very competitive academic
institution with high entry requirements, which has important implications for marketing
communication practices. In essence, mass marketing techniques, such as advertising on TV
or on billboards, as was the case mentioned previously of University of Ulster, are not
appropriate for Manchester Business School as it is interested in high quality students, instead
of a large volume of them. Thus, the focus should be on one-to-one communication tools,
such as direct marketing.
The final observation that needs to be made with respect to the results presented in
Table 7 is concerned with applicants that choose Manchester Business School as choice
number 1, but after they receive the offer from the university, decide on an insurance
response. This implies that these particular students are confident in their ability to meet the
conditions put forward by the university and instead opt for another institution as their firm
choice, one which potentially has harder to meet conditions that the student is not sure he/she
will achieve. Ultimately, this suggests that these candidates are high quality, as they are able
to meet the already tough conditions put forward by Manchester Business School, but which
the university loses throughout the application process, as highlighted by the fact that they
initially opted for UCAS choice number 1. As such, this cluster of students is of high
47
importance to the university and represents a group, which should be directly targeted with
marketing communications in an effort to further persuade them regarding the benefits of
enrolling at Manchester Business School.
This final observation concludes the findings obtained from cluster analysis. Further
on, the discussion will focus on association rules, as these in particular have revealed
important, actionable, information, even more so than cluster analysis.
4.2. Findings – Association rules
The first instance of association rules analysis was conducted by selecting ‘Academic
Programme’ as input, or antecedent, and ‘Response’ as target, or consequence. The choice
was made in order to reveal any differences between academic programmes with respect to
the type of responses they receive. Table 8 contains the results.
Table 8: Rules identified with ‘Academic Programme’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)*
Academic Programme = BSc
(Hons) in Information
Response = F
Technology Management for
2.32
21.17
Business with Industrial
Experience
Academic Programme =
BSc(Hons) in Information
Response = F
2.70
4.04
Technology Management for
Business
Academic Programme = BSc
Response = F
(Hons) Management
2.32
16.47
(International Studies)
Academic Programme = BSc
Response = F
(Hons) Management (Human
3.64
8.27
Resources)
Academic Programme = BSc
(Hons) in Information
Response = F
Technology Management for
2.32
21.17
Business with Industrial
Experience
Academic Programme = BSc
Response = F
7.81
12.58
(Hons) Accounting
*For this comparative analysis, a minimum confidence level of 4.0% was selected in IBM SPSS Modeler 14.2
as opposed to the agreed 10%, to successfully highlight the differences between academic programmes in
terms of responses received.
48
There are two important findings that need to be discussed. Firstly, by comparing
prospective students, which are enrolled on a course with industrial experience, to those
students which are not and putting these attributes in relation to the number of firm choices
the university receives, the analysis reveals that the students enrolled on a course with
industrial experience, in this case ‘BSc (Hons) in Information Technology Management for
Business with Industrial Experience’, are more likely, 21.17%, to select Manchester Business
School as a firm choice that those not on an industrial experience course, which account to
only 4.04%, as is the case of ‘BSc (Hons) in Information Technology Management for
Business’. This behaviour is observable across other academic programmes, such as ‘BSc
(Hons) Management (International Studies)’, ‘BSc (Hons) Management (Human Resources)’
or ‘BSc (Hons) Accounting’, each experiencing a lower chance of being selected as a firm
choice than courses with industrial experience. However, it is important to note that for the
year 2013 the only course with industrial experience was ‘BSc (Hons) in Information
Technology Management for Business’. This is actionable information that the university can
use to potentially alter the marketing communication of courses without industrial experience
to make them more attractive. For instance, communications can reassure students of their
employability chances even when not taking a course with industrial experience. Such an
initiative by the university becomes even more essential with studies suggesting that graduate
with industrial experience are more likely to get jobs than those without (Peacock, 2012).
The second finding from Table 8 is that students which applied for a more specific
academic programme, such as ‘BSc (Hons) in Information Technology Management for
Business with Industrial Experience’, which according to UCAS is provided by only 3
institutions in the U.K. (UCAS, 2015b), are more likely to respond with a firm choice,
21.17%.
In contrast, those enrolled on a more general course, such as ‘BSc (Hons)
Accounting’, provided by 143 institutions in the U.K. (UCAS, 2015c), respond only in
12.58% of the cases with a firm choice. This implies that the university should direct their
efforts on further differentiating their course offering for which they experience high
competition, through marketing communications, by focusing on those specific course
characteristics that no other provider has.
The second instance of association rules analysis was conducted by selecting ‘HO
Domicile’ as input and ‘Response’ as target. Similarly to the above analysis, attributes were
selected to showcase potential differences, but this time with respect to domicile. The results
are in Table 9.
49
Table 9: Rules identified with ‘HO Domicile’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Response = F
HO Domicile = UK
42.76
15.09
Response = F
HO Domicile = EU
9.81
12.26
As it can be seen in Table 9, there is only a small difference in ‘firm’ responses based
on whether the student lives in the U.K. or EU. The difference can be attributed to the
additional financial issues that an EU student might face when coming to study in the U.K.,
such as accommodation costs, or it can be caused by the ultimate decision to stay at home and
continue living with the family. While the difference is not substantial and does not warrant
any changes in marketing communication practices, is it a reminder to the university that
prospective students from different regions face different challenges in attending higher
education institutions.
Another set of rules that were discovered, as seen in Table 10, occurred when selecting
‘LEA Domicile’ as an antecedent and ‘Response’ as a consequence. This attribute choice
further breaks down the results from Table 9.
Table 10: Rules identified with ‘LEA Domicile’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Response = D
LEA Domicile = Bulgaria
1.04
76.31
Response = D
LEA Domicile = Romania
1.31
35.41
Response = F
LEA Domicile = Surrey
1.23
24.44
Response = F
LEA Domicile = Lancashire
2.32
15.30
Response = F
LEA Domicile = Manchester
2.93
10.28
First of all, a significantly larger proportion of students, which have their domicile in
Bulgaria decline the offer received, 76.31%, as opposed to those which have their domicile in
Romania, accounting to 35.41%. As was the case before, the university is looking for a
balance between students from different countries, and as such there is no justification to have
such a disparity as seen in Table 10. However, based solely on the information provided from
the analysis, there is no means to suggest improvements for marketing communications.
Instead, the analysis acknowledges the existence of a substantial difference between the two
countries and it is in the interest of the university to further investigate the behaviour.
Having said this, there are changes that can be suggested based on the second part of
the information contained in Table 10. As it can be seen, 24.44% of students who have their
domicile in Surrey, a geographical region close to London, respond with a firm choice for
50
Manchester Business School. However, a smaller proportion, 15.30% and 10.28%, of students
from Lancashire and respectively Manchester respond with a firm choice to the offer made by
the university.
What the research anticipated was opposite to the results presented above, in the sense
that it was expected that students from regions close to the university would be much easier to
acquire than those further away. This was due to the belief that local students will enrol
because of decreased living expenses, which could occur from staying with their families for
the duration of their studies. However, these contrasting results do highlight the necessity of
further emphasising the benefits of the university to these local students, which, as mentioned
previously, can be done through sending an alumnus to visit schools. Moreover, the content of
the direct marketing messages sent to these students needs to move away from general facts
about the city or university that, more than likely, they are familiar with, but more distant
students are not. Instead, messages can contain specific information that local applicants were
not aware of but which can persuade them to move forward with their application at
Manchester Business School. In fact, because they are familiar with most of the content the
university sends to them, there is a clear opportunity to move beyond this informational nature
of marketing communications to a more emotional one.
Another interesting finding was obtained from choosing ‘Academic Programme’ as an
antecedent and ‘UCAS Choice Number’ as the target. The intent with this analysis was to
identify academic programmes that are more likely to contain undecided students represented
here by a UCAS choice number 3. The results are in Table 11.
Table 11: Rules identified with ‘Academic Programme’ as input and ‘UCAS Choice Number’
as target
Consequence
Antecedent
Support (%)
Confidence (%)
Academic Programme = BSc
UCAS Choice Number
(Hons) International
7.87
22.57
=3
Business, Finance and
Economics
Academic Programme =
BSc(Hons) in Information
UCAS Choice Number
Technology Management for
2.32
15.29
=3
Business with Industrial
Experience
What the above results suggest is that for certain academic programmes, such as ‘BSc
(Hons) in Information Technology Management for Business with Industrial Experience’,
students are less likely to select the university as UCAS choice number 3, 15.29%, than for
51
other academic programmes. UCAS choice number 3 is relevant to the university as these
students are at the threshold of selecting the university as a top choice but for various reasons
did not nominate Manchester Business School in the insurance or firm choices range, namely
UCAS choice number 2 or 1. Ideally, the university would thus like to elevate UCAS choice 3
to 1 or 2 so as to capture a wider applicant pool.
Consequently, there are academic programmes such as ‘BSc (Hons) International
Business, Finance and Economics’, where there is a high opportunity to shift the student base
from inferior UCAS choices to superior ones, such as 1 or 2. It is, thus, from the results
presented in Table 11 that the institution can further leverage marketing communications to
target applicants from academic programmes, which exhibit the behaviour mention above, in
an effort to further emphasise the advantages of studying at Manchester Business School so as
to secure higher level UCAS choices.
Potentially one of the most significant findings can be seen in Table 12. These results
were obtained by selecting ‘Decision’ as input and ‘Response’ as a consequence. The scope
was to understand how students react with respect to what offer they receive from the
university.
Table 12: Rules identified with ‘Decision’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Response = F
Decision = U
21.02
69.44
Response = I
Decision = U
21.02
19.38
Response = D
Decision = U
21.02
11.18
We will begin the discussion by assuming that the university makes unconditional
offers to those students, which are of extremely high quality and for which it has a special
interest in. As such, the institution is faced with a problem as it can only secure firm choices
from these very valuable candidates in 69.44% of the cases. Put differently, the university
loses 11.18% of unconditional offers to those students who decline the offer, while 19.38% of
candidates select Manchester Business School as the insurance choice.
Having said this, there is a clear opportunity for marketing communications to convert
the applicants, which either decline the offer or select it as insurance into candidates with firm
choices. Clearly, this group of students is important for Manchester Business School, and the
first step in increasing the likelihood of acquiring them is to develop a marketing
communication message, which is directly, and solely, targeted at them. Until now, the
discussion has focused on targeting certain clusters through direct marketing, but with this
52
particular group of students, representing 30.56% (n = 235) of the entire population of 769
applicants that Manchester Business School makes unconditional offers to, there is the
opportunity to move beyond such practices.
Thus personal communications, as outlined in the literature review section, would
form a suitable communication tool to target this particular student segment. More
specifically, there is the possibility to host an online presentation, including a question and
answer session, which is available only to student who received an unconditional offer. The
advantage of engaging in this type of activity is twofold: first of all, the university has a
chance to further emphasise the benefits of attending Manchester Business School, while
prospective candidates will have the opportunity to clarify any questions that they have and
convince themselves during the process that Manchester Business School is the appropriate
choice for them.
It is this dissertation’s belief that even with the difficulties that will be faced in setting
up this form of marketing communication, the university will benefit as not only will it be
able to attract a larger proportion of higher quality candidates, but it will also position itself as
an innovator in terms of engaging with its target audience. While other universities are
identifying their high quality applicants and targeting them by sending out unconditional
offers (University of Birmingham, 2013), the above mentioned proposition is a more
elaborate marketing communication strategy as it directly engages with those desired
candidates after the offer has been made in an attempt to secure them as future students.
Additionally, word of such behaviour by Manchester Business School may spread, attracting
other very good candidates that want to be part of a university which places itself at the
forefront of student engagement.
Finally, the most impactful result gained from the analysis was obtained when
selecting ‘Decision’ as input and ‘Response’ as a consequence, such as those presented in
Table 12, but this result is treated separately as its implications are significant. The result is
showcased in Table 13.
Table 13: The significant rule identified with ‘Decision’ as input and ‘Response’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Response = D
Decision = C
35.38
99.77
What this suggests is that almost all, 99.77%, of applicants that receive a conditional
offer from the university decline it, when in fact some of them should have either become
53
unconditional, once they met the offer, or moved into matriculation. Moreover, this rule has a
high support value and, as such, represents a significant rule in terms of size. Simply stated,
the data implies that every applicant that receives a conditional offer declines it.
This rule has such a strong implication on the way data is handled by Manchester
Business School that the results were cross checked with cluster analysis, but the outcome
remained the same.
In reality, such behaviour can never occur and implies a major error in the way the
university records data. This is because a significant proportion of applicants receive a
conditional offer, as seen in Figure 4 page 26, out of which a good percentage of them get
enrolled on their course. If all candidates decline the conditional offer, Manchester Business
School will be left with a very small number of enrolled students, an implication that directly
contradicts real world behaviour. Moreover, when examining the data entries by observing
just records which have ‘Programme Action = Matriculation’, the entire sample, 534
applicants, have received an unconditional offer, as seen in Table 14. Clearly, this is a fallacy
as there are students who received a conditional offer matriculated at the university.
Table 14: The rule identified with ‘Programme Action’ as input and ‘Decision’ as target
Consequence
Antecedent
Support (%)
Confidence (%)
Programme Action =
Decision = U
14.60
100.0
Matriculation
The argument put forward is not that every candidate with a conditional offer should
be matriculated, but the expectations are that some candidates will be. In fact, a considerable
part of those that decline a conditional offer is rightly represented by those that did not meet
the conditions, did not communicate the results to the university and as such were
automatically declined, or those individuals that declined the offer, even if conditions were
met, to enrol at a different university.
What explains this behaviour is that the majority of candidates start out at the
beginning of the application process with a conditional offer. If they have met these
conditions, the student is then recorded as having received an unconditional offer. In essence,
the characteristics of a candidate change over the course of the application process. This
behaviour is supported by the relationship between ‘Programme Action’ and ‘Decision’
mentioned previously in Table 14.
In terms of data recording practices, it means that the people in charge of managing
the data set modify, or change, the contents of each data entry as the application process
54
progresses, instead of maintaining the initial value for each candidate and subsequently
appending the data set with new pieces of information. Such a practice is not advisable as
valuable information is lost during the process, information which can be used to track the
behaviour of candidates along the application process.
Recommendations as to what Manchester Business School can do to alleviate the
problem of flawed data recording practice are presented in the Discussion section, together
with a general set of suggestions to improve the marketing communication strategy.
55
5. Discussion
This chapter brings together all the different rules identified in the Results section in
order to recognise recurring suggested marketing communications practices and propose an
overall set of recommendations to Manchester Business School. The discussion will focus on
the overall picture obtained from the data mining analysis to develop a clear, actionable action
plan that the university can follow in order to create more relevant marketing communications
messages for prospective students.
The most significant result obtained from the analysis is concerned with the way the
university saves the data it collects. As previously mentioned, the institution changes data
entries when new information is available, instead of, for example, appending this new
information to the existing data set and leaving previous information unmodified. Because
this practice is causing the university to lose valuable records, the primary recommendation of
this dissertation is for Manchester Business School to append entries every time a record
changes rather than override existing data, and put a timestamp on the new attributes
(Petrounias, 1997), as seen in Table 15.
Table 15: An example* of how the university can append entries to a student’s record
UCAS Person ID
UCAS Choice
Number
1234567890
1
1234567890
1
1234567890
1
1234567890
1
Response
Decision
Date
15/01/2015
Conditional
20/02/2015
Firm
Conditional
24/04/2015
Firm
Unconditional
03/05/2015
*This example is for demonstrative purposes only and does not contain any identifiable individuals nor is the
data an accurate representation of actual behaviour.
To further elaborate on the above point, the university will use as a record identifier,
records which represent candidates at different stages of the application process, the ‘UCAS
Person ID’ and the date/time the data has been appended to the records (Petrounias, 1997).
For example, the initial record, represented by the orange colour in Table 15, is created at the
moment the university receives the application from the candidate. This data set will include
for example the characteristics of that applicant, age, gender, LEA domicile, and their UCAS
choice number, but will not contain the response they gave to the university, firm, insurance
or decline, as the application process has not reached that stage at this point in time. As such,
56
these attributes will form the basis of a candidate’s record at the university and will be
timestamped accordingly by the individual responsible for managing the data.
Further along the application process, when the university has reached a decision with
respect to the applicant, represented by a green colour in Table 15, but also when the
candidate is required to communicate to the university his/her response, in blue colour, the
institution will take this new attributes and append them to the initial data set ensuring that a
timestamp is put on them. Finally, when the candidate has met his conditional offer and thus
becomes unconditional, it will be appended as such in the data set together with the
corresponding timestamp, as seen in purple colour. The same procedure will be carried out to
any new attributes that will be created during the application process.
The benefit of having different records, captured in the same data set, each with its
corresponding timestamp, is that a complete picture of how the candidate has behaved and
how his/her characteristics changed along the application process will be created. As a
consequence, the university will be in a position, for instance, to compare the UCAS choice
number with response received, and more significantly understand how the students react,
based on their response, to an offer obtained from the university. Moreover, no information
will be lost along the selection process, as no information will be overwritten by a new one.
To summarise, the first recommendation is concerned with how and when the
university saves student applications and decisions. The second one is concerned with what
information the university saves with respect to each candidate
One important attribute not included in the data set is the grades, both predicted and
obtained, that a candidate has. This information will have a substantial positive impact on the
data mining exercise, and subsequently the institution, as the university can better understand
what the characteristics of its highest quality applicants are. While it is acknowledged that not
all students take A levels and there are different entry requirements for different qualifications
and different countries, this knowledge can be used to produce marketing communications
which are more relevant to this target segment as the university can try and predict what
grades the student will obtain and in what subject fields.
Perhaps a more important piece of information that the university does not save is the
personal statement that each candidate writes when applying. The content of the personal
statement can be extremely revealing into who a candidate is and what his/her interests are,
knowledge which cannot be implied from a data-base, such as the one used in this
dissertation, and as such is lost, if the information is not saved. Moreover, a personal
statement is a means through which a candidate directly communicates to the institution what
57
he/she considers is relevant in assisting his/her application, and as such opens the possibility
to truly personalise, not just target, the marketing communications sent to the candidate. The
future research opportunity section will further elaborate on this point, by suggesting ways in
which data mining can be used to draw value from personal statements.
So far the focus was on suggesting recommendations on how Manchester Business
School should manage its data set. Subsequently, the discussion will emphasise general
marketing communication practices that can be implemented following the analysis
previously carried out.
First of all, there is a clear opportunity to target marketing communications based on:
the geographic region the student is resident of and the academic programme the candidate is
applying for. As such, the content of the messages and the tools utilised will differ based on
the above mentioned characteristics, with the outcome of producing more relevant marketing
communications.
This practice does not imply that the entirety of the content needs to change, it only
suggests that certain paragraphs can be modified to better suit a group’s characteristics. At the
same time, not every student segment that is identified should have a dedicated type of
content, as this can prove costly for the Manchester Business School to produce. Instead, the
university can retain its existing practices, where-by it sends out the same form of message to
all applicants, with the exception of those groups, which it has identified as valuable to the
institution and which justifies the additional costs needed to produce a more relevant
marketing communication message.
The second recommendation consists in the fact that marketing communications can
be used to secure highly valuable candidates, which are lost during the application process, as
evidenced by the analysis carried above. Data mining has shed light into the characteristics of
valuable applicants, which decide to enrol at a different university after they have received an
unconditional offer from Manchester Business School. As such, marketing communications
can be sent out to this student group in an attempt to reinforce the position of Manchester
Business School as the desired academic institution, with the intended outcome of capturing a
bigger proportion of ‘must have’ candidates.
Finally, the analysis has revealed that there is a clear opportunity for the university to
utilise marketing communications in order to differentiate their course offering for which they
experience high competition. This can be done, at first, by emphasising those specific course
characteristics that no other higher education institution has, in the attempt of securing a
higher proportion of firm choices than currently experienced. However, the university should
58
attempt to transcend from these specific rational features, to establish an emotional connection
with future applicants, further differentiating themselves from other universities (Berry,
2000).
59
6. Conclusion
6.1. Summary of Findings
Data mining analysis in the context of higher education undergraduate enrolment can
be used to enhance marketing communication practices. Because academic institutions store a
wide range of information on candidates, it can be analysed to identify student groups and
applicant behaviour, which feed directly into developing more relevant messages for
prospective students.
Data mining has already been applied to develop marketing communications across
various industries, such as the banking or retail sectors. The results presented here
demonstrate that similar improvements can be obtained in the highest education sector. As
such, the benefit of this study is twofold. Firstly, it has direct implications on how universities
collect, store and use the information with respect to each candidate. Secondly, it expands the
realm of applicability of data mining by enunciating its benefits on the student selection
process carried out by higher education institutions.
6.2. Wider implications
One of the motivations behind this dissertation was to extend the findings of this study
to other higher education institutions within the U.K. Clearly there is an opportunity for other
universities to carry out the analysis done here, as the data required is available from UCAS
which supplies the application platform for almost all U.K. universities (UCAS, 2015d).
However, the factor that needs to be taken into consideration, which differs from
institution to institution, is what each organisation attempts to achieve from the data mining
activity. Some universities are interested in high quality applicants, while others will be
concerned with attracting a large volume of students. Because the business objective drives
the data mining activity, each higher education institution will need to use the results obtained
from the data analysis in such a way to work towards their organisational objective. In fact,
the analysis carried out here is not limited to just U.K. universities, as almost all higher
education institutions from around the world will have data portraying each candidate.
6.3. Limitations of Study
While the results presented in this study are an adequate representation of an
organisation’s challenges and opportunities, it is important to acknowledge the limitations of
the research.
60
First of all, the data set used was relatively small, containing 4,110 entries, in
comparison to those used in other data mining exercises (KDnuggets, 2013) but more
significantly it contained the applicants of only one enrolment year, 2013. A small sample
size can potentially overstate or understate the importance of certain attributes in the analysis.
Moreover, having data entries from just one intake year means that results do not portray a
generalisable set of findings for the Manchester Business School, but instead represent a snapshot in time of the behaviour encountered in 2013. If the institution is to pursue with the
recommendations presented in this dissertation, the findings of which are not representative to
the events occurred in other years, it can damage, rather than improve, marketing
communication practices. As such, future analysis should be done with more than one year’s
worth of data to increase the reliability of the conclusions presented in this dissertation. This
observation links well with the CRISP-DM methodology presented earlier, where-by data
mining is a cyclical process, suggesting that it should be carried out continuously. Specifically
to this investigation, it means that each enrolment year’s data should be analysed and included
in the decision making process, rather than utilise only a single data set to drive decisions.
Secondly, the data utilised to suggest future courses of action can contain inaccurate
information. This becomes a more pressing issue as certain errors were already discovered in
the data set. With the possibility of further errors unidentified, the findings of this study
should not be considered unquestionably true. However, once the proposed data management
recommendations are adopted, which should ensure fewer errors in the data set, the analysis
can be carried out again to confirm or infirm the findings presented in this dissertation.
A third limitation of this study is that the understanding obtained from data mining on
its own is incomplete, being unable to explain, for example, why a certain outcome occurs in
the first place. In order to add value to the analysis, other research methods should be
included. For instance, a questionnaire can be sent out to students in their first year of study
with questions directed to the marketing communications they received during their
application process. A better understanding of what students thought about the content,
relevance and timing of the messages will be obtained which can be used in conjunction with
the data mining analysis to develop a more complete picture of the situation.
6.4. Future Research Opportunities
This study has focused on performing data mining analysis on Manchester Business
School’s prospective student application data set. It is anticipated, however, that further data
mining analysis will be carried out on other types of information collected during the
61
application process, information, which is already available to universities, with the caveat
that they capture and store it. As such, it is this study’s expectation that text mining will be
carried out on student written personal statements, or teacher’s references, in an attempt to
better understand them.
Text mining is concerned with automatically extracting information from textual data
(Sumathy and Chidambaram, 2013) and has seen a wide range of applicability including
analysing customer feedback (Dörre et al., 1999), or carrying out sentiment analysis for
online forums (Li and Wu, 2010). From a university’s perspective, it allows the gathering of
information that a candidate transmits to the institution in order to firstly, better understand
the characteristics of the applicant, and secondly, develop personalised marketing
communications in an effort to make the university’s service offering more relevant to the
prospective applicant.
For instance, text mining is able to reveal the information that a prospective student is
interested in football, by identifying key words and phrases such as ‘football’ or ‘soccer’
followed or preceded by ‘like’ or ‘enjoy’ in his/her personal statement. The university can
then use this information to include a section on The University of Manchester’s football
society next time it sends out a marketing communication to that respective student. But the
results from text mining are not limited to only recommending the university’s societies, as
the analysis is able to reveal much more than this.
As a further example, text mining can also be utilised to propose university college
courses, available to all students regardless of the academic programme they are enrolled on.
For instance, consider an applicant that has the following sentence in his personal statement:
I have really enjoyed my time spent volunteering for my local charity.
Text mining will analyse this phrase and recognise ‘enjoyed’ and ‘volunteering’ as key
words. This insight will then be included in the next communication sent out to the applicant,
which can contain information about the ‘Manchester Leadership Programme: Leadership in
Action’ course, emphasising the requirement to complete a certain amount of volunteering
hours as part of the unit and in order to receive the Manchester Leadership Award (University
College for Interdisciplinary Learning, 2015). As a consequence of engaging is such
behaviour, the marketing communications that a university sends out will contain a high
degree of personalisation.
A future research opportunity consists in utilising other data mining techniques to
reveal insight, such as classification. For example, classification could be useful in predicting
which applicant could choose Manchester Business School as a university he/she will apply
62
for, even before the application process has started. It is important to note however, and future
research should take this into consideration, that not all decisions made in this dissertation
with respect to attribute selection remain valid when selecting other data mining techniques.
For instance, for this particular investigation the decision to remove ‘SNC Flag’ was made as
it was considered irrelevant to the objective of this dissertation. However, this attribute should
be kept in the analysis if classification is utilised or the objective of the research changes as it
contains information about ‘not so good’ candidates, data which is required in order to
successfully classify ‘good’ candidates.
Finally, future research, but also university practices, will attempt to bring together all
the different data sources and data mining techniques in order to obtain the most complete
picture of the applicant as possible. This is similar to what for-profit organisations are doing,
where-by they construct a 360-degree view of the customer which is built from all the various
data customers generate when engaging with the organisation, such as records of previous
transactions or social media behaviour (Forrest, 2014; Rouse, 2015). While companies have
the necessary resources to undergo such a strategy, it is acknowledged that universities cannot
totally pursue such an approach as they are constrained by resources, one of which is time.
However, while a complete picture of the candidate is unachievable, universities can still
engage in practices to understand their applicants as best as possible.
First of all, classification, as previously mentioned, can be used to predict which
applicants are more likely to apply to a particular higher education institution. This
information can then be coupled with the findings gathered from an analysis similar to the one
mentioned in this investigation, namely using association rules and cluster analysis to learn
about the characteristics of applicants and consequently enhance marketing communication
practices. These techniques will develop an understanding of the applicant which will be built
over time, slowly, with more learning taking place as candidates respond to any marketing
communication they have received. Ultimately, these techniques will be used to build
predictive models, predicting not only who is likely to enrol to a particular university, but how
good each candidate is, and his/her response to the offers made by the institution. Secondly,
investigations such as the one discussed in this dissertation can be utilised to offer support, for
example, in text mining, by analysing personal statements and teacher’s references, as
previously mentioned.
Thus, bringing together the understanding provided by analysing a university’s data
set containing information on applicants, with the insight generated from performing text
mining on personal statements and references, a higher education institution will be in a
63
position to develop an almost complete view of its prospective students, information which is
valuable for strategic and tactical purposes.
64
7. References
Adam, R. and Smith, D. (2014) ‘Universities spend more to attract clearing students’ [online],
Available
from:
http://www.theguardian.com/education/2014/aug/08/university-
spending-clearing-limit-undergraduates [Accessed: 18.02.2015].
Anand, S. S., and Büchner, A. G. (1998) Decision support using data mining. Financial Times
Management.
Ansari, A. and Mela, C. F. (2003) ‘E-customization’ Journal of Marketing Research, 40 (2),
pp. 131-145.
Athey, S. (2014) ‘Information, Privacy, and the Internet – An economic perspective’ [online],
Available from: http://www.cpb.nl/sites/default/files/CPB-Lecture-2014-InformationPrivacy-and-the-Internet-an-economic-perspective.pdf [Accessed: 11.02.2015].
Beneke, J. (2011). Marketing the institution to prospective students–a review of brand
(reputation) management in higher education. International Journal of Business and
Management, 6 (1), pp. 29-44.
Berry, L. L. (2000). Cultivating service brand equity. Journal of the academy of Marketing
Science, 28 (1), pp. 128-137.
Boffey, D. (2014) ‘From freshers to focus groups: how universities are learning to advertise’
[online],
Available
from:
http://www.theguardian.com/education/2014/may/18/universities-turn-to-ad-man
[Accessed: 14.02.2015].
Çetin, R. (2004). Planning and implementing institutional image and promoting academic
programs in higher education. Journal of Marketing for Higher Education, 13 (1-2),
pp. 57-75.
Chaffey, D. (2011) E-Business & E-Commerce Management – Strategy, Implementation and
Practice. 5th edition. Harlow: Pearson Education Limited.
Chamorro-Premuzic, T. (2014) ‘The Unnatural Selection of Male Entrepreneurs’ [online],
Available from: https://hbr.org/2014/03/the-unnatural-selection-of-male-entrepreneurs/
[Accessed: 24.03.2015].
Chapleo, C. (2010). What defines “successful” university brands?. International Journal of
Public Sector Management, 23 (2), pp. 169-183.
Chapleo, C. (2013) ‘The effect of increased tuition fees on Higher Education marketing in the
UK’
[online],
Available
from:
65
http://www.communicationsmanagement.co.uk/site/assets/files/1365/the_effect_of_in
creased_tuition_fees_on_he_marketing.pdf [Accessed: 12.11.2014].
Chapleo, C., Carrillo Durán, M. V. and Castillo Díaz, A. (2011) Do UK universities
communicate their brands effectively through their websites?. Journal of Marketing
for Higher Education, 21 (1), pp. 25-46.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R.
(2000) ‘CRISP-DM 1.0 Step-by-step data mining guide’ [online], Available from:
http://the-modeling-agency.com/crisp-dm.pdf [Accessed: 10.11.2014].
Cheung, A. C., Yuen, T. W., Yuen, C. Y. and Cheng, Y. C. (2010) Promoting Hong Kong's
higher education to Asian markets: market segmentations and strategies. International
Journal of Educational Management, 24 (5), pp. 427-447.
Cubillo, J. M., Sánchez, J. and Cerviño, J. (2006) International students' decision-making
process. International Journal of Educational Management, 20 (2), pp. 101-115.
Cui, Z., Damiani, E. and Leida, M. (2007) Benefits of ontologies in real time data access.
Digital Ecosystems and Technologies Conference, DEST '07., pp. 392-397.
De Chernatony, L. (2003) Brand-building on the internet. Birmingham: Birmingham
Business.
Dörre, J., Gerstl, P. and Seiffert, R. (1999) Text mining: finding nuggets in mountains of
textual data. In Proceedings of the fifth ACM SIGKDD international conference on
Knowledge discovery and data mining, pp. 398-401.
Drucker, P. (2013) Managing for the Future. Routledge.
Durkin, M., McKenna, S. and Cummins, D. (2012) Emotional connections in higher
education marketing. International Journal of Educational Management, 26 (2), pp.
153-161.
Eagle, L. and Brennan, R. (2007). Are students customers? TQM and marketing
perspectives. Quality Assurance in Education, 15 (1), pp. 44-60.
Einav, L. and Levin, J. D. (2014) ‘The data revolution and economic analysis’ [online],
Available
from:
http://web.stanford.edu/~leinav/pubs/IPE2014.pdf
[Accessed:
11.02.2015].
Fayyad, U., Piatetsky-Shapiro, G. and Smyth, P. (1996) From data mining to knowledge
discovery in databases. AI magazine, 17 (3), pp. 37-54.
Fill, C. (2013) Marketing communications – brands, experiences and participation, 6th
Edition, Harlow: Pearson Education Limited.
66
Forrest, P. (2014) ‘Big Data and the 360 degree customer view’ [online], Available from:
http://www.mbnsolutions.com/big-data-and-the-360-degree-customer-view/
[Accessed: 24.04.2015].
Furey, S., Springer, P. and Parsons, C. (2014) Positioning university as a brand: distinctions
between the brand promise of Russell Group, 1994 Group, University Alliance, and
Million+ universities. Journal of Marketing for Higher Education, 24 (1), pp. 99-121.
Gangadharan, G. R. and Swami, S. N. (2004) Business intelligence systems: design and
implementation strategies. 26th International Conference on Information Technology
Interfaces, pp. 139-144.
Gibbs, P. (2001) Higher education as a market: a problem or solution?. Studies in Higher
Education, 26 (1), pp. 85-94.
Gibbs, P. and Knapp, M. (2002) Marketing higher and further education: An educator's guide
to promoting courses, departments and institutions. London: Kogan Press.
Golfarelli, M., Rizzi, S. and Cella, I. (2004) Beyond data warehousing: what's next in
business intelligence?. Proceedings of the 7th ACM international workshop on Data
warehousing and OLAP, pp. 1-6.
Han, J., Kamber, M. and Pei, J. (2012) Data mining: Concepts and Techniques, MA: Morgan
Kaufmann Publishers Inc..
Hegland, M. (2001) Data mining techniques. Acta Numerica 2001, 10, pp. 313-355.
Helgesen, Ø. (2008) Marketing for higher education: A relationship marketing
approach. Journal of Marketing for Higher Education, 18 (1), pp. 50-78.
Ho, H. F. and Hung, C. C. (2008) Marketing mix formulation for higher education: An
integrated analysis employing analytic hierarchy process, cluster analysis and
correspondence analysis. International Journal of Educational Management, 22 (4),
pp. 328-340.
Horn, B. and Huang, W. (2009) ‘Comparison of Segmentation Approaches’ [online],
Available from: http://www.decisionanalyst.com/publ_art/CompareSegmentation.dai
[Accessed: 15.03.2015].
Hughes, A. M. (2002) Editorial: The mirage of CRM. Journal of Database Marketing &
Customer Strategy Management, 9 (2), pp. 102-104.
IAB (2015) ‘Glossary of Interactive Advertising Terms’ [online], Available from:
http://www.iab.net/wiki/print/ [Accessed: 11.02.2015].
67
IBM (2011) ‘IBM SPSS Modeler 14.2 Modeling Nodes’ [online], Available from:
ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/14.2/en/Mod
elingNodes.pdf [Accessed: 10.04.2015].
IBM (2012) ‘IBM SPSS Modeler CRISP-DM Guide’ [online], Available from:
ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/15.0/en/CRI
SP_DM.pdf [Accessed: 06.11.2014].
Ivy,
J.
(2008)
A
new
higher
education
marketing mix:
the 7Ps
for
MBA
marketing. International Journal of Educational Management, 22 (4), pp. 288-299.
James, P. (2011) Issues and Perspectives Related to Mobile Brand Marketing in Thai Private
Higher Education. Journal of Management Research, 3 (1), pp. 1-25.
Jansen, I. and Brenn-White, M. (2011) ‘Overview of Current Marketing Initiatives by Higher
Education Institutions (HEI) and National Agencies Within the European Higher
Education Area (EHEA), Focusing on “Marketing the EHEA“’ [online], Available
from:
http://www.ehea.info/Uploads/presentations/IPN%20Survey%20Report%2025%20Ma
rch%202011.pdf [Accessed: 08.11.2014].
Kahaner, L. (1996) Competitive intelligence. New York: Simon & Schuster.
Kalyanaraman, S. and Sundar, S. S. (2006) The psychological appeal of personalized content
in web portals: does customization affect attitudes and behavior?. Journal of
Communication, 56 (1), pp. 110-132.
Kamakura, W. A., Wedel, M., De Rosa, F. and Mazzon, J. A. (2003) Cross-selling through
database marketing: a mixed data factor analyzer for data augmentation and
prediction. International Journal of Research in Marketing, 20 (1), pp. 45-65.
KDnuggets
(2007)
‘Data
Mining
Methodology’
[online],
Available
http://www.kdnuggets.com/polls/2007/data_mining_methodology.htm
from:
[Accessed:
03.11.2014].
KDnuggets (2013) ‘Poll Results: Largest Dataset Analyzed/Data Mined’ [online], Available
from:
http://www.kdnuggets.com/2013/04/poll-results-largest-dataset-analyzed-data-
mined.html [Accessed: 04.04.2015].
Kotler, P. and Fox, K. F. (1995) Strategic marketing for educational institutions, 2nd Edition,
New Jersey: Prentice-Hall.
Kumar, D. and Bhardwaj, D. (2011) Rise of data mining: Current and future application
areas. IJCSI International Journal of Computer Science Issues, 8 (5), pp. 256-260.
Kumar, V. (2010) Customer relationship management. John Wiley & Sons, Ltd.
68
Kurgan, L. A. and Musilek, P. (2006) A survey of Knowledge Discovery and Data Mining
process models. The Knowledge Engineering Review, 21 (1), pp. 1-24.
Lachlan, J. (2014) ‘BI and analytics delivering over 1300% ROI according to Nucleus
Research:
Do
you
believe
it?’
[online],
Available
from:
http://www.yellowfinbi.com/YFCommunityNews-BI-and-analytics-delivering-over1300-ROI-according-to-Nucleus-Research-Do-you-b-175078 [Accessed: 22.10.2014].
Li, N. and Wu, D. D. (2010) Using text mining and sentiment analysis for online forums
hotspot detection and forecast. Decision Support Systems, 48 (2), pp. 354-368.
Li, W., Wu, X., Sun, Y. and Zhang, Q. (2010) Credit card customer segmentation and target
marketing based on data mining. Computational Intelligence and Security (CIS), pp.
73-76.
Linoff, G. S. and Berry, M. J. (2011) Data mining techniques: for marketing, sales, and
customer relationship management. Indianapolis: Wiley Publishing, Inc..
Mainardes, E. W., Alves, H., Raposo, M. and de Souza Domingues, M. J. C. (2012)
Marketing in higher education: A comparative analysis of the Brazil and Portuguese
cases. International Review on Public and Nonprofit Marketing, 9 (1), pp. 43-63.
Manchester Business School (2015) ‘Culture and structure’ [online], Available from:
http://www.mbs.ac.uk/about-mbs/culture/ [Accessed: 24.03.2015].
Marbán, Ó., Mariscal, G. and Segovia, J. (2009) A Data Mining & Knowledge Discovery
Process Model. Data Mining and Knowledge Discovery in Real Life Applications, pp.
1-17.
Maringe, F. and Mourad, M. (2012) Marketing for Higher Education in Developing
Countries: emphases and omissions. Journal of Marketing for Higher Education, 22
(1), pp. 1-9.
Marr, B. (2014) ‘What is Business Intelligence (BI)?’ [online], Available from:
http://www.ap-institute.com/Business%20Intelligence.html [Accessed: 18.10.2014].
Miller, S. H. (2001) Competitive Intelligence – an overview. Competitive Intelligence
Magazine, 1 (11), pp. 1-14.
Moogan, Y. J. (2011) Can a higher education institution's marketing strategy improve the
student-institution match?. International Journal of Educational Management, 25 (6),
pp. 570-589.
Morgan, J. (2013) ‘Undergraduate numbers cap ‘to be abolished’ – Osborne’ [online],
Available
from:
http://www.timeshighereducation.co.uk/news/undergraduate-
numbers-cap-to-be-abolished-osborne/2009667.article [Accessed: 05.02.2015].
69
Moro, S., Laureano, R. and Cortez, P. (2011). Using data mining for bank direct marketing:
An application of the crisp-dm methodology. In P. Novais et al. (Eds.), Proceedings of
the European Simulation and Modelling Conference, pp. 117-121.
Naidoo, V. and Wu, T. (2011) Marketing strategy implementation in higher education: A
mixed approach for model development and testing. Journal of Marketing
Management, 27 (11-12), pp. 1117-1141.
Naude, P. and Ivy, J. (1999) The marketing strategies of universities in the United
Kingdom. International Journal of Educational Management, 13 (3), pp. 126-136.
Nedbalová, E., Greenacre, L. and Schulz, J. (2014) UK higher education viewed through the
marketization and marketing lenses. Journal of Marketing for Higher Education, 24
(2), pp. 178-195.
Negash, S. (2004) Business intelligence. The Communications of the Association for
Information Systems, 13 (1), pp. 177-195.
Ngai, E. W., Xiu, L. and Chau, D. C. (2009) Application of data mining techniques in
customer relationship management: A literature review and classification Expert
systems with applications, 36 (2), pp. 2592-2602.
Nicholls, J., Harris, J., Morgan, E., Clarke, K. and Sims, D. (1995) Marketing higher
education: the MBA experience. International Journal of Educational Management, 9
(2), pp. 31-38.
Nichols,
W.
(2013)
‘Advertising
Analytics
2.0’
[online],
Available
from:
https://hbr.org/2013/03/advertising-analytics-20/ar/1 [Accessed: 18.02.2015].
Nucleus Research (2014) ‘Analytics pays back $13.01 for every dollar spent’ [online],
Available from: https://nucleusresearch.com/research/single/analytics-pays-back-1301-for-every-dollar-spent/ [Accessed: 22.10.2014].
Payne, A. (2006) Handbook of CRM: achieving excellence in customer management. London:
Butterworth-Heinemann.
Peacock, L. (2012) ‘Graduates with industry placement more likely to get jobs’ [online],
Available
from:
http://www.telegraph.co.uk/finance/jobs/9555036/Graduates-with-
industry-placement-more-likely-to-get-jobs.html [Accessed: 04.04.2015].
Petrounias, I. (1997) A conceptual development framework for temporal information systems.
In Embley, D. W. and Goldstein, R. C., Conceptual Modeling - ER '97, 16th
International Conference on Conceptual Modeling, Los Angeles, California, USA,
November 3-5, 1997, Proceedings, Springer, pp. 43-56.
70
Petty, R. E., Wheeler, S. C. and Bizer, G. Y. (2000) Attitude functions and persuasion: An
elaboration likelihood approach to matched versus mismatched messages In G. R.
Maio and J. M. Olson (Eds.), Why we evaluate: Functions of attitudes, pp. 133–162,
Mahwah, NJ: Erlbaum.
Qiu, T. (2008) Scanning for competitive intelligence: a managerial perspective. European
Journal of Marketing, 42 (7/8), pp. 814-835.
Rajak, A. and Gupta, M. K. (2008) Association rule mining-applications in various areas In
Proceedings of International Conference on Data Management, Ghaziabad, India, pp.
3-7.
Roberts, M. (2003), Internet Marketing: Integrating Online and Offline Strategies, Boston,
MA: McGraw-Hill.
Rouse,
M.
(2015)
‘360-degree
customer
view’
[online],
Available
http://searchcrm.techtarget.com/definition/360-degree-customer-view
from:
[Accessed:
24.04.2015].
Schuller, D. and Rasticova, M. (2011) ‘Marketing Communications Mix of Universities Communication With Students in an Increasing Competitive University Environment’
[online], Available from: http://www.cjournal.cz/files/67.pdf [Accessed: 10.11.2014].
Shaw, M. J., Subramaniam, C., Tan, G. W. and Welge, M. E. (2001) Knowledge management
and data mining for marketing. Decision support systems, 31 (1), pp. 127-137.
Smith, A. D. (2005) Exploring online dating and customer relationship management. Online
Information Review, 29 (1), pp. 18-33.
Stackowiak, R., Rayman, J. and Greenwald, R. (2007) Oracle data warehousing & business
intelligence Solutions. Indianapolis: John Wiley & Sons.
Sumathy, K. L. and Chidambaram, M. (2013) Text Mining: Concepts, Applications, Tools
and Issues–An Overview. International Journal of Computer Applications, 80 (4), pp.
29-32.
Sun, S. (2009) An analysis on the conditions and methods of market segmentation.
International Journal of Business and Management, 4 (2), pp. 63-70.
Svensson, G. and Wood, G. (2007) Are university students really customers? When illusion
may lead to delusion for all!. International Journal of Educational Management, 21
(1), pp. 17-28.
Tam, K. Y. and Ho, S. Y. (2005) Web personalization as a persuasion strategy: An
elaboration likelihood model perspective Information Systems Research, 16 (3), pp.
271-291.
71
The University of Manchester (2011) ‘Manchester 2020 – The Strategic Plan for University of
Manchester’
[online],
Available
from:
http://documents.manchester.ac.uk/display.aspx?DocID=11953
[Accessed:
29.10.2014].
UCAS
(2015a)
‘UCAS
terms
explained’
[online],
Available
from:
https://www.ucas.com/corporate/about-us/who-we-are/ucas-terms-explained
[Accessed: 11.03.2015].
UCAS
(2015b)
‘UCAS
Search
tool’
[online],
Available
from:
http://search.ucas.com/search/providers?CountryCode=&RegionCode=&Lat=&Lng=
&Feather=&Vac=1&AvailableIn=2015&Query=Information+Technology+Manageme
nt+for+Business+with+Industrial+Experience&ProviderQuery=&AcpId=&Location=
&SubjectCode= [Accessed: 26.03.2015].
UCAS
(2015c)
‘UCAS
Search
tool’
[online],
Available
from:
http://search.ucas.com/search/providers?CountryCode=&RegionCode=&Lat=&Lng=
&Feather=&Vac=1&AvailableIn=2015&Query=accounting&ProviderQuery=&AcpId
=&Location=&SubjectCode= [Accessed: 26.03.2015].
UCAS
(2015d)
‘Who
we
are’
[online],
Available
from:
https://www.ucas.com/corporate/about-us/who-we-are [Accessed: 04.04.2015].
University College for Interdisciplinary Learning (2015) ‘Manchester Leadership Programme:
Leadership
in
Action’
[online],
Available
from:
http://www.college.manchester.ac.uk/courses/?year=2015&semester=1&course=106
[Accessed: 24.04.2015].
University of Birmingham (2013) ‘University launches bold new initiative to attract brightest
students’
[online],
Available
from:
http://www.birmingham.ac.uk/news/latest/2013/03/8-Mar-University-launches-boldnew-initiative-to-attract-brightest-students.aspx [Accessed: 04.04.2015].
Vakratsas, D. and Ambler, T. (1999) How advertising works: what do we really know?. The
Journal of Marketing, pp. 26-43.
Vesanen, J. (2007) What is personalization? A conceptual framework. European Journal of
Marketing, 41 (5/6), pp. 409-418.
Weinberger, D. (2010) ‘The Problem with the Data-Information-Knowledge-Wisdom
Hierarchy’ [online], Available from: https://hbr.org/2010/02/data-is-to-info-as-info-isnot/ [Accessed: 23.10.2014].
72
Wolin, L. D. (2003) Gender issues in advertising—An oversight synthesis of research: 1970–
2002. Journal of advertising research, 43 (1), pp. 111-129.
Zeng, L., Xu, L., Shi, Z., Wang, M. and Wu, W. (2006) Techniques, Process, and Enterprise
Solutions of Business Intelligence. 2006 IEEE Conference on Systems, Man, and
Cybernetics, pp. 4722-4726.
73