Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PIZZA TO POLICY: COMPARING PRE AND POST LAUNCH TWITTER DATA OF A PRODUCT 1. PROBLEM STATEMENT/POLICY RELEVANCE While many businesses are using social media for outbound marketing and communication efforts, leveraging social media for product innovation is a new concept for most. To recognize the actual needs of customers is very difficult process. User conversation, available on social media, can be useful for new product development and subsequent market success. Insights from user generated content using various analyses provide an opportunity to businesses to know the customer experience. This leads to developing new products emerging from the real customer experiences. On social media, customers share their experience based on their own judgement and preferences. Judgement is determined by customers’ emotions. Thus it is essential to consider customers’ emotions in developing the product because emotions affect directly their purchasing decisions. Social media platforms allow customers to share their emotions and feelings freely. Hence user generated data can play a significant role in developing new products incorporating the decision making process of customers. These processes will be useful in reducing the failure rate of new products and more beneficial for the end users. The processing adapted by the businesses in new product design can be replicated in the policy making arena by the governments. 2. PRINCIPAL RESEARCH QUESTION How social media data can help businesses to evaluate the instant market reactions by monitoring users' reactions pre and post launch of a new product? 2.1 Sub Research Questions: 1. How does the usage of social media provide valuable insights to businesses in developing new products? 2. How to identify the patterns and trends emerging in Twitter data for a new product? 3. How can network analysis help businesses in identifying lead users and their influence? 3. RESEARCH 3.1 Literature Review Today social media provide a space for businesses to create a place for them for better engagement with their customers and monitor customers’ activities (He et al., 2014). Social media has become an easy way to enhance customer reach with an effective strategy. It is not limited to Facebook and Twitter. There are many other social media platforms for various purposes like LinkedIN basically for professionals, Blogs, Second life for gaming, and Flickr & YouTube (content sharing) (Tuten& Solomon, 2013). Apart from customer engagement, social media also explores more areas for businesses like recruitment, web-based training, updating plans, new offers promotions, and customer involvement in product development. Businesses needs to identify carefully social media platform for their use based on its characteristics. With a significant popularity, Twitter reached 18% of internet users and generates more than 500M tweets per day (Forbes, 2013). This vast amount of user generated data give a valuable opportunity for businesses in various business functions where customer involvement is required (Kaplan & Haenlein, 2010; Verhoef et al., 2013). To recognize the actual needs of customers is very difficult process. Social media helps to overcome the difficultly in capturing users’ voice in form of user-generated data. More customers are willing to share their experiences with others via social media (He et al., 2014; Sam & Cai, 2015). This user generated information is considered unbiased because this space allows to provide real feeling and opinions. Further social media offers a place for discussing various topics in various communities (Papadopoulos et al., 2012). Some extends this for knowledge sharing engagement (Du Plessis, 2007). Insights from user generated content using various analyses provide an opportunity to businesses to know the customer experience (Moe & Schweidel, 2011; Nambisan, 2013). This leads to development of policy for the introducing the new product coming from real customer experience. On social media, customers share their experience based on their own judgement and preferences. Judgement is determined by customers’ emotions. Thus it is essential to consider customers’ emotions in developing the product because emotions affect directly their purchasing decisions (Noble et al., 2008). Social media platforms allow customers to share their emotions and feelings freely. Hence user generated data can play a significant role in developing policies in designing product considering the decision making process of customers. These policies will be useful to reduce the failure rate of new products and more beneficial for the end users. In core marketing and promotional activities, monitoring users’ activities during the launch of a new product can help businesses to capture current reaction. It helps in capturing ideas for marketing strategies to modify their launching policies (Banerjee et al., 2012). Further, it is also helpful to get new insights for new products design and improvements (Marcus et al., 2011). For such analysis, a deep analysis is required along with the data reflecting the collective judgment about the product. Like sentiment only measures the score of users’ feeling not their preferences. Thus our contribution will be the deeper understanding for the capturing insights from data through analytical approach. This leads to policy development of early market success for new product. For this, we present a methodology to analyze tweets around the launch of a new product that has been formerly announced. Use of social media in product development is very limited in literature. Researchers used different platforms for incorporating user conversation in designing product like Facebook comments (Carr et al., 2015). The use of tweet in developing new product is considered as a particular case because of considering the launch event as triggering event (Lipizzi et al., 2015). In literature, User generated data is used for many other purposes to monitor and capture the users’ reaction like TV audience reaction (Harrington et al., 2012) and stock market prediction (Evangelopoulos et al., 2012). These studies includes the mining and analysing data i.e. feedback and reviews to understand the user behaviour and preferences. The approaches used for analysis in all studies are more or less similar such as text mining and sentiment analysis based on semantic method (Brown, 2012). Semantic analysis includes the identification insights in a network. Moreover, these studies are carried on a common assumption i.e. one way communication of users. Various text mining and sentiment methods are used in extraction of emerging themes and associates users’ reaction. Text mining techniques like word frequency involve data extraction, indexing, and classification to understand important terms based on their relevance. Focus of these approaches is on the recognition and evaluation of emotions and preferences shared by users for a specific product. Sentiment classification (i.e., positive, negative, and neutral) can be applied at different semantic levels based on polarity classification. In the case of Twitter, it can be applied to each single tweet or to collection of tweets (Thelwall et al., 2011). The available tools for information visualization for tweets are not able to show the structure of users’ preferences and judgments. Monitoring user activities on social media platforms present nice data visualization solutions, but fail to dig deeper into user conversation. Specifically, most of tools are not aimed at identifying the conversational patterns in user generated data. For businesses, the recognition of patterns from the user conversation can facilitate to evaluate customer experience about a new product. In addition it alsobuilds up a better under-standing of what customers say and how they talk to others about a new product. In the next section we provide a theoretical framework analyzing Twitter data based on conversational analysis. 3.2 Analytical Framework An analytical framework (Figure 1) on data mining techniques in product development is proposed based on a review of the literature on data mining techniques. Essentially, the literature on data mining in product development identified co-creation dimensions and data mining techniques for their application. Now businesses are familiar with the significance of user generated content from social network sites to modify their products. An organization need to have capability to access all useful information about their products in form of comments, opinions and reviews. Itidentifies what has happened and estimates what will happen in the instant future. As many businesses are not using social media application and analysis because of lack of awareness. It has become necessary to use data mining and analysis techniques to gain useful insights from many textual documents quickly (Liu, 2012). Some main applications of data mining contain: opinion analysis, clustering, opinion extraction (opinion summarization), and pattern analysis (Adedoyin-Olowe et al., 2013; Ngai et al., 2011). Figure 1. Classification framework for data mining techniques in product design 3.3 Method & Data Sources We select a launching event of a new product as triggering event famous fast food company ‘Dominos’. The selection of the product is based on the popularity of company on Twitter and the data availability of data. Data plays very critical role in proposed approach especially in product launch as a triggering event. In this paper, the case company was planning to introduce a new pizza product in mid July 2015. Thus we decided to collect Twitter data from February 2015 to October 2015. Tweets extraction from Twitter can be done in two ways either using ‘stream’ or ‘search’. Stream option provides the real-time traffic and ‘search’ option allows downloading the tweets up to a few days with geographical information due to limited API. We opted for the ‘search’ approach because of longer time and decent number of tweets overtime using R and saved the data in excel sheet. Output contains various meta-data information including tweet, data, user name with profile information and geographical details. For this study, all details were not required for the analysis. We used again R with a cleaning script. Further, data preparation like tokenization, filtering and stemming has been done using natural language toolkit libraries in R. The methodology can be broadly divided into three major steps (Figure 2): (1) Data Pre-processing (2) Data analysis (3) Data Visualization Figure 2. Set-wise structure of analysis 3.4 Results, Analysis & Discussion 3.4.1. Basic Information We did content analysis using text and network analysis. First we created word clouds for both datasets (pre launch and post launch) (Figure 3). In pre launch word cloud, the most emerging words are order, easi, place, effect, tweet, job, hire, delivery, gift, card, sugar, sweet etc. On the basis of these words, we can get idea of the major emerging topics in pre launch duration. Users are talking about the pizza order time, easy to order, specific places, welcome notes from the company, girt cards, jobs hiring in Dominos and sweet sugar about pizza. These are the basic themes for the Dominos Italian pizza. In the post launch word cloud, the most discussed words are similar with the few new words like tweet, order, job, welcome, free, delivery, custom, service, want, place, edit, return, get, hire, like, new, look etc. So some major themes are free delivery, start tweet, get free service, job hire, look rep, return delivery.In summary, users are concerning about order delivery and services, new pizza appearance, taste of the new pizza, the way of engagement by tweeting and job opportunity. Post-launch Pre-launch Figure 3. Word clouds Figure 4. Frequency distributions for pre launch data After the word cloud, we count the word frequency. It means how much time word is discussed in the both of time spans. In pre launch period, apart from Dominos, Italianexot and pizza, order was used more than 200 times and tweet word was used more than 150 times (Figure 4). Others words like now, new, deliver, welcome, gift, card, get were used around 100 times. The frequency of words shows the association of words to the product keyword i.e. Dominos Italian pizza in the pre launch duration. Figure 5. Frequency distributions for post launch data In the post launch duration, we removed the Italian and exotic words from the dataset because these were the basic search term. So these were present in each tweet which made the long frequency column in the chart. Apart from the similar word counts from previous figure, here start, want, like, free, get, custom, find, job, just, look are discussed around 100 times (Figure 5). In word cloud and word frequency analysis, we can identify partially change in the topics discussion about the product. In pre-launch data shows the expectation of users from the products and post launch conversation shows the fulfilment of those expectations. But in tweets, some promotional activities are also appeared such as job hiring and gift cards. In the addition of word frequency analysis, we made the hierarchy of words based on the importance of each word. We set the five levels of the hierarchy (Figure 6). First level shows the domain of the product like in our analysis the level shows one element i.e. pizza. The hierarchy also contains the relationship edges and the thickness of edge indicate how strong connection between those words. Pizza has connection to the Italian with very thick edge. It has also connections with third level word free and hut words but not so strong like Italian. On second level, tweet word has link with order and Dominos and order is linked with can. It means Dominos order can be made by tweet. So in this hierarchy, we can identify the importance level of each word and their linkage with the others levels words. It helps the company to focus on the highly connected words and the theme that are emerging from their connection. Figure 6. Word hierarchy In addition, we also did the domain clustering in which the most prior domains of datasets were showing (Figure 7). In the pre launch dataset, there were two clusters having red and black colour elements. It means, in that discussion talked about two basic domains. In our data set, these were Dominos Italian pizza and exotic Italian pizza. Both clusters are overlapping because of the similar discussion for both domains. On the other hand, in post launch dataset only one domain was showing indicating that the whole discussion on the exotic Italian pizza. Pre-launch Post-launch Figure 7. Domain clustering In summary, this clustering explains that in prelaunch data users discussed the earlier or existing similar product. They discussed the drawbacks of existing product and expecting more from upcoming product. Businesses can get better insights to overcome the weak area of existing product based on this expectation comparison. 3.4.2 Ideation After identifying the most frequent words, we checked how words were connected to each other based on the degree counts. In Figure 8, we saw the complex network of words in both datasets. To make graph simple and better understandable, we add the directional component in the network (Figure 9). Pre-launch Post-launch Figure 8. Word degree network Pre-launch Post-launch Figure 9. Word directional degree network Now this network shows the all high degree words in both cases. In pre launch dataset, high degree words are Dominos, welcome, easyord and edit. All other words were connected to these words. In post degree dataset, high degree words are Dominos, easi, edit and easi. This shows the high degree words were same in both cases. It means these words are very common with high frequency for the Dominos Italian exotic pizza conversation. To identify themes from words, we also made work network based on their occurrence together (Figure 10 and Figure 11). Figure 10. Word network with clusters for pre launch data In above figure, we divided the words in various clusters and make edges according to their occurrence. There are eight major clustering having a major centric word. Like in first cluster, Dominos was the centre word. The connections show the themes like Dominos love, Dominos up, and now apply right job. There is also comparison between Dominos pizza fare and train fare. In other cluster, Italian is the centre word along with the themes like user liked YouTube video and large handmade pan. It means user likes the YouTube videos of exotic Italian pizza. They are also happy with the handmade pan using for this pizza. Now this centre word Italian also connected the other centre words like exotic and pizza. In the ‘pizza’ cluster, users were discussing the gift cards and the KFC pizza. Another discussion is related to hiring of drivers for delivery. In the next cluster ‘tweet’ as centre word, discussion was related to the easy order by using tweets. Dominos started to take orders by tweets with the use of emoji. The ‘experience’ cluster showed that users were expecting yummy taste of the pizza. There are more small clusters like customer rep service, thinks real better Dominos cover the Taylor swift’s album. So this cluster helps businesses to make policies according to the customer centric themes emerging from their conversation. Figure 11. Word network with clusters for post launch data In post launch word network also contained same centre words like Dominos, pizza, order with the few new exotic and delivery. In ‘Dominos’ cluster, love to eat and order tracking are emerging. Another cluster ‘pizza’ showed the theme of trip voucher win, comparison of pizza hut for competition and looking best Italian pizza party. These two centre words were connected to another one ‘exotic’. This cluster contains the themes like order delivery, tracking, chill, garlic creamery and engagement experience. This cluster leads to the ordering and payment themes making easy ordering through paypal payment gateway. These cluster are connected each other to show the connection among themes. In summary, these word networks show the themes associated with the common origin words. We can see how themes can be changed from the same origin after the event occurrence that is the launch of the new product. After identifying the theme, we need to understand the sentiment of users for the all themes in both time periods. For that, we did the sentiment analysis based on polarities and emotions (Figure 12 and Figure 13). In polarity sentiment, we saw that there are less negative sentiments in pre launch than that ofpost launch. This indicates that Dominos needs to focus on these negative sentiments. Pre-launch Post-launch Figure 12. Sentiment based on polarity Pre-launch Post-launch Figure 13. Sentiment based on emotions Then we categorized sentiments into various emotions like anger, anticipation, disgust, fear, joy, negative, positive, sadness, surprise and trust. In both dataset, only anticipation and trust are showing more sentiments. It shows that users are expecting less in pre launch duration as compare to post launch duration. It indicates that this pizza was not able to full fill their expectation. They are now looking for another product with more expectations. But this two time periods’ sentiments are not more convincing. That’s why we decided to do the time series analysis for these sentiments over the year as shown in Figure 14. Figure 14. Time series sentiment analysis We can see the average sentiment score are high for anticipation followed by trust and joy. Users were expecting more in whole time period. At the launch time, i.e. July end, anticipation is high and trust is low. Surprise is also slightly high because users are looking for better product. This time series analysis provides the clear view of sentiment change but not the reason of change. For that, we need to look the themes within each sentiment. Then we created the word cloud based on sentiment categorization (Figure 15). In pre launch, joy category shows the good delivery, gift card and low cost of previous product. These topics were making users joyful. In sadness category, they are blaming for not getting the free garlic crust. Due to limitation of algorithm, no themes are coming out from fear and anger categories. The use of some words without any connection is difficult to develop any theme. Handmade pan for this specific Italian pizza was really surprise for them.In post launch, garlic dip taste with happy great liking is related to joy category. Fear category contains the smoke, drinking and chill words which indicate the scared and horrible environment. Again handmade pan are amazing for users in surprise category. Easy ordering through tweet is another theme which has positive sentiment from users for this particular product. Pre-launch Post-launch Figure 15. Word categorization based on emotions 3.4.3 Community Detection After this sentiment analysis, we need to do network analysis to see the major communities in Dominos’s Italian exotic pizza dataset. Pre-launch Post-launch Figure 16. Communities identification in network As shown in above figure, there are many communities existed in the whole network. These communities were developed based on the users’ discussion about the product. Few communities become merged and become the larger community after the product launch. Few communities become bigger because of adding more individual users. There is also an increase in the connections between one to another community. This shows the before launching of product, users are various concerns to discuss but after the experience of product they are very limited only major concerns. In post launch, they are more willing to share their experience to many users. To understand this, we need to go deeper as shown in Figure 17. Figure 17. Community development This is a major community from the whole networks. Within this community, there are many topics to share shown in various colours using clustering. Cluster numbers are more in pre paunch community than that of post launch community. The size of each cluster in post launch community is bigger as compare to pre launch community. Community detection is a critical over time to identify the trends in product discussion. It also shows the influencer users and their individual networks in whole network. These specific users and communities play essential role in developing new ideas for exploration for product. Community discussion helps to validate the primaryimpression and figure concrete insights. CONCLUSIONS AND RECOMMENDATIONS The proposed approach in this paper can provide valuable insights from the customers’ experience. Data availability is quite good on Twitter without being any conditioned to express their ideas by customers. This research proposes to follow a longitudinal approach while comparing the user generated content, before and after the launch of new product. The reason for analysing pre and postboth data is for comparative differences in customers’ reactions. In this way it fulfils the gap between real requirements of customers and actual product features. It aims to develop a mechanism that can mine and analyze users' emotions regarding the specific features. A closer analysis of the new product shall show whether businesses have incorporated the ideas generated by the customers. Some data mining techniques such as clustering, topic modelling, sentiment analysis, and community detection are used to identify the themes (e.g. features) of the product. Because of there is continuous change in pattern of the product and user segmentation. Moreover this approach of collecting data requires relatively low cost. Customer’s preferences can be captured in systematic way. The effect of promotional activity or campaigning can be measured using the same approach with visualization of peak trends and sentiments. The results show that there is noticeable differences in the sharing and discussing behaviour of customers in both pre and post launch time periods. We consider that these differences can be used for the early adoption for market success. It covers the diversity and dynamic preferences of customers. Identifying the early adopters in various communities can help businesses to understand their target audience and their collective judgement. The conversation analysis based on theoretical perspectives empower the understanding the noisy data on Twitter. It also enhances capability to extract useful information from this data. Using classifiers to analyze the customer conversation would provide predict real reactions through effective visualization. In addition, businesses can shape and refine their policies involved in product development process based on customer’s conversation. Adding policies for customer segmentation based on communities can be developed using these results. The above case is social media analysis of Twitter data for a new product, pizza. It compared the pre and post launch data of the product and offered insights on the reception of the product from the content created by the customers in the social media platform. Similar experiments can be conducted for public policy as well. In transport system, this approach may provide valuable insights to develop transport policy using travellers’ views and experiences. It has potential to deliver and improve the transport policy goals. Furthermore, in disaster situations, the adopted methodology can offer the opportunity of harvesting information for situation awareness and taking actions. It can demonstrate the collective behaviour of users consistently based on actionable information warnings in very limited time to respond. In same case, policy makers can use the output to develop policies for emergencies. Moreover, politicians can also formulate their election policies using the similar approach through social media posts generated by users. REFERENCES Adedoyin-Olowe, M., Gaber, M. M. & Stahl, F. (2013). A survey of data mining techniques for social network analysis. Retrieved October 7, 2014, from http://jdmdh.episciences.org/18/pdf Banerjee, N., Chakraborty, D., Joshi, A., Mittal, S., Rai, A., &Ravindran, B. (2012, May). Towards Analyzing Micro-Blogs for Detection and Classification of Real-Time Intentions. In ICWSM. Brown, E. D. (2012). Will twitter make you a better investor? a look at sentiment, user reputation and their effect on the stock market. Proc. of SAIS. Carr, J., Decreton, L., Qin, W., Rojas, B., Rossochacki, T., & wen Yang, Y. (2015). Social media in product development. Food Quality and Preference,40, 354-364. Du Plessis, M. (2007). The role of knowledge management in innovation.Journal of knowledge management, 11(4), 20-29. Evangelopoulos, N., Magro, M. J., &Sidorova, A. (2012). The dual micro/macro informing role of social network sites: can Twitter macro messages help predict stock prices?. Informing Science: the International Journal of an Emerging Transdiscipline, 15(1), 247-268. Forbes. (2013). Can Twitter save TV? (and can TV save Twitter?). Retrieved March 13, 2016, from http://www.forbes.com/sites/jeffbercovici/2013/10/07/can-twitter-save-tv-and-can-tv-savetwitter/#1fbf89f86419 He, W., & Yan, G. (2015). Mining blogs and forums to understand the use of social media in customer cocreation. The Computer Journal, 58(9), 1909-1920. Kaplan, A. M., &Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business horizons, 53(1), 59-68. Lipizzi, C., Iandoli, L., & Marquez, J. E. R. (2015). Extracting and evaluating conversational patterns in social media: A socio-semantic analysis of customers’ reactions to the launch of new products using Twitter streams.International Journal of Information Management, 35(4), 490-503. Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167. Marcus, A., Bernstein, M. S., Badar, O., Karger, D. R., Madden, S., & Miller, R. C. (2011, May). Twitinfo: aggregating and visualizing microblogs for event exploration. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 227-236). ACM. Moe, W. W., &Schweidel, D. A. (2012). Online product opinions: Incidence, evaluation, and evolution. Marketing Science, 31(3), 372-386. Nambisan, S. (2013). Information technology and product/service innovation: A brief assessment and some suggestions for future research. Journal of the Association for Information Systems, 14(4), 215. Ngai, E. W. T., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559-569. Noble, C. H., & Kumar, M. (2008). Using product design strategically to create deeper consumer connections. Business Horizons, 51(5), 441-450. Papadopoulos, S., Kompatsiaris, Y., Vakali, A., &Spyridonos, P. (2012). Community detection in social media. Data Mining and Knowledge Discovery,24(3), 515-554. Sam, Y., &Cai, Y. (2015). A Study on the Use of Social Media to Understand Consumer Preference: The Case of Starbucks. International Journal of Management and Business Research, 5(3), 207-214. Thelwall, M., Buckley, K., &Paltoglou, G. (2011). Sentiment in Twitter events.Journal of the American Society for Information Science and Technology,62(2), 406-418. Tuten, T. L., and Solomon, M. R. (2013). Social Media Marketing. Boston: Pearson. Verhoef, P. C., Beckers, S. F., & van Doorn, J. (2013). Understand the perils of co-creation.Harvard Business Review, 91(9), 28.