Purchase Prediction from Social Media. Methodology, Limitations & Potentials

Seminar Paper, 2014

18 Pages, Grade: 1.3



1 Introduction to Social Network Recommendation

2 The History of Purchase Prediction
2.1 Early Work
2.2 Focusing on the Intentions of Online Shoppers
2.3 A Paradigm Shift - Social Networks in Online Shopping

3 Predicting Purchase Behavior from Social Media
3.1 Dataset
3.2 The Challenge of Data Sparsity
3.3 Users' Purchasing and Liking Focus
3.4 Demographic Di erences
3.5 Correlation between Social Media Interests and Purchases
3.6 Predicting Purchase Behavior
3.6.1 Establishing Evaluation Metrics
3.6.2 Learning Models & Feature Families
3.7 Experimental Results

4 Assessment
4.1 Limitations of Purchase Prediction from Social Media
4.2 Potentials of Social Media Recommendation and Purchase Prediction
4.2.1 Collecting Additional Individual Data
4.2.2 Utilizing Social Network Information
4.2.3 Expanding the Scope - Recommendation vs Marketing

5 Summary


1 Introduction to Social Network Recommendation

With a predicted volume of e439.7Bn in 2014 (according to statista.de1) in Germany alone, the retail market bears large potential for generating additional revenues from marketing. With the decreasing e ectiveness of classical marketing and even relatively new phenomena like online ads it be- comes more and more important to nd new ways to recommend products to customers. In e-commerce it is generally easier to target speci c audi- ences by for example selecting ad spaces according to thematically tting web pages. The fundamental di erence to classical marketing approaches is the availability of data about the respective customer. Currently the most common approach is to mine frequent item sets from the purchase history of the customer population and recommend products to customers based on what other customers bought. In order to obtain more speci c product pre- dictions for a particular customer, more data about the respective customer is needed. It seems like a natural choice to dig for data in the rich pool of data generated by each customer himself by assessing their respective actions and content generated, especially on social media websites. The available data there is much more user speci c than general purchasing behaviors of user groups and can potentially lead to very precise predictions about what the user is interested in and will buy.

This paper rst gives a brief overview over the development and research conducted on social media recommendation and behavior of online shoppers in general. Then the work of Y. Zhang and M. Pennacchiotti8 is presented. Finally, several possibilities for subsequent research based on previous work and the work of Zhang and Pennacchiotti are presented. Since the work presented in this paper is very foundational, some emphasis is put on the outlook in order to underline the relevance of Zhang's and Pennacchiotti's work.

2 The History of Purchase Prediction

2.1 Early Work

As early as in 2000 the rst evaluations of recommender systems for ecommerce, or rather, recommender algorithms for e-commerce such as the the work by Sarwar et al. 7 took place. Despite being conducted over a decade earlier, when e-commerce platforms were just beginning to rise, this work was already surprisingly similar to the work of Zhang and Pennacchiotti8 in the sense that the authors evaluated user ratings of movies in order to pre- dict other movies they might like and compared these ratings and predictions with purchase records of a large e-commerce company. Further research was done by Burke in 20022 on the possibilities of combining knowledge based recommendation with collaborative ltering methods, which concluded that it would indeed be advantageous to combine several methods of recommen- dation systems. The work of Zhang and Pennachchiotti8, as we will later see, used a similar approach by combining collaborative ltering methods with content-based methods.

2.2 Focusing on the Intentions of Online Shoppers

After focusing on recommendations based on the similarity between users, other methods began to emerge that were trying to classify users according to typological attributes and were focusing on the motives and goals users had when shopping online rather than solely considering the products they eventually bought. An important paper was the work Kau et al. presented in 20036 which used a catalog of 24 questions in order to reduce users down to six main factors that determine their online shopping behavior. There were similar works such as the work of Chiu et al. in 20053 in which the authors assess the di erences of intentions between genders when shopping online. What they found was that there are indeed direct in uences of factors like for instance the perceived ease of purchasing online and the perceived usefulness just as much as the personal awareness of security. As a concrete example they found that women have a higher sensitivity to risk when buying online.

2.3 A Paradigm Shift - Social Networks in Online Shopping

While having large amounts of information made available in e-commerce by tracking purchase histories of every single user, the rise of social net- works introduced a whole new set of possibilities for researchers. The main di erence is, that the data available from social networks is created by the respective user himself, rather than just being a mere response to contents she is exposed to. Therefore the causality that leads a user to buying certain products can be examined much more closely by for example evaluating her opinion. In 2010 it was shown by Bhatt et al.1 that the social network e ects on adopting products are very signi cant. They showed that these e ects are clearly induced by the social network due to peer-pressure, and not by single individuals.

Further it was shown by Guo et al. in 20114 that information passing in social networks can enhance purchasing activities. Also they showed, that consumer choice of transaction partners is in fact predictable. All these e ects indicate the huge potential in terms of the prediction power of social networks. The work of Zhang and Pennacchiotti8 was dedicated to establishing a baseline for further research by nding basic connections of users' activities on social network websites (in this case Facebook) and their purchasing behavior on e-commerce platforms (in this case eBay).

3 Predicting Purchase Behavior from Social Media

The history of product recommendation and purchase prediction indicates that the information users provide about their preferences by interacting in social networks and creating content, as well as performing speci c actions that indicate their interest in particular topics, such as for example liking things on Facebook, o er a rich source of data that potentially allows for very precise predictions of the respective users' purchasing behavior. The work of Zhang and Pennacchiotti8 is one of the rst, if not the rst attempt to examine the correlations between these Facebook likes and the purchasing behavior of users.

illustration not visible in this excerpt

Figure 1: Example of user data (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media8 ).

3.1 Dataset

In order to evaluate these relations they used a data set of 13,619 anonymized eBay users who connected to Facebook between June and August 2012. The goal was to build a cold start recommender system for users connecting to eBay from Facebook. Cold start refers to the fact that the user or customer is previously unknown and there especially is no known history or record about him on the retailer's side. Since research in this eld is in such an early stage, the authors limited the analysis and prediction to a category- based approach. This means, instead of predicting which speci c product a user will buy, the authors tried to determine whether it is possible to predict in which category the user will buy based on his likes and posts on Facebook. The data to conduct the research on was selected to match this purpose:

- Basic demographic information (for example age and gender) from Facebook
- Users' Facebook likes and their categories
- Item names and categories of items purchased on eBay from January to August 2012

An example of a user data set is shown in table 1.

As indicated in the data description above, the eBay and Facebook data were selected in order to determine category-level predictors. The likes of users were distributed over roughly 200 Facebook categories, the purchases were spread over 35 eBay meta-categories.

illustration not visible in this excerpt

Figure 2: Distribution of likes for users (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media8 ).

illustration not visible in this excerpt

Figure 3: Distribution of purchases for users (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media8 ).

illustration not visible in this excerpt

Figure 4: Distribution of likes for pages (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media8 ).

illustration not visible in this excerpt

Figure 5: Number of purchases in all eBay meta-categories (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media 8 ).

3.2 The Challenge of Data Sparsity

The biggest challenge lied in the sparsity of data on many users. Figure 2 shows the distribution of likes for users. The graphic shows that most users have between 100 and 200 likes. The median is 152 likes. These likes are distributed unevenly as shown in gure 4. Many pages have less than 10 likes, very few pages have more than 100 likes. Additional to just dealing with sparse data, it is sparse for both, Facebook and eBay. As shown in figure 3, almost half of all users have less than 10 purchases in total. Fur- ther, most purchases are made in a small number of categories as shown in gure 5. As the authors expected, the sparsity of the Facebook data made their task inherently hard. In order to overcome the data sparsity problem, the authors applied collaborative ltering methods and content-based meth- ods to construct feature families (as described in section 3.6.2) that would result in meaningful predictions.

3.3 Users' Purchasing and Liking Focus

In order to nd out if there's a potential underlying pattern, or whether users just buy randomly across categories, the authors rst established how focused users are when buying online. In order to do this they established the k-rank over the eBay categories. The k-rank is the apriori probability that a user u buys from category e. It is obtained by dividing the purchases of a user in category e by her purchases in all categories E:

illustration not visible in this excerpt

Figure 6: Distribution of purchases in eBay meta-categories (Source: Y. Zhang, M. Pennacchiotti: Predicting Purchase Behavior from Social Media 8 ).

illustration not visible in this excerpt

To give a more concrete example say a user buys 2 items from one category and 8 times from another. Therefore P (u)1 = 0.2 and P (u)2 = 0.8. By applying a Kolmogorov-Smirnov (K-S) goodness-of- t test the authors check whether the k-rank is distributed evenly, or not. If that was the case, then users would in fact buy randomly across categories. However, as gure 6 shows, the distribution is by no means uniform, hence the hypothesis is rejected and therefore users do buy focused in terms of categories. In fact the average user is very focused when buying online: as can be easily derived from gure 6, the top 3 categories account for about 85% of all purchases of the average user.

The authors applied the same methodology to users' Facebook likes and reached the same conclusion. The level of focus is slightly lower when it comes to liking pages, but still very signi cant.

3.4 Demographic Differences

As described in section 2, The History of Social Media Recommendation, the typology of online shoppers and especially the demographic di erences have been studied extensively over time. The authors built upon previous work and analyzed whether men and women buy from di erent eBay meta- categories and whether there are differences between age groups. In order to get meaningful results, t-tests were conducted. The results were conclusive and promising: Women buy signi cantly more than men in 10 categories, men buy significantly more than women in 16 categories.


1 de.statista.com/statistik/daten/studie/70190/umfrage/ umsatz-im-deutschen-einzelhandel-zeitreihe/

1 Bhatt, Rushi, Vineet Chaoji und Rajesh Parekh. Predicting product adoption in large-scale social networks. In Proceedings of the 19th ACM international conference on Information and knowledge management,Seite 10391048. ACM, 2010.

2 Burke, Robin. Hybrid recommender systems: Survey and experiments.User modeling and user-adapted interaction, 12(4):331370, 2002.

3 Chiu, Yu-Bin, Chieh-Peng Lin und Ling-Lang Tang. Gender diers: assessing a model of online purchase intentions in e-tail service. International Journal of Service Industry Management, 16(5):416435, 2005.

4 Guo, Stephen, Mengqiu Wang und Jure Leskovec. The role of social networksin online shopping: information passing, price of trust, and consumer choice. In Proceedings of the 12th ACM conference on Electronic commerce, Seite 157166. ACM, 2011.

5 Hill, Shawndra, Foster Provost und Chris Volinsky. Network-based marketing:Identifying likely adopters via consumer networks. Statistical Science, Seite 256276, 2006.

6 Kau, Ah Keng, Yingchan E Tang und Sanjoy Ghose. Typology of online

7 Sarwar, Badrul, George Karypis, Joseph Konstan und John Riedl. Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM conference on Electronic commerce, Seite 158167. ACM,2000.

8 Zhang, Yongzheng und Marco Pennacchiotti. Predicting purchase behaviors from social media. In Proceedings of the 22nd international conference on World Wide Web, Seite 15211532. International World Wide Web Conferences Steering Committee, 2013

Excerpt out of 18 pages


Purchase Prediction from Social Media. Methodology, Limitations & Potentials
University of Heidelberg  (Computer Science)
Seminar - Social Media Network Analysis
Catalog Number
ISBN (eBook)
ISBN (Book)
File size
951 KB
purchase, prediction, social, media, methodology, limitations, potentials
Quote paper
Philipp Güth (Author), 2014, Purchase Prediction from Social Media. Methodology, Limitations & Potentials, Munich, GRIN Verlag, https://www.grin.com/document/305215


  • No comments yet.
Read the ebook
Title: Purchase Prediction from Social Media. Methodology, Limitations & Potentials

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free