Big or Smart Data? Recent trends in Data Science for sales and marketing

An empirical analysis of the consumer-packaged goods industry

Bachelor Thesis, 2021

139 Pages, Grade: 1,0


Table of Content

List of abbreviations

Table of figures

List of tables

1 Introduction

2 Fundamentals of big and smart data
2.1 Characteristics of big data
2.2 Development of smart data

3 Data Science
3.1 Evolution of the data economy
3.2 A data science definition
3.3 Data science techniques in sales and marketing
3.3.1 Introduction of the data mining process
3.3.2 Data modelling in sales and marketing
3.3.3 Model evaluation and deployment

4 Research Design
4.1 Conceptual framework
4.2 The selected industry

5 Case study results
5.1 Summary and validation of expert information
5.2 Data science framework

6 Overview of future approaches in data science
6.1 Challenges and opportunities in practice
6.2 Limitations

7 Conclusion




Big data is getting larger, the pressure in the market to use the existing data is getting stronger and therefore also the number of companies that address the topic of data science increases. This dissertation focuses on identifying big or smart data science trends in marketing and sales within the consumer-packaged goods industry. The objective of this research is to address actual opportunities around data science for the selected focus area. The following research project analyzes those opportunities and identifies nine data science trends. Via in-depth interviews, the expert’s experiences and difficulties with data science are questioned, emotions that arise through the interaction with this science are recognized, and potentials for improvements are discussed. Subsequently, central meaningful quotations are analyzed with Mayring’s qualitative content analysis, reformulated into condensed codes, and summarized through eighteen overarching categories. The general findings of this analysis include the necessity of smart data insights within this low margin industry, the dependence on consultancy support due to knowledge gaps, expandable engagement in the B2B environment, the promotion of data-thinking and acting, the merge of sales and marketing for data science knowledge generations, and the extension of data science knowledge to maintain competitive advantage within the market for the long run. The improvement proposals consist mainly of automated data cleaning, intelligent algorithms, data handling knowledge development, data democracy, and knowledge combinations in form of project dependent focus teams to broaden data science applications within the industry.

List of abbreviations

AI Artificial Intelligence

ARIMA Autoregressive Integrated Moving Averages

BI Business Intelligence

CPG Consumer-Packaged Goods

CRISP-DM Cross-Industry Standard Process for Data Mining

CRM Customer Relationship Management

IoT Internet of Things

ML Machine Learning

ROI Return on Investment

SEM Structural Equation Modelling

SVM Support Vector Machine

Table of figures

Fig. 1: Sequence of big data analytics

Fig. 2: Timeline of the data economy development

Fig. 3: Definition of data science

Fig. 4: The fusion of data analytics with data science

Fig. 5: The data mining process

Fig. 6: Overview of data mining techniques

Fig. 7: Structure of a back-propagation neural network

Fig. 8: The process of qualitative content analysis due to Mayring P

Fig. 9: Visualization of the coding process

Fig. 10: The merging of similar categories

Fig. 11: 18-by-4 reliability matrix

Fig. 12: Matrices of consistency

Fig. 13: Calculation of Krippendorff’s Alpha

Fig. 14: Identification of nine trends from 18 coding categories

Fig. 15: Data science framework

Fig. 16: Insights into the emotions of the interviewees

List of tables

Tab. 1: Compilation of big data sources

Tab. 2: Researchers addressing the term of big and smart data

Tab. 3: Data mining techniques in sales and marketing

Tab. 4: Overview of the conducted interviews

Tab. 5: The 20-category coding guide

Tab. 6: Data mining techniques in the CPG industry in marketing and sales

Tab. 7: Data science differences in the CPG industry

1 Introduction

How many units will I sell next month? How can I inform specific customers about my products? What do people think about my brand? Are they even talking about us? These questions and many more can be answered through proper analysis of big and smart data. Indeed, it is easier in some companies or industries than in others. Since around 2010 the creation, capturing, copying, and consuming of big data steadily grows every year and amounts to a total of around 59 zettabytes worldwide in 2020 (cf. IDC, 2020). Converted, but still unimaginably large, these are approximately 59 trillion gigabytes. With a growth rate of 150 percent, the same source expects the size of global information in the form of big data to reach 149 zettabytes in 2024. For these large amounts of data, cloud storage gains increasingly in importance as traditional data warehouses reach their limits. Today the largest part of big data is generated by the three tech companies viz., Amazon, Facebook, and Google (cf. Wedel and Kannan, 2016, p. 102). Big data creates the business foundation of any online enterprise (cf. Wang and Wang, 2020, p. 2). Of significant interest to companies are instantaneous real-time data, which is generated primarily through interactions with devices connected to the so-called internet of things. Big data enables businesses to improve customer management systems, forecast demand, sales, and turnover, optimize supply chains, increase product security and performance, or develop new business models. Consequently, all these possibilities are full of potential to maintain the competitive advantage within the market. ‘Lidl’, for instance, is one of the first discounters in Germany offering a smartphone app which includes discount promotions, current in-store promotion overviews, and digital receipt storage for users. On the one hand, the application aims to connect Lidl’s digital and brick-and-mortar business and on the other hand the enterprise receives its own, valuable customer data. The ‘Payback’ bonus program has been collecting consumer data for years by offering its members extraordinary savings on their purchases. Another example are well-known delivery services, such as ‘Lieferando’ or ‘Rewe liefert’, which have benefited a lot from the Covid-19 pandemic but also get first-hand consumer data to improve their business and customer experience. Recently, smart consumer products have also been gaining in popularity, such as ‘Oral B's’ first smart electric toothbrush. The toothbrush can be connected to an associated app, allowing brushing behavior to be optimized. At the same time, this consumer-generated data provides a foundation for data-based business measurements. Data analysis opens many fields of application, whether in health care, the financial sector, entertainment streaming services, production or increasingly in the consumer goods industry. This work will focus on the consumer-packaged goods (CPG) industry and in particular the application area of sales and marketing. Therefore, the purpose of this work is to identify the importance of big or smart data for the businesses and actual trends around data science. The work extends existing literature on the topic with valuable insights into an industry that can benefit from a great amount of untapped potential in the field of data science. This makes the results of the conducted research interesting not only from an academic point of view, but especially for consumer goods companies that want to engage more actively in the field of big data.

The work is divided into seven chapters. The first chapter is an introduction to the subject in which it describes the motivation regarding the selection of the topic, sets the objective and gives a brief overview of the topic. The second chapter considers the two data characteristics, big and smart, and gives a brief understanding of the extraction of smart data. The third chapter focuses on Data Science, including its evolution, detailing terms and the data mining process. The fourth chapter begins a disclosure of the main part of the work and describes the conducted research within the selected CPG industry. The main body is chapter five, which contains the development of the data science framework, identifying nine trends from the collected interview material. Attention has been given to the detailed description of the applied methodology for qualitative content analysis by Mayring, and the calculation of Krippendorff’s Alpha. Mayring’s methodology and the developed coding guide, makes it possible to extract valuable information from the interview material and understand data science approaches in marketing and sales within the CPG industry. Additionally, the comparison of similarities within the given information will be presented, allowing dominant trends to be defined. Chapter six focuses on the identification of future opportunities around data science within this industry and the critical view about the limits of this research. A compact summary of the work rounds off the dissertation.

2 Fundamentals of big and smart data

2.1 Characteristics of big data

Images or videos from online communities and social media accounts, retail scanner data, data from online reviews and the customer buying journey, sensor data such as GPS smartphone sensors, or in-store cameras and intelligent changing rooms (cf. Sivarajah et al., 2017, p. 263; Fisher and Raman, 2018, p. 1665) are all commonly used big data sources, compiled in more detail in table 1. The term Big Data as today's norm, first emerged with the creation of the internet. The development of the world-wide-web and the launch of the personal computer in the 1980s, created unique possibilities for customer interaction, data collection and storage (cf. Chen, Chiang, and Storey, 2012, p. 1167f; Wedel and Kannan, 2016, p. 99). Since then, companies have been able to analyze user generated data from various channels and derive a competitive advantage or hidden potential out of it (cf. Chong et al., 2017, p. 5142f). The work from Blattberg and Neslin (cf. 1989, pp. 82-84) is an early example, where analyzed scanner data is used to increase sales through different promotional activities and its interaction effects. Such data was automatically gathered by IBM supermarket scanners for the first time in 1972 (cf. Wedel and Kannan, 2016, p. 99). Increased data availability has also influenced the need of consumer analytics in marketing, with which the area of customer relationship management (CRM) emerged (cf. Shaw et al., 2001, p. 127). However, Fisher and Raman (cf. 2018, p.1665) point out, that sales data available to the companies around 1995 was still limited. They state that retail businesses, for instance, used point-of-sales data as their primary source to gain customer demographic insights. Around the same time, online retailers like Amazon or eBay developed their innovative online marketplaces and introduced advanced recommendation algorithms (cf. Chen, Chiang, and Storey, 2012, p. 1168f), which received their input from filtered content of preferences or similar consumer data (cf. Wedel and Kannan, 2016, p. 112). At the start of the 2000s, new text and video data from the social media platforms Facebook, YouTube, and Twitter became available at decreased cost (cf. Wedel and Kannan, 2016, p. 99f). Only around 10 years later did companies and industry as well as google searches and literature start to increase its focus on generating big data, its analysis, and advanced big data technologies (cf. Cao, 2017, p. 5f; Wang and Wang, 2020, p. 1). Based on the detailed research of Sivarajah et al. (2017, p. 264), there has only been literature on big data since 2002. As stated by Wedel and Kannan (2016, p. 103), connected devices, networks, or software, also called the Internet of Things (IoT), started to facilitate continuous and qualitative big data collection through newly developed technologies. Statistical methods, originated in the 1970s and 1980s, create the basis for prevalent big data analyses (cf. Chen, Chiang, and Storey, 2012, p. 1166). With the accessibility of enormous, heterogeneous data volumes, the era of “the Data Deluge” (cf. Sivarajah et al., 2017, p. 263) was born. Today, the potential of big data is widespread as consumers and the growing number of connected devices in this world, continuously generate big data (cf. Wang and Wang, 2020, p. 1).

Table 1: Compilation of big data sources

Abbildung in dieser Leseprobe nicht enthalten

(cf. Malthouse and Li, 2017, p. 228f and extended with Sivarajah et al., 2017, p. 263; Fisher and Raman, 2018, p. 1665)

Big data is a collective term that combines many areas, such as the management and the collection, but also the analysis of continuously growing, inconsistent data sets and its characteristics (cf. Fosso Wamba et al., 2015, p. 235; George et al., 2016, p.1). Big data analytics contributes to the input for Business Intelligence (BI) (cf. Lycett, 2013, p. 381) and with sensible organization, filtering, and clustering, initial insights can be visually presented and communicated (cf. Hajli et al., 2020, p. 5). Volume, velocity, and variety are the three V’s which characterize big data. Some authors, including Sivarajah et al. (cf. 2017, p. 269 - 273), extend the list to the following six V’s:

- Firstly, big data is defined by its continuously expanding volume (cf. Hashem et al., 2015, p. 100), since a higher data quantity provides better models and hence better results (cf. Lycett, 2013, p. 381). By drawing on a larger data scope, it is also possible to gain insight about populations rather than a sample (cf. George et al, 2016, p. 3). Nowadays, these data sets are stored in exabytes or zettabytes (cf. Erevelles, Fukawa, and Swayne, 2016, p.898; Sivarajah et al., 2017, p. 269) and including their complexity, exceptional storing systems are needed (cf. Chen, Chiang, and Storey, 2012, p. 1166; Leeflang et al., 2014, p. 5). Database management systems like SAP Hana, are typically implemented for structured data processing and insightful extraction of information from continuously collected data (cf. Jacobs, 2009, p.39f; Gupta and George, 2016, p. 1052). Increasingly well-known, are cloud applications for structured and fast data storage, and retrieval (cf. Rust and Huang, 2014, p. 209).
- Velocity is the second characteristic of big data. This includes, among other things, the accuracy of the data flow and the latency rate, which determines the time until latest data is accessible for analysis (cf. Lycett, 2013, p. 381) or personalized product offers (cf. Sivarajah et al., 2017, p. 273). Based on an example from Erevelles, Fukawa, and Swayne (cf. 2016, p.898), a big data set is permanently obtained and collected by a leading supermarket chain, whereas the population data of Europe is defined as a large data set. The velocity can vary between unique snapshots and continuously flowing data streams (cf. Wedel and Kannan, 2016, p. 102), which are created by complex networks (cf. Sivarajah et al., 2017, p. 273).
- Unstructured or structured big data is characterized by its variety (cf. Hashem et al., 2015, p. 100), resulting from the compilation of various sources and inconsistent formats (cf. Lycett, 2013, p. 381). Big data can also include sentiments or data from physical conditions of a person (cf. George et al., 2016, p. 4). Blogs or review platforms for instance, are excellent sources to ascertain the vibe around a product or brand (cf. Netzer et al., 2012, p. 522). Certain software can transform the visual, numerical, textual, sensory, and audible (cf. Wedel and Kannan, 2016, p. 102; Sivarajah et al., 2017, p. 269) variety of data into semi-structured information (cf. Erevelles, Fukawa, and Swayne, 2016, p.898). The software Stanford CoreNLP for example, pre-processes textual data by eliminating irrelevant words and extracts valuable linguistic information (cf. Liu, 2020, p. 4). The described cleaning process of big data is also known as “Data staging” (cf. Hashem et al., 2015, p. 102).
- Value as the fourth ‘V’ emerged over time and defines the importance of analyzing big data and utilizing its valuable results for business (cf. Lycett, 2013, p. 381). In this context researchers implement the term datafication, which represents the process of information extraction from physical assets and resources for newly generated insights (cf. Lycett, 2013, p. 382f). Sivarajah et al. (2017, p. 273) describes the process of uncovering valuable insights as sifting the gold out of big data.
- At some point the fifth character, veracity, or validity, gained importance in big data. Ever-growing data sets are only useful with the elimination of insignificant or non-qualitative information (cf. Erevelles, Fukawa, and Swayne, 2016, p.898), which in turn develops reliable (cf. Wedel and Kannan, 2016, p. 102) and understandable big data (cf. Sivarajah et al., 2017, p. 269). By filtering out impulsive trends, data sets act as helpful warning systems and increase a business’ benefit (cf. Qi and Shanthikumar, 2018, p. 1682). Malthouse and Li (2017, p. 233) define a valid big data set as a comprehensive data set, which describes the selected target group and not the complete customer database.
- Finally, Sivarajah et al. (2017, p. 273) ads the characteristic variability to the list of V’s. Not to be confused with the second characteristic variety, where the data type plays a predominant role. Variability defines the various meanings and rapid transformation of one and the same data entry (cf. Sivarajah et al., 2017, p. 273). The word ‘large’ for instance, is negatively regarded to incorrectly ordered pants but positively regarded with respect to a TV screen for a home cinema (see also Chapter 3.3., sentiment analysis).

The goal of collecting big data is its storage, its management through clustering, and with useful analysis, its transformation into insightful and readable information. Big data analysis enables companies to readjust strategies, realign resources, and revitalize business capabilities (cf. Fosso Wamba et al., 2015, p. 243).

2.2 Development of smart data

Erevelles, Fukawa, and Swayne (2016, p. 901) conclude that creativity-driven companies who are implementing fundamental innovations, can extract more noteworthy value from big data. They define customer big data as an innovative asset for sustainable competitive advantage development (2016, p. 903). A frequent source of this advantage are online reviews, which capture sentiments (cf. Hu, Koh, and Reddy, 2014, p. 3) but also guidance on consumer needs (cf. Hajli et al., 2020, p. 5). However, not all of the big data volume is valuable, but rather the uncovering of relevant insights with tailored analyzing techniques and statistical models (cf. Sivarajah et al., 2017, p. 264). Given that the composition of unstructured big data is unclear, the existing data is analyzed, interpreted, and converted into qualitative actionable information (cf. Hashem et al., 2015, p. 100) or so-called smart data. Wedel and Kannan (2016, p. 104) determined a framework for data analytics in marketing, which, combined with the work of Sivarajah et al. (2017, p. 266), results in figure 1. This figure visualizes the sequence of different big data analytics to identify the smart data.

Figure 1: Sequence of big data analytics

(Wedel and Kannan, 2016, p. 104; Sivarajah et al., 2017, p. 275f)

In the first sequence, structured and unstructured data is analyzed via descriptive, univariate, or bivariate statistics, such as variability measures, linear regression analyses (cf. Chen, Chiang, and Storey, 2012, p. 1166f), or probability measures of maximum likelihood (cf. Reiss, 2011, p. 954). The visualization of the results in boxplots, histograms or similar, helps to gain an initial understanding of the data, identify outliers, summarize important information and define focus areas for further analysis (cf. Liu, 2020, p.5-7). Descriptive analytics of structured data from data warehouses and its result mapping, form the basis of analytical or BI projects (cf. Cao, 2017, p. 4). If one now adds the marketing analytics approach of Wedel and Kannan (2016, p. 105) into this first step, it deals with reviewing the sales data in combination with interview sample data. While the first part of the framework focuses on managing and explaining the collected data, sequence number two tries to predict future performance and minimize the risk. Analyses applied, are Bayesian statistics and the definition of random events through well-known testing methods such as Chi-Squared test or T-test. Wedel and Kannan (2016, p. 105f) state that methods applied are principal component analysis for the combination of related variables or basic Bayesian statistics, that focus on determining probabilities of input and output events through, for instance, hypothesis testing. Bayesian models can form heterogeneous consumer segments for example (cf. Reiss, 2011, p. 954). The segmentation can in turn be the basis for the prediction of preferable marketing measurements and its effects on sales growth for example (cf. Wedel and Kannan, 2016, p. 104 and 107). The last sequence focuses on the development of prescriptive models that are based on multivariate statistics such as regression models, clustering analyses or structural equation models. These models extract smart data and generate data-based recommendations for action. Finally, with the creation of prescriptive models and valuable insights gained through big data analytics, businesses can adapt their future (marketing) actions to those findings (cf. Wedel and Kannan, 2016, p. 104; Wang and Wang, 2020, p. 2). In conclusion, data analysis is a sequence of big data observation, information gathering through detailed data analysis, followed by data-based predictions and actions with extracted smart data (cf. Cao, 2017, p.17). With data sets becoming larger, predictive machine learning (ML) algorithms like support vector machine (SVM) or Bayesian networks gain in importance (cf. Chen, Chiang, and Storey, 2012, p. 1175). They facilitate the last framework sequence of uncovering feasible smart data insights. More detailed information concerning smart data analytics and the prescriptive sequence can be found in chapter 3.3 of this work. The framework and big data analytics can be regarded as the bridge between big and smart. Therefore, smart data is the extension of big data, and focuses on the comprehension, useful application, and even prediction of data. Smart data consists of converted, valuable business insights from big data (cf. Iafrate, 2015, p. 13). Big data, for example, records past and current sales data, while smart data uses real-time big data to forecast turnover (cf. Iafrate, 2015, p. 19). Supported by Jacobs (2009, p. 44), big and smart data analytics will be the new routine of the data world and only the businesses who look beyond conventional technologies and algorithms will be able to maintain a competitive advantage. An overview and summary about relevant researchers addressing the transformation from big data into smart data within this chapter, can be found in table 2. Particularly, smart data enables businesses to not only learn about their consumers, but also to analyze competitors, align pricing or sales strategies, collect reviews on the assortment and prototypes (cf. Zhenning, Frankwick, and Ramirez, 2016, p. 1563), and to review present business models and opportunities (cf. Fisher and Raman, 2018, p. 1666).

Table 2: Researchers addressing the term of big and smart data

Abbildung in dieser Leseprobe nicht enthalten

(Sources are in the table included)

3 Data Science

3.1 Evolution of the data economy

The origins of data modelling in sales and marketing lie far in the past. After the second world war, effective models for the optimization of production and logistics were created as stated by Leeflang and Wittink (2000, p. 106). Shortly thereafter, the importance of predictive modeling in marketing was highlighted by Politz and Deming (1953, p. 51), as each decision in marketing defines a future expectation and is therefore based on a forecast. In the 1960s marketing managers, for instance, started to apply those successful mathematical models to predict consumer buying behaviors (cf. Leeflang and Wittink, 2000, p. 106). As an example, Winters (1960, p. 326f) applied moving averages for sales performance predictions of products containing seasonal structures. Among other things, the well-known growth model from Bass (1969) for sales of new consumer durables evolved. Bass was able to develop a model, which predicts a future phenomenon and contributes to its comprehension without having specific data for it (cf. Bass, 2004, p. 1834f). Throughout the 1970s many analytical models for consumer goods emerged which were increasingly realistic but also complicated (cf. Leeflang and Wittink, 2000, p. 107). These models were predominantly based on economic descriptions, whereby psychological analyses were also included (cf. Wedel and Kannan, 2016, p. 100). In 1975, Peter Naur was one of the first to introduce the term Data Science, by elaborating on language and methodology to handle sizeable data sets (cf. Belzer, 1976, p. 125f; Cao, 2017, p. 3). Around the same time, data analysis also gained importance in the sales environment. In the study of Rothe (1980, p. 114-116), sales and marketing was defined as one of the main areas where sales predictions are vital for the effectiveness of the business. In addition, this study illustrates that regression analyses or time series models were still applied by the minority of the respondents. The majority reports traditional statistical methods or manager experiences to be the main source of forecasting. Similar information is provided by Edmundson, Lawrence, and O'Connor (1988, p. 203), who defined product and its industry knowledge as the major source of competitive advantage in early sales forecasting. Makridakis and Wheelwright (1977, p. 26-29) in relation with Mentzer and Cox Jr. (1984, p. 35) expanded the range of prevalent forecasting techniques with moving averages, trends, decision trees, exponential smoothing, but also simple survey evaluations. Over the course of time, summarizing journals of marketing science and data analytics businesses expanding its tracking services such as Nielsen, increased the interest in marketing decision support systems (cf. Leeflang and Wittink, 2000, p. 108f). Additionally, CRM began to flourish with the evolvement of facilitating software for interviews and easier data accessibility in the world wide web (cf. Wedel and Kannan, 2016, p. 99). This implies that Data science was already defined as a method to handle limited data in the 1970’s, revolutionized by the occurrence of big data in conjunction with the world-wide-web and transformed into today’s data-driven economy (cf. Cao, 2017, p. 25). The evolution of the data-driven economy pushed classical statistics to its limits, giving rise to a new kind of science for data viz. data science. The expansion of the internet in the early 2000s, lead to a marketing shift towards a customer centric focus with abundant availability of marketing data and tailored campaigns for selected customers (cf. Leeflang and Wittink, 2000, p. 117; Wedel and Kannan, 2016, p. 99). Online retailers created platforms for user generated reviews, serving as informational sources for future customers and businesses researching their brand image (cf. Fan, Che, and Chen, 2017, p. 90). In this new economy, Cao (2017, p. 25f) defines the data as the key driver in which new systems, methodologies, IT infrastructure, environments for data management, and professional communities are created. With the ability to store and analyze continuously flowing data, defined as digital data streams by Piccoli and Pigni (2013, p. 54f), Data Science becomes a key resource for businesses. The authors introduce the clickstream, of a customer searching a product on a website, as a common example of a digital data stream, which can be harvested, enriched with additional valuable data, and analyzed for business insights. Digital data streams allow businesses to monitor and visualize its performance in real-time, receive instant feedback and modify its decisions (cf. Piccoli and Pigni, 2013, p. 57). In the area of marketing, customer journeys or loyalty loop tracking becomes a key feature to increase sales via higher customer engagement (cf. Leeflang et al., 2014, p. 5). Whereby Rust and Huang (2014, p. 213) define customer interaction as the fuel of the new era of marketing science. In addition, the granularity of marketing mix components, tailored to a single customer or a homogeneous segment, increased with the accessibility of customer specific data (cf. Wedel and Kannan, 2016, p. 110f). Nowadays businesses rely heavily on digital marketing data, such as clickstream data, collected through cookies (cf. Wedel and Kannan, 2016, p. 99). Each click offers an opportunity to generate targeted marketing content supported by real-time automated models (cf. Bucklin et al., 2002, p. 253). Big data science also enriches the area of customer specific demand predictions, which allows for sales forecasts with the finest accuracy (cf. Feng and Shanthikumar, 2018, p. 1672). Kulkarni, Kannan, and Moe (2012, p. 605) for example, benefit from structured online search data, to measure the pre-launch interest in a new product and forecast its sales. On the other hand, this information allows for precise inventory management and the alignment of the production capacities, which in turn reduces variable cost (cf. Kuo and Xue, 1998, p. 105). The described development of the data economy is clarified with the timeline of figure 2 on the next page.

Today, data science has become the engine of innovation and advantage creation as highlighted by Cao (2017, p. 15) or Feng et al. (2020, p. 1). Cao’s statement (2017, p. 15-17) is supported by the rapid development and growth of a professional community actively promoting this new science, the regular exchange of these experts at conferences or similar, and the extension of consumer interacting online platforms like Facebook or LinkedIn as hot spots for big data. In a data-driven economy, business decisions are based on continuously expanding, permanently accessible global data (cf. Cao, 2017, p. 27). The data analytics framework presented in chapter 2.2 can be extended to an ongoing and intelligent data science analytical cycle, by adding a data-driven innovation and optimization sequence proposed by Cao (2017, p. 27). Professionals, as well as Cao (2017, p. 33), see data science as a newly emerging discipline, which continues to develop its own systems, methods, and theories. One of the most interesting trends, recently influencing different industry sectors, is Artificial Intelligence (AI), which replaces humans or repetitive work with intelligent machines or algorithms.

Figure 2: Timeline of the data economy development

Abbildung in dieser Leseprobe nicht enthalten

(Sources are included in the figure)

3.2 A data science definition

Data Science is an umbrella term for an interdisciplinary science. It describes the extraction of valuable insights and the development of forecasts from continuously growing big data (cf. George et al., 2016, p. 1) through the interaction of three fields. The combination of the three fields in data science and what they include is visualized with the input from George et al. (2016, p. 1f) and Cao (2017, p. 5-8) in figure 3 below. Data analytics, already discussed in chapter 2.2, is the cornerstone of data science and connects the fields of statistical methods and business or economy knowledge (cf. Cao, 2017, p. 5). Statistics is a developed discipline, where selected data sets are analyzed and insights are rapidly extracted via statistical tests or representative sample selection (cf. Choi, Wallace, and Wang, 2018, p. 1869f). The second field, business knowledge, represents department-based knowledge in marketing or sales for instance. This knowledge is important to draw the right conclusions from the statistical analyses and algorithmic predictions (cf. George et al., 2016, p. 1). The area of data engineering is responsible for the data flow or pipeline (cf. Yaqoob et al., 2016, p. 1241). Lastly, computer science completes the three corners of Data Science. On the one hand, computer science supports the business insight management and exchange by providing data systems to structure the data, and on the other hand it creates the basis for statistical ML models for big-data-driven decisions (cf. Cao, 2017, p. 8). In deep big data ML, intelligent computer-based algorithms are used to capture especially complex data relations, whereby these algorithms continuously learn from the input data (cf. Choi, Wallace, and Wang, 2018, p. 1870). Data mining is a computer-assisted methodology to classify, cluster, or separate similar big data, while the process closes the triangular relation and links the ML models with the field of statistics (cf. Choi, Wallace, and Wang, 2018, p. 1870). The combination of all three fields is defined as data science, and depending on the focus area, definitions can differ from one another. While for example George et al. (2016, p. 1) focuses on the interaction of analytical statistics and computer science to define data science, Cao (2017, p. 8) highlights the business knowledge as an important basis in the new science to extract information.

Figure 3: Definition of data science

Abbildung in dieser Leseprobe nicht enthalten

(cf. George et al, 2016, p. 1f; Cao, 2017, p. 5-8)

Now the concept of data analysis, which focuses on understanding various data sets, its statistical analyses, and smart data extraction (cf. Cao, 2017, p. 17), can clearly be distinguished from the term data science. Cao (2017, p. 19f) separates the two terms by including data analytics into the area of explicit analytics, and data science into the area of implicit analytics. This extension of the data analytics framework, based on figure 1, is clarified in the following figure 4 in blue. The most significant difference is that with data becoming more complex, data science methods are included in the prescriptive analysis of unknown variables and patterns, to ensure a clear understanding of previously undiscovered actionable insights (cf. Cao, 2017, p. 19f).

Figure 4: The fusion of data analytics with data science

Abbildung in dieser Leseprobe nicht enthalten

(cf. Cao, 2017, p. 18-20)

To share the retrieved information with stakeholders, access, and knowledge about data science tools, including an adequate IT-infrastructure, is required to exploit data science skills successfully (cf. Fosso Wamba et al., 2015, p. 235). Provided these prerequisites are realized, every data science project follows a similar process. The following chapter will elaborate on the data science process and its diverse methodologies.

3.3 Data science techniques in sales and marketing

3.3.1 Introduction of the data mining process

As described by Bucklin et al. (2002, p. 253), historically developed statistical models have not been designed for massive data sets but rather for incomplete data. This challenge has given rise to the process of data mining, in which efficient methods are able to transform growing data sets into future predictions (cf. Bucklin et al., 2002, p. 253). In contrast to the Data Analytics framework, which focuses on the discovery of smart data, Data Science follows the prescriptive part of this same sequence in figure 1 or 4 and digs deeper into the data or even predicts it with very high accuracy. New discoveries from big data analyses, a request from a colleague or the management level in the company, gives rise to a data science project or so-called data science use case. Consequently, the use case is usually put into shape and directed by the standardized data science process, referred to as data mining or cross-industry standard process for data mining, abbreviated as CRISP-DM (cf. Martinez-Plumed et al., 2020, p. 1). In the following section, the work focuses on the prescriptive part of big data analytics and describes the complete data mining process in more detail, also visually in figure 5.

Figure 5: The data mining process

Abbildung in dieser Leseprobe nicht enthalten

(c f. Martinez-Plumed et al., 2020, p. 2)

In the first step the goal of the data science use case is identified, including the business knowledge about the project and its data (cf. Martinez-Plumed et al., 2020, p. 4). Typical use cases in sales and marketing are the effectiveness of a bundle of marketing measurements and its effects on sales or the optimization of social-media ad spending. Thereafter the necessary big data is defined, collected, and stored. In today’s data economy large amounts of data are automatically collected by sensors in smart devices, or intelligent algorithms extracting data from websites via web crawling or clickstream tracking (cf. George et al., 2016, p. 10-12). In the same step, those large data sets are stored in an easily accessible way where large processing capacity is available, such as within a multiple computer storage and the software Apache Hadoop or a cloud system (cf. George et al., 2016, p. 14). Due to Jacobs (2009, p.39), big data collection and its structured storage in so-called data warehouses is no impediment of the process anymore. Data warehouses are not to be confused with data lakes, where raw big data is stored in its original form and from which only data for relevant data mining can be extracted (cf. Kitchens et al., 2018, p. 542). Data science goes further than pure big data analytics because active engagement with the data, expertise on the data, and even creation of necessary new data for the use case is required (cf. Martinez-Plumed et al., 2020, p. 5). Creating an understanding of the data also creates this reciprocal effect to the previous step. The original CRISP-DM model integrates the data as a fixed center, however the increased variety in big data also requires flexibility in data collection, including internal business data but also external or newly developed data sets (cf. Martinez-Plumed et al., 2020, p. 5f). This is the reason why the process in figure 5 includes the data as an invisible source adding its input from different platforms, devices, and environments. The third step of the data mining process includes data cleaning and processing, where the relevant data is fished out of the data lake or warehouse (cf. Guo, Wong, and Li, 2013, p. 248). At the same time, the variety but not the value of the data is reduced through categorization, combination, or formatting of the variables (cf. George et al., 2016, p. 16). This is particularly relevant for highly heterogeneous marketing data sets, which are dominated by the big data characteristics such as variety and veracity (cf. Rosi and Allenby, 2003, p. 316). In this same step the relevant variables of the big data sets are analyzed via descriptive and multivariate statistics, such as regression analysis, to determine the relation between different variables (cf. George et al., 2016, p. 18-20). Interesting findings from this advanced data analytics process are visualized and reported to determine further data science methods (cf. George et al., 2016, p. 27f). If the problem to be investigated and its data are already known, descriptive analyses and visualization are omitted or have already been carried out in advance. So far, the data science process is very similar to the data analytics process illustrated in figure 1 and 4. While data analytical methods focus on a sample set of data, data science applies its methodologies to the predefined but full size of the data set, which can be summarized in the term data mining (cf. Shaw et al., 2001, p. 131). In the next step the features for the methodology are defined which might require some redefinition of needed inputs. For example, the competitor’s influence, established with the combination of several variables, is considered a feature (cf. Ferreira, Lee, and Simchi-Levi, 2016, p. 73f). In the retail sales prediction model of Guo, Wong, and Li (2013, p. 248f) a harmony-search-wrapper-based algorithm selects the optimal input features and excludes irrelevant variables. As soon as the features are identified the data set is, dependent on the methodology, directly ready for modelling or else split into a training and testing set. Whereas the first set is fed into the prediction model (cf. Guo, Wong, and Li, 2013, p. 248) and the latter is the basis for the accuracy check after the model training. In general, data models describe the complexity of a big data set (cf. Navathe, 1992, p. 112f), can make analytical forecasts, or solve problems with adapted models (cf. Cui and Curry, 2005, p. 597). In the following section, the work gives an overview of popular statistical and ML methodologies in this fourth step of the data mining process.

3.3.2 Data modelling in sales and marketing

Abbildung in dieser Leseprobe nicht enthalten

Before applying data mining techniques, one needs to decide on the input data being a cluster or a sample. While a sample is a randomly selected part of a larger population or customer base, a cluster, such as weekend promotion shoppers, combines data entries with similarities in a selected variable (cf. Bradlow et al., 2017, p. 27). Sample findings are mostly identified with statistical techniques and transferred to the entire database, which might not represent the segment of focus (cf. Malthouse and Li, 2017, p. 233). Whereas clusters are mostly defined and analyzed via ML algorithms as described below. For a better orientation, the addressed data mining methods are visually organized in figure 6.

Figure 6: Overview of data mining techniques

Abbildung in dieser Leseprobe nicht enthalten

(Own figure, Sources due to context in the following paragraph)

Classic statistical data mining techniques can be subdivided into statistical models and time series analysis, whereas the first includes multivariate statistics, such as simple linear regressions, but also structural equation modelling (SEM) or generalizable linear models. A maximum likelihood estimator is also part of the statistical modelling techniques, specifically a computer-based model which can prognose a function of the observed phenomenon by describing the data in detail, such as an individualized online demand model including the return time of customers, their related volume of sales, as well as substitutional product sales in case of unavailability (cf. Feng and Shanthikumar, 2018, p. 1673). For example, continuously available online sales data offers businesses the possibility to nowcast market demand and respond to market changes (cf. See-To and Ngai, 2018, p. 428) even faster by applying such model estimations. Chong et al. (2016, p. 374-376), as well as in their subsequent analysis (cf. Chong et al.; 2017, p. 5152), analyzed various influencing factors of demand for electronic goods and identified discount offerings for items with high sales volumes as the most effective one. They also state that the combination of online feedback and discounts has a significant impact on revenue. The study from Feng et al. (2020, p. 10f) is one of the first to focus on mobile commerce data, which allows the discovery of behavioral trends of customers and required business partners. The authors applied the method of Bayesian structural equation modeling (SEM), consisting of a measurement and structural equation it classifies correlated variables into inferred or latent variables like loyalty, to reveal new targeting possibilities from the analyzed mobile application data (cf. Feng et al., 2020, p. 5). In addition, statistical modelling also includes dependency analyses, which determine the correlation of various variables, product sets can be defined to align efficient marketing measurements and increase their sales performance (cf. Shaw et al., 2001, p. 129). Mainly utilized to discover, summarize, or compare the data in large data sets are concept descriptions (cf. Shaw et al., 2001, p. 129f). Also included are data visualizations with the support of various colors or layers, which sounds simplistic but is essential to identify developments or intricate patterns for in dept data mining. (cf. Shaw et al., 2001, p. 130f). The methodology called multidimensional scaling visualizes the relationship of variables and compares different sets of categorized data in a colorful way (cf. Carroll and Green, 1997, p. 195). Focusing on the second category, the time series analyses, simpler sales forecasting techniques are presented. This includes averages or autoregressive-integrated-moving-average (ARIMA) models, which develop fitted forecasts based on temporal defined past values (cf. Ansuj et al., 1996, p. 421). However, also regression-based trends or multiple linear regressions are common time series methods, which utilize past developments as the cause of changed future values and to fit the model (cf. Carbonneau, Laframboise, and Vahidov, 2008, p. 1143).

Especially in the online context, several contact points in marketing and sales create the basis for big data-based decisions, such as the bidding price for the positioning of an online advertising banner (cf. Malthouse and Li, 2017, p. 228). These unstructured big data sets are not adequate for structural statistical analyses, such as ANOVAs, instead they are better handled in statistical ML models (cf. Malthouse and Li, 2017, p. 230). Carbonneau, Laframboise, and Vahidov (2008, p. 1143) define the described methodologies as linear and therefore classical, and the following as more sophisticated forecasting techniques due to their overperformance. Guo, Wong, and Li (2013, p. 248) expand the differentiation by matching the ML techniques hereinafter, including AI, with use cases also consisting of non-time series data. The algorithms have the capability to compute missing data in the timeline by calculating the mean value of its neighboring values (cf. Guo, Wong, and Li, 2013, p. 249). The following ML methods are applied, in cases of large data sets and when high processing capacity is needed (cf. Rust and Huang, 2014, p. 218).

In general, ML models are categorized in unsupervised or supervised techniques, whereas the first independently develop output data and the latter are trained with predefined data. Supervised data mining classification models are different because they focus on only one dependent variable (cf. Spangler, May, and Vargas, 1999, p. 38). Data science in marketing, for example, adapts individual and targeted marketing campaigns to the masses through automated classification of shopper data and round the clock CRM over online platforms (cf. Rust and Huang, 2014, p. 209). Resulting from the study of Hallikainen, Savimäki, and Laukkanen (2020, p. 95f), successful CRM has a positive effect on sales performance. Furthermore, the efficiency of marketing measures and a business’ return on investment (ROI) is improved (cf. Wedel and Kannan, 2016, p. 98). The Latent Dirichlet Allocation (LDA) is the most popular classification approach (cf. Spangler, May, and Vargas, 1999, p. 41) and an exemplary algorithmic model to uncover topics of textual content (cf. Liu, 2020, p. 5). LDA is very beneficial to group the data set into different classes for further, class related analysis (cf. Spangler, May, and Vargas, 1999, p. 44). Tirunillai and Tellis (2014, p. 464f) describe this algorithm as an efficient tool to process textual data, including its structure, grammar, and syntax. They state that LDA can differentiate word meanings in distinct contexts and reduce biases in the authors’ brand analysis by excluding non-informative words and restructuring particular words like pronouns into their associated nouns. In relation to the textual analysis, the classification also known as decision tree can be a helpful tool. It automatically classifies variables, which best describe the differences between the newly generated groups even with the inclusion of an additional data set (cf. Efron and Tibshirani, 1991, p. 394). Single decision trees are summarized into the classification ML algorithm random forest or bagging. Random forests involve several decision trees as training input for the model and can increase the preciseness between model and reality (cf. Ferreira, Lee, and Simchi-Levi, 2016, p. 76). Additionally, the support vector machine is a well-suited, supervised ML technique for marketing predictions, such as consumer buying behavior and the demand linked to it (cf. Cui and Curry, 2005, p. 607). The SVM can integrate various formats of data into the analysis or its kernel, performing various distinct analyses including probability measures, classification, or pattern recognition methods, and summarizing them into a concluding matrix (cf. Kitchens et al., 2018, p. 560-562). In the work of Kitchens et al. (2018, p. 543) for example, the SVM is utilized to determine the future consumer churn and lifetime rate. Carbonneau, Laframboise, and Vahidov (2008, p. 1144) describe the superiority of risk reducing SVMs as optimal for future events, such as demand forecasts and its related sales. Lastly, the work describes in more detail the popular back-propagation neural network (cf. Guo, Wong, and Li, 2013, p. 248), a supervised ML algorithm illustrated in figure 7. It consists of a minimum of two different layers, where each output signal creates the new input for the following layer (cf. Kumar, Rao, and Soni, 1995, p. 252; Carbonneau, Laframboise, and Vahidov, 2008, p. 1143). Every additional layer creates a new so-called hidden layer between the input and the output layer, which elevates the complexity but also accuracy of the neural network (cf. Carbonneau, Laframboise, and Vahidov, 2008, p. 1143; Lu, Lee, and Lian, 2012, p. 587). The layers consist of nodes or neurons receiving input and producing output signals, and these in turn are connected through adjustable weights (cf. Lu, Lee, and Lian, 2012, p. 587). The neurons receive their input signals from real-time sensors or from every neuron in the prior layer, in turn the algorithmic activation function computes weighted output values, which are condensed and transferred to the next neuron or layer (cf. West, Brockett, and Golden, 1997, p. 374f; Spangler, May, and Vargas, 1999, p. 43; Chiang, Zhang, and Zhou, 2006, p. 516). By constantly feeding the algorithm with new sample data and reviewing the output values, the undesired values are backpropagated (cf. Kumar, Rao, and Soni, 1995, p. 253). Consequently, the algorithm learns and can adjust the signal transmitting weights, as well as the output patterns including valuable insights for business decisions (cf. Ansuj et al., 1996, p. 422; Chiang, Zhang, and Zhou, 2006, p. 516). A connection or pathway between neurons becomes stronger with the related weight being increased (cf. West, Brockett, and Golden, 1997, p. 373). Lu, Lee, and Lian (2012, p. 587) state that the node number in the layers influences the learning rate and therefore also the accuracy of the model. The structure of the algorithm is very similar to the neural structure in our human brain (cf. Spangler, May, and Vargas, 1999, p. 43; Chiang, Zhang, and Zhou, 2006, p. 516). Just like our brain, the algorithm takes its time to compute these complex structures but learns with each new data input (cf. Shaw, 1993, p. 81). Those algorithms are applied in diverse forecasting areas, such as demand forecasting (cf. Carbonneau, Laframboise, and Vahidov, 2008 p. 1143f), consumer behavior (cf. Chiang, Zhang, and Zhou, 2006, p. 516), and online sales predictions including reviews or discounts (cf. Chong et al., 2016, p. 367). Due to its popularity, there are many variants of neural networks, such as the recurrent neural network described by Carbonneau, Laframboise, and Vahidov (2008, p. 1143f). It allows for hidden layer outputs to be reutilized as inputs for the same layer. The difference between this network and the one described previously is that it can recognize patterns beyond the specified time frame through its recurrent linkages. Kuo and Xue (1998, p. 106) address so-called fuzzy neural networks, designed to learn from complex inputs and consequently also computing outputs difficult to interpret. They identify this algorithm as especially effective for the computation of promotion effects (cf. Kuo and Xue; 1998, p. 113).

Figure 7: Structure of a back-propagation neural network

Abbildung in dieser Leseprobe nicht enthalten

(Chiang, Zhang, and Zhou, 2006, p. 522)

Now focusing on the second sub-area of supervised ML the work addresses regression analyses, also one of the most popular ML techniques in data science. Ordinary and linear least square regressions measure the difference between observed and modelled values, which in turn confirms the accuracy of the regression model and the connection between the examined variables (cf. Hsiao et al., 2020, p. 6). Archak et al. (2011, p. 1503) identified the logistic regression model as the best performing one to predict next week’s sales performance through consumer reviews, prices, and sell-out data. In the online retailer analysis of Ferreira, Lee, and Simchi-Levi (2016, p. 74f), regression models such as (partial) least squares or principal components along with regression trees, modelled the demand and therefore also the revenue loss of sold-out products. Principal component as well as partial least square regressions benefit from components of the explanatory variable as a regression basis for the dependent variable (cf. George et al., 2016, p. 20f). Regression trees identify the difference between an averaged dependent variable and defined values of an averaged independent variable, such as the average customer churn for customers being older than thirty or being younger (cf. George et al., 2016, p. 22). The regression trees are related to above-described decision trees and applied to identify product sales relations or price adaptions in respect of product quality and demand (cf. Ferreira, Lee, and Simchi-Levi, 2016, p. 76).

As unsupervised ML techniques are not that advanced in sales and marketing, the work only focuses on the sub-area of clustering. Additional techniques are illustrated above in figure 6. Clustering is especially helpful, to reduce the computing time of algorithms and its complexity. Successful marketing content is designed for the right group of customers or products and creates valuable information for shareholders (cf. Hajli et al., 2020, p. 5f). To identify the right audiences, the methods of unsupervised clustering from Shaw et al. (2001, p. 129) are advantageous. The authors refer to concept clustering for the identification of similar variable characteristics and terminology such as in-store or online customer, and to mathematical taxonomy algorithms for automated data-driven categorization of future customer spending. In the work of Ferreira, Lee, and Simchi-Levi (2016, p. 74) hierarchical clustering on hourly sales proportions identified the loss of turnover. Clustering models, but also SVMs can be utilized to model in-store demands of new customers by matching data from the established customer base (cf. Feng and Shanthikumar, 2018, p. 1673). With those findings, businesses can influence sales volumes through promotions and its related store or online traffic, can manage product listings and shelf life, and can focus marketing measurement on high-potential customers (cf. Feng and Shanthikumar, 2018, p. 1674). As already confirmed by the early analysis of Morwitz and Schmittlein (1992, p. 404), the clustering of the input data improves prediction methodologies including ML applications, significantly.

The last concept are sentiment analyses, which combine several analyses of the ones described above. Sentiment analyses are utilized to define the variability of identical words, but distinct meaning (cf. Sivarajah et al., 2017, p. 273). Liu (2020, p. 7), for instance, categorized tweets into positive or negative sentiments with a text-mining ML model. These models extract product terminology from websites, chunk the information into sections of interest, and identify word relationships or related sentiments (cf. Netzer et al., 2012, p. 524). This allows to find out what consumers really think about the product, it enriches sales predictions and consumer gratification while stock-out is prevented (cf. Lau, Zhang, and Xu, 2018, p. 1792). Chong et al. (2016, p. 371) applied a sentiment analysis to hierarchically classify online reviews, which later created the basis for their neural network sales forecasting. While Fan, Che, and Chen (2017, p. 93) integrate their sentiment analysis into the prominent Bass model and predict the number of shoppers, which in turn is identical to the consumer durables sales performance. To conclude this informative subchapter all data mining methods in marketing and sales are separately documented in table 3. In addition, this table provides an overview of the related sources for this chapter.

Table 3: Data mining techniques in sales and marketing

Abbildung in dieser Leseprobe nicht enthalten

(Sources are included within the table)

3.3.3 Model evaluation and deployment

During modelling, it is important to check the model for biases as it may not consider all the relevant underlying reasons (cf. Malthouse and Li, 2017, p. 233). Omitting variables for example, such as a wealth measure in a demand model, can distort the result without the integration of regulating covariates (cf. Rossi, 2014, p. 657). This explains the presence of the fifth step, model evaluation, and its interconnection with the previous modelling step. Naïve Forecast is often applied as a performance indicator in relation to the described sophisticated forecasting techniques and predicts values of interest by utilizing the same variable’s past values (cf. Carbonneau, Laframboise, and Vahidov, 2008, p. 1143). Attention is also paid to another common challenge of ML techniques, which is called overfitting. In the case of overfitting, the model fits too closely to the extracted data, which influences the predictive accuracy. To prevent this, the statistical-based root mean squared error delivers the model accuracy and the mean absolute percentage error measures its performance (cf. Fan, Che, and Chen, 2017, p. 94). Neural networks are especially prone for overfitting and errors in the training set. This is prevented by cross-validation procedures, where the error between training and validation set is constantly observed and as soon as the error increases, the optimal result is obtained (cf. West, Brockett, and Golden, 1997, p. 377). To make sure that the result of the exemplary neural network analysis from Chong et al. (2016, p. 373) is error free, they applied the popular t-test on the split and training sets. Additionally, Google Analytics is an excellent tool for the collection of online customer data, whilst it also enables businesses to review the efficiency of marketing measurements and related change in sales (cf. Ja¨rvinen and Karjaluoto, 2015, p. 6). To conclude, this penultimate step can be compared to a security check at the airport. The modelling results are checked for one last time and adjustments are made to the data science use case, its data, or data mining methods.

In the sixth and final step, the best fitting model is selected and deployed in the respective environment. Additionally, the model will be under regular observation and results are interpreted and implemented in conjunction with expert business knowledge. Even the early study from Lawrence, Edmundson, and O'Connor (1986, p. 1530) identified positive effects with the combination of different statistical forecasting techniques. The same applies for data mining techniques, whereby in the end the best fitting model delivers the result. In general, authors state that ML methodologies, in particular sales forecasting neural networks, perform better than traditional statistics (cf. Chiang, Zhang, and Zhou, 2006, p. 525; Laframboise, and Vahidov, 2008, p. 1143; Feng and Shanthikumar, 2018, p. 1672;). In the area of marketing, Zhenning, Frankwick, and Ramirez (2016, p. 1564) still suggest merging big data marketing algorithms with traditional marketing analytics, to overcome missing knowledge obstacles. Similar suggestions are given by Ferreira, Lee, and Simchi-Levi (2016, p. 86) who examine sales forecasts with the combination of ML and classical optimization methods. The empirical part of this work is completed with the presentation of the data mining process, its methodologies and their applications in sales and marketing. The following chapters will focus on the conducted interview research, where current data science trends, challenges and data mining methods of the CPG companies will be examined in more detail.

4 Research Design

4.1 Conceptual framework

The applied research methods to obtain the findings consisted of; qualitative in-depth semi-structured interviews to collect expert knowledge, the Zaltman metaphor elicitation technique (ZMET), and the qualitative content analysis according to Mayring to create a data science framework. For the second half of this analysis, semi-structured interviews were chosen as the primary data source. These kinds of interviews resemble a conversation and are designed for deep, insightful information gathering including desires and emotions (cf. Harrell and Bradley, 2009, p. 27). In total six interviews, each representing a different international CPG company, were conducted during a five-week time-period, from the 26th of March to the 3rd of May 2021. For convenience and due to pandemic reasons, every interview was held online via video calls and with an average duration of 39 minutes. To simplify the analysis, all answers were audio recorded with the prior data protection notice for, and permission of the interviewees. For this work, the target interview sample was strategically chosen, to represent the selected consumer-goods industry and the focus area of marketing and sales. The applied selection methodology is known as non-probability snowball sampling, where the first interviewee is selected randomly, and the individual provides contact information to other potential respondents. This procedure was especially helpful, as it is difficult to find suitable interview partners within suitable companies from an external position. However, the network LinkedIn was very helpful in overcoming this challenge. Most of the interview partners work as data scientists in a large CPG company, mostly being responsible for the area of Sales in an B2C environment. Each company represented in the interviews, operates internationally with their headquarters in Germany. Table 4 gives an overview of the conducted interviews. A number has been assigned to each interview, which serves as the basis for citation in the following chapters.

Table 4: Overview of the conducted interviews

Abbildung in dieser Leseprobe nicht enthalten

(Derived from the interview material)

The interview guide for the semi-structured interviews consists of standardized questions whereby the interviewer has flexibility to ask follow-up questions or adjust the order of the questions (cf. Harrell and Bradley, 2009, p. 27). Additionally, the guideline ensures that all important topics are covered during the interview, the most critical information is gathered, and the interviewer stays on track during the actual interview. Having an interview guide also helps to establish a trusting environment for the interviewee, to ask more sensitive questions. Additionally, the guide assures the information to be free from non-response errors such as failure to answer a question because of willingness, inability, unavailability, or ambiguity occurring with the misunderstanding of questions. As a result, the quality criterion of objectivity is met, higher quality information is gathered and the ability to uncover useful insights in the subsequent analysis is improved. The interview guide and therefore every interview transcript is structured in four sections. The first section asks simple questions about the current touchpoints of the interviewee with data science. Thereafter, the international organization of the data science team(s) in the companies is discussed. In the third section, the participants address the prepared ZMET of the interviewee. The ZMET is known as an indirect abstract approach of research, where the interviewee prepares a collage of maximum eight pictures, in relation to a given question. By asking the interviewees to create this ZMET collage about their thoughts and feelings related to data science at their company, they actively engaged with the interview topic already upfront and were ready to talk about it. Many memories are saved in the form of stories and the pictures are a perfect source of rich insights on the interviewee’s experience. Consequently, deeper insights or solutions to the visualized issues can be revealed in the last section, to which the participant may not have been able to convey without the ZMET. Due to time reasons, not all interview partners were able to prepare a ZMET collage upfront. In the last part of the interview more in-depth questions are asked to get an understanding of how, when, and why the businesses are using data science methods and its related challenges. Finally, the interviewee was given the opportunity to share further information or ask open questions. After conducting the interviews and for further analysis, the recordings were transcribed with the help of the online transcription software ‘Trint’. The complete interview transcripts as well as the interview guide and the ZMETs are attached in the appendix of this work. Sensitive and business recognizable information was removed from the transcripts, to ensure data protection. Afterwards, the 1983 developed qualitative content analysis from Mayring was applied, to analyze the information given by the interviewees (cf. Mayring, 2000; Mayring, 2014). The content analysis, illustrated in figure 8, enables the textual interview data to be integrated into a model defining the goal of the analysis (cf. Mayring, 2000, p. 2).

Figure 8: The process of qualitative content analysis due to Mayring P.

Abbildung in dieser Leseprobe nicht enthalten

(cf. Mayring, 2000, p. 4; Mayring, 2014, p. 80)

In the very first step of the analysis due to Mayring, the research objective and a concrete research question are determined (cf. Mayring, 2014, p. 10). Mayring (2014, p. 10) states this as a key necessity to define the qualitative research as relevant. In chapter 3.3, this work reviewed existing literature on data science applications in marketing and sales. However, as can be observed in table 3, up-to-date literature from the areas of sales and marketing on this fast-developing data topic are scarce. Hence expert interviews were collected and utilized to fill the literature gap and illustrate current trends in data science. The research topic for this work is already predefined in the title of this paper and can be formulated as the following


Excerpt out of 139 pages


Big or Smart Data? Recent trends in Data Science for sales and marketing
An empirical analysis of the consumer-packaged goods industry
Pforzheim University
Catalog Number
ISBN (Book)
Big data, Smart data, Data Science, Sales, Marketing, Trends
Quote paper
Julia Ertel (Author), 2021, Big or Smart Data? Recent trends in Data Science for sales and marketing, Munich, GRIN Verlag,


  • No comments yet.
Read the ebook
Title: Big or Smart Data? Recent trends in Data Science for sales and  marketing

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free