Deriving a big data analytics framework. Approaching the project management process for big data initiatives

A case study


Master's Thesis, 2018
42 Pages, Grade: 7

Excerpt

Contents

Abstract

A. Introduction
Background
Industry partner
Project objective
Significance

B. Research methodology
Business Process Management Lifecycle
Method overview
Literature review
Phase 1 - Extraction of literature
Phase 2 – Organization and preparation for analysis of artefacts
Phase 3 – Coding and analysis
Phase 4 – Write up and presentation
Interviews
Project management approach for this research project

C. Results
Summary of literature review results
Waterfall approach
Agile
CRISP-DM
Hybrid agile and waterfall approach
Summary of interview findings
Big data Analytics Framework
1. Business Understanding
2. Understand and Prepare Data
3. Validate Business Understanding
4. Design Solution
5. Evaluate Solution
6. Validate Business Understanding
7. Deployment

D. Discussion

E. Conclusion

Abbreviations

References

Appendix 1 – Reflection of learnings and project logs

Appendix 2 – Literature review details

Appendix 4 - Questionnaire

Appendix 5 – Transcript interview

Appendix 6 – Transcript interview

Appendix 7 – Transcript interview

Appendix 8 – Transcript interview

Appendix 9 – Transcript interview

Abstract

This case study report investigated the project management approach for big data projects for industry partner Red Rocks Company. Big data is considered a key enabler for future decision making and process automation. The topic is however very new and not well understood yet. Hence 50% of big data projects are not delivering the expected benefits and are costing more than initially planned. Firstly, a brief literature review was undertaken to find out how big data projects are managed. From this, a Big Data Analytics Framework was derived which is based on CRISP-DM. As a second step, the framework was validated through interviews with stakeholders from the corporate sector. For this case study, the first three phases of the Business Process Management Lifecycle were applied: process discovery, analysis and design. Key findings of the case study are that literature recommends an agile project management approach for big data initiatives. On the contrary, the majority of interviewed industry stakeholders confirmed a waterfall approach is conducted more often to deliver such projects. The developed Big Data Analytics Framework was validated and will add significant benefits to Red Rocks Company as it will help to successfully deliver big data initiatives in future.

A. Introduction

Background

Big data has been a topic of high attention for organisations in the past 10 years. Heudecker defines big data as "high volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation" (2013, p. 2). A recent study confirmed that 82% of organizations believe that big data gives them a competitive advantage in today’s fast-paced business environment (Vanson Bourne. 2015). In a global and highly-connected world, it enables organizations to identify patterns and to make smart business decisions based on facts delivered by big data platforms. However, 50% of organisations state that big data initiatives have been more expensive than originally expected (Hendershot. 2016). The reason for this is that the topic is rather new and not very well understood, therefore big data initiatives are often underestimated and are not returning the expected benefits to organisations.

The aim of this project is to understand best practice project management for big data initiatives and to develop a framework to help such projects to deliver the expected advantages.

Industry partner

With over 53,000 employees, the Red Rocks Company is one of the largest global mining operators with mines in Australia, Chile, the USA and many other countries. The organisation’s technology vision is “to enable a fully integrated and highly automated business from resource to market by 2025” (Red Rocks Company. 2018). Big data is a fundamental building block to achieve a “fully integrated business”.

Red Rocks Company is in the process of establishing a big data platform which will be used by at least 12 different customers (mine sites). There is limited expertise in this area of the business and therefore the organization is seeking to understand what the challenges and considerations in managing big data initiatives are and the associated best practice project management.

Project objective

This project seeks to understand best practice project management for big data initiatives. Firstly, it will undertake a preliminary literature review to derive a framework for managing big data initiatives. Secondly, it will validate the framework by undertaking interviews with project team members, internal and external to industry partner Red Rocks Company, that have been involved in successful Big data projects within the corporate sector. Audience of this work are industry partners and academic researchers in the big data discipline.

Significance

Big data is a growing area into which organisations are investing to develop capabilities. The project will deliver significant research for practitioners and researchers due to the novel nature of the topic. It will provide insights into how to manage big data projects which will help the Red Rocks company to achieve the 2025 vision of “highly integrated and fully automated”.

B. Research methodology

Business Process Management Lifecycle

The project predominantly followed the Business Process Management (BPM) Lifecycle as per Dumas (2016) to develop a standardised process. The aim of this model is to improve the performance of a process. The BPM Lifecycle is structured into five phases (see Figure 1 - The BPM Lifecycle model (Dumas. 2016)) and provides a simple and easy to follow structure to achieve continuous improvement in Business Process Management.

This model was selected because it is considered industry best practice (Bernado et al. 2017) and it aligns with methodologies currently applied within Red Rocks Company.

Abbildung in dieser Leseprobe nicht enthalten

Figure 1 - The BPM Lifecycle model (Dumas. 2016)

This process was identified and requested to be reviewed by one of the key stakeholders of Red Rocks Company as part of this work. As part of the project discovery and analysis, a literature review and interviews were undertaken. An outcome of the review was the development of this new framework which fits into the process redesign phase of the BPM Lifecycle model.

For this project, only the first three of the five phases are in scope (see Figure 2 - In scope phases for this project as per BPM Lifecycle). This is mainly due to the project execution time of 12 weeks which is too restrictive to implement a new process in a large organisation like Red Rocks Company and to receive performance insights via a feedback loop. Perhaps this will be part of a follow-on project.

Abbildung in dieser Leseprobe nicht enthalten

Figure 2 - In scope phases for this project as per BPM Lifecycle model

Method overview

A combination of methods was applied in this project. Initially, a brief literature review was undertaken from which a Big Data Analytics Framework was derived. This was then validated through interviews.

Abbildung in dieser Leseprobe nicht enthalten

Figure 3 - Method overview

Literature review

The project undertook a brief literature review of nine artefacts, which adopted the guidelines by Bandara et al (2015). Reason for choosing these guidelines is that they provide a validated framework within academia.

Phase 1 - Extraction of literature

The first phase is the extraction of literature. An online search was conducted in the QUT library and Google Scholar to find relevant literature. The reason for using both libraries was to ensure no important articles were missed. Google Scholar is an online search engine for academic publications. It was not used before by the author and there was a certain curiosity to see the difference in search results. The Gartner, Project Management Institute (PMI) and McKinsey databases were also searched for industry-relevant literature that was not available in the QUT library or Google Scholar. The following search word parameters were used:

- Big data and project management
- Big data and project management approach

An initial search presented the following results as in the table below. The relevant articles were then coded using the top 3 main ideas.

Abbildung in dieser Leseprobe nicht enthalten

Table 1 - Literature review search results

Phase 2 – Organization and preparation for analysis of artefacts

The top documents from each search were reviewed and categorized in the below table. Initially the abstract was reviewed for relevant paragraphs to “big data” and “project management”. All reviewed artefacts had some content on big data but not all presented insights on the project management approach. The top 3 main ideas were noted in the coding table for each article. Early on there was a trend identified that big data projects are suited to an agile project management approach and that CRISP-DM is a foundational process. Therefore two additional criterion were added to the evaluation table to mark the articles that supported this idea. The criteria used for literature review success:

- Contains relevant information and insights on Big data projects
- Contains information on the project management approach
- Classified by the article supporting the idea of using an agile approach for Big data projects

The literature analysis was kept to9 articles that contain valuable content to the research topic. Other artefacts were reviewed by reading the abstract, however, most of the content addressed the theme at a very high level and did not add new knowledge. Further, contemporary articles were chosen from different industries to ensure good questionnaire design.

Phase 3 – Coding and analysis

An initial coding and analysis was undertaken as per table below. The data were classified into the title, source, author, published year and top 3 main ideas. Two classifiers were established to note if the article supported the idea that an agile approach is suited for big data projects and the CRISP-DM process.

Abbildung in dieser Leseprobe nicht enthalten

Table 2 - Literature review coding and analysis table

The coding and analysis were undertaken in Microsoft Excel and can be found in Appendix 2 – Literature review details.

Phase 4 – Write up and presentation

This phase was undertaken as part of Section C - Results.

Interviews

Interviews was selected as a method to validate the Big Data Analytics Framework. Validation of theoretical models is of significant importance, because it proves that the model is sound and it is required for the ongoing application (MacKenzie et al, 2011). Interviews provided objective answers to validate the framework in such a short period of time. A survey with a questionnaire would have been too subjective. A questionnaire was developed based on the literature review and stakeholder input. It can be found in Appendix 4 - Questionnaire. The questionnaire had eight focus areas:

1. Confirm the size and complexity of the big data environment to enable comparison
2. The initial goal of the big data project to be able to categorise into operational or innovative nature
3. Discovery steps undertaken
4. Project management methodology used
5. Challenges encountered
6. Benefits derived
7. Stakeholder management
8. Application of the CRISP-DM framework.

Five face to face interviews were held at Red Rocks Company and outside over a two month timeframe in Brisbane, Australia. Interviews were recorded and later transcribed for research purposes. From the limited interviewees, a number of disciplines were interviewed including senior project managers, technical infrastructure staff and data scientists. All interviews related to big data projects in the corporate sector and had a varied duration of 7-30 min.

Abbildung in dieser Leseprobe nicht enthalten

Table 3 - Summary of interviews

Project management approach for this research project

Scrum is an agile project management framework that was selected for this project. Schwaber et al (2017, p. 3) defines Scrum as “a framework within which people can address complex adaptive problems, while productively and creatively delivering products of the highest possible value”.

There were be five sprints of equal duration of two weeks over the course of the project. A product backlog was established to manage the requirements of the project.

Abbildung in dieser Leseprobe nicht enthalten

Figure 4 - Sprint and task breakdown overview

Key deliverables of this project include:

- Brief literature review
- Big data Analytics framework
- Validation through interviews:
- Questionnaire
- Interviews and transcripts
- Case study report to present findings.

The main reason for selecting Scrum for this project is that it is simple to understand and provides an early feedback mechanism for stakeholders which is important for the success of the project. Hotle (2017) states that early stakeholder feedback is one of the key benefits of a Scrum agile approach.

C. Results

Summary of literature review results

Nine artefacts were reviewed in the initial literature review. From this selection, five articles confirmed that agile is the recommended project management approach for big data projects, one referred to a waterfall based approach and three had no information to confirm the project management approach. Although there was no evidence on the project management approach for three reviews, they were still listed as they contained important insights into the significance of big data initiatives.

Abbildung in dieser Leseprobe nicht enthalten

Table 4 - Literature review results

Two out of the nine kinds of literatures were case studies and provided contradicting recommendations. Frankova et al (2016) undertook a survey with conference participants which determined that agile is the preferred approach for big data projects. The article concludes that it is recommended to start with a small use case, learn from failures and continue with an iterative approach. On the contrary, Dutta et al (2015) suggests using a waterfall approach with a thoroughly planned project plan to ensure successful execution, deployment and acceptance by the end users. This is validated with that big data projects are new to most organisations and more difficult to implement. Big data projects require significant change management to adjust the mindset of users by making decisions based on the data provided instead of intuition.

The differentiating industries from which the case studies were understood, could be the reason for the contrary opinions. The work undertaken by Frankova et al was based on a survey at a conference, whereas Dutta’s was established on a case study that implemented a big data solution at a cement manufacturer which is culturally close to the resources sector in which Red Rocks Company operates. This may lead to the conclusion that the primary resources sector is more aligned to a waterfall project management approach due to the industry’s nature of establishing large, complex and long living assets like mine sites and plants. There was no evidence found to support this conclusion in the literature and further research into this particular topic may be required outside of this work.

Waterfall approach

The waterfall model is the traditional project management approach which uses a sequential design flow. First applied as far back as the 1950s, it is still popular for engineering and construction projects. The Ramco case study (Dutta et al. 2015) confirmed that waterfall is a recommended approach for big data analytics projects. Further, the majority of interviews followed this approach. It was noted that the industry background of the case study and also the interviews is resourcing in form of mining or oil & gas.

Agile

The Agile Manifesto was first defined in 2001 by a group of independent software developers (Schneider. 2017). As per Schneider (2017, p. 2), the Agile mindset is all about building the right solution today and acknowledging that this might not be the right solution tomorrow. Compared to the waterfall approach, it is rapid and iterative. Agile focuses on quality whilst applying continuous improvement principles. The literature review identified that the vast majority of sighted artefacts recommended an agile project management approach for big data analytics projects. On the contrary, only one interview confirmed this and another interview stated that a hybrid approach of waterfall and agile was applied.

CRISP-DM

The Cross-Industry Process for Data Mining, also referred to as CRISP-DM, was first defined by a group of data scientists in 1999 (Chapman et al, 1999). Although not a project management methodology, it is a data analysis process that provides the basis for a significant number of data analytics frameworks that are known today. The initial literature review did not recognise CRISP-DM as a process for big data projects. One of the interviewees uses the approach and two other interviews confirmed, that although officially a waterfall approach was used, intuitively CRIPS-DM was applied. An additional literature review confirmed, that CRISP DM provides the fundamentals behind successful data analytics projects up until today (Mariscal et al, 2010).

Hybrid agile and waterfall approach

Hayata et al (2011) states that a hybrid agile and waterfall approach is an evolving trend within organisations. Organisational change takes time and as technology teams are accustomed to their traditional way of working in a waterfall approach, the transition to an agile organisation can take many years. By using an agile and waterfall approach, it allows the organization to practice some of the agile techniques while remaining in waterfall-based world (Hotle et al, 2018). The initial literature review did not find supporting evidence for a hybrid agile and waterfall project management approach. The majority of the documents were very clear to validate an agile approach. However, the interviews confirmed that a hybrid approach was used although sometimes not prescribed and rather unknowingly. Two of the interviews operated in a waterfall approach. However, when questioned on the techniques and processes like CRISP DM used for the data analytics, it was confirmed that unknowingly this process has been followed. Further, interview 1 used a very pure agile approach which enabled the team to quickly commission a working solution to the organisation. Although one of the challenges encountered was insufficient licensing which could have been prevented by using a traditional waterfall approach. This suggests that a hybrid agile and waterfall approach would be more suitable for this organisation.

Summary of interview findings

The results of the interviews are rather broad. A correlation between projects of operational character (i.e. development of Spotfire report for operational data) and applying a waterfall project management approach is evident. Further, projects that followed an agile or hybrid agile waterfall approach had a more innovative and incubator character (i.e. predictive maintenance data analytics). Projects that were delivered in a waterfall approach still initiated a constant feedback loop with the customer, which is indicative for an agile approach. This leads us to assume that project personnel are intuitively seeking constant stakeholder feedback to ensure the success of the project although not prescribed in the waterfall approach. Overall, the majority of the individual steps of the Big Data Analytics Framework were validated during the interviews.

Abbildung in dieser Leseprobe nicht enthalten

Table 5 - Interview findings

Big data Analytics Framework

The below Big Data Analytics Framework has been derived from the findings of the literature review and interviews as a suitable approach for the Red Rocks Company to undertake big data projects. The framework has seven steps and is based on a hybrid agile and waterfall approach. The foundations of the approach are based on CRISP-DM.

Abbildung in dieser Leseprobe nicht enthalten

Figure 5- Big Data Analytics Framework

1. Business Understanding

“Business Understanding” is the first step of the Business Data Analytics Framework. The key goal of this phase is to understand the objectives of the project and business requirements. This is also the phase where the current AS IS status is determined, including an inventory of resources. The interviews confirmed that business understanding is critical. Especially interview 5 highlighted that “creating a gap analysis to help understand the business” was important for success. Brocchi et al (2016) state that a business-driven approach is one of the core principals of any digital transformation. Under this model, organisations create use cases and requirements whilst taking inventory of data associations for different use cases and opportunities. This step is followed by “Checkpoint 1” to determine if the initiative is still to progress or needs to be reprioritised. It is proceeded by step “2. Understand and Prepare Data”.

[...]

Excerpt out of 42 pages

Details

Title
Deriving a big data analytics framework. Approaching the project management process for big data initiatives
Subtitle
A case study
College
Queensland University of Technology  (Faculty of Science and Engineering)
Course
Master in Business Process Management
Grade
7
Author
Year
2018
Pages
42
Catalog Number
V499625
ISBN (eBook)
9783346033536
Language
English
Notes
This project was market with high distinction.
Tags
BigData, ProjectManagement, Agile, BigDataAnalytics, Waterfall, CRISPDM
Quote paper
Theres Mitscherling (Author), 2018, Deriving a big data analytics framework. Approaching the project management process for big data initiatives, Munich, GRIN Verlag, https://www.grin.com/document/499625

Comments

  • No comments yet.
Read the ebook
Title: Deriving a big data analytics framework. Approaching the project management process for big data initiatives


Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free