Grin logo
de en es fr
Shop
GRIN Website
Publicación mundial de textos académicos
Go to shop › Ciencia de la Computación - IT-Security

Process Mining and Network Protocols

Probing the application of process mining techniques and algorithms to network protocols

Título: Process Mining and Network Protocols

Tesis , 2015 , 111 Páginas , Calificación: 1

Autor:in: Matthias Leeb (Autor)

Ciencia de la Computación - IT-Security
Extracto de texto & Detalles   Leer eBook
Resumen Extracto de texto Detalles

Process mining is the binding link between computational intelligence, data mining, process modeling and analysis. The thesis shows how this research discipline can be applied to network protocols and what the awards will be. Process mining is based on event data, logged by almost every information system. This event data is extracted, transformed and loaded into the process mining tool to discover, check conformance or enhance the underlying process based on observed behavior. Determining the significance of process mining in the field of network protocols and their control flow, finding the best possible algorithms and notation systems, clarifying the prerequisites and providing a proof of concept are the main achievements. Additionally other reasonable and beneficial applications, like mining an alternative protocol, dealing with a large amount of event data and estimations due to size of necessary event data, are investigated.

Extracto


Table of Contents

1. Introduction

1.1. Process Mining

1.2. Business processes and network protocols

1.3. Vision

1.4. Idea, leading questions and strategy

1.5. Outcome

1.6. Structure of thesis

2. Process Mining and related topics

2.1. The BPM life-cycle

2.2. Process modeling notations

2.3. Positioning process mining

2.4. Process models, analysis and limitations

2.4.1. Model-based process analysis

2.4.2. Limitations

2.5. Perspectives of process mining

2.6. Types of process mining

2.6.1. Play-in

2.6.2. Play-out

2.6.3. Replay

2.7. Discussion

2.7.1. Discovery

2.7.2. Conformance

2.7.3. Enhancement

2.8. Findings

3. Properties and quality

3.1. Event data

3.1.1. Quality criteria and checks

3.1.2. Extensible event stream

3.2. Notation frameworks

3.3. Evaluation of algorithms

3.3.1. Problem statement

3.3.2. What “Disco” does

3.3.3. Challenges for algorithms and notation systems

3.3.4. Categorization of process mining algorithms

3.3.5. Algorithms and plug-ins for control-flow discovery

3.3.6. Fuzzy Miner

3.4. Process models

3.5. Findings - The weapons of choice

4. Prerequisites and -processing

4.1. Data extraction

4.2. Data transformation

4.3. Load data

4.4. Automating the ETL procedure for TCP

4.5. Findings

5. Proof of Concept

5.1. Mining TCP with Disco

5.1.1. Extracting relevant information

5.1.2. Results

5.2. Discussion

5.2.1. Recorded activities

5.2.2. Sequences

5.2.3. Limitations

5.3. Mining TCP with RapidMiner

5.3.1. Adjustments in the results perspective

5.4. Findings

6. Reasonable applications, adaptions and enhancements

6.1. Mining HTTP

6.1.1. Results

6.1.2. Discussion

6.2. Moving towards bigger captures

6.2.1. SplitCap

6.2.2. Adaptions to the ETL script

6.3. Protocol reverse engineering

6.3.1. Gathering data

6.3.2. Results

6.3.3. Discussion

6.4. Findings

7. Conclusion

A. Data

A.1. Example PNML file

A.2. Self-captured

A.2.1. tcpCapture.pcap

A.2.2. httpCapture.pcap

A.3. External

A.4. RapidMiner structure

B. Tools and software

B.1. Disco

B.2. Perl

B.3. ProM

B.4. R

B.5. RapidMiner and RapidProM

B.6. RStudio

B.7. Ruby

B.8. SplitCap

B.9. tshark

B.10. Wireshark

B.11. WoPeD

C. Sourcecode

C.1. Script tcp_pcap2xes.rb

C.2. Script http_pcap2xes.rb

C.3. Script tcp_splitPcaps2xes.rb

D. Glossary

Research Objectives and Focus Areas

This thesis explores the application of process mining techniques to network protocols to gain fact-based insights into communication behavior, performance, and conformance. The primary research goal is to bridge the operational gap between network traffic captures and process mining tools, enabling the automatic discovery and analysis of protocol control flows.

  • Application of process mining to network protocols (TCP, HTTP).
  • Development of automated ETL (Extract, Transform, Load) procedures for network traffic.
  • Evaluation of process mining algorithms and tools for protocol analysis.
  • Statistical methods to estimate sufficient event data for robust process models.

Excerpt from the Book

1.1. Process Mining

Van der Aalst et al. define process mining in their manifest as follows:

“Process mining is a relatively young research discipline that sits between computational intelligence and data mining on the one hand, and process modeling and analysis on the other hand. The idea of process mining is to discover, monitor and improve real processes (i.e., not assumed processes) by extracting knowledge from event logs readily available in today’s (information) systems.”[79, p. 1]

Whether there already is a BPM or not, process mining is the technology to discover or enhance the processes or check them due to conformity based on event data. There are several tools and algorithms that support extracting and visualizing processes from event logs. Process Mining can be used in a large variety of application domains. The techniques are based on event data written by information systems.

Summary of Chapters

1. Introduction: Presents the motivation, vision, and core research questions regarding the application of process mining to network protocols.

2. Process Mining and related topics: Explains fundamental concepts, perspectives, and types of process mining within the BPM lifecycle.

3. Properties and quality: Details the requirements for event data, discusses various mining algorithms, and evaluates their suitability for protocol discovery.

4. Prerequisites and -processing: Describes the technical ETL pipeline used to extract and transform network traffic data into a process-mineable format.

5. Proof of Concept: Demonstrates the practical application of process mining on TCP traffic using tools like Disco and RapidMiner.

6. Reasonable applications, adaptions and enhancements: Explores mining HTTP, handling large network captures, and introducing a metric for data adequacy.

7. Conclusion: Summarizes the findings and discusses the overall effectiveness of using process mining for network protocol analysis.

Keywords

Process Mining, Network Protocols, TCP, HTTP, ETL, Event Logs, Fuzzy Miner, Data Extraction, Process Discovery, Conformance Checking, Performance Analysis, Network Traffic, XES, RapidMiner, Disco

Frequently Asked Questions

What is the core focus of this thesis?

The work investigates the applicability of process mining techniques to the domain of network protocols, specifically aiming to discover, monitor, and improve protocol-based communication through event data analysis.

What are the central thematic fields?

The thesis covers the intersection of information security, process modeling, network traffic analysis, and the development of specialized ETL (Extract, Transform, Load) procedures for converting packet captures into process logs.

What is the primary research goal?

The primary goal is to determine if and how process mining perspectives and types can be successfully applied to network protocols to gain fact-based insights without relying on manual, error-prone analysis.

Which scientific method is utilized?

The author performs systematic literature research followed by an empirical proof of concept, where automated scripts are developed to process real network traffic data using mining tools like Disco and RapidMiner.

What topics are discussed in the main part?

The main part covers the theoretical foundations of process mining, requirements for log quality, an evaluation of various discovery algorithms, and the development of an automated pipeline to handle network protocols like TCP and HTTP.

What are the defining keywords for this work?

The most important keywords include Process Mining, Network Protocols, TCP, HTTP, ETL, Event Logs, Fuzzy Miner, and Process Discovery.

How does the author handle large network captures?

The thesis proposes using specialized tools like SplitCap to decompose large packet captures (e.g., 17GB) into individual streams in a single pass, significantly reducing processing time from days to hours.

Why is the "Fuzzy Miner" highlighted?

The Fuzzy Miner is identified as the algorithm of choice because it is implemented in most standard tools, handles noise effectively, and offers adjustable parameters that allow for granular control over process simplification.

What metric is introduced to measure data sufficiency?

The author introduces a metric based on the 'Information Value' (sum of activities and transitions) to calculate the 'Average Information Gain' (aig), which helps predict when enough event data has been collected to mine a stable process model.

Final del extracto de 111 páginas  - subir

Detalles

Título
Process Mining and Network Protocols
Subtítulo
Probing the application of process mining techniques and algorithms to network protocols
Universidad
St. Pölten University of Applied Sciences  (Informatik & Security)
Calificación
1
Autor
Matthias Leeb (Autor)
Año de publicación
2015
Páginas
111
No. de catálogo
V308134
ISBN (Ebook)
9783668066120
ISBN (Libro)
9783668066137
Idioma
Inglés
Etiqueta
Process Mining Event data discovery conformance enhancement process prom disco algorithm notation system
Seguridad del producto
GRIN Publishing Ltd.
Citar trabajo
Matthias Leeb (Autor), 2015, Process Mining and Network Protocols, Múnich, GRIN Verlag, https://www.grin.com/document/308134
Leer eBook
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
Extracto de  111  Páginas
Grin logo
  • Grin.com
  • Envío
  • Contacto
  • Privacidad
  • Aviso legal
  • Imprint