Network Measurements play an essential role in operating and developing today's
Internet. A variety of measurement applications demand for multipoint
network measurements, e.g. service providers need to validate their delay guarantees
from Service Level Agreements and network engineers have incentives to
track where packets are changed, reordered, lost or delayed. Multipoint measurements
create an immense amount of measurement data which demands for high
resource measurement infrastructure. Data selection techniques, like sampling
and filtering, provide efficient solutions for reducing resource consumption while
still maintaining sufficient information about the metrics of interest. But not all
selection techniques are suitable for multipoint measurements; only deterministic filtering allows a synchronized selection of packets at multiple observation points.
Nevertheless a fillter bases its selection decision on the packet content and hence
is suspect to bias, i.e the selected subset is not representative for the whole population.
Hash-based selection is a filtering method that tries to emulate random
selection in order to obtain a representative sample for accurate estimations of
traffic characteristics.
The subject of the thesis is to assess which hash function and which packet content
should be used for hash-based selection to obtain a seemingly random and
unbiased selection of packets. This thesis empirically analyzes 25 hash functions
and different packet content combinations on their suitability for hash-based
selection. Experiments are based on a collection of 7 real traffic groups from
different networks.
Inhaltsverzeichnis (Table of Contents)
- Abstract
- Zusammenfassung
- 1 Introduction
- 2 Related Work
- 3 Hash-Based Packet Selection
- 3.1 Hash Functions
- 3.2 Packet Content
- 4 Experimental Setup
- 4.1 Datasets
- 4.2 Metrics
- 5 Results
- 6 Conclusion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis aims to evaluate the effectiveness of different hash functions and packet content combinations for hash-based packet selection in multipoint network measurements. The goal is to identify methods that produce unbiased and representative samples, minimizing resource consumption while maintaining data accuracy.
- Evaluation of hash functions for multipoint sampling.
- Analysis of different packet content combinations for hash-based selection.
- Assessment of bias and representativeness of resulting samples.
- Empirical analysis using real-world network traffic data.
- Contribution to efficient and accurate multipoint network measurement techniques.
Zusammenfassung der Kapitel (Chapter Summaries)
The Introduction sets the context of network measurements and the need for efficient data selection techniques in multipoint scenarios. Related Work reviews existing literature on network measurement techniques and data reduction methods. Hash-Based Packet Selection details the methodology of hash-based selection, exploring different hash functions and packet content choices. The Experimental Setup describes the datasets used (7 real traffic groups from different networks) and the metrics employed for evaluation. Results (excluding the conclusion) presents a portion of the findings on the performance of various hash functions and packet content combinations in achieving unbiased sampling.
Schlüsselwörter (Keywords)
Multipoint network measurement, hash-based sampling, packet selection, hash functions, bias, representativeness, traffic characteristics, network monitoring, data reduction, empirical analysis.
- Citar trabajo
- Christian Henke (Autor), 2008, Evaluation of Hash Functions for Multipoint Sampling in IP Networks, Múnich, GRIN Verlag, https://www.grin.com/document/186592