Excerpt

## Contents

1 Introduction and Motivation for the Thesis

1.1 Field of Research

1.2 Contribution and Outline of this Thesis

1.3 Model Definition and Classification

2 Related Work for Existing Mathematical P2P Models

2.1 Relating Query Popularity and File Replication in Gnutella .

2.2 Random Walk Search Algorithms in P2P Networks

2.3 Stochastic Fluid Theory for P2P Streaming Systems

2.4 Service Capacity of P2P Networks

2.5 An Analytic Framework for Modeling P2P Networks

2.6 Special Issue on P2P Networking and P2P Services

2.7 BubbleStorm

2.8 Controlling the Cost of Reliability in P2P Overlays

2.9 A Reference Architecture for Overlay Networks

2.10 Efficient Resource Virtualization and Sharing Strategies .

2.11 Summary

3 The New Mathematical Model of P2P Systems

3.1 Overlay Parameters

3.1.1 The Basic Notations

3.1.2 Overlay Functionality

3.1.3 Routing in P2P Overlays

3.1.4 The Random Walk Search Algorithm

3.1.5 The Service Capacity

3.1.6 The Identifier Space

3.1.7 The Super Peer Network Design

3.2 The Characteristics of Participating Peers

3.2.1 The Basic Notations

3.2.2 The Delay Experienced by Peers

3.2.3 The Request Processing

3.2.4 The Peers’ Heterogeneity

3.3 The Resource and Service Characteristics

4 Extending the Mathematical Model to the Application Layer

4.1 P2P Communication: Skype

4.1.1 Skype Users

4.1.2 The Overlay Operators

4.1.3 The Participating Peers

4.2 P2P Television: Joost

4.2.1 The Joost Users

4.2.2 The Joost Network Operators

4.3 P2P File Sharing: KaZaA

4.3.1 KaZaA Users

4.3.2 KaZaA Providers

4.3.3 Participating Peers

4.4 Internet Service Providers

4.4.1 Peering

4.4.2 Communication Quality

5 P2P System Layers

5.1 Introduction

5.2 P2P Reference Architectures

5.2.1 JXTA

5.2.2 Reference Model for P2P Overlays

5.2.3 A Reference Interface for Structured P2P Systems

5.3 User Level

5.4 Application Layer

5.5 P2P Service Layer

5.6 Key-Based Routing

5.7 Overlay Network

5.8 Network Layer

6 Case Study: Modeling the GLOBASE.KOM Overlay

6.1 Introduction

6.2 Basic Notations

6.3 The Overlay Tree Structure

6.3.1 The Simplified Model of a Complete Tree

6.3.2 Dependencies of Tree Structure on the Load Parameters: Implementation Issues

6.3.3 Evaluation Results

6.4 Interconnections and Their Effect on Overlay’s Performance

6.4.1 Super Peer Failures

6.5 GLOBASE.KOM Operations

7 Conclusion

7.1 Summary and Results of This Thesis

7.2 Future Work

7.3 Contribution in the Related Research Area

A GLOBASE.KOM Implementation

## Acknowledgements

First of all I would like to thank my supervisor Dipl.-Math, Dipl.-Inf. Kálmán Gy örgy Graffi for proposing this topic, his support, helpful discussions and creative ideas. I would also like to thank Prof. Dr.-Ing. Ralf Steinmetz for providing the possibility to work at this thesis. Thanks to Dipl.-Ing. Aleksandra Kovacevic for her support during working at GLOBASE.KOM. Special thanks to Alexander Fuchs for his encouragement and constant support during my studies and working at this thesis.

## List of Figures

1.1 The Model Classification According to the Transformation Medium and an Inves- tigation Approach [44]

5.1 The P2P Layer Architecture

6.1 GLOBASE.KOM Overlay Structure

6.2 GLOBASE.KOM Tree Structure

6.3 The Tree Structure before Inserting the New Peer

6.4 The Tree Structure after Insertion

6.5 The Depth of the Tree Depending on *L* 1/ *L* 2 Ratio. Analytical and Simulation Results.

6.6 The Breadth of the Tree Depending on *L* 1/ *L* 2 Ratio. Analytical and Simulation Results

6.7 The Number of Root’s Super Peer Children Depending on *L* 1/ *L* 2 Ratio. Analytical and Simulation Results

6.8 Number of Hops Made by the Lookup Query in Dependency on the *L* 1/ *L* 2 Ratio. .

6.9 Number of Hops in Dependency on the Number of Lookup Queries in a Tree with Interconnections

6.10 Number of Hops for Local and Distant Area Search. Analytical and Simulation Results

## List of Tables

2.1 Parameters Used in [1] (Described in Section 2.1)

2.2 System Characteristics Calculated in the Model from [1] (Described in Section 2.1).

2.3 Parameters Used in [2] (Described in Section 2.2)

2.4 System Characteristics Calculated in the Model from [2] (Described in Section 2.2).

2.5 Parameters Used in [3] (Described in Section 2.3)

2.6 System Characteristics Calculated in the Model from [3] (Described in Section 2.3).

2.7 Parameters Used in [4] (Described in Section 2.4)

2.8 Notation and Model Parameters for [4] (Described in Section 2.4)

2.9 Model Notations Deployed in [5] (Extended), Described in Section 2.5

2.10 System Characteristics Calculated in [5] (Extended), Described in Section 2.5 (the First Part)

2.11 System Characteristics Calculated in [5] (Extended), Described in Section 2.5 (the Second Part)

2.12 Notation and Model Parameters for [6] (Described in Section 2.6)

2.13 System Characteristics Calculated in [6] (Described in Section 2.6)

2.14 Notation and Model Parameters for [7] (Described in Section 2.7)

2.15 System Characteristics Calculated in [7] (Described in Section 2.7)

2.16 Notation and Model Parameters for [8] (Described in Section 2.8)

2.17 System Characteristics Calculated in [8] (Described in Section 2.8)

2.18 Characteristics Used in [15] (Described in Section 2.9)

2.19 System Characteristics Calculated in the Model from [15] (Described in Section 2.9).

2.20 Input Parameters [19] (Described in Section 2.10)

2.21 System Characteristics Calculated in the Model from [19] (Described in Section 2.10).

3.1 The Overlay Parameters (Section 3.1), the First Part

3.2 The Overlay Parameters (Section 3.1), the Second Part

3.3 The Calculated Overlay Parameters (Section 3.1)

3.4 The Characteristics of Participating Peers (Section 3.2)

3.5 The Calculated Characteristics of Participating Peers (Section 3.2)

3.6 The Service and Resource Characteristics (Section 3.3)

3.7 The Calculated Service and Resource Characteristics (Section 3.3)

6.1 The GLOBASE.KOM Basic Notations (Section 6.2)

## Chapter 1 Introduction and Motivation for the Thesis

illustration not visible in this excerpt

Abstract

*In this chapter the motivation for this thesis is presented. First the insights in the related research area are gained in order to provide some background for this work. An overview over contributions of this thesis in the research area is given next. Finally some essential terms like model and modeling process and existing model classifications are presented.*

### 1.1 Field of Research

Peer-to-peer (P2P) computing became increasingly important in the last years, providing new exciting opportunities for business, communication and entertainment. Deployed in the financial, content sharing, business-to-business, governmental and many other markets, P2P applications cause up to 80% of the traffic in the internet. Therefore improving the performance of P2P systems has become a desirable goal and a hot research topic.

In order to evaluate the results of P2P research various approaches like simulations, testing or mathematical models can be deployed. Due to the large amount of peers participating in P2P systems, simulation as the most scalable evaluation approach is often used. However simulations require a high time overhead while not all details can be considered. Testing enables highly detailed evaluations, but is expensive and not scalable. Thus mathematical models present a less expensive alternative for detailed evaluations of P2P systems. Furthermore mathematical models can be applied in order to analyze the P2P system performance and system parameters affecting it.

Numerous mathematical models considering diverse performance aspects of P2P systems like content downloading times, the resource search delay and overhead, load balancing, content availabilityand many others alreadyexist. Most models arefocused just on few specific characteristics of P2P systems and do not consider other significant aspects. Thus a model typically covers just one of the P2P functional layers like network, application, overlay, user, key-based routing or service layers. Therefore creating a more extensive model taking more characteristics of P2P systems and more functional layers into account would facilitate the evaluation and analysis of P2P systems considerably and can be regarded as a desirable goal.

The first step towards the construction of such an extensive model is creating an overview over already existing mathematical models for P2P systems, assessing their contribution and listing their benefits, drawbacks and incompletenesses. According to our knowledge such an essay giving an overview over existing mathematical P2P models does not exist yet.

In this thesis we provide an overview over already existing works in this area, presenting mathematical models considering various P2P systems’ characteristics. Thereafter a new model for the overlay, the network layers and the user behavior based on already existing models and covering the most essential P2P system characteristics is introduced. Furthermore the model is extended to the P2P application layer, and the functional layer architecture for P2P systems is presented. Finally a mathematical model of a specific overlay GLOBASE.KOM [38] is developed and verified by comparing the model statements with the simulation results.

### 1.2 Contribution and Outline of this Thesis

The contributions and goals of this thesis can be summarized as follows:

- Creating an overview over existing mathematical models focused on various aspects of P2P systems. Evaluation of described models, observing their advantages, drawbacks and extension possibilities.

- Establishing a set of essential P2P system characteristics considered in most observed math- ematical models. Creating a new mathematical model focused on these characteristics and therefore being valuable for analyzing the most significant aspects of P2P systems.

- Extending the created model to the P2P system application layer considering P2P file shar- ing, P2P communication and P2P television applications. Thereby the users and providers of P2P applications, their objectives, their behavior and collaboration and participating peers from the technical point of view are considered.

- Presenting ideas for the P2P system layer model, taking functionalities, inputs, outputs and dependencies between system layers into account. Six system layers (user, application,

### 1.3. MODEL DEFINITION AND CLASSIFICATION 15

service, key-based routing, overlay and network layers), the constitutive components of each layer and the operations provided by these components are presented.

- Creating a mathematical model of a specific P2P overlay GLOBASE.KOM [38], [46] and comparing mathematical with simulation results. The GLOBASE.KOM overlay is chosen for this goal, since this overlay is still in its early development stage and no mathematical models for GLOBASE.KOM exist yet.

In the next section definitions for models and a model building process are introduced. Con- stitutive components and the model classification are presented in order to clarify the notion of the mathematical model being the focus of this thesis. In Chapter 2 existing mathematical models for P2P applications and P2P overlays are presented. Observed characteristics and executed cal- culations are listed for each described model. In Chapter 3 the new mathematical model focused on the most essential P2P system characteristics is introduced. Thereby P2P overlays, participat- ing peers, resource and service characteristics are observed. In Chapter 4 the model extension to the P2P application layer is proposed. Skype, Joost and KaZaA are chosen as representatives for P2P applications. Furthermore it is explained how the P2P paradigm is used by Internet Service Providers (ISPs). In Chapter 5 the P2P six-layer architecture is presented and components and functionalities of each layer are explained individually. Chapter 6 introduces the new mathemat- ical model of the GLOBASE.KOM overlay. Efficiency of overlay operations and dependency of the hierarchical tree structure on the choice of load parameters are investigated. Finally Chapter 7 presents a summary of the thesis and the outlook for future work.

### 1.3 Model Definition and Classification

In this section the notions of *model* and *model building* are introduced and existing model tax- onomies are presented in order to procure a theoretical basis for the following chapters. As proposed in [31] a *model* is an abstract, simplified and often idealized representation of some real system and is a system itself. The *model building* is a problem dependent process, transforming the system in a model and abstracting from the system properties not relevant for a particular problem.

Thus the models are always created with a focus on a particular problem, so that for each problem a number of applicable models can be developed. Since the created models may present different levels of abstraction from the real system the required abstraction level can be predefined before the model building. Typically less abstract models lead to more precise solutions for the considered problems [43]. In order to create precise and less abstract models an iterative approach for the model building can be applied. Thereby the simple and highly abstract model can be developed first and after a number of iterations it can be further extended to a more complex model at the required abstraction level.

Since the model is a system itself [31] the system parameters can also be used for the model description. Thus the model contains a number of *entities*, having properties denoted as *attributes*. Dynamic properties are indicated as *state variables* changing their values in dependency of the entity attributes. The set containing the values of all state variables describes the current state of the system represented by the model.

Numerous existing model taxonomies classify models according to various criteria. Let us observe the classification according to the transformation medium and an investigation approach proposed in [44] and presented in figure 1.1. Thereby *informal* models can be created as an in- termediate step before *formal* models can be built. * Graphical-mathematical* models highlight the internal model structure, enabling construction and viewing of different models and develop- ment of new models [47]. *Mathematical* models apply the system representation via logical and quantitative expressions. In *simulation* models numerical approaches are deployed, so that the execution proceeds iteratively calculating the model states at successive time points. *Simulation * models are typically used if the considered problem is too complex to be solved via *analytical*

illustration not visible in this excerpt

Figure 1.1: The Model Classification According to the Transformation Medium and an Investigation Approach [44]. models. *Analytical* models present the system dependencies via equality systems, enabling the system state calculation by inserting the known system parameters.

This thesis is focused on * mathematical analytical* models, their extension and evaluation. In addition to that, simulations are applied in order to assess the quality of mathematical models.

## Chapter 2 Related Work for Existing Mathematical P2P Models

illustration not visible in this excerpt

Abstract

*In this chapter existing mathematical models for P2P systems are presented and evaluated. Thereby the search e* ffi *ciency in structured and unstructured P2P overlays, the features and restrictions in P2P streaming applications, the service capacity in P2P file sharing systems, content download and replication times in P2P networks and many other issues are investigated.*

### 2.1 Relating Query Popularity and File Replication in Gnutella

Efficient searching in P2P file sharing systems became a significant research area the last years. In order to improve the search efficiency, the system design has to be considered and evaluated. Thereby the user behavior is an important issue which also has to be taken into account.

An analysis of the user’s behavior in the P2P file sharing network Gnutella [9] is presented in [1]. Particularly it is investigated, which files the users provide and share and which queries they send. Query popularity and file replication are analyzed, and thereafter a mathematical model, based on a probability function is created. This model expresses the matching of queries to files in a P2P file sharing system. It is applied in order to associate the *passive* and *active* user behavior. Thereby the passive behavior is defined by issued queries, and the active behavior is defined by the files shared by the user.

The matching of queries to files proceeds correctly as it was shown by extensive simulations. Thereby the query popularity and the number of file replicas are taken into account. Model parameters used in [1] are presented in table 2.1. In table 2.2 the system characteristics calculatedviagiven parametersarepresented. The firstcolumn is usedto expresssome parameter characteristics in these and all subsequent tables. ’g’ or ’l’ reveal, if the parameter is declared globally in the whole system or locally just at one peer. The notation ’h’ is appended if the history has to be maintained in order to calculate the characteristic value.

The studied notions of query popularity and file replication are essential characteristics of the user’s behavior. The applicability for simulation and prototype-based studies is a significant benefit of the presented model in comparison to other models in this scientific area. However, the model could still be extended considering other aspects, for instance the time of the user being online or combining the users in communities according to their topics of interest.

### 2.2 Random Walk Search Algorithms in P2P Networks

For efficient search of resources and services in unstructured P2P networks search algorithms being able to achieve low latency and low overhead arerequired. Random walk searchalgorithms [10] are applied in order to fulfill these requirements. However, the parameters of the random walk search algorithm may influence the search performance considerably.

N.Bisnik and A.Abouzeid created a mathematical model addressing this issue [2]. This model allows the investigation of the random walk search performance in dependency on the resource popularity, the number of random walkers used during the search and their time-to-live (TTL). Furthermore an *Equation Based Adaptive Search (EBAS)* algorithm is presented. It allows an adaptive parameter choice for the random walk. The estimates for the resource popularity are obtained with the aid of a feedback-based algorithm [2].

The developed search mechanism EBAS is able to guarantee the desired performance to be achieved. For instance it can ensure the achievement of the target access rate on the resource, whereas the search overhead and the search delay are limited. Further benefits of the adaptive random walk approach are low memory requirements and the overhead influencing just the one-hop neighbor.

In order to confirm the benefits of the developed approach, analytic expressions for the search performance characteristics were derived. These are the probability of a successful search *p s*, the expected total overhead *E* [ *O* ] of the search with *k* random walkers and the expected search delay *E* [ *D* ]. These and other used notations are listed in table 2.3. The calculated performance characteristics and applied formulas can be found in table 2.4.

The main benefit of the presented EBAS algorithm is its ability to adapt the number and the TTL of random walkers in dependency on the resource population. This approach is realizable, since each node is required to maintain the information about the popularity estimates *p* of resources in the network.

illustration not visible in this excerpt

Table 2.1: Parameters Used in [1] (Described in Section 2.1).

illustration not visible in this excerpt

Table 2.2: System Characteristics Calculated in the Model from [1] (Described in Section 2.1).

illustration not visible in this excerpt

Table 2.3: Parameters Used in [2] (Described in Section 2.2).

illustration not visible in this excerpt

Table 2.4: System Characteristics Calculated in the Model from [2] (Described in Section 2.2).

### 2.3 Stochastic Fluid Theory for P2P Streaming Systems

A simple stochastic fluid model considering important features and restrictions of P2P streaming systems has been developed [3]. The model is focused on the peers’ real-time demand, peer churn, peers with heterogeneous upload capacity, limited server upload capacity, buffering and playback delay.

The following questions are aimed to be answered by the model: Which key parameters have effect on the P2P streaming system performance? Does a threshold value exist, such that the performance changes from weak to perfect after reaching this value? Why do large systems generally have a better performance than small ones? Which influence does the system size modification have on the performance? Can the performance be improved applying buffering and playback lag? How can the benefit of additional infrastructure resources be measured? How can access control be used in order to offer sufficient service for each peer and reduce the amount of denied requests?

The primary objective consists in the investigation of how the *universal streaming* in the stream- ing system can be achieved. The universal streaming stands for the system state in which all participating peers get data at the video rate *r*. Due to the heterogeneous upload capacity of the participating peers, the peers are categorized in two classes: super peers and ordinary peers. An increasing number of super peers in the system leads to a higher average upload capacity per peer. Therefore, it becomes easier to realize universal streaming if more super peers are participating in the system.

For the systems without peer churn the maximal streaming rate *r max* is obtained, such that universal streaming can be performed. Afterwards large and small P2P streaming systems with peer churn are analyzed. For these systems the existence of the critical threshold *c* and the corresponding critical region is proved. Universal streaming is more frequently achieved in the systems with the average number of super peers divided by the average number of ordinary peers (ρ1/ρ2) greater than the critical threshold *c*.

Thereafter the buffering and playback lag’s impacts on the performance were analyzed. It was shown that significant performance improvement can be achieved applying these mechanisms. This result is confirmed by the analytical expressions for the peer’s maximal download rate φ(*t*), the output time for the server queue at a particular time point *I* (*t*) and the content removing rate from the server queue *D* (*t*). These and other expressions can be found in table 2.6. System parameters used for the calculations are listed in table 2.5.

The presented model is the first mathematical model addressing the performance and prop-

illustration not visible in this excerpt

Table 2.5: Parameters Used in [3] (Described in Section 2.3).

erties of P2P streaming systems. Therefore [3] represents an essential contribution in the related research area.

### 2.4 Service Capacity of P2P Networks

The *service capacity* in P2P file sharing applications is explored in [4]. The notion of service capacity refers to the system’s capability of serving the requests. For instance, a network with a large number of peers sending requests for the same file and just one host uploading the desired file would have a small service capacity. As soon as several peers finished downloading this file, they will also be able to serve requests. Therefore the service capacity of this system is increasing.

The service capacity is modeled in two distinct states: transient and steady states. In the steady state each peer in the system has a constant throughput performance and service capacity increases exponentially. The system is restricted by the server capacity. In the steady state the service capacity is influenced by the demand variability, therefore the service capacity is restricted by demand. It was shown that if the offered loads are high and the participating users are acting cooperatively, the performance can be improved.

Data management, peer selection, traffic, admission and scheduling policy are factors affecting the service capacity. The influence of data management on service capacity was examined by dividing a shared file in *m* chunks and deriving analytical expressions for this scenario. The analytical formula for the average delay experienced by peers[Abbildung in dieser Leseprobe nicht enthalten]was derived. This and other formulas are listed in table 2.8.

The influence of peer behavior on service capacity was revealed applying the *branching process model* [4]. The mean service capacity *E* [ *N d* (*t*)] can be calculated as

illustration not visible in this excerpt

with service capacity growth characteristics δ and β. As long as a demand for resources persists in the system, an exponential growth of the service capacity is observed.

illustration not visible in this excerpt

Table 2.6: System Characteristics Calculated in the Model from [3] (Described in Section 2.3).

### 2.5. AN ANALYTIC FRAMEWORK FOR MODELING P2P NETWORKS 23

In order to mitigate the negative influence of uncooperative peer behavior on the service capacity, the parallel uploading can be deployed. Thereby for the new growth rate β′ and the growth constant δ′ the equations

illustration not visible in this excerpt

should hold. Parameter υ stands for a constant family size for the branching process, *m* ˜ is a mean for the probability distribution function and ζ captures the peer departure dynamics. In this way the data availability and the growth of system capacity can be improved, even though some peers leave the system after their downloads have been finished.

In order to analyze how the peers’ departure rate and the offered load affect the delay to service requests the *Markov Chain Model* [4] was applied. Arrival of new requests in the system, processing service requests and peer churn were modeled with Markov chain as follows:

illustration not visible in this excerpt

for new request,

illustration not visible in this excerpt

for processing the peer’s service request, and

illustration not visible in this excerpt

for peer’s departure.

Applied notations are explained in table 2.7. The Markov Chain Model is applied to show, that the average delay experienced by peers improves in the offered load if peers leave the system slowly. Thereby the service capacity is linearly increasing in the offered load. Other expressions derived in [4] are presented in table 2.8.

Since the user behavior affects the service capacity considerably, *incentive mechanisms* may be applied in order to prevent the ”free-riding” problem [13] and ”tragedy of the commons” [14]. These mechanisms encourage peers to stay online and share files, thereby improving system performance. For instance, the game-theoretic approach from [11] is adaptable and scalable, provides fairness in the system, helps to avoid waste of resources and maximizes the utility for all users.

### 2.5 An Analytic Framework for Modeling P2P Networks

The numerous mathematical models, addressing the performance issues in P2P networks typically do not consider the user’s point of view. Hence questions like ”How long does the user have to wait for the file to be downloaded, assumed this file is available in the network?” still have to be answered. The mathematical model presented in [5] aims to answer this question, considering the download and replication times of an arbitrary file in a P2P network as performance metrics. Furthermore *peer selection policies* and *user level policies* are developed.

Peers participating in the system are classified into core routers and end peers. Thereby queuing delays at the core routers and end peers are separately investigated in the model. Primarily a *router network model*, describing delays at the core routers was developed. The notations deployed in this and other models from [5] are listed in table 2.9. The router network model contains the fundamental traffic-rate (rate balance) equation, as well as equations for traffic variability and network latency. The total arrival rate at router *j* is calculated as a sum of the total external arrival rate λ0 *j* and the arrival rates λ *i q ij* from each of the *N R* neighboring routers.

Therefore the arrivals from all neighboring routes can be calculated as∑ *N R * Finally the

illustration not visible in this excerpt

illustration not visible in this excerpt

Table 2.7: Parameters Used in [4] (Described in Section 2.4).

illustration not visible in this excerpt

Table 2.8: Notation and Model Parameters for [4] (Described in Section 2.4).

26 CHAPTER 2. RELATED WORK FOR EXISTING MATHEMATICAL P2P MODELS

The total external arrival rate [Abbildung in dieser Leseprobe nicht enthalten] at router [Abbildung in dieser Leseprobe nicht enthalten] is calculated as a sum of the arrival rates [Abbildung in dieser Leseprobe nicht enthalten] from

illustration not visible in this excerpt

all streams *i* into the router *j*:

The number of packets [Abbildung in dieser Leseprobe nicht enthalten] at the router *i* is composed of the workload ρ *i* on the router and *i* the total expected waiting time [Abbildung in dieser Leseprobe nicht enthalten] for all packets arriving at the router *i*. Therefore the *i* expected value [Abbildung in dieser Leseprobe nicht enthalten] for the number of packets at the router *i* equals to

illustration not visible in this excerpt

The total external traffic rate λ0 results from the sum of all external rates of traffic λ0 *i* into each router *i* = 1, ..., *N R* and equals to

illustration not visible in this excerpt

Thereafter the queuing model expressing the *end peer* ’s behavior was created. The model is used to determine the time required for a file download. It is also shown, that the arrival process variability function at end peers equals to 1. Since the requests leaving the router with a rate of [Abbildung in dieser Leseprobe nicht enthalten] are sent to the peer *i* with probability [Abbildung in dieser Leseprobe nicht enthalten], the incoming rate[Abbildung in dieser Leseprobe nicht enthalten] at peer *i* is calculated as [Abbildung in dieser Leseprobe nicht enthalten].

The service rate at each peer is determined as the total service capacity divided by number of the currently served requests ratio: *C* / *current*. The workload ρ on peer *i* is calculated in dependency of the overall rate of the request arrival λ *di* and the average service time *X* required for serving the request:

illustration not visible in this excerpt

The maximal workload ρ *m* is reached, if the maximal number of requests *m* allowed by the peer has to be served. In this case the service limit is achieved, and the files might be blocked or lost with probability [illustration not visible in this excerpt]With probability 1 − *P l* neither block nor loss occurs and the file will be served by the peer. The throughput of the queue depends on the overall rate of request arrival λ *di* and the probability 1 − *P l* and is calculated as [Abbildung in dieser Leseprobe nicht enthalten].

Furthermore the time required for the resource search in the network (*Query Search Time*) is measured, since it is a significant aspect for system performance. Primarily an expression derived in [18] for the shortest path 〈 *d* 〉 between two randomly selected nodes in the router graph is presented:

illustration not visible in this excerpt

It can be observed, that 〈 *d* 〉 depends on the average number *z i* of the *i* hop neighbors in the router graph. The formulas for [Abbildung in dieser Leseprobe nicht enthalten]can be found in table 2.17.

Thereafter two groups of search architectures are observed: the centralized and the decentralized one. The formulas for calculating the average lookup time *T QS* passed between issuing the query and its termination for centralized and decentralized searching architectures were derived (see table 2.10). In the centralized architecture the average lookup time *T QS* depends only on the central server’s mean service time μ *Cs*. Therefore for the centralized architecture

illustration not visible in this excerpt

Equations describing the influence of multiple file copies on the transfer time were derived next. Furthermore formulas for the number of copies of the *i th* most popular file in the network, number of files served by each peer at any moment, and the service time required for data transfer at the worst peer were introduced. Notations and model parameters used for calculations are introduced in table 2.9. Tables 2.10 and 2.11 contain information about which system characteristics can be derived from the known ones and the formulas for calculations.

It was also investigated how the file replicas should be distributed between peers in order to minimize the total download time. As a result it was claimed that files should be located on the peers proportionally to the peers’ service rates. Simulation results revealed, that such an allocation strategy causes considerably lower delays. Simulations also showed, that the download times depend on the number of file copies and the file size. Thereby the rising number of file copies in the system leads to lower download times. However, the dependencies between download times and number of file copies are not linear. The number of downloads allowed by a peer at a certain time point also can influence the system performance considerably.

### 2.6 Special Issue on P2P Networking and P2P Services

The investigation of the network traffic and the content search rate in P2P networks became an essential aspect in the related research area. However the studies of certain performance parameters in P2P networks typically require extensive simulations, which still cannot ensure absolutely correct results. Therefore a mathematical model based on random graph theory was developed to address this problem [6].

The network is modeled as a random graph, representing the peers and connections between them. The graph is deployed for investigations of the peers’ connectivity in the network. Thereby the number of nodes, connected to any individual node in the graph is determined. The generation process of the connected peer set (giant component) in the P2P overlay graph is observed and analyzed at each hop *h*. Thereby aspects like the number of nodes *n f ree h* still not connected to any other node at hop *h*, the number of neighbors *z h* at the hop *h*, the average degree *z* 1 *h* of the remaining free nodes at hop *h* are considered. Moreover, this mathematical approach considers finite environments by taking the loop occurrences into account and calculating their probabilities. Further model notations, system characteristics and the formulas used for calculations are listed in tables 2.12 and 2.13.

The presented mathematical model enables the assessment of the data traffic and the success rate for content searchin P2Pnetworks with known network topology. Moreover traffic, searchef- ficiency and further performance aspects of various P2P protocols can be compared and analyzed. Thereby the model is appropriate even for large P2P networks, since the required computational power is considerably smaller than the computational power needed for simulations.

### 2.7 BubbleStorm

The aspect of an efficient solving of the *rendezvous problem* for popular data in heterogeneous networks is addressed in [7]. A P2P system *BubbleStorm* able to perform efficient exhaustive search avoiding hotspots and exploiting the heterogeneity of participating peers is presented .

Heterogeneity is introduced in the BubbleStorm by setting the degree of each node proportional to the node’s capacity. Thereby the network should stay well connected by keeping each node’s degree ≥ 4.

The simplified membership protocol and its invariant for the nodes joining and leaving the network are presented. An extended join protocol allows simple serialization of concurrent joins and leaves.

A communication component *Bubblecast* is created with the objective to guarantee a ren- dezvous between messages of various types (type-A and type-B messages). Bubblecast has a task of detecting a *rendezvous node r* located in intersection of the two message bubbles. The

illustration not visible in this excerpt

Table 2.9: Model Notations Deployed in [5] (Extended), Described in Section 2.5.

### 2.7. BUBBLESTORM

Variable Meaning Dependencies

illustration not visible in this excerpt

Table 2.10: System Characteristics Calculated in [5] (Extended), Described in Section 2.5 (the First Part).

illustration not visible in this excerpt

Table 2.11: System Characteristics Calculated in [5] (Extended), Described in Section 2.5 (the Second Part).

illustration not visible in this excerpt

Table 2.12: Notation and Model Parameters for [6] (Described in Section 2.6).

**[...]**

- Quote paper
- Inna Kotchourova (Author), 2007, Evaluation and Extension of Mathematical Models of P2P Systems, Munich, GRIN Verlag, https://www.grin.com/document/87906

Publish now - it's free

Comments