Investigating Distributed Approaches for Solving Discrete, Multistage Optimization Problems

Diploma Thesis, 2004

83 Pages, Grade: 1,3



1. Introduction

2. Optimization
2.1 Introduction
2.2 Simulation-Based Optimization
2.2.1 Background
2.2.2 Applications
2.3 Parallel and Distributed Computing
2.3.1 Background
2.3.2 Applications
2.4 Grid Computing
2.4.1 Background
2.4.2 Applications
2.4.3 Future Prospects
2.5 Performance Evaluation
2.5.1 Speedup
2.5.2 E ciency
2.5.3 Scalability

3. Discrete Multistage Processes
3.1 Introduction
3.2 Discrete Multistage Manufacturing Processes .
3.3 Challenges

4. Solution Approaches
4.1 Introduction
4.2 Simplex Approaches
4.3 Evolutionary Approaches
4.4 Mixed Integer Approaches

5. Implementation of Two Algorithms for Solving Multistage Problems .
5.1 Introduction
5.2 Implementation Structure
5.3 Scatter Search
5.3.1 Basic Heuristic
5.3.2 Parallelization
5.3.3 Overview
5.3.4 Method Description
5.3.5 Performance Evaluation .
5.4 Branch and Bound
5.4.1 Introduction
5.4.2 Basic Design
5.4.3 Basic Heuristic
5.4.4 Parallelization
5.4.5 Overview
5.4.6 Method Description
5.4.7 Performance Evaluation .

6. Applications to the Approaches
6.1 Introduction
6.2 Test Problems
6.2.1 N-Dimensional Rosenbrock Problem .
6.2.2 Multistage Test Problem .
6.3 Test Results
6.3.1 Scatter Search
6.3.2 Branch and Bound
6.3.3 Grid Environment

7. Summary and Future Work


2.1 Modeling a component with the FEM [Bar01]

2.2 Evaluation by a black box with the input parameters x and the result f (x)

3.1 Illustration of a multistage manufacturing process [AM04]

4.1 Re exion step with a simplex method in two-dimensional space .

5.1 High level UML-diagram of the implementation

5.2 Pseudo-code for sequential Scatter Search (comp. [LM03])

5.3 Pseudo-code for parallel Scatter Search

5.4 Schematic representation of the parallel Scatter Search approach (comp. [LM03])

5.5 UML-diagram of the Scatter Search implementation

5.6 Path Relinking recovering infeasible solution

5.7 Combination of a size three subset

5.8 Search tree of a Branch Bound approach solving a multistage problem

5.9 Discretization of dependent variables to achieve separability

5.10 Finding solutions using the search step

5.11 Generating solutions with the poll step

5.12 Constraint handling in Branch Bound

5.13 Branch Bound pseudo-code

5.14 Schematic representation of a distributed Branch Bound ap- proach for solving multistage problems

5.15 UML-diagram of the Branch Bound implementation

5.16 Isolated e ciency of the Branch Bound approach solving a mul- tistage problem

6.1 Speedup of the Scatter Search algorithm solving the n-dimensional Rosenbrock function

6.2 E ciency of the Scatter Search algorithm solving the n-dimensional Rosenbrock function

6.3 Development of the objective function value with Scatter Search and Polytope optimizing the 10-dimensional Rosenbrock function . .

6.4 Development of the objective function value with Scatter Search and Polytope optimizing the 100-dimensional Rosenbrock function .

6.5 Speedup of the Scatter Search algorithm solving the multistage test function

6.6 E ciency of the Scatter Search algorithm solving the multistage test function

6.7 Speedup of the Branch Bound algorithm solving the multistage test function

6.8 E ciency of the Branch Bound algorithm solving the multistage test function

6.9 Communication losses when employing the algorithms on the dis- tributed grid environment


Distributed nonlinear optimization becomes more and more important as demands for high quality results grow, and the tasks themselves become more complex by the day. Solving these problems on large mainframes rather than mass produced com- puters is expensive, because the hardware costs for large mainframes are greater. The distribution of computing needs is cost e ective because networks of existing workstations can be used, which operate quickly as long as the communication costs are low.

For the distribution of the optimization tasks, which will be introduced in Chapter 2, several approaches have been developed, but so far there is no stan- dard that ts every problem. New approaches are needed and new concepts for an intelligent splitting of the workload have to be developed. In this thesis, the prob- lem of discrete multistage optimization processes and possible distributed solution approaches will be discussed. These processes focus on nonlinearity, and a test function is used to compare the di erent approaches. The results are discussed in Chapter 6.

Two optimization approaches are implemented in this thesis; they are introduced in Chapter 4. The implementation is entirely distributed and scalable. Details on the implementation can be found in Chapter 5. One of the approaches is the well established meta-heuristic called "Scatter Search", which will be thoroughly discussed in Section 5.3.

The mixed integer approach "Branch and Bound", which is also a well known heuristic applied to many problems in the past, is modi ed and utilized to solve multistage problems. Its capability of solving these problems is examined, with the goal of greatly reducing computing costs. It will be discussed in Section 5.4.

Both algorithms will be employed to solve the same test problem introduced in Chapter 6, which models a multistage simulation-based optimization process, introduced in Chapter 3. Their results and performance will be compared. Scatter Search will additionally be tested for its performance of solving the n-dimensional Rosenbrock function, also introduced in Chapter 6. Note that in this thesis, much theoretical background is presented. However, the focus is placed on the implementation of the mentioned distributed algorithms.


2.1 Introduction

Optimization is a general term for extremum searching (maximizing or minimizing) of particular properties. An optimization problem can be one of any kind, such as every day things like e.g. how to spend money to achieve the best personal bene t from it, or, as in industrial applications, the most cost-e ective way to build a device.

Whatever the particular problem is, according to [WCS03], a typical optimization problem has these three main properties:

1. An ob jective function to evaluate a solution and provide means to compare di erent solutions. Optimization problems typically call for a minimization or a maximization of the objective function.
2. A set of variables which determine the objective function value. These can be variables of any kind (i.e discrete, continuous, etc.) that the objective function requires to evaluate the solution's value.
3. Constraints and bounds that restrict the variables so that they only take on certain values and exclude others. Optimization problems are sometimes unconstrained, however according to [CS70], real world problems are gen- erally constrained. Therefore, an algorithm designed for solving real world problems needs to include means to handle constraints.

Therefore, to optimize a problem, one needs to nd a strategy to alter the variables in such a way that the objective function value approaches an optimum but does not violate the constraints. A minimization problem can be expressed as:

illustration not visible in this excerpt

where f (x) denotes the objective function value, c(x) the constraint values and y the threshold to determine if x is a feasible solution to the problem.

2.2 Simulation-Based Optimization

2.2.1 Background

Whereas simulation has only recently been employed for the optimization of com- plex systems, it has been traditionally always been one of the most widely used tools for evaluating such systems [Gos03]. Increased computing resources and more sophisticated optimization algorithms have made simulation-based optimization feasible for complex optimization problems. Along with the increasing computing powers, the complexity that modern engineers have to deal with has increased as well. Traditional optimization tools rapidly break down on these problems [Gos03].

Naturally, as the problem complexity increases, it is more di cult to express an objective function in the form of an algebraic expression. In that case, one is forced to use a simulation to evaluate the objective function value.

A widely used tool for engineers to transform real world components and parts into a computer model is the Finite Element Method (FEM). With the FEM, the structure of the element to be modeled is broken down into a nite number of graphical elements in which each of them usually has three or four vertices and three or four edges, or, in a three-dimensional model, four to six vertices and six to nine edges.

Figure 2.1 shows how a real-world component can be modeled with the Finite Element Method. The quality of the model increases with the number of discrete elements used.

The enumerable structure of the nite elements make it possible to mathemat- ically describe its properties, thus evaluating its quality and comparing it to other models.

A major eld for these kind of processes is the concept of Virtual Prototyping. Virtual prototyping describes the process of virtually creating an element on a computer and evaluating its quality and characteristics without actually building it. It promises great reductions in production cost, since design aws can be detected before the actual production started.

In [Bar01], simulation-based optimization is described as being among the non- linear optimization problems with nonlinear constraints, for which no derivative information is available. Therefore, possible solution approaches need to optimize a problem solely by evaluating the objective function. These kinds of optimization procedures are often referred to as evaluated in a "black box", where parameters are entered beforehand and the simulation runs without the possibility of any in- teraction until it produces the result. This behavior is illustrated in gure 2.2.

2. Optimization 11

illustration not visible in this excerpt

Fig. 2.1: Modeling a component with the FEM [Bar01]

Due to these properties, most e cient optimization algorithms (i.e classical nonlinear programming methods, such as gradient descent, or classical algorithms in stochastic dynamic programming, such as value iteration and policy iteration) are not capable of solving these problems.

According to [Gos03], two main simulation-based problems can be distinguished:

1. Parametric Optimization (or Static Optimization).

2. Control Optimization (or Dynamic Optimization).

The key di erence between static and dynamic optimization problems is that a dynamic problem contains variables that are time-related, hence the values that describe the function change during the optimization process.

illustration not visible in this excerpt

2.2.2 Applications

Simulation-based optimization is commonly used, among other areas, in pharmaceutical design and sheet metal forming, i.e at the designing of parts for the automobile or aircraft industry. These simulations are included among the parametric optimization problems.

Two examples of applications of dynamic optimization problems are climate control and water management.

2.3 Parallel and Distributed Computing

As introduced in the previous chapter, simulation-based optimization can be com- putationally very expensive. In order to cope with complex problems, a concept of distributing the computing needs to di erent resources is necessary. In this section, the terms parallel and distributed computing will be introduced and their applications examined.

2.3.1 Background

According to [GH93], conventional sequential computers are all based on the design of John von Neumann from the late 1940s, and consist of three major components:

1. a central processing unit (CPU)

2. a main-memory system

3. an input / output system (I/O)

These machines execute instructions sequentially, one at a time, until the entire stack has been processed. Although many improvements have been applied to the early von Neumann machine to increase its performance, the underlying principle still remains the same in today's computers. An architecture like this is known as single instruction stream, single data stream (SISD).

To step from sequential to parallel machines, di erent types of machines for processing instructions in a parallel fashion are classi ed as follows [Fly66]:

- single instruction stream, multiple data stream (SIMD)
- multiple instruction stream, single data stream (MISD)
- multiple instruction stream, multiple data stream (MIMD)

SIMD machines can be further classi ed into two sub-categories: shared memory and local memory machines. Both machines are essentially similar, except that the shared memory machine does not have its own memory. All SIMD machines have n processors, n memories, a connecting network and a control unit, and all processors operate on di erent data sets. Common applications for the SIMD machine are target tracking, image processing and database operations.

In MISD machines, every processor has its own control unit and shares a common memory unit. Parallelism is accomplished by having each processor perform a di erent operation on the same datum at the same time.

According to [Fly66], MIMD machines are the most powerful among the above mentioned because they incorporate n processors, n streams of instructions, and n streams of data. The processors are the same as in SISD machines, therefore these processors are considered more powerful than the ones used in SIMD computers.

The architectures introduced above are classically known as Parallel Architectures, although for the last two decades, the ability to distinguish between the terms distributed and parallel processing has begun to fade [Zom96].

The next classi cation of machines is generally referred to as being distributed :

Clustering - This describes an architecture that enables multiple computers to operate simultaneously to solve a computationally expensive problem. The algorithm running on this architecture must be parallelizable. Interconnections between the machines are realized with networks, which are normally Local or Wide Area Networks (LANs or WANs).

The performance and, of course, the cost of clustered machines depend on a variety of factors [Zom96]:

- Communication Cost - The time needed to communicate between nodes of a cluster. The higher the communication cost, the more time is needed for interchanging information.
- Expandability - The amount of hardware that needs to be added in order to include an extra component. In an ideal architecture, this cost is minimal.
- Fault Tolerance - The capability of the system to handle partial failures, such as losing a network connection or the malfunctioning of a node.
- Bandwidth - The amount of data that can be sent throughout the network per unit time.

2.3.2 Applications

Aside from being used to solve the previously mentioned optimization problems, distributed computing is recognized as an important vehicle for the solution of a variety of problems. These include, climate modeling, pollution dispersion, water management, pharmaceutical design, and others. All of these require a computing power greater than that which is obtainable with conventional computers [FWM94] [Zom95].

2.4 Grid Computing

2.4.1 Background

The term Grid was originally used as an analogy to terms like electrical power grid or other utility grids and, in fact, their architectures can actually be considered similar [Fos02]. In a power grid, di erent nodes are connected to each other to provide every end with electrical power; in a computing grid, nodes are connected to each other to provide computing power to every end.

The idea of grid computing evolved from the e ort of distributing computing resources in the 1980s and 1990s. In contrast to the parallel machines introduced in Section 2.3, a grid infrastructure is geographically dispersed, thus having high communication latencies. Applications that are successfully operating on a grid are likely to either inhibit minimal communication and data transfer between program components, or be able to tolerate high latency [BFH03].

The vision of a future computing grid, which is often quoted, is that computing resources will be as accessible and normal in everyday life as other utilities, such as water and electricity. It does, however, fall short in a deeper observation, since electric resources (i.e power plants) are concealed from the user and no matter where the power is used, it is always of the same kind and strength. In a computing environment however, it does matter which resource is being utilized, as performance greatly depends on the type of the machine.

2.4.2 Applications

Applications need a stable infrastructure to rely on. Unfortunately, the computing grid is still in constant development and therefore current applications are gen-erally developed in absence of mature technologies. Yet some projects have been successful and a few will be introduced in this section.

Some of the fastest growing application areas in grid computing are the Life Sciences. These are, among others, computational biology, bioinformatics, genomics and computational neuroscience. They all use the grid infrastructure to access, collect and mine data [BFH03]. Two examples are the Protein Data Bank [PDB04] project and the Biomedical Information Research Network [BIR04].

Another important eld is engineering applications. Utilizing the grid for this resource-intensive eld is a cost-e ective alternative to expensive local systems. One project that deserves mention is the NASA Information Power Grid [NAS04].

In the near future, data will be of even higher importance, and data-applications for the grid will be used to collect, store and analyze that data. The DAME project [DAM04] is an industrial project from the UK to analyze in- ight data gathered by operational aircraft engines to improve safety and decrease maintenance costs.

2.4.3 Future Prospects

Grid computing can in many ways still be considered in the beginning stages. Grid technology itself, as well as applications and middleware, will be subject to changes within the next decade. Eventually, if development of the grid continues, sensors, health monitors, PDAs and many other devices will be connected to the grid and linked to high performance supercomputers and vast databases. The result will be an unprecedented heterogenous network with great performance variations. In order for this to work, immense research has to be done and software standards have to be created. Once operational, the grid will play a major role in the availability of modern day information. Sensors and sensor-nets embedded in aircrafts, roads, clothing, etc. will provide an immense amount of data, and real- time analysis will play a greater role in public safety and health [BFH03].

2.5 Performance Evaluation

There are many ways to determine the performance of a distributed solution approach. A few of them are speedup, e ciency, and scalability.

2.5.1 Speedup

Speedup appears to be the most natural way to evaluate the performance of a distributed algorithm. It is determined by comparing the runtime of the distributed algorithm with that of a sequential one. There are two kinds of speedup that can be distinguished: Absolute and relative speedup.

Absolute speedup is de ned as [Fos95]:

illustration not visible in this excerpt

where T (1) describes the runtime of the best available sequential algorithm and T (p) the runtime of the distributed algorithm for p processors.

Alternatively, relative speedup can be used to measure the performance. Rela- tive speedup allows the comparison of the distributed algorithm to itself, performed on a T(1) machine [Fos95, ERL98]. This is signi cantly important, as the best available sequential algorithm for a certain problem is not always easy to deter- mine.

An ideal distributed algorithm achieves a relative speedup of p for p utilized processors. In reality, however, interprocessor communication time adds to the overall execution time, which leads to speedup values that are well below the ideal ones [Bar01].

2.5.2 E ciency

E ciency describes the ratio of speedup to available processors, and is determined as [Bar01]:

illustration not visible in this excerpt

The ideal value for the e ciency is 1. However, this value is usually unattainable due to communication and other losses during the process. It could only be reached if no processors would ever be idle or do any unnecessary work, and if interprocessor communication time would be negligible.

2.5.3 Scalability

Scalability is a very important measurement for determining the exibility of an algorithm. It describes how well a procedure performs when faced with di erent amounts of available resources. Scalability is measured by comparing the speedup or the e ciency values for various processor numbers [Fos95].

An ideal scalable algorithm has linear increasing speedup values and constant e ciency values. Since scalability depends on speedup and e ciency, the ideal scalability is subject to the same restrictions.


3.1 Introduction

Multistage Processes involve n ≥ 2 stages for the complete process to nish, with each stage yielding a result that will either be further processed in subsequent stages or withdrawn as nished [Tse03]. Each stage relies on the information provided by the last stage and is therefore naturally sequential. This chapter discusses di erent discrete processes of this kind and outlines their possible appli- cations. Two algorithms used for solving these problem classes are implemented and discussed in Chapter 5.

The analysis of a chess game can be seen as an example for a multistage process. To predict the next best move, the player has to evaluate all possible moves and their corresponding e ect on the the next move. An experienced human playing the game would rule out a signi cant number of these possibilities, thus rapidly downsizing the search space.

However, a computer playing the game needs to analyze every possible move and all of its subsequent e ects to determine which move is the best. The quality of a possible move is given by the overall outcome of the next move and all of its possible successors. During this evaluation, the process has to obey the rules of the game (constraints). It becomes quite obvious that the search space increases exponentially with every move, as more possibilities become available.

Therefore, multistage processes are expensive to solve computationally, because experience is di cult to simulate and the search space grows rapidly. The next section introduces an industrial example of multistage processes.

3.2 Discrete Multistage Manufacturing Processes

Multistage manufacturing processes are those which involve n ≥ 2 stages for the complete process to nish, where each stage is dedicated to a speci c task. A car assembly line is an example for this kind of a process [DCS00]. However, on a smaller scale, many parts for the car manufacturing process itself have to be produced in a multistage process. Thus, the production of a car can be broken

illustration not visible in this excerpt

Fig. 3.1: Illustration of a multistage manufacturing process [AM04]

down into many multistage processes that eventually yield one nal product, the car.

As this example illustrates, multistage manufacturing processes are of tremen- dous importance for modern day engineering. E ciently optimizing these pro- cesses promises a great reduction in production costs, and is a task that is yet to be accomplished. According to [FM99], current procedures for accomplishing de- sign tasks, such as selecting the number of stages and determining the conditions for each operation, produce feasible solutions that are often far from optimal. Some of the challenges that make optimizing these processes so di cult are discussed in Section 3.3.

Figure 3.1 illustrates a multistage production process, where each stage is dedicated to a speci c task. It shows the forming process of a sheet metal, which is inserted on the right side of the model. The sheet metal is processed in six stages until the nal product is yielded on the left side.

According to [GBG03], multistage problems in engineering can be formulated as optimization problems with objective and/or constraint functions given implicitly by a numerical simulation. This implies that this problem class can be theoretically solved using the automated simulation-based optimization approaches introduced in Section 2.2.

3.3 Challenges

The main reason multistage optimization problems are so di cult to solve is because they deal with a variety of factors which need to remain exible. According to [GBG03], in simulation-based optimization these factors are:

- The number of stages involved is not xed and may be subject to change, such as an increase might be necessary i.e. to maintain feasibility, or a decrease i.e. to reduce production costs.
- The result of a stage directly or indirectly in uences the values of subsequent stages.
- Evaluating the success of a process often requires complex simulations.

A commonly used technique for solving multistage problems is to select a discrete number of stages and then determine the outcome for parameters based on expert knowledge. This procedure is repeated until a reasonable and feasible result is reached. The vast amount of possible solutions prevent nding an optimal solution without using sophisticated automated algorithms.

However, most dynamic programming techniques fail to work e ciently on a multistage problem due to the variable number of stages. The natural way to solve a problem (using common techniques), where the number of stages also has to be optimized, is to keep the number of stages xed to a certain amount and treat each of the stage amounts as a separate problem.

An additional di culty in solving these problems is the possibility that optimal decisions in stage n might not be optimal in terms of optimizing the entire process.

Two approaches for solving this problem class are implemented in this thesis and are discussed in Chapter 5.


4.1 Introduction

This chapter will discuss di erent solution approaches that can be used for nonlinear optimization. A general background will be provided about direct search, evolutionary, and mixed integer approaches, which are all among the class of derivative-free approaches. The two derivative-free approaches Scatter Search, an evolutionary one, and Branch and Bound, a discrete one, are thoroughly introduced and implemented in Chapter 5.

In [Bar01], direct search approaches, in contrast to gradient methods, are described as being based solely on the evaluation of the objective function values. In [Wri95], direct search methods are applied to problems where the objective function has the following characteristics:

- No derivative information is available.
- The approximation of the gradient is too expensive computationally.
- The determination of the objective function value is very time-consuming.

An important eld for direct search approaches is the class of simulation-based optimization problems (see Section 2.2), where the evaluation of the objective function is expensive and is done in a black box.

In the following sections, a brief introduction will be given to a selection of important approaches in the eld of direct search methods.

4.2 Simplex Approaches

The Simplex Method was originally proposed in [SH62] and is also the basis of the well known approach from Nelder and Mead [NM65]. The simplex methods are based on an initial design of k + 1 trial points, where k is the number of variables. A geometric gure of k + 1 trial points in a k-dimensional space is called a simplex.

A new variable is calculated by a re ection into the variable space opposite the worst solution or by contraction within the simplex. This new trial point replaces the least favorable trial in the simplex, thus producing a new least favorable trial. To prevent loops, the simplex approaches incorporate a rule that states that a simplex is never to return to previously rejected levels.

illustrates a re exion step with a simplex method.

Fig. 4.1: Re exion step with a simplex method in two-dimensional space

Simplex methods are generally designed for unconstrained optimization. How- ever, Box proposed an approach called Complex for constrained optimization. This approach incorporates a repair function for infeasible trial solutions. Please see [Box65] for further details on this method. The Complex method is the basis of the multidirectional search method, which extends the Complex approach for parallel execution. Similar approaches have been discussed in [Bar01, DT94, Gra84].

4.3 Evolutionary Approaches

Interest in approaches that solve problems based on evolution has greatly increased during the last 4 decades. Such approaches maintain a population of individual solutions which are all unique and have a method to generate new solutions that are superior to the existing ones. Furthermore, they incorporate a selection function that is usually based on the quality ( tness ) of the solutions.

In [Mic96], evolutionary approaches are described as being probabilistic algo-

rithms which maintain a population of individuals, P (t) = xt1,

···,xtn for iteration

t. Each of these individuals poses a solution to the problem at hand, and is eval- uated to give an indication about its tness. In each subsequent iteration (t + 1), a new population is formed by selecting the tter individuals and having them undergo transformations. These transformations are performed in many di erent ways. The most common are mutation, where a new solution is formed by ran- domly changing small parts of a single solution, and crossover, where two or more solutions are combined to form a single new one. Other techniques are the linear combination of solutions, where two or more solutions are combined to several solutions linear to the original ones, and the linking of solutions, where moves are performed to bring two or more solutions closer together while generating new ones along the way.

After a number of generations, the approaches converge, leaving the ttest individual as the best solution to the problem.

Scatter Search

The meta-heuristic Scatter Search was rst proposed by Glover in [Glo77]. Scatter Search, which is also an evolutionary approach, originated from strategies for creat- ing surrogate constraints and composite decision rules as described in [CDG99]. It uses strategies for search diversi cation and intensi cation that have been proved e ective in a variety of optimization problems (i.e. [Gra98, Gre99, RT95]).

The differences between Scatter Search and other evolutionary procedures such as genetic algorithms, are that Scatter Search provides unifying principles for joining solutions based on generalized path construction and the use of strategic designs instead of randomization. Further advantages come from intensi cation and diversi cation mechanisms that exploit adaptive memory.

Each of the Scatter Search elements can be implemented in a variety of ways and varying degrees of sophistication. This section will introduce basic Scatter Search designs as well as its parallel execution capabilities. A fully distributed Scatter Search implementation for solving nonlinear problems can be found in Section 5.3.

4.4 Mixed Integer Approaches

Mixed integer approaches represent a powerful framework for modeling many optimization problems that involve discrete as well as continuous variables. In [Gro02], two main categories of mixed integer approaches are distinguished: the "Mixed Integer Linear Programming" (MILP) and the "Mixed Integer Nonlinear Programming" (MINLP) technique.

A basic MINLP problem can be described as:

illustration not visible in this excerpt

where x and y are the discrete and continuous variables. The set X is assumed to be a convex, compact set, and Y is the enumerable set of discrete values.

While MILP techniques have been available and applied for over twenty years (see [JNS00] for an overview), MINLP codes have only recently been developed and applied to optimization problems. Since the focus of this thesis is on nonlin- ear problems, only the MINLP techniques will be further examined in this section. The discrete variables involved in these approaches are generally categorical, which means that they always have to take on values of a predetermined set. One tech- nique for solving MINLP is the Branch and Bound approach, which produces branches with di erent categorical variables and employs an NLP solver for opti- mizing the continuous variables at each node. An approach for solving multistage problems that are based on this principle is implemented and discussed in Section 5.4.

The Mixed Variable Pattern (MVP) technique, proposed in [AD00], combines unique ideas for decreasing the size of the search space with the well established "Generalized Pattern Search" (GPS) approaches for NLP. The GPS class described in [Tor97] uni es a wide class of useful derivative-free algorithms for unconstrained optimization. An extension for constrained optimization can be found in [LT99]. Although GPS itself is capable of solving nonlinear optimization problems, within the MVP approach it merely serves the purpose of an NLP solver. A GPS solver generates a sequence of iterates xk in ℜn with decreasing objective function values. MVP splits these optimization iterations up into three steps:

1. Search Step (GPS, MVP)

2. Poll Step (GPS, MVP)

3. Extended Poll Step (MVP)

The search step performs a global exploration of the variable space, while allowing the use of surrogate objectives and constraints (when available) to predict favorable areas. The purpose of this step is to narrow the search space, since the direct application to the actual problem is many cases prohibitively expensive. Search steps generally select candidate points that are independent of the incumbent.

When the search step fails to nd an improved solution, the poll step is applied. The poll step is a local exploration near an identi ed incumbent point xk, and ensures the convergence of the process.

In the case of discrete and continuous variables, an extended poll step is applied if both of the above procedures fail to produce an improved point. It searches the neighborhood of the discrete variables and, if necessary, adjusts the continuous variables [Abr04].

If at any point during these steps a feasible point xk+1 = xk is discovered which poses an improvement to preceding solutions, the iteration is complete and the process is reiterated [AD04a].

An example of an applied mixed integer approach is the "Mixed Integer Variable Algorithm Model" (MIVAM) proposed in [AD04b]. It iterates over a given problem by setting parts of the solution vector as xed, and performing a continuous search over the other parts. After each iteration, the discrete variables are newly determined until a termination criterion is reached.

Branch Bound Approach

Branch and Bound is a heuristic that works on the idea of successive partitioning of the search space [MF00]. It needs a lower bound on the cost for any particular solution (or an upper bound - depending on the problem). Therefore, the pro- cedure is: for a solution xa the objective function value can be determined and compared with the lower bound. If it is above that threshold, the branch can be pruned away and the optimization can concentrate on intensifying the other branches.

It is apparent that this process can only be applied to discrete values, as branches have to be distinguishable from each other and enumerable. Therefore, most applications for the Branch and Bound approach published in the last 2 decades are of this kind. However in recent years, strategies have been developed to solve nonlinear problems (NLP) with Branch and Bound and with the other mixed integer approaches that were previously mentioned.

A fully distributed Branch Bound implementation for solving multistage nonlinear problems can be found in Section 5.4.


5.1 Introduction

This chapter discusses the Scatter Search (ScS) and Branch and Bound (BB) approaches, which are implemented in this thesis. Each section discusses details of the procedures as well as details of the corresponding implementations. Both approaches will also be discussed in the context of distributed multistage optimization that was introduced in Chapter 3.

All implementations are made in JAVA v1.4. They are entirely focused on distributing computing needs. Therefore, both the Scatter Search as well as the Branch and Bound implementation need connectivity in the form of an interface to the distributing environment and the problem evaluator. The implementation provided with this thesis provides an environment that allows testing of the performance of the proposed algorithms in real and in virtual time. Test results for two di erent problems can be found in Chapter 6.

5.2 Implementation Structure

The implementation is organized so that it is easily extendable and scalable. Each algorithm will only receive the information that is necessary to perform the opti- mization. The problems and the corresponding constraint handlers are not visible to the algorithms and are only approachable through the IDistribute interface, which serves as a link to the distributing environment and the problem classes.

The algorithms themselves implement the interface IAlgorithm, which simpli es the adding of algorithms to the infrastructure. The main method can be found in the Startup class, which employs a console to start the algorithms. The Startup class implements the IStartup interface.

Optimization problems implement the IMathProblem interface, which allows the addition of other problems to the structure. These problems are administered and evaluated by the MathSolverExecutionHandler class which contains informa- tion about solving the problem and provides connectivity to the constraint handlers (IConstraintHandler ).


Excerpt out of 83 pages


Investigating Distributed Approaches for Solving Discrete, Multistage Optimization Problems
University of Siegen
Catalog Number
ISBN (eBook)
ISBN (Book)
File size
2238 KB
Investigating, Distributed, Approaches, Solving, Discrete, Multistage, Optimization, Problems, Scatter Search
Quote paper
Christian Kutsch (Author), 2004, Investigating Distributed Approaches for Solving Discrete, Multistage Optimization Problems, Munich, GRIN Verlag,


  • No comments yet.
Read the ebook
Title: Investigating Distributed Approaches for Solving Discrete, Multistage Optimization Problems

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free