## Discrete optimization problems. Heuristic algorithms.

### Abstract

On this lecture we present a survey of heuristic (approximate) methods of discrete optimization: greedy algorithm, beam search, hill climbing. A notion of neighborhood of problem solutions is also analyzed.

### Optimization problems

Let us recall a definition of discrete optimization problem:

Let X will be an arbitrary finite set (state space), let f:X->R - a function on X (goal function). Then by an optimization problem we mean a problem of finding an element x0 in X such that (depending on detailed formulation):

• f(x0)=max( f(x) ), for all x in the set X
• f(x0)=min( f(x) ), for all x in the set X

Every discrete and finite optimization problem may be solved by enumerative search through all possible solutions (all elements of state space X). Hoverer, for many real-sized problems it is practically impossible. Thus, one will need approximate algorithms, which will give a satisfactory result in short time. This kind of methods we will call heuristics (heuristic algorithms).

We will start with some exemplary optimization problems. They all have easy formulation, high practical importance and lack of fast and accurate solutions.

Example: Traveling salesman problem (TSP).

Let G be a given graph which edges are labeled by lengths. Find the shortest path going exactly once through all vertices of G (if one exists).

State space: all possible patches going exactly once through all vertices. Size of the state space: at most n!, where n - the number of vertices.

Goal function: total path length. Our aim is to minimize the goal function.

The importance of good solutions of this problem is obvious for transport companies. Some production optimizations may also be based on TSP problem (e.g. to design a route of welding robot arm).

Example: Binary matrix covering problem.

Let a binary matrix A={aik} be given. Find a smallest subset B of columns, such that in any i-th row of the matrix A exists aik=1 and column k belongs to set B.

State space: all possible column coverings of A. Size of the state space: less than 2n, where n - number of all columns.

Goal function: the size of B.

This problem arises when we have a set of items which have various features and we have to select a smallest subset of items having all needed features. E.g. our company may sign contracts with many various suppliers offering many various goods and we want to sign possibly small number of contracts covering all our needs.

Example: Knapsack problem.

Suppose we have n items labeled by its weight mi and value wi. Select a subset of items which may be carried in a knapsack of capacity (maximal items' weight) M having the highest total value.

State space: all possible subsets of items. State space size: 2n.

Goal function: total value of items in knapsack (assuming the weight limit is not violated).

### Greedy algorithm

General principle: the greedy algorithm builds the solution gradually, on each step trying to achieve maximal (local) benefit.

Example: Courses selection problem.

We have a schedule of n courses, starting and finishing on a given time. Find a maximal subset of non-intersecting courses.

Greedy algorithm: on each step select a course which finishes earlier (excluding those which intersect with the previously selected ones). In this case the maximal benefit is the maximal amount of time left.

Greedy algorithm gives optimal solution for this problem.

Example: Traveling salesman problem.

On each step go to the nearest unvisited vertex.

Example: Matrix covering problem.

On each step add to the solution a column which cover the most non-covered rows.

Example: Knapsack problem.

On each step we add one item to the knapsack, such that the weight limitation is not voided and the item have the highest value/kg.

As we could see, the greedy method is simple, intuitive, and straightforward. On the other hand, it is in many cases suboptimal, as a greedy selection of a solution (locally the best one) path may lead to a worse (globally) solution. It is similar to a beginner chess player, analyzing the game from the point of view of immediate benefits; such a player may be easily trapped by more experienced one.

### Hill climbing algorithm

Hill climbing algorithm does not build a solution step by step, as the greedy algorithm does. Every step of the hill climbing algorithm involves a complete solution (better or worse) of our optimization problem. Hill climbing means a gradual improvement of a given solution of the problem.

General principle: start from a random solution of the problem, search through its neighbors and find one with the highest value of goal function ("climb the hill"), repeat until a local optimum is achieved (i.e. none of the neighbors has better value of f).

General scheme of the hill climbing algorithm:

x0 = Random(X)
do

max=x0
for(x in N(x0))

if(f(x)>f(max)) max=x

end for
if(max=x0) break
x0=max

while(1)

The set N(x) above denotes a set of neighbors of a solution x. Note that it is the neighborhood in state space - i.e. it is a kind of similarity relation of the complete solutions of our problem (the similarity criteria are defined by us). E.g. in traveling salesman problem we should not connect the notion of neighborhood with the distance between vertices ("cities"), because a vertice is not a complete solution of TSP. In this case we should define somehow a distance (and neighborhood) between full paths; e.g. by defining two paths as neighbors if they differ by a transposition of two cities.

Hill climbing method is easy to implement and fast. On the other hand, it falls easily in local optima, i.e. it may find a solution which is much worse than optimal one, but it cannot be improved by simple operation. The final result is highly connected with the starting point of the search.

### Tabu search General principle of tabu search: as in hill climbing, we search neighbors of the current solution and go to the best neighbor, but we exclude from our search a list of k recently visited solutions (e.g. k=1000).

The difference between tabu search and hill climbing is such that we allow "going down the hill", i.e. to accept a neighbor worse than the current solution (providing it is the best available neighbor). We can do it safely since we will not get looped: we cannot return to the solution we analyzed a while ago (it is on our tabu list). Therefore we will escape from the local optimum and we have a chance of finding better optima.

### Beam search

General principle: we will build final solution step by step (as in greedy algorithm), but we will find not only one locally best continuation (as in greedy method), but k of them. On the next step we will start from these k partial solutions concurrently and find the next best k continuations. The method may be regarded as parallel greedy algorithm, restricted to the best k paths on every step.

Example: Traveling salesman problem for 7 cities (k=3).

1. Start from a random city (vertice).
2. Find k nearest cities.
3. Start from all of them in parallel, check a distance to all unvisited cities. Find k cities such that the sum of distances of the first two steps is minimal.
4. Repeat 3., remember the best k partial paths on each step.
5. On the last step, select the best of the paths. Beam search may be regarded as an improvement of greedy algorithm. In this case we will not simply select the best (local) path to the solution, but we will analyze several other possibilities. On the other hand, limited size of "search beam" keeps the time efficiency of the algorithm (we avoid the exponential growth of a number of possible paths).

### The notion of the neighborhood

To use a hill climbing or tabu search (or several other methods, presented on the next lectures) we should define a notion of neighborhood on a state space X. For each x (a potential solution of our problem, or an element of the state space) we will define a finite set N(x) of its neighbors. The set N(x) may be defined arbitrarily, although in practice we should do it carefully and in connection with the optimization problem. The principle of all methods similar to hill climbing is to perform rather small, gradual changes to the solution, thus the definition of neighborhood should reflect to such small changes.

Moreover, the neighborhood of a solution should not be too numerous, otherwise algorithms (which check all neighbors in a loop) will be too slow. On the other hand, the neighborhood should be a connected relation, i.e. it should be possible to transit from any solution to all other solutions by walking from one neighbor to another. Otherwise our algorithms would not be able to search all the state space and may miss an optimal solution.

Example: X - a plane (discrete, for a given grid step e.g. 0.01). As a neighbor of a point x we may regard any point which is not further than by 0.01 in any direction.

Example: vertex covering problem.
Given a graph G, find the smallest set of vertices B adjacent to all edges of G. State space X is a family of all (potential) coverings, i.e. all subsets of the set of vertices. We may regard two solutions as neighbors when they differ by just one vertex. Note that we are talking about the whole solutions, not single vertices of G!

### Summary and references

The lecture was devoted to optimization techniques, which are based on one of the following principles:

Greedy principle: build a solution gradually, achieving as much as possible on each step.

Neighborhood search: define a neighborhood relation (i.e. which solutions we will regard as similar) and search through the state space by "jumping" from one solution to another (the better one, if possible).

References:

• T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein. Introduction to algorithms. MIT Press/McGraw-Hill, 2001.

### Glossary

See the text above for detailed definitions (keywords are marked red).

hill climbing - algorithm which improves a given solution by searching its neighbors
greedy algorithm - algorithm which builds a solution gradually, with the best (local) components
goal function - a real-valued function which we have to optimize
tabu search - a modification of hill climbing algorithm which allows to escape from local optima
state space - a set of all possible solutions of the optimization problem
beam search - a modification of greedy algorithm which allow to check many paths in parallel
neighbor - two different solutions of the optimization problem will be regarded as neighbors, if they met some (arbitrarily defined) similarity criteria
optimization problem - a task of finding minimal or maximal value of a goal function on a discrete set (state space)

### Exercises

6.1. Find an example of traveling salesman problem (an instance of the distance graph), for which a greedy algorithm will fail to find an optimal path.

6.2. Solve the knapsack problem with hill climbing method (i.e. define a neighborhood and a goal function).