
Imagine you were a salesman who had to call on a number of customers in different towns. The sooner you can finish your calls the sooner you can get home so, obviously, you will want to keep the travel between your customers as short as possible. Ignoring the problem of getting drawn into long conversations on route finding the shortest route between all the customers is what you need to do to get home as soon as possible. Let's say for simplicity that there is one customer that you have to visit in each town and you set out from home in the morning and must return there in the evening.
How would we find the shortest route that starts and ends at home? Well, if we had a computer we could tell it the distances between each pair of towns and let the computer generate every possible ''tour'' and print out the best one at the end.
So if we had towns A, B and C to visit then the first tour the computer might try could be ''start out from home (H) and visit A; from there go to B; then visit C and and then return home.'' We could abbreviate this tour by writing it ''HABC'' with the assumption that once the last town is visited (C) return back to the start, H. What are the other possibilities? Well, we could have ''HBAC'' or ''HCAB.'' There are 6 possible tours and the complete list is:
Let's see for a moment how you would calculate how many tours there can be. The first "town" is always home so that doesn't add or take from the number of tours. We could place any of the 3 towns in the second position and having decided on a choice we could place either of the remaining towns in the third position. We have no choice for the fourth and last position because there is only one town left unused. The number of ways of selecting the towns that go in the second and third positions, therefore, is 3 x 2 x 1, or 6 ways. This figure is consistent with our "exhaustive" listing of the tours above.
Making the problem a bit more general now, how many tours are there on n towns and our home base? Well for starters we'll ignore the home base since it is simply tacked on to the front of every tour. So what we want to find is the number of complete orderings or, permutations, of towns. As we reasoned before, any one of the towns can be put in the first position of the ordering, meaning we visit that town immediately when we leave our base. Now no matter what town we pick as the first choice, our second choice could be any one of the remaining n-1 cities. Therefore, there are n x (n-1) ways of choosing the first two towns of the tour. Similarly, no matter which of these n x (n-1) choices for the first two slots we make we can put any one of the remaining n-2 choices in the third position.
This pattern continues down along until we get to the last slot where there will be no choice since there will be only a single town left. The complete count of the n permutations is, therefore n x (n-1) x...x 2 x 1. You may know this by its more usual abbreviation n!
So our strategy for finding the shortest tour is to generate, in turn, each of these n! tours, adding the distance contributed by each link of the tour to the total for that tour and saving the tour with the lowest total. So for a salesman who has to visit the county towns of say, the 32 counties of the island, the total number of ways he could go about visiting them would be 32! A perfect job for a computer I hear you say: simply let it look at each of the possible tours and tell us the best one it sees. Well, maybe not.
The trouble is that the quantity n! grows at an alarmingly fast rate and very quickly the computer cannot possibly keep up with its explosive growth. Suppose we are able to look at a table of distances for the distance between any two towns. Now given a particular tour comprising say, 20 towns, we would need to add up the 20 distances in the tour in order to compute its length.
Now suppose we had a computer that could perform 1 billion additions per second. That's a lot higher than anything that you or I would be able to get our hands on today but let's just say we have one that can. Since it needs 20 additions to compute one tour, in one second our computer can compute the total lengths of 1 x 109, or 50 million tours. In scientific notation this is written as 5 x 107 . But, believe it or not, 20! (the total number of tours we will need to look at) is a staggering 2,432,902,008,176,640,000. This is roughly 2.4 x 1018 in scientific notation. At 5 x 107 tours per second it would take our lightning fast computer 2.4 x 1018 /5 x 107 or 4.87 x 1010 seconds to look at every single one. With 86,400 seconds in a day the number of days required to find the best tour exhaustively would be 563,172 days or 1,542 years. Now if we just wanted to add one more town so that we now had 21 towns to start with the answer would take 21 times longer.
So what are we to do? We have what seems like a very reasonable request namely, to find the shortest tour of Ireland's 32 county towns, yet a computer seems unable to produce the exact answer for even 20 towns. What we can do to deal with this problem is to resort to a heuristic solution. A heuristic is like a rule of thumb. It is a trick that will get you a good answer much of the time but by no means an exact answer all the time. The heuristic we will use here is one that, in one form or another, is useful for quite a large variety of problems. What we do is construct any tour we like and record its length. We then look at every pair of links of the tour and ask what would be the effect if we removed those two links and replaced them by their alternative (there will be only one alternative that will lead to a proper tour). If the tour that results from replacing a pair of links by their alternative yields a longer tour then we return the tour to what it was and we consider another pair of links for replacement.
If, on the other hand, the resulting tour was shorter then we keep it and begin the search from it in the hope that we will find another improvement somewhere amongst its links. Figure 1 illustrates a successful link replacement where the length of the new tour is improved. Note that the section of tour that gets isolated as a result of the link removals get reversed in the new tour. In the original tour the sequence of visitations was "..dghef.." but after the link replacements the sequence became "..dehgf..". Some times the order that we visit the towns in is important so we may have to discard link replacements if they cause towns to be visited in the wrong order. When we generate a tour in this way that cannot be improved upon we stop, declaring this tour to be "locally optimal." That is, this tour is optimal with respect to all the local changes (break 2 links and replace) that we could try..
The heuristic we have described is called "local optimization", because we are optimizing the tour by making small or local changes to it. Although it runs quite fast, people blame its short running time on it not being able to find many improvements for a given tour. To help improve matters what we could do would be to run it a number of times giving it a different, maybe random, solution to start with each time. We could then pick the best of the locally optimal solutions that the heuristic returns.
A more elaborate approach is to break more edges of the tour and reconnect the pieces. For example, we could break 3 edges and from the 3 "strings" that result we could piece the tour back together in different ways. If any of the different reconstructions improve things we're off again. This will usually return solutions that are better than the 2-opt approach that we talked about above but it runs a lot slower.
For the types of problems that researchers work on nowadays - approximately 1 million cities - 3-opt is unreasonably long-running in spite of the better solutions that it gives. You may be wondering why on earth anyone would be concerned with a 1 million-city travelling salesman problem. Well, while that problem may not arise in practise one almost identical to it does. During the fabrication of complex electronic circuits it is necessary to "zap" many points on the circuit with a laser to break a connection that might have previously existed. The faster we can zap all the points the faster we can turn out our electronic circuits. Finding the shortest path for the laser between the points is the same problem as what we have seen above.
It is quite common for this situation to arise in research: a problem that arises in one area of research may have been examined and solved in some other area earlier. This is an important reason for studying what seem at first to be problems with no practical importance. To paraphrase Frankie Byrne, the agony aunt of the Jacobs Biscuits programme on radio: "The problems we investigate may not be yours today, but they could be some day."
Patrick Healy is a Lecturer in Computer Science and Information Systems. His research interests are in Combinatorial Optimization and Graph Drawing.
