# IOI'94 - Day 2 - Solution 2: The Buses

## [ Introduction ] [ Problem Statement ] [ Test Data ]

## Problem Analysis

### Bus routes

A bus route is characterized by the time of its first arrival at the bus stop (in minutes after 12:00) and its interval (number of minutes between successive stops). These two numbers determine how often a bus route stops at the observed bus stop from 12:00 to 12:59. Because this number of stops plays an important role, it will be included with the other information to describe a bus route. Here is the definition of type`BusRoute`:

type BusRoute = record first : 0..29; interval: 1..59; { in fact, first < interval <= 59 - first } howoften: 2..60; { howoften = 1 + (59 - first) div interval } end;The lower bound on

`first`is zero by definition. The upper bound and the bounds on

`interval`are less straighforward and we will now explain them.

Observe that the problem statement implies
`first < interval`,
since buses are known to arrive *throughout the entire hour*.
If `first >= interval`, then there would have been
a stop earlier than `first` at time `first-interval >= 0`,
which contradicts the definition of `first`.
Also observe that the second bus arrives at time `first+interval` and
since buses are known to stop at least twice we therefore have
`first + interval <= 59`.
This explains the bounds on `interval`.
The upper bound on `first` can be obtained by adding the
inequalities for `first`:

first < interval first <= 59 - interval ------------------------ + 2*first < 59from which we infer

`first <= 29`. The number

`howoften`is determined by the two inequalities:

first + (howoften-1)*interval <= 59 first + howoften *interval > 59These can be rewritten into

59-first - interval < (howoften-1) * interval <= 59-firstfrom which we infer

`howoften = 1 + (59-first) div interval`. So much for the bounds. Question: How many bus routes are there?

Here is procedure for writing a bus route in a graphically appealing way. It is useful during devopment and will help understand the problem better.

procedure GraphBusRoute(var f: text; b: BusRoute); var i: integer; begin with b do begin write(f, 1:first+1) ; i := first + interval ; while (i <= 59) do begin write(f, 1:interval) ; i := i + interval end { while } ; write(f, ' ':62-i+interval) ; writeln(f, '[', first:2, ',', interval:2, ',', howoften:2, ']') end { with } end { GraphBusRoute } ;

`GraphBusRoute`writes a tally for each arrival of the bus route. The locations of the tallies on the output line correspond to the arrival times. At the end of the line the parameters are written. The three bus routes in the schedule appearing in the problem statement are shown by

`GraphBusRoute`(three calls) as follows (the first two lines with time labels are produced by

`WriteTimes`, which is too simple to include here):

000000000011111111112222222222333333333344444444445555555555 012345678901234567890123456789012345678901234567890123456789 1 1 1 1 1 [ 0,13, 5] 1 1 1 1 1 [ 3,12, 5] 1 1 1 1 1 1 1 [ 5, 8, 7]

### The input data

The input data is a sorted list of arrival times, possibly containing duplicates. These are most conveniently stored by counting for each arrival time how often it occurs. For this purpose we introduce variables`s`and

`a`:

var s: integer; { s = # unaccounted arrivals = sum a[0..59] } a: array[0..59] of integer; { a[t] = # unaccounted arrivals at time t }Procedure

`GraphUnaccounted`(listing not included) shows the arrival times in the same format as

`GraphBusRoute`, except that now the tallies may take on values from 0 upward (0 is displayed as a space, and numbers above 9 are displayed as letters from A upward). The input for the example with 17 arrival times in the problem statement would be written as

000000000011111111112222222222333333333344444444445555555555 012345678901234567890123456789012345678901234567890123456789 1 1 1 2 1 1 11 1 1 2 1 111 total = 17Compare this to the graphs of the bus routes shown above. The three rows of the bus routes nicely add up to the row of unaccounted arrival times in the input.

The input is read from file `inp` by procedure `ReadInput`:

procedure ReadInput; { read input into s and a } var i, j: integer; begin if Test then writeln('Reading input') ; readln(inp, s) ; if Test then writeln('Number of stops = ', s:1) ; for i:=0 to 59 do a[i] := 0 ; for i:=1 to s do begin read(inp, j) ; inc(a[j]) end { for i } ; readln(inp) ; if Test then begin GraphUnaccounted ; writeln end end { ReadInput } ;

The following function `Fits` determines whether a given
bus route `b` fits with the arrivals `a`,
that is, whether all stops of `b` occur in `a`:

function Fits(b: BusRoute): boolean; { check whether b fits with a, that is, all arrivals of b occur in a } { global: a } var i, j: integer; begin with b do begin i := first ; j := 60 ; { bounded linear search for earliest a[first + k*interval] = 0 } while i < j do if a[i] <> 0 then i := i+interval else j := i ; Fits := (i >= 60) end { with } end { Fits } ;

### Finding candidate bus routes

We will first make a list of all bus routes that fit with the arrival times in the input. These are called candidate bus routes. Observe that the total number of possible bus routes equals the number of pairs`(first,interval)`with

`0 <= first <= 29`and

`first+1 <= interval <= 59-first`, which is 59+57+...+3+1 = 60*30/2 = 900.

Candidate bus routes will be stored in global array `c`
and counted in integer `n`:

var n: integer; { # candidate bus routes } c: array[0..899] of BusRoute; { c[0..n-1] are candidate bus routes }Procedure

`FindBusRoutes`determines the candidate bus routes that fit with the given arrival times

`a`:

procedure FindBusRoutes; { post: c[0..n-1] are all bus routes fitting with a } { global: a, n, c } var f, i: integer; begin if Test then begin writeln('Finding candidate bus routes') ; WriteTimes end { if } ; n := 0 ; for f:=0 to 29 do begin if a[f] <> 0 then begin for i:=f+1 to 59-f do begin with c[n] do begin first := f ; interval := i ; howoften := 1 + (59 - f) div i end { with c[n] } ; if Fits(c[n]) then begin { another candidate } if Test then GraphBusRoute(c[n]) ; inc(n) end { if } end { for i } end { if } end { for f } ; if Test then writeln('Number of candidate bus routes = ', n:1) end { FindBusRoutes } ;Procedure

`FindBusRoutes`is quite straightforward. As usual we have included some diagnostic output. A few things might need further explanation.

First of all, the check `if a[f] <> 0` was inserted
to cut off impossible bus routes as early as possible
(otherwise, for values of `f` with `a[f]=0`,
all possible values of `i` would be tried in vain).

Second, one might be tempted to optimize a little more.
For instance,
the computation of `howoften` could have been done only
if `Fits(c[n])` turned out successful.
Also, the function `Fits` could be adapted to exploit
the fact that `a[f] <> 0` was already tested,
by starting `Fits` with `i := first+interval` instead of
`i := first`.
The main reasons for not doing so are that these changes do not
speed up things considerably (try it),
and that they may complicate later uses or changes of `Fits`
(such as using `howoften` in `Fits`).

As an example of `FindBusRoutes` consider the
diagnostic output produced when reading
the example input from the problem statement
and finding the candidate bus routes.
The 17 stops of this input give rise to 42 candidate bus routes,
of which only eight stop more than twice.

Here is an overview of the number of candidate bus routes in each test:

test number of number of case arrivals candidates ---- -------- ---------- 0 17 42 1 12 24 2 44 237 3 43 375 4 31 136 5 40 201 6 70 365

### Finding schedules

A schedule can be described as a list of bus routes (at most 17 according to the problem statement):type Schedule = array [0..16] of BusRoute;A schedule is written by the following procedure:

procedure WriteSchedule(var f: text; sc: Schedule; len: integer); var i: integer; begin for i:=0 to len-1 do with sc[i] do writeln(f, first:2, ' ', interval:2) ; if Test then writeln(f, '-----') end { WriteSchedule } ;Using the candidate bus routes we can do straighforward

*backtracking*to determine bus schedules that exactly account for the given arrival times. We are only interested in a bus schedule with as few bus routes as possible. For that purpose we introduce some global variables:

var t: longint; { # schedules found so far } freq: array [1..17] of longint; { freq[p] = # schedules with p bus routes } p: integer; { # buses in partial schedule so far } m: integer; { # buses in best schedule so far } sched: Schedule; { sched[0..p-1] is schedule so far } best: Schedule; { best[0..m-1] is best schedule so far }Variables

`t`and

`freq`are for diagnostic purposes only.

Note that, according to the problem statement, the order of bus routes in a schedule is irrelevant (``the order of the bus routes does not matter'') and that a bus route may occur more than once (``buses from different routes may arrive at the same time''). To avoid duplication of work we will put bus routes in a schedule in the same order as they appear in the list of candidates and we allow multiple occurrences of the same bus route.

The state of the backtracking process is captured by the variables
`s`, `a`, `p`, and `sched`.
The bus routes `sched[0..p-1]` account for part of the arrival times,
and the unaccounted arrival times are represented by `a`
(and `s`).
Procedure `Use` extends the current partial schedule with
a given bus route and updates the state variables:

procedure Use(b: BusRoute); { global: s, a, p, sched } var i: integer; begin sched[p] := b ; inc(p) ; with b do begin i := first ; while (i <= 59) do begin dec(a[i]) ; i := i+interval end { while } ; s := s - howoften end { with } ; if Trace then begin WriteSchedule(output, sched, p) ; GraphUnaccounted(output) end { if } end { Use } ;Similarly, procedure

`RemoveLast`removes the last bus route that was used in the current partial schedule:

procedure RemoveLast; { global: s, a, p, sched } var i: integer; begin dec(p) ; with sched[p] do begin i := first ; while (i <= 59) do begin inc(a[i]) ; i := i+interval end { while } ; s := s + howoften end { with } end { Remove } ;The recursive procedure

`FindSchedules`generates all schedules (with at most 17 bus routes):

procedure FindSchedules(k: integer); { global: s, a, n, c, p, sched, m, best, t, freq } { Find all schedules sched[0..r-1] with p <= r <= 17 such that bus routes sched[0..p-1] are as given, sched[p..r-1] accounts for a and uses only bus routes from c[k..n-1] } begin if s = 0 then { nothing left to account for } ScheduleFound else if p = 17 then { too many bus routes: ignore } else { try each candidate c[k..n-1] that fits } while k < n do begin if Fits(c[k]) then begin Use(c[k]) ; FindSchedules(k) ; RemoveLast end { if } ; inc(k) end { while } end { FindSchedules } ;

`FindSchedules`is called as follows in procedure

`ComputeAnswer`:

procedure ComputeAnswer; begin FindBusRoutes ; if Test then writeln('Finding schedules') ; for p:=1 to 16 do freq[p] := 0 ; t := 0 ; p := 0 ; m := 18 ; FindSchedules(0) ; if Test then begin writeln('Number of schedules = ', t:1) ; WriteFrequencies(out) end { if } end { ComputeAnswer } ;For the 17 arrival times in the problem statement,

`FindSchedules`produces 18 schedules, as shown by the diagnostic output. The shortest has three bus routes and is unique, the longest (of which there are three) has seven bus routes.

This method is incorporated in Program 1. It is too slow for the second test input with 44 arrival times. Program 1 quickly finds a schedule with four bus routes (the minimum) but then continues to look for (non-existing) improvements. Apparently there are many (longer) schedules for this test case.

## Program 2

One way of improving Program 1 is by cutting off the search for schedules in a `corner' of the search space where it is impossible to find improvements of the best schedule so far. More precisely, if`p`is the number of bus routes in the current partial schedule, then we will see to it that

`p < m`holds invariantly. If

`s=0`then we have a schedule that is also an improvement of the best schedule so far. If, however,

`s<>0`then we can stop searching in this corner if

`p+1=m`since at least one more bus route is needed to complete the schedule, and therefore it will never result in an improvement. Here is the adapted code:

procedure FindBestSchedule(k: integer); { global: s, a, n, c, p, sched, m, best } { Find all schedules sched[0..r-1] with p <= r < m such that bus routes sched[0..p-1] are as given, sched[p..r-1] accounts for a and uses only bus routes from c[k..n-1] } { pre: p < m } begin if s = 0 then { nothing left to account for } ScheduleFound else { try all candidates c[k..n-1] that fit } while (k < n) and (p+1 <> m) do begin if Fits(c[k]) then begin Use(c[k]) ; FindBestSchedule(k) ; RemoveLast end { if } ; inc(k) end { while } end { FindBestSchedule } ;Note that we have not written

else if p+1 = m then { too many bus routes: ignore } else { try all candidates c[k..n-1] that fit } while k < n do beginbecause

`m`may be changed inside the while-loop by the recursive call to

`FindBestSchedule`.

This idea is incorporated in Program 2. This program indeed only tries one schedule for input-1.txt and input-2.txt. However, on the other input files it still takes (too) long.

## Program 3

Yet another idea that might help improve performance is based on reordering the list of candidate bus routes. For Program 1, the order of the candidate bus routes does not matter, since this program generates*all*schedules. Using a different order of candidate buses simply means that all schedules are found in a different order.

Program 2 might benefit from another order for the candidates, because if it finds a good schedule early on, then the built-in cut-off mechanism is more efficient. Since we are interested in schedules with the fewest bus routes, each bus route in the schedule should account for as many arrivals as possible. Therefore, we might first try bus routes that make many stops.

Program 3 sorts the candidate bus routes
on `howoften` as they are found.
The diagnostic output for the input
in the problem statement shows the sorted
list of 42 candidate bus routes.
Sorting turns out to make only a small difference.
Program 3 still takes too long for input-3.txt.

## Program 4

Programs 2 and 3 aimed at reducing the number of schedules investigated. We can also directly aim at reducing the number of*partial*An adapted procedure

`FindBestSchedule`is here:

procedure FindBestSchedule(k: integer); { global: s, a, n, c, p, sched, m, best } { Find all schedules sched[0..r-1] with p <= r < m such that bus routes sched[0..p-1] are as given, sched[p..r-1] accounts for a and uses only bus routes from c[k..n-1] } { pre: p < m } begin if s = 0 then { nothing left to account for } ScheduleFound else begin { try all candidates c[k..n-1] that fit } while (k < n) {c}and (c[k].howoften > s) do inc(k) ; while (k < n) {c}and (p + 1 + (s-1) div c[k].howoften < m) do begin if Fits(c[k]) then begin Use(c[k]) ; FindBestSchedule(k) ; RemoveLast end { if } ; inc(k) end { while } end { else } end { FindBestSchedule } ;The first while-loop skips candidate bus route that make too many stops for the remaining unaccounted arrival times. The second while-loop breaks off as soon as the remaining candidate bus routes make so few stops that improvement is no longer possible. Note that I have written

`{c}and`to remind myself (and you) that this conjunction is

*conditional*(using short-circuit boolean evaluation). That is, if the first conjunct already evaluates to false then the second conjunct is not evaluated (in our case it is then even undefined).

Observe that this technique only works if the list of candidate bus routes is
sorted on `howoften`.
In that case a *lower bound* can be given on the number of bus routes
needed to complete the schedule.
The technique is sometimes called *branch-and-bound*.
It is used in Program 4,
which is so effective that all six input files are done in an instant.
For all of them only one or two (complete) schedules are considered.

## Variants of this problem

What changes should be made to the programs if all bus routes in a schedule should be*different*? What about generating

*all*minimal schedules?

Write an auxiliary program that generates all bus routes, sorts them on how often they stop, and then puts this data in a file `routes.dat'. Investigate whether using this file, instead of generating them inside the main program, can speed up the generation of all candidates.