NOTE: This is an informatics problem because it deals with a parallel computation involving a number of communicating processors (the scientists).
Suppose S is a sequence of N , N> 0 , elements. The elements of S are denoted by S[i] with 0<=i<N , and are sorted in increasing order: S[i-1]<S[i] for 0<i<N .
Both search algorithms have as input a value x occurring in S , and as output the value i such that 0<=i<N and S[i]=x . Both algorithms have local integer variables j and h . Here is algorithm A :
The idea behind algorithm A is that invariantly S[i]<=x<=S[j-1] holds, that is, x occurs in the segment from S[i] to S[j-1] . If this segment consists of one element ( i=j-1 ), then the search terminates (see step 2). Otherwise, the segment is split into two parts, by choosing h halfway between i and j (see step 3a). Depending on whether x lies in the left half or the right half, either j or i is adjusted (see step 3b).
Algorithm B is obtained from algorithm A by expanding step 3b into
The idea behind algorithm B is that the search can be terminated immediately when the middle element S[h] equals x .
There are two questions concerning algorithms A and B that you should investigate.
In a parallel program, access to shared variables must be controlled. For instance, what happens when two processors write different values to the same shared variable at the same moment? The result could be garbage. In our problem, this is avoided because the small machine room guarantees mutual exclusion: only one scientist can inspect or modify the number on the blackboard.
In a sequential program (executed by one processor), 10,000 increments by 37 starting from zero would result in the value 370,000. Parallel execution of these same increments, however, may result in different (smaller) values. The reason is that the increments are split into three actions (read, increment local copy, write), which may be executed in an interleaved fashion. For instance, scientist S[0] reads zero, scientist S[1] also reads zero, both increment their zero to 37, S[0] writes back 37, S[1] writes back 37. This illustrates that two such increments may have the same effect as one. Obviously, one hundred of them executed in parallel may also act as a single increment. Therefore, another possible final value after the 10,000 increments is 3700.
This is not yet the whole story, however. A careful analysis of the possiblities reveals that any multiple of 37 in the range from 74 to 370,000 can be obtained. Here is an interleaving yielding 74:
This can be confirmed by appropriate experiments. (An analytic treatment is possible but tedious. Sequence lengths that are a power of two are easy to deal with; the complications arise for other lengths). There are several ways to set up an experiment. A naive way is to repeat the following experiment a number of times. Choose a random increasing sequence of strings (also its length is chosen randomly), apply the search algorithms to a (random) number of random input values x and measure the actual search times. Collect enough data and summarize the statistics. A disadvantage of this approach is that it is unclear how much data must be collected for a reliable result. It also difficult to summarize the results.
A more controlled experiment is this. Note that attention may be restricted to INTEGER sequences with S[i] = i , because it is not interesting WHAT is searched, only HOW it is done. Considering various sequence lengths, invoke the algorithms ONCE for EACH allowed input value x , keeping track of the total number of repetitions and elementary operations (that is, comparisons and assignments). When all input values have been searched, divide the total counts by the sequence length to obtain the mean counts. A broad range of sequence lengths can be covered in a relatively small number of experiments by taking some powers of ten, say from one to one million. (Because of the nature of the algorithms, sequence lengths that are a power of two are a special case, and these would not be representative.) Table 1 below illustrates the results.
Algorithm A Algorithm B Sequence Mean Mean Mean Mean Length # Rep # Op # Rep # Op ------------------------------------------- 1 0,00 3,00 0,00 3,00 10 3,40 16,60 2,80 17,00 100 6,72 29,88 5,79 31,95 1000 9,98 42,90 8,99 47,93 10000 13,36 56,45 12,36 64,81 100000 16,69 69,76 15,69 81,45 1000000 19,95 82,81 18,95 97,76Table 1: Mean repetition and operation counts for the two algorithms
You can find below a Turbo Pascal program for these experiments. Note that in Pascal, step 3a can be programmed as h := (i+j) div 2 .
program Problem2 (input, output) ; { Comparing two search algorithms } const MaxE = 6 ; MaxN = 1000000 ; { MaxN = 10 ^ MaxE } type index = 0..MaxN ; var N: index ; { the sequence consists of S[i] = i for 0 <= i < N } procedure AlgorithmA (x: longint; var i: index; var repcount, opcount: longint); var j, h: index ; begin i := 0 ; j := N ; opcount := opcount+3 ; while i < > pred(j) do begin h := (i+j) div 2 ; if x < h { x < S[h] } then j := h else i := h ; repcount := repcount+1 ; opcount := opcount+4 end { while } end { AlgorithmA } ; procedure AlgorithmB (x: longint; var i: index; var repcount, opcount: longint) ; var j, h: index ; begin i := 0 ; j := N ; opcount := opcount+3 ; while i < > pred(j) do begin h := (i+j) div 2 ; if h = x { S[h] = x } then begin i := h ; j := succ(h) end { if h = x } else if x < h { x < S[h] } then j := h else { x > S[h] } i := h ; repcount := repcount+1 ; opcount := opcount+5 end { while } end { AlgorithmB } ; procedure Experiment ; var k, r: index ; rcA, ocA, rcB, ocB: longint ; begin write(N: 7, ' |') ; rcA := 0 ; ocA := 0 ; rcB := 0 ; ocB := 0 ; for k := 0 to pred(N) do AlgorithmA(k {S[k]}, r, rcA, ocA) { assert(r=k) } ; write(rcA/N:7:2, ocA/N:7:2, ' |') ; for k := 0 to pred(N) do AlgorithmB(k {S[k]}, r, rcB, ocB) { assert(r=k) } ; writeln(rcB/N:7:2, ocB/N:7:2) end { Experiment } ; procedure DoExperiments ; var i: index ; e: integer ; begin writeln('Experiments for comparing two search algorithms') ; writeln ; { for i := 0 to MaxN do S[i] := i ;} writeln(' | Algorithm A | Algorithm B') ; writeln(' Seq. | Mean Mean | Mean Mean') ; writeln(' Length | # Rep. # Op. | # Rep. # Op.') ; writeln(' ------ | ------ ------ | ------ ------') ; for e := 0 to MaxE do begin if e = 0 then N := 1 else N := 10*N ; { N = 10^e } Experiment end { for e } ; end { DoExperiments } ; begin DoExperiments end.