CS494 -- Computational Geometry

Heather Booth

Algorithm Analysis Practice

Goals: There are practice problems for each type of goal above and solutions are now included at the end of the document.

Section 1: Analyzing non-recursive Algorithms:

In non-recursive code we can often analyze the running time by determining the number of times selected statements are executed. I will assume we are looking for worst-case running time using big-O notation. Since we don't care about constants, we can usually get a good estimate of the running time by considering one type of statement such as key comparisons, or some statement in a for loop.

Some examples:

for (i = 0; i < n; i++)
   for (j = 0; j < n;  j++)
      scanf("%d", &(A[i][j]));
Analysis: Here there is no key comparison so we will count the "scanf". In a nested for loop with fixed lower and upper limits (e.g. "0" and "n") the number of times the insider part is executed is just n*n. So this runs in O(n2) time.
for (i = 0; i < n; i++)
  for (j = 0; j < i; j++)
     A[i][j] = 0;
Analysis:This code zeroes out the lower triangular part of an array. Since the inner loop is executed a variable amount of times, we can't just multiply. When i = 0 the inner loop is executed 0 times.
When i = 1 the inner loop is executed 1 time.
When i = 2 the inner loop is executed 2 times.
So the number of times the statement "A[i][j] = 0" is executed is: 1 + 2 + 3 + ... + n
This sum comes up a lot in algorithm analysis. You should memorize the formula:
   1 + 2 + 3 + ... + n = n(n+1)/2
So this algorithm runs in O(n2) time.
main()
{
    int i, query, A[100], n;

    i = 0;
    scanf("%d", &query);
    while ((A[i] != query)&& (i < n)) i++;
    if (i > n) printf("query %d not found.\n", query);
    else printf("query %d found at location %d.\n", query,  i);
}
Analysis:This code has a key comparision: "A[i] != query". We will count key comparisons to determine the running time. This means the running time is essentially the number of times the while loop is executed. This depends on the input and the query which we do not know. So we must consider the worst possible case. In the worst case the query is not found and the while loop is executed n times. Therefore the worst-case running time is O(n).

All the previous examples have been actual code. You should be able to analyze pseudo-code too, although this can be tricky because pseudo-code is more vague. In pseudo-code the last example above would be:

   search the array starting from location 0 
   comparing each element to the query until it is either found
   or the end of the array is reached.
Analysis: The worst-case will be when the query is not found and is compared to every element of the array. If there are n elements this takes O(n) time.

Section 1 Practice Problems

Give the big-O running time for the following pieces of code or pseudo-code.
  1. main()
    {
        int i, A[100], n, j;
    
        read_in(A, &n);  /* reads n integers into array A */
    
        for (i = 0; i < n; i++){
            j = find_min(A, i, n);  /* scans elements i, i+1, ..., n to find index of min element */
            swap(A[i], A[j]);       /* swaps A[i] and A[j] */
        }
    }
    
  2. Consider this modification of sequential search. Is it any better?
      Input: Given an array of n integers and a query integer.
      Output: The index of the array location containing the query, if found.  Otherwise -1.
      Algorithm: i = 0; j = n;
                 while A[i]!= query and A[j] != query  and i < j
    	  	increment i
    		decrement j
                 if A[i] == query output i
       	     if A[j] == query output j
                 else output -1
    
  3.  Input: An array A of n integers in no particular order.
     Output: The integers in sorted order (increasing).
     Algorithm:
          Initialize a balanced binary search tree to contain A[0];
          for i = 1 to n-1 do
             Insert A[i] into the tree
          print out the sorted list by traversing the tree in preorder order.
    
    Hint: You shold know that inserting into a balanced binary tree takes O(log2 n) time on a tree with n nodes and that traversing a binary tree in preorder order takes O(n) time.

Section 2: Analyzing Recursive Algorithms by Writing Recurrence Relations

A recursive function is a function that calls itself. The best way to analyze the running time of such a function (or algorithm) is to write a recurrence relation that represents the running time.
For example consider the following function:
int two_to_the_power(int n)
{
    if (n == 0) return 1;
    else return 2*two_to_the_power(n-1);
}
This function computes 2n for n an integer >= 0 recursively. If we count the if-else statement as taking 1 time unit, then how much time does this take? If we "unravelled" the execution, we would see that the function is called 1 time when n = 0, 2 times when n = 1, 3 times when n = 2, etc. Generalizing, we guess that the if-else statement is executed n+1 times when the function is called with n.
We can write a recursive function that expresses the running time of this function. Let t(n) denote the number of times the if-else is executed when the function two_to_the_n is called with a value of "n".
By looking at the code we can say that t(0) = 1. This is called the "base case". Also: t(n) = t(n-1) + 1. This is called the recursive formula. Together this is a recursive function. By plugging in a few small values of n we can see that t(n) = n+1. We can then prove this by induction which we will do later. For now we are concerned with determining the recurrence relation from code or pseudo-code.

Section 2 Practice Problems.

Write recurrence relations for the following:
  1.    int mystery(int n)
       {
         if (n <= 0) return 1;
         else return n*mystery(n-1);
       }
    
  2.    void sort(int A[], int start, int end)
       {
           int i;
    
           i = find_min(A, start, end);
           swap(A[start], A[i]);
           sort(A, start+1, end);
       }
    
    In the code above, function find_min scans each element betweeen locations start and end to find the minimum element in this range. This takes time O(end - start). Function swap swaps two array elements and takes O(1) time.
  3. Consider the pseudo-code for quicksort and answer the questions below.
    Quicksort(A, start, end)
    {
     if (n > 1) {
      1.  pick a pivot value p
      2.  reorder the array A so that it looks like:
          [---elements less than p-- elements equal to  p -- elements > than p]
          That is, all elements less than p are before all elemetns equal to p
            which are before all elements greater than p
          let j = index of the first element equal to p
      3.  Quicksort(A, start, j)
      4.  Quicksort(A, j+1, end)
     }
    }
    
    Assume that steps 1 and 2 take n time units. Is the following recurrence relation correct? Why or why not?
       t(n) = 2t(n/2) + n
       t(1) = 0
    

Section 3: Solving recurrence relations by guessing

Consider a recurrence relation such as:
     t(n) = 2t(n/2) + n
     t(1) = 0
This is not a "closed-form". A closed-form is a function like t(n) = nlog n (which is the answer for the recurrence relation above). The recurrence relation is an intermediate step in the process of determining the big-O running time of an algorithm. After determining the recurrence relation we need to find a closed-form.

There are mathematical ways to do this, but they are beyond the scope of this class. The simplest way to solve easy recurrence relations it to guess a solution and prove it by induction.

How do you guess a solution? There are two ways. One is to plug in small values of n and deduce a pattern. This is the one we will discuss here.

Examples:

1.  t(n) = t(n-1) + 1  (eqn 1)
    t(1) = 1           (eqn 2)
Plugging in:
    t(2) = t(2-1) + 1    (by eqn 1)
	 = t(1) + 1 = 
         = 1 + 1 = 2     (by eqn 2)
    t(3) = t(2) + 1  = 2 + 1 = 3
Now we guess that t(n) = n, which is correct.  Usually we need more
values to see a pattern.

2. t(n) = t(n-1) + n
   t(1) = 1
Plugging in:
   t(2) = t(1) + 2
        =  1 + 2 = 3
   t(3) = t(2) + 3
        = 3 + 3 = 6
   t(4) = t(3) + 4
        = 6 + 4 = 10
This one is actually tricky unless you notice that the sum is 1 + 2 + ... + n
=n(n+1)/2.

3. t(n) = t(n/2) + n
   t(1) = 1
When dealing with n/2 you may consider only values that are powers of 2
in this class.

Plugging in:
   t(2) = t(1) + 2 = 1 + 2 = 3
   t(4) = t(2) + 4 = 3 + 4 = 7
   t(8) = t(4) + 8 = 7 + 8 = 15
   t(16) =t(8) + 16 = 15 + 16 = 31
                  ------^
Notice that in the 3rd column we find that t(n) = (n-1) + n = 2n-1.
This is correct.

Section 3 Practice Problems:

Guess solutions for the following recurrence relations.
  1. t(n) = t(n-1) + n; t(1) = 1
  2. t(n) = 2t(n/2) + 1; t(1) = 0
  3. t(n) = t(n/2) + n; t(1) = 1

Section 4: Proving the solution.

After you guess the closed form for a recurrence relation you need to prove it correct by induction. The proof will work out if and only if your guess is correct (assuming you are doing inductive proofs correctly). If it doesn't work you need to change your guess. Example 1: t(n) = t(n-1) + n; t(1) = 1
Claim: For this recurrence relation, t(n) = n(n+1)/2
Proof (by induction):
Basis: We claim that t(1) = 1(2)/2 = 1, which is the value given by the recurrence relation so the claim is true for n = 1.
Inductive Hypothesis:Assume t(n-1) = (n-1)*n/2 for some n.
Inductive Step: We must show that t(n) = n(n+1)/2.
   t(n) = t(n-1) + n 
        = (n-1)*n/2 + n     by I.H.
        = (n*n - n + 2n)/2
        = (n^2 + n)/2
        = n(n+1)/2
Therefore the claim is true.  
Example 2: t(n) = n*t(n-1); t(1) = 1
Claim: t(n) = n! = n*(n-1)*(n-2)*....*1
Proof: By induction on n.
Basis: We claim t(1) = 1! = 1. This is true by definition.
I.H. Assume t(n-1) = (n-1)! = (n-1)*(n-2)*(n-3)*....1 for some n.
Inductive Step: Will show t(n) = n!.
   t(n) = n*t(n-1)
	= n*(n-1)!    by I.H.
  	= n*(n-1)*(n-2)*....*2*1
        = n!
Therefore the claim is true for all n.

Section 4 Practice Problems

For each of the recurrence relations below, prove the solution given is correct by induction.
  1. t(n) = t(n-1) + 2; t(1) = 1     Show t(n) = 2n-1.
  2. t(n) = t(n/2) + n; t(1) = 1;     Show t(n) = 2n-1.
  3. t(n) = 2*t(n/2) + 3n; t(2) = 1    Show t(n) = 3nlog n - 5n/2 (Here is the minus term that Maura asked about).

Solutions:

Section 1
  1. The running time of find_min(A, i, n) is O(n-i) because each element A[i], A[i+1], ..., A[n] must be checked. Function swap takes O(1) time. So the running time of the inner part of the loop is:
       n-0 + 1   = n+1 when i is 0
       n - 1 + 1 = n   when i is 1
       n - 2 + 1 = n-1 when i is 2
       .
       .
       .
       n - (n-1) + 1 = 2 when i is n-1
    Thus the running time is: 2 + 3 + ... + n+1.
    To analyze this we need to remember that 1 + 2 + ... + n = n(n+1)/2.
    Thus 2 + ... + n+1 = (1 + 2 + ... + n) - 1 + n+1 = n(n+1)/2 + n = (n^2 +3n)/2 = O(n^2).
    
  2. This version of sequential search simultaneously searches from the front and back of the array. In the worst case the item is not found. In this case i is decremented in each iteration fo the loop and j is incremented until i < j, which must happen when i = n/2 + 1 and j = n/2 - 1 (assuming n is even). Thus the loop is executed n/2 + 1 times and each iteration of the loop does 2 key comparisons (A[i] != query, A[j] != query). If we count the running time exactly, it is n + 2, which is worse than normal sequential search. The running time is also O(n), which is the same as normal sequential search.

    This answers the question I asked, but it is interesting to ask whether in some cases this would be better. In what cases would this be better? It is quicker to look at th eelements near the end of the list so it would be better if a lot of queries were near the end of the list (but not in the middle).

  3. The loop takes log 1 + log 2 + log 3 + ... + log n time because after inserting i elements the tree has i nodes and it takes log i time to insert into a tree with i elements. We don't have a formula for the sum: log 1 + ... + log n. We know it is at most n log n since the highest term is log n and there are n terms. Therefore the loop takes O(n log n) time. Printing the list out by traversing the tree in preorder order takes O(n) time. This is less than O(n log n) therefore the entire algorithm takes O(n log n) time.
Section 2
  1. Let t(n) denote the number of times the if statement is executed when function mystery is called with value "n". Then we can see that the base case is when n = 0 and the if is evaluated 1 time in order to determine that n = 0. So t(0) = 1. When n > 0, the function calls mystery(n-1) (after evaluating the if to see that we need to do the else part). This means when n > 0 that the # if's executed is 1 (during this call to mystery with value "n") + t(n-1), which is the # of if's executed in all the other calls to mystery starting with value "n-1". Therefore t(n) = t(n-1) + 1. So the recurrence relation is:
       t(n) = t(n-1) + 1   for n > 0
       t(0) = 1
    
  2. Let t(n) denote the number of comparisons done in call: sort(A, start, end) where end - sort = n. Note that the comparisons are "hidden" in the call to function find_min. We know that find_min(A, start, end) must compare each A[i] for i = start, ..., end to a current minimum element in order to determine the minimum. Therefore find_min does end- start + 1 comparisons. This code is actually incomplete because there is not base case. We would have to add an if statement like: if (end - start > 1) before all the other code. With this change we can say that t(1) = 0. No key comparisons are done.
    Now consider a call to sort(A, start, end) when end - start + 1 = n.
    Then find_min does n key comparisons.
    Swap does no key comparisons.
    Then the recursive call sort(A, start+1, end) does t(n-1) key comparisons.
    Therefore the recurrence is:
       t(n) = t(n-1) + n   for n > 1
       t(1) = 0
    
  3. No the recurrence relation is not correct. The algorithm splits the list into items < pivot and items > pivot (assuming no items equal to pivot). If the pivot were the median, then these two sets would be of size n/2 and the recurrence relation would be correct. The problem with quicksort is that we do not know how to pick a good pivot quickly. If we accidentally pick one that is smaller than the smallest element, then the recurrence relation would look like: t(n) = t(n-1) + n instead of t(n) = 2 t(n/2) + n. This is important because the former gives an O(n^2) running time and the latter is O(n log n). The correct recurrence for quicksort is complicated and I will not ask about it.
Section 3
  1. t(n) = n(n+1)/2
  2. t(n) = n - 1
  3. t(n) = 2n - 1
Section 4
  1. t(n) = t(n-1) + 1; t(1) = 1. Claim t(n) = 2n-1 for all n >= 1.
    Proof: By induction on n.
    Basis: When n = 1, we are given that t(1) = 1. By the formula, t(1) = 2*1 - 1 = 1. Therefore the claim is true for n = 1.
    I.H. Assume that t(n) = 2n-1 for some n> 1.
    Inductive Step: We will show that t(n+1) = 2(n+1) - 1 = 2n + 1.
    By the recursive formula for t(n), we know that:
       t(n+1) = t(n) + 2
              = 2n - 1 + 2    by I.H.  (must say this)
              = 2n + 1.
    
    Therefore the claim is true for all n > 1.
  2. t(n) = t(n/2) + n; t(1) = 1. Claim: t(n) = 2n-1 for all integers n >= 1 that are powers of 2.
    Basis: When n = 1, by the recursive formula we have t(1) = 1. The other formula gives t(n) = 2*1 - 1 =1 also.
    I.H. Assume that t(k) = 2k-1 for k = 2, ..., n-1.
    Inductive Step: Will show that t(n) = 2n-1 for n a power of 2.
    By the recursive formula we know:
       t(n) = t(n/2) + n
            = 2(n/2) - 1 + n    by I.H. 
            = n - 1 + n = 2n - 1.
    
    Therefore the claim is true for integers n that are powers of 2 and greater than or equal to 1.
  3. t(n) = 2t(n/2) + 3n; t(2) = 1. Claim: t(n) = 3n log n - 5n/2 for all n a power of 2 and n >= 2.
    Proof: By induction on n.
    Basis: For n = 2, the recursive formula specifies t(2) = 1. The other formula says: t(2) = 3*2*log(2) - 5(2)/2 = 6 - 5 = 1. Therefore the claim is true for n = 2. (whew).
    I.H. Assume that t(k) = 3k log k - 5k/2 for k = 3, ...., n-1.
    Inductive Step: We will show that t(n) = 3n log n - 5n/2, assuming n is a power of 2 and n > 2. By the recursive formula t(n) = 2t(n/2) + 3n.
    By the I.H. we can get the value of t(n/2) by plugging "n/2" into the formula 3n log n - 5n/2 giving:
       t(n/2) = 3*(n/2) log (n/2) - 5(n/2)/2
              = 3n/2 * (log n - 1) - 5n/4
              = (3n/2)log n - 3n/2 - 5n/4
              = 3n/2 log n - 11n/4
    Therefore:
        t(n) = 2*t(n/2) + 3n
             = 2*(3n/2 log n - 11n/4) + 3n   by IH
             = 3n log n - 11n/2 + 3n
             = 3n log n - 11n/2 + 6n/2
             = 3n log n - 5n/2, as claimed.
    
    Therefore the claim is true for all values of n where in is a power of 2 and n is >= 2.