#include <iostream> #include <vector> using namespace std; void sort_doubles(vector <double> *v, int print); |
The file sort_driver.cpp contains a main() routine that lets you perform a variety of sorting examples. It is called with five command line arguments:
UNIX> sort_driver size iterations seed double-check(yes|no) print(yes|no)This will run iterations tests where it sorts a randomly created vector of size size by calling sort_doubles(). It will use the given seed as a seed for srand48(). If double-check is "yes", then it will double check the results of sort_doubles() to make sure that it sorted correctly. The last command line arguments specifies how to pass the print parameter to sort_doubles().
In this lecture (and your next lab), we will implement a large host of sorting algorithms and link them in with sort_driver.o so that we can test their correctness and speed.
The first such implementation is null_sort.cpp, which does nothing excecpt print out the vector size times if print is equal to one:
#include <iostream>
#include <vector>
using namespace std;
#include "sorting.h"
void sort_doubles(vector <double> *v, int print)
{
int i, k, sz;
sz = v->size();
for (i = 0; i < sz; i++) {
if (print) {
for (k = 0; k < v->size(); k++) cout << (*v)[k] << " ";
cout << endl;
}
}
return;
}
|
As such, it does not run correctly. For example, if we double-check it on a three-element vector, it will fail:
UNIX> null_sort 3 1 0 yes yes 0.170828 0.749902 0.0963717 0.170828 0.749902 0.0963717 0.170828 0.749902 0.0963717 Sorting Error during iteration 0: v[1] = 0.749902 and v[2] = 0.0963717 UNIX>Although it doesn't sort properly, null_sort.cpp is useful because we can use it as a base case for timing other sorting algorithms.
bubble_sort.cpp
#include <iostream>
#include <vector>
using namespace std;
#include "sorting.h"
void sort_doubles(vector <double> *v, int print)
{
int i, j, k;
int sz;
double tmp;
sz = v->size();
for (i = sz; i > 0; i--) {
for (j = 1; j < i; j++) {
if ((*v)[j] < (*v)[j-1]) {
tmp = (*v)[j];
(*v)[j] = (*v)[j-1];
(*v)[j-1] = tmp;
}
}
if (print) {
for (k = 0; k < v->size(); k++) {
cout << (*v)[k] << " ";
}
cout << endl;
}
}
return;
}
|
selection_sort.cpp
#include <iostream>
#include <vector>
using namespace std;
#include "sorting.h"
void sort_doubles(vector <double> *v, int print)
{
int i, j, k;
int sz;
double min;
int index;
sz = v->size();
for (i = 0; i < sz-1; i++) {
index = i;
min = (*v)[index];
for (j = i+1; j < sz; j++) {
if ((*v)[j] < min) {
min = (*v)[j];
index = j;
}
}
(*v)[index] = (*v)[i];
(*v)[i] = min;
if (print) {
for (k = 0; k < v->size(); k++) {
cout << (*v)[k] << " ";
}
cout << endl;
}
}
return;
}
|
#include <iostream>
#include <vector>
#include "sorting.h"
using namespace std;
void sort_doubles(vector <double> *v, int print)
{
int i, j, sz, k;
double tmp;
sz = v->size();
for (i = 1; i < sz; i++) {
for (j = i-1; j >= 0 && (*v)[j+1] < (*v)[j]; j--) {
tmp = (*v)[j];
(*v)[j] = (*v)[j+1];
(*v)[j+1] = tmp;
}
if (print) {
for (k = 0; k < v->size(); k++) {
cout << (*v)[k] << " ";
}
cout << endl;
}
}
}
|
One of the nice things about insertion sort is that it sorts pre-sorted input in linear time rather than quadratic. To see this, I've implemented a second driver called sort_sorted.cpp which generated sorted input and sorts it. Note the difference between sorting random and sorted input with insertion sort:
UNIX> time insertion_1_sort 50000 1 0 no no 5.247u 0.012s 0:05.28 99.4% 0+0k 0+0io 0pf+0w UNIX> time insertion_1_sorted 50000 1 0 no no 0.002u 0.004s 0:00.05 0.0% 0+0k 0+2io 0pf+0w UNIX>
The "user time" is the first word printed -- 5.247 seconds for unsorted input as opposed to 0.002 seconds for sorted input.
We can speed up insertion sort if we observe that the implementation above performs too much data movement. Think about the following output:
UNIX> insertion_1_sort 6 1 0 no yes 0.170828 0.749902 0.0963717 0.870465 0.577304 0.785799 0.0963717 0.170828 0.749902 0.870465 0.577304 0.785799 0.0963717 0.170828 0.749902 0.870465 0.577304 0.785799 0.0963717 0.170828 0.577304 0.749902 0.870465 0.785799 0.0963717 0.170828 0.577304 0.749902 0.785799 0.870465 UNIX>Specifically, look at the third line, where i equals 3, and we have the sorted list (0.0963717, 0.170828, 0.749902, 0.870465) and we're inserting 0.577304. It will swap it with 0.870465 and then with 0.749902 to put it in place. That's really one swap too many. Instead, it's more efficient if we simply move 0.870465 and 0.749902 one place over, and then put 0.577304 where it belongs. This change is done in insertion_2_sort.cpp -- I only put the main loop here without the print statements.
for (i = 1; i < sz; i++) {
tmp = (*v)[i];
for (j = i-1; j >= 0 && tmp < (*v)[j]; j--) {
(*v)[j+1] = (*v)[j];
}
(*v)[j+1] = tmp;
}
|
Note how it is faster than insertion_1_sort:
UNIX> time insertion_1_sort 50000 1 0 no no 4.977u 0.011s 0:05.00 99.6% 0+0k 0+0io 0pf+0w UNIX> time insertion_2_sort 50000 1 0 no no 2.916u 0.009s 0:02.94 98.9% 0+0k 0+1io 0pf+0w UNIX>However, when we plot sorting times as a function of i (using the shell script time_sorting_algorithms -- I'm not explaining this one in class), we'll see that selection sort is the fastest, despite what textbooks and web pages say:
![]() |
(Oh -- all timings are on my MacBook Pro).
If you look at that inner loop, there is one place where it can be improved: it is always checking to make sure that j >= 0. We can fix this by traversing the vector before sorting, and putting the minimum element in index 0. This is done in insertion_3_sort.cpp -- here's the relevant part:
tmp = (*v)[0];
index = 0;
for (i = 1; i < sz; i++) {
if ((*v)[i] < tmp) {
tmp = (*v)[i];
index = i;
}
}
if (index != 0) {
(*v)[index] = (*v)[0];
(*v)[0] = tmp;
}
...
for (j = i-1; tmp < (*v)[j]; j--) {
...
|
This is a big help:
![]() |
I have a last insertion sort in insertion_4_sort.cpp that does the same thing only it simply checks for whether the ith element is the smallest inside the loop rather than finding it at the beginning. Its performance is pretty much identical to insertion_3_sort.cpp (check the two timing files timing_insertion_3.txt and timing_insertion_4.txt).
void sort_doubles(vector <double> *v, int print)
{
multiset <double> s;
multiset <double>::iterator dit;
int i;
for (i = 0; i < v->size(); i++) s.insert((*v)[i]);
i = 0;
for (dit = s.begin(); dit != s.end(); dit++) {
(*v)[i] = *dit;
i++;
}
}
|
As you can see, this blows away the other algorithms in performance. This is because the others are O(n2) algorithms, and STL's sets are implemented with a balanced binary tree structure (e.g. AVL or Red-Black trees), which results in O(n lg(n)) sorting:
![]() |
That's much faster. However, using multisets is overkill, since they contain all that structure (internal nodes, pointers, etc). There are algorithms that sort in O(n lg(n)) time without the extra overhead. One of these is used in the sort routine STL algorithms. We include that in stl_sort.cpp:
#include "sorting.h"
#include <algorithm>
void sort_doubles(vector <double> *v, int print)
{
sort(v->begin(), v->end());
}
|
The graph below shows how it destroys the others (note the X-axis has been greatly expanded).
![]() |
![]() |
Quicksort and merge sort are both recursive. I used the following recursive calls with each:
|
Quicksort: void recursive_sort(vector <double> *v, int start, int size, int print) Mergesort: void recursive_sort(vector <double> *v, vector <double> *temp, int start, int size, int print) |
The difference between merge sorts 1 and 2 is that in #2, any list of size 27 or less is sorted with insertion sort. That makes a big difference because all of those recursive calls for small lists are avoided. I determined the value of 27 experimentally.
All the Quicksort lines are roughly in the same place. In all versions of quicksort, I used the "version with in-place partition," (from the Wikipedia notes). However, my version is slightly different. I start with the pivot in (*v)[start], and a left pointer at start+1 and a right pointer at start+size-1. While the left pointer is less than the right pointer, I do the following:
When that is done, swap the pivot in elements (*v)[start] with the last element of the left set. Then you may recursively sort the left and right sets, omitting the pivot, since it is already in the right place.
Here's an example. Suppose our vector has 12 elements:

And suppose we call:
recursive_sort(v, 5, 7, 0);This means we need to sort the last five elements of the vector. Below, I've colored those elements light orange. Suppose we use (*v)[start] as our pivot. Then, our pivot is 13. We start partitioning with our left pointer equalling 6 and the right pointer equalling 11:

Next, we move the left pointer to the right until it is pointing to an element ≥ 13. That moves it to equal 7. The right pointer is already pointing to an element ≤ 13:

Now, we swap them, increment the left pointer and decrement the right pointer:

Since the left pointer is pointing to an element ≥ 13 and the right pointer is pointing to an element ≤ 13, we swap them, increment the left pointer and decrement the right pointer:

We increment the left pointer again, and now it is pointing past the right pointer, so we are almost done:

Our final move is to swap the pivot with the rightmost element in the left set (element #9 with a value of 10). Then we are going to recursively sort the left and right sets (colored very light blue below) with the two calls:
recursive_sort(v, 5, 4, 0); recursive_sort(v, 10, 2, 0);

UNIX> time quick_1_sorted 100000 1 0 no no 5.173u 0.012s 0:05.21 99.4% 0+0k 0+0io 0pf+0w UNIX> time quick_2_sorted 100000 1 0 no no 0.006u 0.003s 0:00.01 0.0% 0+0k 0+0io 0pf+0w UNIX>Quicksort #3 sorts lists of size 7 and smaller with insertion sort. That improves things, but not nearly as much as with merge sort.
A first pass of using this information is in bucket_1_sort.cpp
// ... headers & insertion sort defined here
void sort_doubles(vector <double> *v, int print)
{
int sz;
int index;
double val;
double *v2;
int hind, lind, done, i;
sz = v->size();
v2 = (double *) malloc(sizeof(double)*sz);
for (i = 0; i < sz; i++) v2[i] = -1;
for (i = 0; i < sz; i++) {
val = ((*v)[i] * sz);
index = (int) val;
if (v2[index] == -1) {
v2[index] = (*v)[i];
} else {
hind = index+1;
lind = index-1;
done = 0;
while(!done) {
if (hind < sz && v2[hind] == -1) {
v2[hind] = (*v)[i];
done = 1;
} else {
hind++;
}
if (!done && lind >= 0 && v2[lind] == -1) {
v2[lind] = (*v)[i];
done = 1;
} else {
lind--;
}
}
}
}
for (i = 0; i < sz; i++) (*v)[i] = v2[i];
insertion_sort(v);
free(v2);
}
|
What this code does is predict where each value is going to go, and then puts it into that index of v2 so long as it's empty (-1). If that entry is not empty, then it looks adjacent to that entry, and continues doing so until it finds an empty slot, and puts it there. Once that process is done, it copies v2 back to v and uses insertion sort to sort v. Since v is nearly sorted (or should be), insertion sort should sort is very quickly.
As it turns out, this process is quite slow, and the reason is that as v2 fills up, it takes longer to find empty slots and they are quite far from where they should be. We fix this in bucket_2_sort.cpp where we double the size of v2 so that there are more empty cells and a much smaller chance of having to move to adjacent cells. As you can see, the results are amazing -- much better than the Standard Template Library!
![]() |
As I said in class, if you can characterize the probability distribution, you can use the CDF to sort any input in this way. Think about it.