Benchmarking
------------

Below are the timings for taken for the basic operations on the data structures
provided in the dict library. The keys and data used were ~235,000 words from
Webster's dictionary, located in /usr/share/dict/words on BSD systems. The
comparison function used was C's "strcmp" function. The first data set was the
entire set of words in sorted order. The second data set was the entire set of
words in random order. All words were inserted, then all words were searched
for in the same order in which they were inserted, and finally all words were
deleted from the data structure in the same order they were searched for. The
individual time taken for insertion of all keys, search for all keys, removal
of all keys, and total for all three are listed, as reported by the Unix system
facility getrusage(2).

These tests are in no way conclusive and are not meant to be interpreted as
empirical data relating the performance of these various theoretical structures
and algorithms. They are to be interpreted with due caution because they
reflect hardware, compiler, operating system, data set, implementation of data
structure, etc, and not just the theoretical data structure or algorithm in
question. Also, one of the data sets is in sorted order, which is rarely the
case in real life, and second, and each data set is inserted, then searched
for, and then removed, in *exactly* the same order, a condition which is highly
unlikely in real world situations. Nevertheless, the timings do reveal some
interesting characteristics about these structures. They also take into account
"real world" implementation issues which are usually not discussed in theory,
such as cost of allocation, etc. It is also useful for someone who plans to
make use of this library in determining which structure to use for a specific
application.

These tests were performed on a 500Mhz PIII system running FreeBSD 4.3 with 128
MB of memory. The compiler used was GCC 2.95.3 with the -O optimization
switch. The tests were performed with a minimum of other programs running.

+-----------------------------+-------------------+-------------------+
| Data Structure              |       Set 1       |       Set 2       |
|                             |   Time      Cmps  |   Time      Cmps  |
+-----------------------------+---------+---------+---------+---------+
| Height-Balanced Tree Insert |  1.574s |  3.984M |  2.534s |  3.949M |
|                      Search |  0.671s |  3.984M |  1.496s |  4.058M |
|                      Remove |  1.157s |  3.019M |  2.760s |  3.915M |
|                       Total |  3.401s | 10.987M |  6.790s | 11.921M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |     16        17  |     13        21  |
|        Internal Path Length |      3747852      |      3822219      |
+-----------------------------+---------+---------+---------+---------+
| Path-Reduction Tree  Insert |  1.618s |  3.984M |  2.693s |  3.935M |
|                      Search |  0.891s |  3.984M |  1.497s |  4.044M |
|                      Remove |  0.749s |  2.131M |  2.253s |  4.059M |
|                       Total |  3.258s | 10.098M |  6.443s | 12.038M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |     16        17  |     13        21  |
|        Internal Path Length |      3796564      |      4057732      |
+-----------------------------+---------+---------+---------+---------+
| Red-Black Tree       Insert |  1.626s |  7.129M |  1.808s |  3.961M |
|                      Search |  0.894s |  4.073M |  1.489s |  4.069M |
|                      Remove |  0.957s |  3.659M |  2.019s |  3.604M |
|                       Total |  3.477s | 14.860M |  5.315s | 11.634M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |     16        32  |     13        21  |
|        Internal Path Length |     3836678       |      3833099      |
+-----------------------------+---------+---------+---------+---------+
| Splay Tree           Insert |  0.365  |  0.236M |  0.365  |  0.236M |
|                      Search |  0.652  |  1.277M |  0.652  |  1.277M |
|                      Remove |  0.689  |  0.992M |  0.689  |  0.992M |
|                       Total |  1.706  |  2.505M |  1.706  |  2.505M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |      0    235880  |      0        46  |
|        Internal Path Length |     2050001364    |      5392608      |
+-----------------------------+---------+---------+---------+---------+
| Treap                Insert |  0.689  |  2.642M |  0.689  |  2.642M |
|                      Search |  0.830  |  5.320M |  0.830  |  5.320M |
|                      Remove |  0.597  |  2.913M |  0.597  |  2.913M |
|                       Total |  2.116  | 10.876M |  2.116  | 10.876M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |      6        43  |      5        43  |
|        Internal Path Length |      4985572      |      5276239      |
+-----------------------------+---------+---------+---------+---------+
| Weight-Balanced Tree Insert |  1.986  |  4.633M |  1.986  |  4.633M |
|                      Search |  0.654  |  3.991M |  0.654  |  3.991M |
|                      Remove |  0.612  |  2.096M |  0.612  |  2.096M |
|                       Total |  3.252  | 10.719M |  3.252  | 10.719M |
|                             +-------------------+-------------------+
|      Min / Max Path to Leaf |     16        22  |     12        22  |
|        Internal Path Length |      3754838      |      3848532      |
+=============================+=========+=========+=========+=========+
| HashTable (~1K)      Insert |  8.198  |  8.258  |  8.198  |  8.258  |
|                      Search |  5.319  |  5.368  |  5.319  |  5.368  |
|                      Remove |  8.146  |  8.167  |  8.146  |  8.167  |
|                       Total | 21.663  | 21.792  | 21.663  | 21.792  |
|                             +---------+---------+---------+---------+
|                  Slots Used |                                       |
+-----------------------------+---------+---------+---------+---------+
| HashTable (~5K)      Insert |  1.989  |  1.997  |  1.989  |  1.997  |
|                      Search |  1.302  |  1.308  |  1.302  |  1.308  |
|                      Remove |  1.933  |  1.935  |  1.933  |  1.935  |
|                       Total |  5.225  |  5.240  |  5.225  |  5.240  |
|                             +---------+---------+---------+---------+
|                  Slots Used |                                       |
+-----------------------------+---------+---------+---------+---------+
| HashTable (~30K)     Insert |  0.638  |  0.661  |  0.638  |  0.661  |
|                      Search |  0.453  |  0.450  |  0.453  |  0.450  |
|                      Remove |  0.597  |  0.608  |  0.597  |  0.608  |
|                       Total |  1.689  |  1.719  |  1.689  |  1.719  |
|                             +---------+---------+---------+---------+
|                  Slots Used |                                       |
+-----------------------------+---------+---------+---------+---------+
| HashTable (~60K)     Insert |  0.501  |  0.530  |  0.501  |  0.530  |
|                      Search |  0.369  |  0.367  |  0.369  |  0.367  |
|                      Remove |  0.475  |  0.483  |  0.475  |  0.483  |
|                       Total |  1.347  |  1.379  |  1.347  |  1.379  |
|                             +---------+---------+---------+---------+
|                  Slots Used |                                       |
+-----------------------------+---------+---------+---------+---------+
| HashTable (~120K)    Insert |  0.469  |  0.493  |  0.469  |  0.493  |
|                      Search |  0.322  |  0.362  |  0.322  |  0.362  |
|                      Remove |  0.457  |  0.465  |  0.457  |  0.465  |
|                       Total |  1.248  |  1.284  |  1.248  |  1.284  |
|                             +---------+---------+---------+---------+
|                  Slots Used |                                       |
+-----------------------------+---------+---------+---------+---------+

* The hashtables were created with the smallest prime number of slots greater
  than the multiple of 1000 they are listed with. For example, the hashtable
  with "~5K" next it had 5003 slots. Hash table sizes were chosen to be prime
  numbers because a hash value is more likely to have congruences with
  non-prime numbers than it is with prime numbers. Thus a prime hash table size
  will result in fewer congruences, which in turn results in fewer collisions.

The first thing that that one notices about the results for both tree
structures is that sorted data sets cause them to perform far better than
unsorted. The splay tree takes more than 3.5 times as much time for the third
data set than for either two sorted sets. The disparity is not so large for the
other tree structures. My theory is that inserting in sorted order results in
more "perfect rotations" - that is, rotation(s) that are performed after an
insertion or deletion which violates the balance rule of a tree in which the
resulting nodes have "perfect" balances. For example, for height-balanced (AVL)
trees, this means after adjusting rotations more nodes have balance 0 then +1
or -1. This is verified by the fact that, for the height-balanced tree, the
maximum path length from a root to a leaf is far closer to the minimum path
length from the root to leaf for trees built by inserting in sorted order than
for those in random order. For example, for the height-balanced tree with the
sorted data set, the maximum path length was 17, and the minimum path length
was 16. For the same data set in "random" order, however, the maximum path
length was 21 and the minimum path length was 13.

Secondly, we find that the ordering of the data set makes nearly no difference
upon the performance of the hash table, and nearly identical timings are
produced for all three data sets. These timings are also a reflection upon the
hash function used to produce a hash value from a key. In choosing the hash
function for this benchmark, I looked for one that is both fast, and produces a
reasonably even distribution of hash values. I chose:

unsigned
s_hash(const char *str)
{
	const unsigned char *p = (const unsigned char *)str;
	unsigned hash = 0;

	while (*p)
		hash = hash * 33 + *p++;
	return(hash);
}

We also find that performance improves rapidly when we increase the size of
the hashtable when it is obviously too small (at ~1000 slots, the expected load
factor would be upwards of 235), yet hardly improves when we expand a hash
table is this is already "adequately" large. When the hash table was sized to
235,889 (the largest prime greater than 235,881, which is the exact number of
words in the data set), only 149,031 slots were being used. An entire 86,858
slots were going entirely unused (and thus there were about as many
collisions). In this situation, performance would improvely only slightly for
large increases in table size, which would consume a large amount of memory. A
better solution would be to employ a superior hash function. The drawback to a
better hash function is that it may take longer to compute than the simple one
used here.
