The tests below measure the speed of basic operations: creating a database, adding a record, inserting values, performing scans and queries. The timings are obtained on a laptop with an Intel Core i7-2820QM and 8 gigabytes of 1333 Mhz RAM running 32-bit Gentoo Linux. Both the library and the tests are compiled with gcc 4.2.4, using the optimization level -O2.
We run these tests on a 32-bit system with the field size in a record being also 32 bits. Using 64 bit fields would probably make the operations slightly slower.
All the tests are run as a single process without threads, hence utilizing a single processor core and not requiring
As the Yahoo benchmark results indicate, you can get higher throughput using multiple
processes/cores. On the other hand, multiple processes doing writes would require locking, i.e. inserting
wg_end_read from the C API into the test
code. Locking speed depends
on both where you call the lock functions and the locking strategy set during WhiteDB configuration. As said before, the
following tests do not require and do not use locking.
2. How to run the tests
The tests are available as small standalone source files speed1.c, speed2.c etc in
Compile and run them yourself by
gcc speed1.c -o speed1 -O2 -lwgdb time ./speed1
Tests 1-8 create a new database, run the operations and then delete the database. Tests 10 and later will not delete
the database: do it yourself by
wgdb 10 free where 10 is the test number (look at the test source).
Make sure that you have sufficient shared memory before running the tests: all the tests assume that you have at least one gigabyte of shared memory available. Check the available amount and increase as necessary:
less /proc/sys/kernel/shmmax su sysctl kernel.shmmax=1000000000Alternatively, replace the
wg_attach_database(name, 1000000000)calls with
wg_attach_local_database(1000000000)to use ordinary non-shared memory, accompanied by
wg_delete_local_databaselater: this is ok for tests 1-9.
Below we will always present the wall-clock time as reported by
time, including both user and system time.
3. Creating and deleting a database
speed1.c creates a new 1 gigabyte database and then immediately deletes the database, repeating the whole process 1000 times.
The time spent is ca 10 seconds, meaning that you can create and then delete a 1 gigabyte database 100 times per second.
Creating/deleting a smaller database is faster: it takes 13 seconds to do 100 000 creations/deletions of a 10 megabyte database (the total amount of memory involved is the same as in the previous test).
The database creation time increases roughly linearly with the database size. Database creation splits the whole space into a number of subareas and datastructures, mostly organized into free memory chunk lists for various sizes.
4. Inserting and deleting records
speed2.c creates a new 1 gigabyte database and then creates 10 million records of length 5, with record values initialized to NULL.
The time spent is ca 0.6 seconds, meaning that you can create one 5-field record in ca 60 nanoseconds.
In comparison, on the same system it takes ca 0.08 seconds to allocate 1 GB with conventional malloc and then just sequentially write 50 million integer 0 values: the same number of integers as the NULL-initialized fields in the database.
speed3.c performs a very similar task to speed2.c: it creates a new 1 gigabyte database and then creates 10 million records of length 5, this time not initializing the field values.
The time spent by this faster version is ca 0.349 seconds.
speed4.c, again, creates a new 1 gigabyte database and then creates 10 million records of length 5, with record values initialized to NULL, this time deleting each newly created record immediately after creation.
The time spent by this create/delete version is ca 1.1 seconds.
5. Inserting and setting field values
In real life you insert a record to fill it with real values, not just NULL.
speed5.c creates a new 1 gigabyte database, then creates 10 million records of length 5, with field values then set to integers of increasing value.
Observe that integers
stored are encoded. Also, we are using
wg_set_new_field which assumes the field does
not contain an allocated or indexed value. The conventional
wg_set_field checks for
the previous value (to deallocate/remove from index as necessary) and is slightly slower.
The time spent by this proper insert is 1.16 seconds. The slightly slower
version takes 1.55 seconds.
Now, when we add code to read back the immediately written value from the base, decode and add it to a counter, we get the running time 1.77 seconds.
Next we modify the code to create just 1000 records, but fairly long ones: 50 thousand fields each, again filling them with growing integers. This gives us the same total number of fields as in the previous test.
The time spent by the long-record version is 0.94 seconds, slightly faster than the initial proper insertion test.
So far we have used integers, which should be expected to be the fastest value type.
The next step is to experiment with strings. WhiteDB uses two different encoding methods for strings:
- Strings shorter than 32 bytes are always separately allocated as 32-byte chunks, with the field containing a pointer to the string. BTW, the 32-byte limit can be modified by changing the SHORTSTR_SIZE macro in dballoc.h
- Longer strings (and also short strings with an added language/namespace attribute) are allocated uniquely: a single copy is kept of each string. This saves space and allows very fast string equality comparison: just pointers have to be compared. However, encoding a long string involves computing a hash of a string, looking it up from the hash table and inserting into hash chain, which takes some time.
speed6.c creates a new 1 gigabyte database, then creates 1 million records of length 5, with field values then set to a 30-byte string. NB! We have decreased the number of records from 10 million in previous tests to 1 million for a short string: otherwise the strings won't fit into our meager 1 GB database.
The time spent by the string-fields test is 0.4 seconds. This would translate to 4 seconds when compared with the 10 million records of the integer test, making the string-record insertion ca 3.5 times slower.
Testing with a 10-byte string instead of a 30-byte string has a very small effect.
speed7.c performs the same operations as the previous speed6, but this time using a 100-byte string as a field value. The string is kept uniquely, essentially just one copy for the whole database.
The time spent by the test is 1.5 seconds, about three times slower than the 30-byte string test was. The memory requirements are a lot lower for this test than the previous test, but this would change if we'd save a different string each time.
speed8.c performs the same operations as the last string tests, but using a double value. Again, just one million records of five fields are created. All of these are allocated as 8-byte values. The field is assigned a pointer to the allocated value.
The time spent by the test is 0.19 seconds, which is significantly faster than the string tests. When compared to the integer tests of 10 million records, it would correspond to 1.9 seconds, which is a bit slower than the integer test.
6. Scanning, queries and pointers
The first test scans the database, i.e. goes through all the records sequentially, looking for specific values in a given field. This is what happens when you search through a non-indexed field. The second test will query an indexed field and demonstrate, as expected, real speed.
Scanning records over a non-indexed value
speed10.c builds 10 million records of five fields, with the value of field 3 assigned an integer stemming from the last five digits of the record number, thus storing each number between 0...100 000 to 100 different records.
speed11.c takes the previously built database and scans through all the records, checking whether field 3 contains the value 123. Matching records are counted: there are 100 of these.
The scanning test takes 0.2 seconds. In the other words, you can expect to scan through 50 million records in a second.
Querying records over an indexed value
speed12.c creates the T-tree index for the (integer) field 3 in the previously created database of 10 million records with 100000 different values in the field.
The indexing test takes 6.5 seconds.
speed13.c performs a query over the previously indexed field 3, again looking for the value 123 and counting all the matching records (exactly 100). In order to make the time measureable, we repeat the whole query building / searching / fetching / deallocating process one million times.
The time to perform million queries is 3.2 seconds. In the other words, you can make ca 300 000 queries per second. For smaller databases the query time decreases logarithmically: running the same test on a 1000-record database takes 1.1 seconds, i.e. ca one million queries per second is the maximum speed you could get.
Record pointers: the foundation of a graph database
Searching for a related record through an index using the record id (the standard SQL way) is neither the fastest nor the simplest way of going through complex data structures.
It is much faster and easier to store direct pointers to records.
speed15.c builds a database of 10 million records and stores a pointer to the previously created record to the field 3 of each record. Essentially, all the 10 million records will form a long chain from the last record back to the first. Additionally, we store a pointer to the last record into field 2 of the very first record, to directly access the last record later.
The building test takes 0.66 seconds.
speed16.c traverses through the whole chain of backward pointers and counts the 10 million records in a list.
The traversal test takes 0.15 seconds: in other words, you can traverse almost 100 million records in a linked list in a second.