Introduction
Constant Database PlusPlus (CDB++) is a C++ implementation of hash database specialized for serialization and retrieval of static associations between keys and their values. The database provides several features:
- Fast look-ups. This library implements the data structure of the Constant Database proposed by Daniel J. Bernstein.
- Low footprint. A CDB++ database consists of a chunk header (16 bytes), hash tables (2048 bytes and 16 bytes per record), and actual records (8 bytes plus key/value size per record).
- Fast hash function. CDB++ incorporates the fast and collision-resistant hash function for strings (MurmurHash 2.0) implemented by Austin Appleby.
- Chunk format. The structure of CDB++ is designed to store the data in a chunk of a file; CDB++ database can be embedded into a file with other arbitrary data.
- Simple write interface. CDB++ can serialize a hash database to C++ output streams (
std::ostream
).
- Simple read interface. CDB++ can prepare a hash database from an input stream (
std::istream
) or from a memory block on which a database image is read or memory-mapped from a file.
- Cross platform. The source code can be compiled on Microsoft Visual Studio 2008, GNU C Compiler (gcc), etc.
- Very simple API. The CDB++ API exposes only a few functions; one can use this library just by looking at the sample code.
- Single C++ header implementation. CDB++ is implemented in a single header file (cdbpp.h); one can use the CDB++ API only by including cdbpp.h in a source code.
CDB++ does not support these for simplicity:
- modifying associations
- checking collisions in keys
- compatibility of the database format on different byte-order architectures
Sample code
This sample code constructs a database "test.cdb" with 100,000 string/integer associations, "000000"/0, "000001"/1, ..., "100000"/100000 (in build function). Then the code issues string queries "000000", ..., "100000", and checks whether the values are correct (in read function).
Download
CDB++ is distributed under the term of the modified BSD license.
History
- Version 1.1 (2009-07-14):
- Fixed a compile issue (a patch submitted by Takashi Imamichi).
- Replaced SuperFastHash with MurmurHash 2.0 (a patch submitted by Takashi Imamichi).
- Classes cdbpp::builder_base and cdbpp::cdbpp_base taking a template argument to configure a hash function. Classes cdbpp::builder and cdbpp::cdbpp are now the synonyms of
cdbpp::builder_base<cdbpp::murmurhash2>
and cdbpp::cdbpp_base<cdbpp::murmurhash2>
, respectively.
- Split the sample code into build and read functions.
- Version 1.0 (2009-07-09):
Documentation
Acknowledgements
The data structure of the constant database was originally proposed by Daniel J. Bernstein.
The source code of CDB++ includes the MurmurHash 2.0 implemented by Austin Appleby.
The CDB++ distribution contains "a portable stdint.h", which is released by Paul Hsieh under the term of the modified BSD license, for addressing the compatibility issue of Microsoft Visual Studio 2008. The original code is available at: http://www.azillionmonkeys.com/qed/pstdint.h
References