Overview

BPlusTree is a implementation of the generic IDictionary<TKey, TValue> interface backed by a disk-based B+Tree. There is no practical size limit on the total amount of data that be stored or on the size of the key or value data used. Keys and values can be any class, struct, or built-in primitive written in .NET provided that you can supply an implementation of ISerializer<T> to read and write the object. There are also several available storage, concurrency, and caching strategies available to choose from.

Storage

All storage systems implement a single interface, INodeStorage. This interface is responsible for the creation, reading, writing, and deletion of a node in the tree. You can implement your own storage subsystem and provide it to the options class via the StorageSystem property. Most of the time you will want to use one of the two forms of storage BPlusTree supports by default each of which have numerous options. File Storage - The first and most common of these storage types is the file based storage. To use the file store simply specify the FileName property and the CreateFile policy on the Options class. The file storage uses a fixed block sub-allocation scheme to allocate blocks within the specified file. This block size can be specified or left as the 4kb default. The other important setting here is the FileOpenOptions setting which allows you to specify a combination of the System.IO.FileOptions flags. Both WriteThrough and NoBuffering (FILE_FLAG_NO_BUFFERING = 0x20000000, see File Buffering) are supported. Using the WriteThrough and NoBuffering options provides you with the most durable storage at a significant performance cost. Memory Storage - Most of the time this is fairly useless as the whole B+Tree goes since you would ordinarily just use a regular Dictionary<TKey, TValue>. It is possible the BPlusTree can outperform a Monitor locked Dictionary; however, real world use cases of the excessive contention required seem unlikely. If you simply must have an IDictionary that is completely thread safe you are welcome to give it a try, just set the StorageType property on the options class to Memory.

Concurrency

Concurrent reads and writes are completely safe. By default writers do not block readers so a reader is only guaranteed to read a modification if the read operation began after the completion of the write. In addition the use of localized node level locking for writers allows multiple concurrent writers to co-exist in different locations of the tree. The BPlusTree uses an implementation of an IFactory<ILockStrategy> to construct locks for each node read from the store. The default ILockStrategy uses a Monitor TryEnter/Exit to control access only for writers. If you are using the BPlusTree from a single thread, or you are opening it in ReadOnly mode, then you can disable the locking by setting the LockingFactory property of the Options class to an instance of the IgnoreLockFactory class. With a few exceptions all public members and enumerators are completely thread safe by default. Those exceptions are diagnostics methods like Print, and Validate, as well as the methods EnableCount, Clear, and UnloadCache. All of these methods may be made thread safe by specifying a valid reader/writer lock (i.e. SimpleReadWriteLocking) on the CallLevelLock property of the options class. In addition once the CallLevelLock is set you can obtain a Write lock on this to obtain exclusive access to the entire tree.

Caching

The cache is fairly baked into the implementation at present and is therefore not extensible; however, there are several options to control its behavior. Setting the CachePolicy of the Options class is the primary control allowing for None (nothing is cached, everything is read from disk), Recent (high utilization nodes are kept in memory), and All (everything is kept in cache). The default is Recent and this cache option has a few options of its own. The CacheKeepAliveMinimum/MaximumHistory properties control the range of items allowed to be cached while the CacheKeepAliveTimeout defines how quickly the cache should be reduced back to the minimum value. It is important to note that these numbers do not reflect unique nodes, but rather include each visit to a node. By default the cache holds at least the most recent 10 nodes visited and at most the last 100 nodes visited. With mild activity this cache will reduced itself after 1 minute. Please note that the BPlusTree does not create or manage threads so without some level of activity the cache may remain at its maximum capacity.

Fault Tolerance

What can go wrong will go wrong. Every effort has been made to ensure that the BPlusTree continues to behave even when things don’t go as expected. All modifications to the tree and the file are transacted internally to ensure modifications to the data or tree remain atomic and either complete or fail as a single unit. Because of the fact that threads may be killed by unmanaged code or other catastrophic thread event (stack overflow, stack corruption) there may be a case when .NET’s Monitor becomes confused about who owns it. If this happens every call to the BPlusTree may throw a TimeoutException when attempting to acquire a lock. This is the other reason apart from regaining memory for the UnloadCache method (see Concurrency and the CallLevelLock property). Though I’ve never seen it happen under normal circumstances, if your file does become corrupt you can attempt to recover the data with the RecoverFile/RecoveryScan.

Multiple Instances

Multiple instances of a BPlusTree using the same file is not supported unless the ReadOnly property on the Options class is set to true. Though it is possible to build this into or on top of the BPlusTree I personally prefer to use a single instance and IPC/RPC to allow cross-domain or cross-process access.

Exception-less Errors

Although BPlusTree supports the IDictionary interface the public methods available directly on the class do not always mirror those semantics. You should note that the public Add method returns a boolean result. This is also true for the Update, AddOrUpdate, and Remove methods. If you like exceptions using the IDictionary method Add with an existing key will raise one for you.

Simple Example:


    BPlusTree<string,Guid>.Options options = new BPlusTree<string, Guid>.Options(
        PrimitiveSerializer.String, PrimitiveSerializer.Guid, StringComparer.Ordinal)
         {
             CreateFile = CreatePolicy.IfNeeded,
             FileName = @"C:\Temp\Storage.dat"
         };

    using(BPlusTree<string, Guid> map = new BPlusTree<string,Guid>(options))
    {
        Assert.IsTrue(map.Add("foo", Guid.NewGuid()), "Failed to add foo");
        Assert.IsTrue(map.ContainsKey("foo"), "Failed to find foo");
    }

Online Resources