Continued from: BPlusTree 2.0 Changes Part 4 – Rewriting the Storage Format

Getting Started

To utilize the new file format in the BPlusTree you will need to use OptionsV2 class. The previous Options class is still there and continues to use the same file format.

// Create the OptionsV2 class to specify construction arguments:
var options = new BPlusTree<string, Guid>.OptionsV2(
    PrimitiveSerializer.String, PrimitiveSerializer.Guid, StringComparer.Ordinal)
    {
        CreateFile = CreatePolicy.IfNeeded,
        FileName = @"C:\Temp\Storage.dat",
        StoragePerformance = StoragePerformance.Default,
        TransactionLogLimit = 100 * 1024 * 1024, // 100 MB
    }
    .CalculateBTreeOrder(10, 16);

// Create the BPlusTree using these options
using(var map = new BPlusTree<string, Guid>(options))
{
    map.Add("foo", Guid.NewGuid());
    Assert.IsTrue(map.ContainsKey("foo"), "Failed to find foo");
}

Choosing a StoragePerformance Option

I’ve attempted to simplify the multitude of options available into this simple enumeration. The default (LogFileInCache) should be acceptable to most uses, but read the following carefully to understand what you are getting. The StoragePerformance has the following possible values and their implications:

StoragePerformance.Fastest
This option is safe from corruption and provides the fastest possible throughput. This is essentially the same behavior as LogFileInCache without the ability to recover any data written since the last successful call to Commit() or Dispose(). For some operations, for example, a temporary index of data that will not fit entirely in memory, this is going to be the best option. The performance again can vary but you should be able to see 100~200k writes/second.

StoragePerformance.LogFileInCache (Default)
This is the Default option and fastest durable option. The transaction log is written to system cache and in the event of a power failure or OS crash some records may be lost. This option should not corrupt state but may take some time to recover from a crash. The performance, even on slow disks, is quite good. You should expect anywhere from 30~60k writes/second or more depending the size of the data.

StoragePerformance.CommitToCache
This is the least durable option and the only one that may result in data file corruption. Corruption may occur in the event of power failure or OS crash with a partially flushed system cache. Provided the OS cache is flushed entirely or not at all, the file will not be corrupted. The performance of this option will usually reach around 6~8k writes/second. This can be a viable option if you do not want a transaction log and need to keep every write on disk in the case of a process crash.

StoragePerformance.LogFileNoCache
This option is intended to give you an ACID BPlusTree at the least cost in performance. The downside is that should the process crash or the power fail the file will need to be recovered.
See the discussion below on using transaction log files. Generally this option will perform about 2~4k writes/second.

StoragePerformance.CommitToDisk
Using this options should give you a fully ACID compliant BPlusTree. To be completely ACID (as with any database) means you need to disable cache and write behind features on the hardware as well. Performance will suffer, like with any real database. Depending on the speed of the disk and size of the data you can expect 1~2k writes/second or less.

Using Transaction Logs

Recovering the file from log can be expensive depending on the volume of data.
This option will open a transaction log with FileOptions.WriteThrough to cause every write to go through system cache to disk. The data file will remain in it’s current state until a call to Commit() or Dispose(). Calling Rollback() will, if using transaction logs, revert the storage file to it’s last committed state. Calling Commit/Rollback are exclusive operations on the entire tree, so all readers and writers must exit prior to processing.

For long-running processes (i.e. services) you may want to provide a thread to perform periodic calls to Commit() when idle. If you do not have a way to detect idle conditions, or don’t want to bother, set the OptionsV2.TransactionLogLimit to the max allowable log size. Keep in mind that calls to Commit require a OS-level flush of all file buffers, and as such, are expensive. My recommendation is to use a large constant value, 100MB or so.

Transaction Log Recovery

Transaction logs will by default be auto-recovered the next time the BPlusTree is opened after a crash. This comes at a cost of performing the operation as an ‘exclusive’ task which is generally what you are going to want. There are times when controlling this recovery can be useful. To this end there is a new option OptionsV2.ExistingLogAction which can be used to control the activity performed when opening a BPlusTree that has a existing log data. For instance you can choose to use the “Truncate” option to discard all changes in the transaction log.

Upgrading Existing Data

Upgrading from one BPlusTree to another is easily accomplished via the BulkInsert operation. The following example leverages the fact that the existing tree will enumerate in key order:

using Btree = CSharpTest.Net.Collections.BPlusTree<int, string>;
static void UpgradeDatabase(Btree.Options source, Btree.OptionsV2 target)
{
    source = source.Clone();
    source.ReadOnly = true;
    using(Btree input = new Btree(source))
    using (Btree output = new Btree(target))
    {
        output.BulkInsert(input,
                          new BulkInsertOptions
                              {
                                  ReplaceContents = true
                                  InputIsSorted = true,
                                  CommitOnCompletion = true,
                                  DuplicateHandling = DuplicateHandling.RaisesException,
                              }
        );
    }
}
Comments