Before I go much farther talking about the BPlusTree I want to address one thing. Does the BPlusTree support ACID operations? (definitions below from wikipedia)

Atomicity

Atomicity requires that database modifications must follow an “all or nothing” rule. Each transaction is said to be atomic. If one part of the transaction fails, the entire transaction fails and the database state is left unchanged.

All modifications to the B+ tree and the disk are transacted internally to ensure that if an exception occurs the tree and storage file are left in a valid state. So while this is mostly supported this one is a No see Durability below.

Consistency

Consistency states that only consistent (valid according to all the rules defined) data will be written to the database. Quite simply, whatever rows will be affected by the transaction will remain consistent with each and every rule that is applied to them.

The only consistency constraint imposed by the BPlusTree is that keys are unique, and that the serializer provided can serialize both the key and value. So as far as those simple rules are concerned the BPlusTree does support consistency.

Isolation

Isolation refers to the requirement that no transaction should be able to interfere with another transaction at all. In other words, it should not be possible that two transactions affect the same rows run concurrently, as the outcome would be unpredicted and the system thus unreliable.

The BPlusTree fully supports isolation when using the default configuration.

Durability

Durability means that once a transaction has been committed, it will remain so. In other words, every committed transaction is protected against power loss/crash/errors and cannot be lost by the system and can thus be guaranteed to be completed.

Each modification to the BPlusTree amounts to a single ‘node’ that is being modified. In the case of reorganization of the hierarchy (split/join) copies are made for the new child nodes but the parent is ‘modified’. The modification of a node essentially rewrites the file block associated with it in a single write call (followed by a flush). During this write (usually a 4kb block) if the power is interrupted or another God ordained event happens, like a BSOD, it is possible to get only part of the 4kb block written. Since the BPlusTree does not currently use a transaction file, this can cause corruption in the store. This is detected by the BPlusTree via a CRC32 checksum written into every file block so you will receive a specific exception. In short, the BPlusTree is not a Durable store by this definition.

Updated for V2

The BPlusTree version 2 provides several durability options that will bring it inline with ACID requirements. See the post “BPlusTree 2.0 Changes Part 5 – Using the v2 Storage Format” for more information about those options and how they affect the durability of the storage.

Comments