I’ve recently seen some comments about various aspects of btree/b+tree structures that are troublesome. They are, IMHO, just plain wrong. Let’s look at a few of these and see if we can shed some light on b+trees and their implementations.

**1. Items in a tree node must be presorted.**

*FALSE* This is simply not true, a btree can decide to sort the items in a node, or to perform a linear search. Neither approach change more than the small constants in the big-O, it’s still O(n log(n)). The BPlusTree keeps it’s children sorted since the memory-move cost is nothing compared to the disk IO.

**2. Split/Join of a node requires reorganizing it’s ancestors**

*FALSE* While this is the most common approach it is also the most naive. Reorganization of the tree from the bottom-up is not only expensive but more complicated than the top-down approach. The BPlusTree uses a top-down reorg that says simply “If I’m inserting and this node is full, split” all the way down the tree. This means that at any given time I only need two nodes ‘locked’ for write, me and my parent. Once I’m sure that this node is not full I can release my parent’s lock and move down the tree. Not only does this strategy allow multiple concurrent writers, but it also prevents deadlocks. By using this top-down approach locks are always acquired at the root and traverse down so that a deadlock is simply impossible.

**3. Nodes can have at most order-1 items**

*FALSE* This is also a naive implementation detail where leaf nodes (or all nodes) have a vacant slot. It’s most commonly used when using a bottom-up reorganization so that you always have room to insert a split node. It is simply not necessary with either bottom-up or top-down reorganization, it’s just easier. BPlusTree uses all the space available in a node, only when inserting and already full will it split the node.

**4. Nodes can have no fewer than order/2 items**

*FALSE* The minimum items in a node is 1. When you calculate the lookup time with a minimum of 1 (*let b = 1*) you get *O*(log_{b}*n*) or *O*(*n). So you never **want* a minimum value of 1, but you can do it. Based on my own testing the best value for the minimum node count in BPlusTree is 1/3 of the order (or max).

**5. The order of a btree is the same at all levels**

*FALSE* The BPlusTree supports having a different order for hierarchy nodes than that of it’s leaf nodes. The reason is simple. Disk-based B+Trees optimize disk access based on what will fit in a ‘page’. Using 8kb as a page size, at the leaf node this is 8192 / (sizeof(key) + sizeof(value)). At all other locations in the hierarchy this is 8192 / (sizeof(key) + c) where c is some constant size of data used to reference a child node. This means that if I am storing 1kb records with a 4-byte key I want no more than 8 records in a leaf, whereas I can have as many as 680 children in a non-leaf node. Now we can *let b = 680* for *O*(log_{b}*n*), meaning we can find the leaf node of a 1,000,000 item tree in approximately 3 operations.