上周看MIT, <<Introduction to algorithms>> 的時候, 覺得B tree 實現起來有點麻煩, 正好可以練習一下。
花了一天時間,晃晃悠悠, 終於寫完了, 非常的沒有效率啊。
邏輯不複雜,但很多分支。 關鍵是下標容易出錯。 書上的刪除方法不是很明了, 還費了些周折。 在動手寫之前關鍵是明白B tree到底是怎麼實現的。
我先實現了insert, 因為這個方法簡單一些, 也藉機加深對B tree的認識。 再實現的刪除方法。
關鍵的注釋在原始碼(下載) 裡面都有。insert寫到一半, 發現我患了一個低級錯誤: 方法的傳參混亂, 比如, insert()是在Node裡面, 所以, 必須由一個Node來調用, 而方法裡面又有要求傳遞一個Node作為參數。 這樣一來的話, 一個node可以修改另一個node了。 如果要求一個node可以修改另一個node, 可以把方法設定為static, 如果不用, 那麼參數就重複傳遞了。 但我沒有改過來, 後面所有的方法都是這麼一個模式:
node.method(node, ...)
我只是做了簡單的測試, insert方法是依次插入值[1,18], 每插入一個值, 和參考【1】的樹比較, 看是否一樣。 delete的測試是首先構建一個樹, 包含值[1,18], 然後從18到1, 依次刪除並驗證。
手都酸了。。
參考:
1. Animation of 2-3-4 tree, source code also available。 這個網站提供一個示範2-3-4樹的圖形介面
2. Collection of BTree info
3. MIT <<Introduction to algorithms>>
附, 參考3上提供的方法, 我加了一些注釋:
Insert
B-TREE-SPLIT-CHILD(x, i, y)
1 z ← ALLOCATE-NODE()
2 leaf[z] ← leaf[y]
3 n[z] ← t - 1
4 for j ← 1 to t - 1
5 do keyj[z] ← keyj+t[y]
6 if not leaf [y]
7 then for j ← 1 to t
8 do cj[z] ← cj+t[y]
9 n[y] ← t - 1
10 for j ← n[x] + 1 downto i + 1
11 do cj+1[x] ← cj [x]
12 ci+1[x] ← z
13 for j ← n[x] downto i
14 do keyj+1[x] ← keyj[x]
15 keyi[x] ← keyt[y]
16 n[x] ← n[x] + 1
17 DISK-WRITE(y)
18 DISK-WRITE(z)
19 DISK-WRITE(x)
B-TREE-INSERT(T, k)
1 r ← root[T]
2 if n[r] = 2t - 1
3 then s ← ALLOCATE-NODE()
4 root[T] ← s
5 leaf[s] ← FALSE
6 n[s] ← 0
7 c1[s] ← r
8 B-TREE-SPLIT-CHILD(s, 1, r)
9 B-TREE-INSERT-NONFULL(s, k)
10 else B-TREE-INSERT-NONFULL(r, k)
B-TREE-INSERT-NONFULL(x, k)
1 i ← n[x]
2 if leaf[x]
3 then while i ≥ 1 and k < keyi[x]
4 do keyi+1[x] ← keyi[x]
5 i ← i - 1
6 keyi+1[x] ← k
7 n[x] ← n[x] + 1
8 DISK-WRITE(x)
9 else while i ≥ 1 and k < keyi[x]
10 do i ← i - 1
11 i ← i + 1
12 DISK-READ(ci[x])
13 if n[ci[x]] = 2t - 1
14 then B-TREE-SPLIT-CHILD(x, i, ci[x])
15 if k> keyi[x]
16 then i ← i + 1
17 B-TREE-INSERT-NONFULL(ci[x], k)
Deletion
There are two special cases to consider when deleting an element:
- the element in an internal node may be a separator for its child nodes
- deleting an element may put it under the minimum number of elements and children.
Algorithm
If the key k is in node x and x is a leaf, delete the key k from x.
If the key k is in node x and x is an internal node, do the
following.
If the child y that precedes k in node x has at least t keys, then find the predecessor k′ of k in
the subtree rooted at y. Recursively delete k′, and replace k by k′ in
x. (Finding k′ and deleting it can be performed in a single downward
pass.), that is, replace k with the largest key of the left subtree (??????If y is a leaf within t keys, after the deletion, y has t - 1
keys. Then, it's possible that an element is deleted from y next time,
which result in y 's key size to be t - 2, ??? see rule 3)
Symmetrically, if the child z that
follows k in node x has at least t keys, then find the successor k′ of k in the subtree rooted at z. Recursively delete k′, and replace k by k′ in x.
(Finding k′ and deleting it can
be performed in a single downward pass.), that is, replace k with the smallest key of the right subtree
Otherwise, if both y and z have only t - 1 keys, merge k and all of z into y, so that x loses both k and the pointer to z, and y now contains 2t - 1 keys. Then, free
z and recursively delete k from y. that is, merge the children, that is, merge the two children
-----borrow
an element from the children, otherwise, merge, to minimize the
operation on delete, that is, only the key is seemed to be replaced in
the internal node(the special case 1)
If the key k is not present in
internal node x, determine the root ci[x] of the appropriate
subtree that must contain k, if k is
in the tree at all. If ci[x] has only t - 1 keys, execute step 3a
or 3b as necessary to guarantee that we descend to a node containing at least t keys. Then, finish by recursing on the appropriate child of
x.(while traversing down)
If ci[x] has only t - 1 keys but has an
immediate sibling with at least t keys, give ci[x] an extra key by moving
a key from x down into ci[x], moving a key from ci[x]'s immediate left or
right sibling up into x, and moving the appropriate child
pointer from the sibling into ci[x].
If ci[x] and both of ci[x]'s immediate siblings have t - 1
keys, merge ci[x] with
one sibling, which involves moving a key from x down into
the new merged node to become the median key for that
node.
------borrow
an element from the sibling, otherwise, merge the sibling and the key
between the sibling and ci[x]. that is, to ensure the lower bound of
every node(the special case 2)