12
votes

I have explored the definitions of T-trees and B-/B+ trees. From papers on the web I understand that B-trees perform better in hierarchical memory, such as disk drives and cached memory.

What I can not understand is why T-trees were/are used even for flat memory?

They are advertised as space efficient alternative to AVL trees.

In the worst case, all leaf nodes of a T-tree contain just one element and all internal nodes contain the minimum amount allowed, which is close to full. This means that on average only half of the allocated space is utilized. Unless I am mistaken, this is the same utilization as the worst case of B-trees, when the nodes of a B-tree are half full.

Assuming that both trees store the keys locally in the nodes, but use pointers to refer to the records, the only difference is that B-trees have to store pointers for each of the branches. This would generally cause up to 50% overhead or less (over T-trees), depending on the size of the keys. In fact, this is close to the overhead expected in AVL trees, assuming no parent pointer, records embedded in the nodes, keys embedded in the records. Is this the expected efficiency gain that prevents us from using B-trees instead?

T-trees are usually implemented on top of AVL trees. AVL trees are more balanced than B-trees. Can this be connected with the application of T-trees?

2

2 Answers

3
votes

I can give you a personal story that covers half of the answer, that is, why I wrote some Pascal code to program B+ trees some 18 years ago.

my target system was a PC with two disk drives, I had to store an index on non volatile memory and I wanted to understand better what I was learning at university. I was very dissatisfied with the performance of a commercial package, probably DBase III, or some Fox product, I can't remember.

anyhow: I needed these operations:

  • lookup
  • insertion
  • deletion
  • next item
  • previous item

  • maximum size of index was not known

  • so data had to reside on disk
  • each access to the support had high cost
  • reading a whole block cost the same as reading one byte

B+-trees made that small slow PC really fly through the data!

the leafs had two extra pointers so they formed a doubly linked list, for sequential searches.

2
votes

In reality the difference lies in the system you use. As my tutor in university commented it : if your problem lies in memory shortage, or in hdd shortage will determine which tree and in which implementation you will use. Most probably it will be B+ tree.

Because there are hundreds of implementations, for instance with 2direction queue and one directional queues where you need to loop thought elements, and also there are multiple ways to store the index and retrieve it will determine the real cons and mins of any implementation.