The techniques of DBMS that enable us to process queries efficiently

A collection of my notes

  1. Query Execution

  • indexing-based

  • hashing-based

  • sorting-based

They can be used for different operators: duplicate elimination, join, select, union, etc. And all of these three methods can reduce the number of I/Os or memory size we need.

2. indexing in DBMS

the different categories in indexing:

  • dense indexes vs sparse indexes

  • clustering indexes vs non-clustering indexes

B+ Tree

A B+ tree is a height-balanced multi-level tree. It has internal nodes and leaf nodes. Each node(internal or leaf) is stored as one disk block. For an internal node, at least half of the pointers in an internal node must be used. Though there is an exception: for the root node, it needs at least one key and two pointers. Similar to the internal nodes, at least half of the search keys (and corresponding record pointers) in a leaf node must be used.

Hashing

We use dynamic hashing to tackle overflow for the same key. There are two different ways: extensible hashing and linear hashing.

Multi-dimensional Indexes

Multi-dimensional indexes are mainly for these four problems: partial match queries, range queries, nearest neighbour, where-am-I queries. And there are many different indexes:

  • grid files

  • partitioned hashing

  • multiple-key indexes

  • KD-trees

  • quad trees

  • R-trees

  • bitmap indexes

My teammates and I have done a bitmap-related project, we build the bitmap indexes and compress the bitmap indexes, then we also use the index to sort and merge two files.

References

Last updated