Multi-dimensional Data
We use various tree structures to represent data in 1, 2, 3, ..., k dimensions.
High Level Overview
Each node divides the current region. Each children represents a subregion.
We care about the 5 essential operations that define the characteristics of that data structure.
- Insertion
- Search for the right position to insert, determine which subregion this node would fall into
- Search
- Very similar to insert, except when if not found, return return null
- Deletion
- Remove the node is easy, just do a search operation
- The key here is (recursively) finding the best replacement that keep the properties of the subtrees
- Range Query
- Find all points within a range/region
- Check if the region(node) intersections with the query region
- Then check node.point and recursively search on the children nodes
- Nearest Neighbor Query
- Keep two global values, bestDist and bestSol
- need a function to compute dist(region(node), query point)
- If dist(region(node), query point) < bestDist, update bestSol, visit all children
- Else prune this subtree
TBC