CST370 - Week 6
This was our sixth week in CST 370 - Design and Analysis of Algorithms.
AVL Tree
An AVL Tree is a balanced binary search tree where the difference in height between the left and right subtrees is either -1, 0, or 1. The difference between subtrees is called a balance factor or balancing factor.
What do we do if a Binary Tree is unbalanced (not an AVL Tree)? We can perform a R, L, RL, or LR rotation to make it balanced.
This AVL Tree Visualizer may be helpful.
2-3 Tree
A 2-3 Tree is a type of search tree where each node can have one or two values.
If a node has three values, we promote the middle value to the parent node (or make it the new root node).
Visualizations:
Heaps
A heap is a type of Complete Binary Tree. There are two types of heaps: a Max Heap and a Min Heap.
- Max Heap: The number of each node is greater than or equal to the numbers of its children.
- Min Heap: The number of each node is smaller than or equal to the numbes of its children.
- Complete Binary Tree: a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.
A heap can be represented as an array using the following conventions:
- Left child of node j is at 2j.
- Right child of node j is at 2j + 1.
- Parent of node j is at j/2.
- Parental nodes are represented in the first n/2 locations.
Visualizations:
Heapsort
Heapsort is a sorting algorithm with a time efficiency of O(n*logn). It is a two stage algorithm:
- Construct a heap for a given array.
- Apply the root-deletion operation n-1 times to the remaining heap.
Hashing
This week we discussed hashing. The basic principle of hashing is that we use the modulo of a number, such as an ID number, as the index of an array. Hashing is very effecient, with a O(1).
But what if two IDs have the same modulo value? (1234 % 100 = 34 & 3434 % 100 = 34) When two different values have the same hash, we call it a collision. We have two solutions for this problem, Separate Chaining and Linear Probing:
Visualizations:
Separate Chaining
Linear Probing
With Linear Probing we insert a value into the next available entry if there is a collision. Then when we lookup a value we start a sequential search from the original hash value.
Load Factor
The load factor of a hash table is the number of keys divided by the size of the hash table (n/m). With Separate Chaining, each index has about n/m keys if the keys are evenly distributed. With Linear Probing, the load factor should be less than 1.
Example:
- Number of Keys: 50, Hash Table Size: 100
- Load Factor: 0.5 (50/100)
Load Factor is used to determine the efficiency of hashing.
- Too High: May result in too many collisons and less efficient in searching a key.
- Too Low: May not be efficient use of space.
- Load Factor increases as more elements/keys are added to a hash table. Efficiency may decrease.
Rehashing
The objective of Rehashing is to reduce the load factor to be within a pre-defined load factor.
To rehash we first increase the hash table's size, often to be the first prime number that is twice as large as the previous table size. Then the values in the old table are hashed and moved to the new table.
Comments
Post a Comment