Hashing and its Implementation

Shruti Jha
4 min readDec 31, 2021

A hash table is a data structure which is used to store data in the form of key-value pairs. Hash tables are significantly used in competitive programming because operations such as insert, search and delete in hash tables do not go beyond linear time complexities. Hence programmers can have an optimized approach while using hash tables.

So how are the keys stored in the hash table?

Hash tables are based on the concept of hashing, which means mapping a larger value to a smaller value with the help of a hash function.

What is a hash function?

The hash function is of the form –

index = f(key, ht_len)

where key is the value to be inserted and ht_len is the size of the hash table.

One of the most popular hash functions is given by –

index = key%ht_len

Suppose we need to insert the following keys in our hash table –

{14, 43, 42, 45, 6, 89….}

Let the size of the hash table be 10 (it is an array of size 10) for the entire article, i.e., ht_len = 10.

The keys can be inserted as follows-

However, if the keys were as follows-

{42, 22, 35, 25……}

In this scenario, two different key values may point to the same index value. This is known as collision.

Collision resolving techniques-

1. Separate Chaining

2. Open addressing

Separate Chaining

In Separate Chaining, every index value has a list associated with it. The keys pointing to the same index keep getting added to the list.

For the above example the implementation would be as follows-

The time complexity for insertion would be O(1), but the time complexity for search and delete operations would be O(k) where k is the maximum number of keys in a list.

To improve the time complexity, a balanced binary search tree can be used in place of the list at every index.

Open Addressing

Open addressing is based on the concept that whenever a collision occurs, move to the next available slot. The location of the available slot is decided by the hash function. There are three types of Open Addressing techniques-

1. Linear Probing

2. Quadratic Probing

3. Double Hashing

1. Linear Probing

The hash function in case of linear probing is –

f(key, ht_len)= (key+i)%ht_len

where i = number of times collision occurs for a particular index

Example, the keys are as follows –

{22, 42, 44, 72, 94..}

22 will be placed at index 2.

For 42, there is a collision with 22 so it will be placed at index 3 according to the hash function for linear probing.

44 will be placed at index 4.

For 72, there will be 2 collisions with 22 and 42 and another collision with 44 so it will be placed at index 5.

For 94, there will be one collision with 44 and another collision with 72.

In case of linear probing there are too many collisions.

The second technique of Open Addressing is Quadratic probing.

2. Quadratic Probing

To reduce the collisions in Linear probing, we introduce Quadratic probing.

The hash function in case of quadratic probing is –

f(key, ht_len)= (key+i*i)%ht_len

where i = number of times collision occurs for a particular index

For placing 22, 42 and 72-

22 will be placed at index 2

42 will be placed at index 3 ((42+1*1)%10)

72 will be placed at index 6 ((72+2*2)%10)

Example of keys to be inserted — {22, 42, 44, 72, 94..}

Collisions are lesser in quadratic probing than in linear probing and the difference will continue to increase as more keys will be added.

If the size of the hash table is double or more than double the number of keys to be added, then the number of collisions are minute, hence the time complexity of insertion, search and deletion is almost constant.

Operations in Open Addressing-

Search(key) — For searching a key, keep probing until you get the index with key stored in it or an empty slot. If a deleted marked slot is reached then continue searching.

Delete(key) — After finding the key from search, you can delete it. But after deletion, the slot cannot just be left empty. It must be marked as deleted.

Insert(key) — For inserting a key, keep probing until an empty or deleted slot is found. Then insert the key. If it was a deleted slot then it will no longer be marked as deleted.

The third technique in Open Addressing is known as double hashing.

3. Double Hashing

In Double Hashing, the hash function is a combination of two hash functions.

--

--