How ConcurrentHashMap works and ConcurrentHashMap interview questions.

ConcurrentHashMap Interview Questions In Java.

Question 1.  What is the need of ConcurrentHashMap when there is HashMap and Hashtable already present?

Performance and Thread safety are 2 parameter on which ConcurrentHashMap is focused. Imagine a scenario where we have frequent reads(get) and less writes(put) and need thread safety,

Can we use Hashtable in this scenario?
No. Hashtable is thread safe but give poor performance in case of multiple thread reading from hashtable because all methods of Hashtable including get() method is synchronized and due to which invokation to any method has to wait until any other thread working on hashtable complete its operation(get, put etc).

Can we use HashMap in this scenario?
No. Hashmap will solve performance issue by giving parallel access to multiple threads reading hashmap simultaneously.
But Hashmap is not thread safe, so what will happen if one thread tries to put data and requires Rehashing and at same time other thread tries to read data from hashmap, It will go in infinite loop.

Infinite loop problem discussed in detail: Infinite loop in HashMap
 

ConcurrentHashMap combines good features of hashmap and hashtable and solves performance and thread safety problem nicely.



Question 2.  HashMap and Hashtable uses Array and Linkedlist as datastructure to store data, How is it different in ConcurrentHashMap?

If you are not familiar with HashMap and Hashtable, Please go through it first:
How HashMap works

Below diagram shows how hashtable/hashmap look like,

ConcurrentHashMap added one Array on top of it and each index of this additional array represents complete HashMap. Additional array is called Segment in ConcurrentHashMap.

Architecture of ConcurrentHashMap looks like below,


Putting key-value pair: 

1. Putting key-value pair in ConcurrentHashMap requires first identifying exact index in 
    Segment array. 
    (Once Segment array index is identified, Now flow will be exactly same as putting the data in 
    hashmap/hashtable.) 
2. After identifying index in Segment array, next task is to identify index of internal bucket/array 
    present in internal hashmap as shown in figure above.
 3. After identying bucket(internal array index), iterate key-value pairs and check each key with key 
    to store, wherever match is found replace stored value with value to store.
    If there is no match, store key-value pair at the last of list.


Getting key-value pair:

1. Getting key-value pair in ConcurrentHashMap requires first identifying exact index in 
    Segment array. 
    (Once Segment array index is identified, Now flow will be exactly same as getting the data from
    hashmap/hashtable.) 
2. After identifying index in Segment array, next task is to identify index of internal bucket/array 
    present in internal hashmap as shown in figure above.
3. After identying bucket(internal array index), iterate key-value pairs and match each key with 
    given key, wherever match is found return value stored against key.
    If there is no match, return null.



Question 3.  How ConcurrentHashMap is efficient in terms of Performance and Thread safety?

ConcurrentHashMap provides better Performance by replacing the Hashtable's map wide lock to Segment level lock.

Hashtable is not efficient beacause it uses map wide lock, it means lock is applied on map object itself, 

So if 2 threads tries to call hashtable.get(key), 
Thread T1 calls to get() method will acquire a lock on hashtable object and then execute get() method. (Lock is on complete 'hashtable object')

Now if Thread T2 calls hashtable.get(key) method, then it will also try to acquire lock on hashtable object, but T2 will not able to acquire lock as lock on 'hashtable' is currently held by T1, 

So T2 waits until T1 finishes get() operation and release lock on hashtable object.


ConcurrentHashMap works bit different here and instead of locking complete map object it Locks per Segment. 
It means instead of single map wide lock, it has multiple Segment level lock.

So 2 Threads can execute put operation simultaneously by acquiring lock on different Segments.

Thread T1 calls concurrentHashMap.put(key, value), It acquires lock on say Segment 1 and invokes put method.
Thread T2 calls concurrentHashMap.put(key, value), It acquires lock on say Segment 4 and invokes put method.

Both threads doesn't interfere with each other and both can proceed simultaneously as they are working on separate Segment locks.


This is how ConcurrentHashMap improves Performance and provide Thread safety as well. 



Question 4.  Can multiple threads read and write from same or different Segments of ConcurrentHashMap simultaneously?

Read Operation: get(key)
Same Segment/Different Segment : Yes. 
Two threads T1 and T2 both can simultaneously read data from same Segment or different Segment of CHM simultaneously without blocking each other.


Write Operation: put(key, value)
Different Segment :Yes
Multiple threads can write data to different Segment of CHM simultaneously without blocking each other.
Same Segment : No  
Multiple threads CANNOT write data to same Segment of CHM simultaneously and need to wait for one thread to come write operation and then only other write operation can be proceed.
  

Read-Write Operation: get and put
Say T1 is writing data in Segment 1 and T2 is reading data from same Segment 1, can read be allowed while writing operation is going on?
YES. 
Both operation that is T1 writing and T2 reading can be done parallely

What data will T2 read if T1 is updating same data?
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). 
Latest updated value present will be returned by get operation that is value updated by most recently completed update operationswill be returned.

Note: Get operations are lock free and can be performed simulateneously irrespective of other thread writing data on same or different Segment of CHM.



Question 5.  What is the default size of Segment array? how it is tuned? What is ConcurrenyLevel in case of CHM?

Default size of Segment array is 16. 
 
ConcurrentHashMap differes from Hashtable in terms of Performance by introducing Segment array.
Each index of Segment array is guarded by a lock for put operation.


Threads working on separate Segments index doesn't affect each other. 
By default Segments array size is 16, So maximum 16 threads can simultaneously put data in map considering each thread is working on separate Segment array index.

How Segment array size is tuned?

Segment size decides the number of Threads that can paralley write to a map.
Segment array size is configured using ConcurrencyLevel parameter as shown below,
ConcurrentHashMap m = new ConcurrentHashMap(initialCapacity, loadFactor, concurrencyLevel)


It takes 3 parameters,   

Example:
ConcurrentHashMap m = new ConcurrentHashMap(200 , 0.75f, 10);
Initial capacity is 200, it means CHM make sure it has space for adding 200 key-value pairs after creation.

Load factor is 0.75, it means when average number of elements per map exceeds 150 (intital capacity * load factor = 200 * 0.75 = 150) at that time map size will be increased and existing items in map are rehashed to put in new larger size map.
For more details on Load Factor: Load factor in Map

Concurrency level is 10, it means at any given point of time Segment array size will be 10 or greater than 10, so that 10 threads can able to parallely write to a map.




Question 6.  What will happen if the size of Segment array is too small or too large?

Choosing correct ConcurrencyLevel is very important because ConcurrencyLevel decides what will be the size of Segment array.

Segment array size will decide how many parallel Threads will be able to execute put operation on map parallely.

So Segment array size should not be too big or should not be too small because,
Using a significantly higher value than we will waste space and time, and a significantly lower value can lead to thread competition.



Question 6.  If we choose ConcurrenyLevel as 10 then what will be size of Segment array? Is Segment array size exactly same as concurrenyLevel? If No, then how is the Segment array size calculated?

Segment array size is calculated based on concurrenyLevel specified but it doesn't mean it will be exactly same as concurrenyLevel.

If concurrenyLevel is 10 then Segment array size will be 16.

Segment array size = 2 to the power x, where result should be >= concurrenyLevel(in our case it is 10)
Segment array size = 2 to the power x >= 10

Segment array size = 2 ^ 1 = 2   >= 10 (False)
Segment array size = 2 ^ 2 = 4   >= 10 (False)
Segment array size = 2 ^ 3 = 8   >= 10 (False)
Segment array size = 2 ^ 4 = 16 >= 10 (True) 

Segment array size is 16.

Example: 2
concurrenyLevel = 8 then Segment array size = ?
Find 2 ^ x >= 8 

2 ^ 1 >= 2 
2 ^ 2 >= 4
2 ^ 3 >= 8
Segment array size will be 8.



Question 7.  What is HashEntry array and how is the size of HashEntry decided?

Default initial capacity of CHM is 16. It means CHM make sure there is sufficient space to accomodate 16 key-value pairs after CHM is created.

What is HashEntry[]?

We saw that each index of Segment array itself is complete HashMap, 
 

In HashMap the bucket/array is of class Entry[] and in CHM the array is of class HashEntry[].
 
static final class Segment<K,V> extends ReentrantLock implements Serializable {
 transient volatile HashEntry<K,V>[] table;
}

static final class HashEntry<K,V> {
 final int hash;
 final K key;
 volatile V value;
 volatile HashEntry<K,V> next;
}


Lets see how internal architecture of CHM looks like, 



How HashEntry[] array size will is calculated?
ConcurrentHashMap m = new ConcurrentHashMap(initialCapacity, loadFactor, concurrencyLevel)

HashEntry[] array size  =   2 ^ x   >=  (initialCapacity / concurrenyLevel)

Eg: ConcurrentHashMap(32,   0.75f,   4);
HashEntry[] array size  =  2 ^ 1 = 2   >=  8(32/4) (False)
HashEntry[] array size  =  2 ^ 2 = 4   >=  8 (False)
HashEntry[] array size  =  2 ^ 3 = 8   >=  8 (True)

HashEntry[] array size = 8
It means there will always be capacity of 8 key-value pairs that can be put in CHM after its creation.




Question 8.  How does ConcurrentHashMap handle rehashing while another thread is still writing on another segment/partition?

For understanding rehashing, please understand Load Factor first.
Load Factor is a measure, which decides when exactly to increase the HashMap/CHM capacity(buckets) to maintain get and put operation complexity of O(1).

Default load factor of Hashmap/CHM is 0.75f (i.e 75% of current map size).

For more details on Load factor, Please refer: Load Factor in HashMap


In CHM, Every segment is separately rehashed so there is no collision between Thread 1 writing to Segment index 2 and Thread 2 writing to Segment index 5.

Example: If say Thread 1 which is putting data in Segment[] array index 2 finds that HashEntry[] array needs to be rehashed due to exceed Load factor capacity then it will rehash HashEntry[] array present at Segment[] array index 2 only.
HashEntry[] array at other Segment indexes will still be intact, unaffected and continue to serve put and get request parallely.



MORE QUESTIONS COMING SOON...


You may also like to see


Exception Handling Interview Question-Answer

Method Overloading - Method Hiding Interview Question-Answer

Advanced Multithreading Interview Questions-Answers In Java

Type Casting Interview Questions-Answers In Java

How Thread.join() in Java works internally

How is ambiguous overloaded method call resolved in java

 

Enjoy !!!! 

If you find any issue in post or face any error while implementing, Please comment.

Java Interface interview questions and answers

Interface Interview Questions In Java.


Interview questions on Java Interface

Question 1. Super class of all classes is java.lang.Object class, does this applies to Interface as well? What is super class of all Interface in Java?

java.lang.Object class is the super class of all Java classes. All non-primitive types (including arrays) inherit either directly or indirectly from Object class. 
For Interface that is not the case, Super class of Interface is null.

From below program that will be clear.

interface ITest{}
class CTest{}

class Test{ 
 public static void main(String[] args) {
  System.out.println(CTest.class.getSuperclass()); // class java.lang.Object
  System.out.println(int[].class.getSuperclass()); // class java.lang.Object
  System.out.println(ITest.class.getSuperclass()); // null
  System.out.println(int.class.getSuperclass());   // null
 }
}
Output:
 class java.lang.Object
 class java.lang.Object
 null
 null

The Knapsack Problem

0/1 Knapsack Problem solved using Iterative and Dynamic Programming


Given weights and values(profits) of N items, put these items in a knapsack of max capacity W to get the maximum total value(profit) in the knapsack.

Advanced Multithreading Interview Questions In Java

Multithreading Tutorial.
Interview Questions on Threads in Java


Question 1. What is the use of Threads in Java? why Thread is required? What are Threads in Java?

Lets try to understand this with simple scenario and it will be more clear: 

Scenario:
Suppose you want to count the population of a India, how will you approach? 
Note: There are 29 states in India. 

Approach 1:
First approach is, you start with first state and count population of that state then you will start second state and so on for all 29 states.
Once you have population of all the states, just sum the population count of all States. 

Imagine the time it will take for you to do this as you are alone and you have to count population state by state. 

Approach 2:
Second approach is, you called 29 people to help you out and you distributed the task of population count to 29 person, each person taking care of individual state. 
So, person 1 will take care of population count for State 1. 
Person 2 will take care of population count for State 2 and so on. 

Once you have population count of all the states, just sum the population count received from all 29 person. 

Imagine the time it will take for you to do this as compared to Approach 1, surely it will be much less. 
   
So that is what Thread does. In above scenario, you can consider 29 persons as 29 Threads who are doing their respective task of population count. 
It is possible that Person 1 may finish population count for State 1 assigned to it much early than Person 2 doing population count for State 2 because State 1 was small. 
Person 2 will continue doing his task even after Person 1 finished early. 

In the similar way, Say If you have 2 Threads say Thread 1 and Thread 2. Thread 1 may complete its job early and Thread 2 will continue doing its job even after Thread 1 is done and they both execute separately. 

Now to relate it with Threads:
When you have task like above that needs to be run in parallel for faster processing at that time Threading will come in picture.
You can say, Java Threads helps creating multiple independent path of execution within a program which can run parallely.
Application Example:
In Java, when a program requires more than one task to execute in parallel, say for example,
  1. Reading a data from a local file.
  2. Reading a data from remote connection.
When both of above task need to be executed in parallel at that time Threading will come in picture.
So Java Threads helps creating multiple independent path of execution within a program which can run in parallel.