There are various LFU caching solutions. This solution is to keep cache operations optimized and bring in LFU algorithm (discussed here) only when cache size reaches to its threshold.
What is this solution about?
1. Cache add and read operations in O(1).
2. If cache size reaches to its threshold, LFU (Least Frequently Used) elements from cache are removed.
3. To find LFU elements, frequencies of read operations from cache are determined.
4. Frequencies of read operations are maintained in different data structure.
5. To determine LFU elements, frequencies are sorted using temporary data structure like red-black tree for sorting.
6. Temporary data structure for sorting and LFU determination are used only when cache size reaches to its threshold limit. This ensures O(1) add/read operations to/from cache, till size of cache is below threshold.
1. We will start with two data structures (can be HashMap, for O(1) solution)
First data structure is CACHE to hold key value pairs and other is to hold frequencies of read operations from cache.
2. Put or Add Operation -
If not already present in cache,
If not already present in frequency map,
3. Read Operation -
If required key (data) is present in cache, read and return value, in O(1).
If required key (data) is present in frequency map, read [key, value] pair, increment frequency number (to record, read once more), in O(1).
If required key (data) is not present in cache, read data from backend/DB (actual source of your data), add this data to cache and return this data. Also, add this data to frequency map with value 1 (read once).
4. Cache size reached to its threshold -
Here we will use temporary data structure like TreeSet (in java).
We need to add [key, value] pairs from frequency map to this TreeSet.
Note: TreeSet should sort values based on [value i.e. frequencies]. This sorted keys (sorted on frequencies) would allow us to iterate, read keys in sequence and remove these keys from cache and frequency map.
Trick here is to keep duplicate values (frequencies), as we are thinking of tree[SET].
When TreeSet is ready with sorted frequencies, read keys one by one in sequence and remove these keys from cache and frequency map.
Number of keys to be removed from cached can be pre-configured.
Now when our cache has some space, we are ready to go back to normal operation.
We can remove/delete temporary data structure, TreeSet (as garbage collected in java).