International Core Journal of Engineering 2020-26 | Page 63
move it to L1 in the corresponding position as described
above, and reset k according to formula 2. When k is less than
K max , update it according to LRU algorithm.
(3) If R request is found in L2, then update L2 according to
LRU algorithm.
Fig. 1 Cache architecture.
B. Double Queue K-frequency Cache
First of all, this paper agrees that a row in data set is a
record. The BlockCache strategy uses blocks as a basic unit
for reading data to improve read performance. However, the
BlockCache strategy aims to improve system-oriented read
and write performance. Consequently, when reading records
randomly in data set, the BlockCache strategy does not bring
about a high hit rate.
In this paper, we propose an LRU-based double queue
K-frequency cache method, named DLK. Unlike LRU, DLK
uses two LRU queues (L1 and L2). Records in cache have two
special properties: timestamp and k. timestamp is the last
access time of the record. k is access times which the record
was accessed during its lifetime in cache, the minimum value
of k is 1. Record, timestamp and k constitute a basic unit of
caching data within the cache space, as shown below.
Fig. 2 The operation flow of cache layer.
C. Organization Architecture of Records in Memory
Organization architecture of data in memory determines
retrieval efficiency. LRU algorithm uses queue to store data,
which time complexity is O(log(n)). In order to achieve
efficient retrieval, this paper divides the memory into a data
area and a list area, as depicted in Fig. 3
(1)
When the record is accessed for the first time in a time slot,
put it into the L1. If in a certain time period, this record is
retrieved for Kmax times, then move it into L2. The value of
Kmax is specified by the user. Records in L1 and L2 are all
evicted according to the LRU algorithm. Different from LRU,
the evicted record in L2 moves into L1 as a time -sequentially
ordered, and reset k value to the average of the previous and
next record’s k in the L1. For example, the record Rm needs to
be evicted from L2, then insert it into L1 according to
timestamp Tm, ensuring that the timestamp of all records in
L1 remains in descending order. Besides assuming that Rm is
inserted into the position i in L1, k of Rm is calculated
according to Eq. 2.
Fig. 3 Data organization structure.
Fig. 2 shows the operation flow of caching record. When a
record Rrequest is retrieved, there are three situations: Data area use key-value structure to store data. Every
record has a unique identifier, denoted as RowKey. When
constructing key-value, use RowKey as the key, and the value
is the serialized record. Due to key-value structure is used to
store data, retrieval only depends on the Rowkey, time
complexity of DLK is O(1). Similarly, store k and timestamp
as the same way.
(1) If it misses in cache, put the R request into L1, and set k of
R request to 1. If L1 is full, before storing R request , it is necessary
to evict corresponding data in L1 based on LRU algorithm to
leave enough space. In list area, two LRU queues are used to store record
information of all cached records, which store the RowKeys
instead of storing entire records. The records organized by
LRU queues is efficient to determine which record is evicted.
k i =
k i-1 +k i+1
2
(2)
(2) If it is found in L1, R request ’s k adds 1. When k reaches
K max , R request is evicted from L1, then insert it into L2. If L2 is
full, select an evicted record according to LRU algorithm, and
41