International Core Journal of Engineering 2020-26 | Page 63

move it to L1 in the corresponding position as described above, and reset k according to formula 2. When k is less than K max , update it according to LRU algorithm. (3) If R request is found in L2, then update L2 according to LRU algorithm. Fig. 1 Cache architecture. B. Double Queue K-frequency Cache First of all, this paper agrees that a row in data set is a record. The BlockCache strategy uses blocks as a basic unit for reading data to improve read performance. However, the BlockCache strategy aims to improve system-oriented read and write performance. Consequently, when reading records randomly in data set, the BlockCache strategy does not bring about a high hit rate. In this paper, we propose an LRU-based double queue K-frequency cache method, named DLK. Unlike LRU, DLK uses two LRU queues (L1 and L2). Records in cache have two special properties: timestamp and k. timestamp is the last access time of the record. k is access times which the record was accessed during its lifetime in cache, the minimum value of k is 1. Record, timestamp and k constitute a basic unit of caching data within the cache space, as shown below. Fig. 2 The operation flow of cache layer. C. Organization Architecture of Records in Memory Organization architecture of data in memory determines retrieval efficiency. LRU algorithm uses queue to store data, which time complexity is O(log(n)). In order to achieve efficient retrieval, this paper divides the memory into a data area and a list area, as depicted in Fig. 3 (1) When the record is accessed for the first time in a time slot, put it into the L1. If in a certain time period, this record is retrieved for Kmax times, then move it into L2. The value of Kmax is specified by the user. Records in L1 and L2 are all evicted according to the LRU algorithm. Different from LRU, the evicted record in L2 moves into L1 as a time -sequentially ordered, and reset k value to the average of the previous and next record’s k in the L1. For example, the record Rm needs to be evicted from L2, then insert it into L1 according to timestamp Tm, ensuring that the timestamp of all records in L1 remains in descending order. Besides assuming that Rm is inserted into the position i in L1, k of Rm is calculated according to Eq. 2. Fig. 3 Data organization structure. Fig. 2 shows the operation flow of caching record. When a record Rrequest is retrieved, there are three situations: Data area use key-value structure to store data. Every record has a unique identifier, denoted as RowKey. When constructing key-value, use RowKey as the key, and the value is the serialized record. Due to key-value structure is used to store data, retrieval only depends on the Rowkey, time complexity of DLK is O(1). Similarly, store k and timestamp as the same way. (1) If it misses in cache, put the R request into L1, and set k of R request to 1. If L1 is full, before storing R request , it is necessary to evict corresponding data in L1 based on LRU algorithm to leave enough space. In list area, two LRU queues are used to store record information of all cached records, which store the RowKeys instead of storing entire records. The records organized by LRU queues is efficient to determine which record is evicted. k i = k i-1 +k i+1 2 (2) (2) If it is found in L1, R request ’s k adds 1. When k reaches K max , R request is evicted from L1, then insert it into L2. If L2 is full, select an evicted record according to LRU algorithm, and 41