redis内存淘汰机制审深入分析

/ 默认分类 / 0 条评论 / 211浏览

redis内存淘汰机制审深入分析

一.引入

redis是基于内存的缓存,当存储的缓存数据超过了设置的可用内存最大限制,那么就会触发内存淘汰机制,来根据配置的不同策略删除key,释放内存

二.配置

redis.conf中可以进行相关配置

//配置最大可用内存
maxmemory <bytes>

//配置内存淘汰机制  
# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached. You can select one from the following behaviors:
#
# volatile-lru -> Evict using approximated LRU, only keys with an expire set. //只在设置了过期时间的key中使用lru淘汰
# allkeys-lru -> Evict any key using approximated LRU.  //在所有的key中使用lru淘汰
# volatile-lfu -> Evict using approximated LFU, only keys with an expire set. //只在设置了过期时间的key中使用lfu淘汰
# allkeys-lfu -> Evict any key using approximated LFU. //在所有的key中使用lfu淘汰
# volatile-random -> Remove a random key having an expire set. //在设置了过期时间的key中随机淘汰
# allkeys-random -> Remove a random key, any key. //在所有key中随机淘汰
# volatile-ttl -> Remove the key with the nearest expire time (minor TTL) //删除过期时间最近的key
# noeviction -> Don't evict anything, just return an error on write operations. //不做任何淘汰,内存满了还添加key直接报错 
#
# LRU means Least Recently Used
# LFU means Least Frequently Used
maxmemory-policy noeviction

三.分析

redis中的LRU并没有维护一个数据结构用来获取淘汰的key,使用的是时钟比较的方法,来获取最久没有被使用的key,并且实现的是一种近似的LRU,因为没有对所有的key进行LRU,而是从随机获取的样本数据中进行LRU.

关于LRU和LFU的算法介绍参考我的另一篇博客

在redis中键和值都是redisObject结构体

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

其中会记录当前key的LRU时间值,按照每个key的最后一次访问时间和当前系统时钟时间比较来判断空闲时间
在redis中不同的过期策略最终都是通过使用不同的逻辑计算出idle空闲时间,然后比较空闲时间来进行淘汰;

redis中每次执行获取key的时候,都会执行下面的查找方法:

/* Low level key lookup API, not actually called directly from commands
 * implementations that should instead rely on lookupKeyRead(),
 * lookupKeyWrite() and lookupKeyReadWithFlags(). */
robj *lookupKey(redisDb *db, robj *key, int flags) {
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);

        /* Update the access time for the ageing algorithm.
         * Don't do it if we have a saving child, as this will trigger
         * a copy on write madness. */
        if (server.rdb_child_pid == -1 &&
            server.aof_child_pid == -1 &&
            !(flags & LOOKUP_NOTOUCH))
        {
            //维护更新淘汰策略数据
            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
                updateLFU(val);
            } else {
                val->lru = LRU_CLOCK();
            }
        }
        return val;
    } else {
        return NULL;
    }
}

主要的redis中的过期淘汰策略逻辑在evict.c中,大致的逻辑如下图
evict.c

redis中使用的LRU为近似LRU,按照配置的不同的过期策略,抽样分为对所有的key数据进行抽样以及对所有设置了缓存失效时间的key进行抽样,抽样的数量可以在redis配置文件中配置maxmemory-samples进行配置

    int maxmemory_policy;           /* Policy for key eviction */
    int maxmemory_samples;          /* Pricision of random sampling */

可以看出,随机抽样的数目越大,LRU的准确度会更高,但是抽样数目越大会导致LRU的性能越低,redis中默认值为5,这样可以在准确度和性能中得到平衡.

下面我截取了redis源码中的缓存过期淘汰evict.c的一部分逻辑:

int freeMemoryIfNeeded(void) {
    /* By default replicas should ignore maxmemory
     * and just be masters exact copies. */
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;

    size_t mem_reported, mem_tofree, mem_freed;
    mstime_t latency, eviction_latency;
    long long delta;
    int slaves = listLength(server.slaves);

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return C_OK;
    //如果当前使用的内存小于maxmemory限制,那么直接返回C_OK
    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
        return C_OK;

    mem_freed = 0;

    //如果配置的淘汰策略是禁止释放内存,则直接报错,返回C_ERR
    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        goto cant_free; /* We need to free memory, but policy forbids. */

    latencyStartMonitor(latency);
    //不断循环释放key,直到释放的内存达到需要的内存大小
    while (mem_freed < mem_tofree) {
        int j, k, i, keys_freed = 0;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            //过期淘汰池的大小默认为16
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                /* We don't want to make local-db choices when expiring keys,
                 * so to start populate the eviction pool sampling keys from
                 * every DB. */
                 //遍历每一个库,从中抽样获取key数据
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    //按照策略得到筛选的源数据,要么是所有key,要么只是设置了过期时间的key
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                    //向淘汰key池中插入数据
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                //遍历过期淘汰池
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

过期淘汰池子中的数据是怎样插入进去的呢?

/**
sampledict是根据淘汰策略得到的原数据(allkeys  ||  expire keys)
keydict是当前库的所有key
**/
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
    int j, k, count;
    dictEntry *samples[server.maxmemory_samples];

    //抽样获取指定数目的键的位置,存储在samples中
    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
    for (j = 0; j < count; j++) {
        unsigned long long idle;
        sds key;
        robj *o;
        dictEntry *de;

        de = samples[j];
        key = dictGetKey(de);

        /* If the dictionary we are sampling from is not the main
         * dictionary (but the expires one) we need to lookup the key
         * again in the key dictionary to obtain the value object. */
        if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {
            if (sampledict != keydict) de = dictFind(keydict, key);
            o = dictGetVal(de);
        }

        //按照不同的过期策略计算得到每一个随机抽样出的key的空闲时间idle
        /* Calculate the idle time according to the policy. This is called
         * idle just because the code initially handled LRU, but is in fact
         * just a score where an higher score means better candidate. */
        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
            idle = estimateObjectIdleTime(o);
        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
            /* When we use an LRU policy, we sort the keys by idle time
             * so that we expire keys starting from greater idle time.
             * However when the policy is an LFU one, we have a frequency
             * estimation, and we want to evict keys with lower frequency
             * first. So inside the pool we put objects using the inverted
             * frequency subtracting the actual frequency to the maximum
             * frequency of 255. */
            idle = 255-LFUDecrAndReturn(o);
        } else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {
            /* In this case the sooner the expire the better. */
            idle = ULLONG_MAX - (long)dictGetVal(de);
        } else {
            serverPanic("Unknown eviction policy in evictionPoolPopulate()");
        }

        /* Insert the element inside the pool.
         * First, find the first empty bucket or the first populated
         * bucket that has an idle time smaller than our idle time. */
         //将当前key按照idle插入到pool中
         // 左=============右
         //  1 2 3 4 5 6 7  (idle)
        k = 0;
        while (k < EVPOOL_SIZE &&
               pool[k].key &&
               pool[k].idle < idle) k++;
//        k=0,只有两种情况 1.当前key的idle比池子最左边的idle还要小(池子中有数据,可能是有部分数据,也可能是满的) 2.当前池子是空的
        //下面这个if是指当前key的idle比池子最左边的idle还要小,并且池子是满的
        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
            /* Can't insert if the element is < the worst element we have
             * and there are no empty buckets. */
            continue;
        }
        //池子不是满的(1.有数据但是不满 2.池子是空的),那么直接插入第一个空位置
        else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
            /* Inserting into empty position. No setup needed before insert. */
        }
        //池子有数据,但是需要插入的位置不是空,即idle在现有的key的idle之间,那么进行pool中数据移位 
        else {
            /* Inserting in the middle. Now k points to the first element
             * greater than the element to insert.  */
             if (pool[EVPOOL_SIZE-1].key == NULL) {
                /* Free space on the right? Insert at k shifting
                 * all the elements from k to end to the right. */

                /* Save SDS before overwriting. */
                sds cached = pool[EVPOOL_SIZE-1].cached;
                memmove(pool+k+1,pool+k,
                    sizeof(pool[0])*(EVPOOL_SIZE-k-1));
                pool[k].cached = cached;
            } else {
                /* No free space on right? Insert at k-1 */
                k--;
                /* Shift all elements on the left of k (included) to the
                 * left, so we discard the element with smaller idle time. */
                sds cached = pool[0].cached; /* Save SDS before overwriting. */
                if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);
                memmove(pool,pool+1,sizeof(pool[0])*k);
                pool[k].cached = cached;
            }
        }

四.总结

redis中的LRU是遍历每一个DB,然后从每个数据库中随机获取抽样的个数,然后对这些抽样key按照不同的淘汰策略计算出idle空闲时间,然后将它们放入池子中,池子是按照idle时间长短排列的,idle时间越长的排在越右边,越短的排在越左边,默认池子大小为16,并且放入池子也是有逻辑的:

  1. 当待插入的key的idle比最左边的idle还要小,并且池子是满的,那么就不插入,这重点是池子是满的;
  2. 如果池子有空位置,并且待插入的key的idle比现有的最右边的还要大,那么直接插入在空位置上;
  3. 如果待插入的数据在现有的key的idle之间,那么就按照顺序插入;