Caching mybatis with Redis


How to enable L2 cache

See at http://www.mybatis.org/mybatis-3/sqlmap-xml.html#cache

How cache works

Construct Steps

We can read source code here

// Impl: PERPETUAL(永久): org.apache.ibatis.cache.impl.PerpetualCache
// Decorator: LRU(最近使用): org.apache.ibatis.cache.decorators.LruCache
org.apache.ibatis.builder.MapperBuilderAssistant#useNewCache
  1. Construct a new instance of cache using reflection.
  2. Set individual properties using SystemMetaObject.
  3. Call org.apache.ibatis.builder.InitializingObject#initialize for customization[/ˌkʌstəmɪ'zeʃən/], see #816
  4. Add decorator cache chains by build
  5. Put the cache into a Map<NameSpace, Cache>

If your are using a standard cache, your will get

// see org.apache.ibatis.annotations.CacheNamespace
// It may **produce dirty data** on distributed scopes.
SynchronizedCache -> LoggingCache -> LruCache -> PerpetualCache

And if you are using a customized cache, you will get

LoggingCache -> CustomCache

If you want to get a log, please override getId and return the id with the mapper's namespace.

Cache Process flow

By default(cacheEnabled=true), the framework will create a CachingExecutor[/ɪg'zekjʊtə/] as a proxy(which is called the second level cache) for the database executor.

// Query -> CachingExecutor -> SimpleExecutor
org.apache.ibatis.session.Configuration#newExecutor

There is a brief process flow digram demonstrates how Mybatis caches when a query comes.

  • L1 cache implementation: Java HashMap, aka LocalCache.
  • L2 cache implementation: Redis. HGET and HSET are commands for Redis hash data type. And id is the namespace of mapper.

TransactionalCacheManager

In L2 cache, only put, get and clear will be called despite all methods of interface are implemented.

Redis caching Improvement

In addition to LinkedHashMap-based LRU cache, We also use Redis for distributed caching. Of course, there is already a Jedis-based open source project called Redis-cache

However, there are some improvements to be done.

  • it creates a pool on each construction, singleton instance is better.
  • Doesn't support Redis sentinel mode.
  • JDK-based Serializer is risky when deployed on different platform. JSON, XML or Parcelable is preferred.
  • Lack of namespace for Redis. cache:com.xx.mapper is more maintainable and debuggabe when you DEL keys by prefix.

Your need to fork the project and create your own cache.

Handle mutiple table with cache-ref

If there is a student with lessons, two mapper turns cache on.

<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
    <cache type="REDIS"/>
    <select id="selectLessonWithStudent">
        SELECT s.name, s.age, l.name as L_NAME 
        from student s left join lesson l
        on s.lesson_id = l.id
    </select>
</mapper>
<!-- com.xxx.studentMapper -->
<mapper namespace="com.xxx.studentMapper">
    <cache type="REDIS"/>
    <update>
    UPDATE student set name= #name
    </update>
</mapper>

when student's name is updated, the result of selectLessonWithStudent is not flushed, and dirty data will be fetched.

fixed by shared namespace

<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
-    <cache type="REDIS"/>
+    <cache-ref namespace="com.xxx.studentMapper"/>
    <select id="selectLessonWithStudent">
        SELECT s.name, s.age, l.name as L_NAME 
        from student s left join lesson l
        on s.lesson_id = l.id
    </select>
</mapper>

When data in lesson or student updates, flushCache will be called, and ALL cache in the namespace will be flushed, no update by special cacheKey, so the hit ratio will be explicitly lower.

// ALL cache in the same namespace will be flushed.
org.apache.ibatis.executor.CachingExecutor#flushCacheIfRequired

Concuclusion

  • It's better to use cache on the only one table.
  • When using cache with joined tables, use cache-ref to share namespace or turn cache off mannually.
  • It's better to handle cache in business code and find your own cachekey(eg: put in Elastic as a document)

Alternative performance improvement

  • analyse SQL AST in interceptor and flush only changed changed -> It's too complex.

  • Do static analyse on XML and SQL -> It's too complex too.

APPENDIX

Voiding the risk of L1 Cache

In most situations, turning L1 cache on is risky if you have no control over the project. The two cached results may refer to the same pointer(eg: repeat queries in a for loop).

// eg: in a service
List<Student> list1 = mapper.select();
// do modification
list1.get(0).setName("Modified");
// get dirty data from cache
List<Student> list2 = mapper.select();
assert(list1 == list2)

to fix the problem

  • avoid same query in @Transactional, and always remove repeat queries.

  • turn L1 cache off(see issue #482) and directly hit the DB.

<settings>
  <!-- will flush the hashMap after the query in BaseExecutor. -->
  <setting name="localCacheScope" value="STATEMENT"/>
</settings>