Introduction: Java Cache

From Resin 4.0 Wiki

Revision as of 00:00, 29 January 2012 by Ferg (Talk | contribs)
Jump to: navigation, search

Squirrel-48.png

Faster application performance is possible with Java caching by saving the results of long calculations and reducing database load. The Java caching API is being standardized with jcache. In combination with Java Dependency Injection (CDI), you can use caching in a completely standard fashion in the Resin Application Server. You'll typically want to look at caching when your application starts slowing down, or your database or other expensive resource starts getting overloaded. Caching is useful when you want to:

  • Improve latency
  • Reduce database load
  • Reduce CPU use

Contents

Cache Performance Benefits

Since reducing database load is a typical cache benefit, it's useful to create a micro-benchmark to see how a cache can help. This is just a simple test with mysql running on the same server and a trivial query. In other words, it's not trying to exaggerate the value of the cache, because almost any real cache use will have a longer "doLongCalculation" than this simple example, and therefore the cache will benefit even more.

The micro-benchmark has a simple jdbc query in the "doLongCalculation" method

"SELECT value FROM test WHERE id=?"

and then to get useful data, the call to "doStuff" is repeated 300k times and compared with the direct call to "doLongCalculation" 300k times.

Although the change is realistic (the 100x is a measured result with Resin Cache), this is also an ideal situation, where the item is always in cache. In other words it's an actual cache with a 0% miss ratio.

Ideal-cache.png

Type Time requests per millisecond Mysql CPU
JDBC 30s 10.0 req/ms 35%
Cache 0.3s 1095 req/ms 0%

Even this simple test shows how caches can win. In this simple benchmark, the performance is significantly faster and saves the database load.

  • 100x faster
  • Remove Mysql load

To get more realistic numbers, you'll need to benchmark the difference on a full application. Micro-benchmarks like this are useful to explain concepts, but real benchmarks require testing against your own application, in combination with profiling. For example, Resin's simple profiling capabilities in the /resin-admin or with the pdf-report can get you quick and simple data in your application performance.

Improving Cache Performance

t = p_miss * t_miss + (1 - p_miss) * t_hit

where 
  t is the total time
  p_miss is the miss rate
  t_miss is the time taken for a miss (e.g. database time)
  t_hit is the time taken for a hit (cache implementation overhead)

20% miss, 100ms miss time, 1ms hit time

If your cache might have a fairly-high 20% miss rate, it might already improve your performance by 5x. Even a terrible miss ratio of 50% can improve performance by a factor of 2x. And this might be good enough for you, because the 80/20 rule always applies. If improving that database performance by a factor of 5x is good enough, then you can move on to improving a different performance problem.

But suppose you do need better performance than the 5x improvement. What changes will help? After all, there's no sense spending time trying to improve something that doesn't matter. We can take the basic cache performance equation and try some experiments:

  • improve the cache implementation (by asking Caucho to speed up Resin Cache)
  • improve the miss ratio (typically be increasing the cache size, but possibly refactoring)
  • improve the miss time (by speeding up the database code, optimizing queries, etc.)

For each experiment, we'll see what happens if we can get a 50% improvement.

Cache-20-miss-changes.png

Change Performance
no change 20.8ms
0.5ms hit time 20.4ms
10% miss rate 10.9ms
50ms miss time 10.8ms
disable cache 100ms

1% miss, 100ms miss time, 1ms hit time

Cache-1-miss-changes.png

Change Performance
no change 1.99ms
0.5ms hit time 1.5ms
0.05% miss rate 1.5ms
50ms miss time 1.49ms
disable cache 100ms

Improving the Miss Ratio

Cache-hit-graph.png


The Resin ClusterCache implementation

Since Resin's ClusterCache is a persistent cache, the entries you save will be stored to disk and recovered. This means you can store lots of data in the cache without worrying about running out of memory. (LocalCache is also a persistent cache.) If the memory becomes full, Resin will use the cache entries that are on disk. For performance, commonly-used items will remain in memory.

Personal tools
TOOLBOX
LANGUAGES