php shell/cache-benchmark.php init \. --name basic --clients 4 âops 30000 --seed 1 \. --keys 20000 --tags 5000 --min-t
The Fast, the Slow and the Ugly: A Magento Cache Showdown Colin Mollenhour
[email protected] github.com/colinmollenhour @colinmollenhour
Began working with Magento in early 2009 with Business Services & Solutions, LLC • Small team based in Knoxville, TN • Services small number of clients in which we own an interest • Highly distributed team of coders • Integrates with third-party solution providers • We’re hiring engineers!
Life Without Cache?! • Config.xml + core_config_data • Block output • Zend_* components (DDL, Locale, Date, etc..) • Remote API data • Expensive queries • Custom uses
Cache Without Tags What would happen without tagging?
No invalidation No Tags == == Cache thrashing
Invalidation event occurs
Match tagged cache keys
Invalidate layouts, blocks, etc..
Zend_Cache_Backend_? Out-of-the-box
• Files • Database • Memcached • APC/Xcache/eAccelerator • ZendServer_Disk/ShMem • Sqlite • TwoLevels combinations
Alternative
• Redis • Indexed Files • Simplified TwoLevels • and more..
Special Needs Tags support Clusterfriendly Highconcurrency Persistent
• Memcached, APC, Xcache, eAccel, Zend* • File, Database, Sqlite
• File, APC, Xcache, eAccel, Zend*, Sqlite • Database, Memcached
• Sqlite
• Memcached, APC, Xcache, eAccel, ZendServer_ShMem • File, Database, Sqlite, ZendServer_Disk
Problems with TwoLevels • Redundant – Hits: fast or fast+slow – Misses: fast+slow – Writes: fast+slow – Cleans: fast+slow
• Buggy – Synchronization – Priorities – Expire times
TwoLevels
DATA
DATA + TAGS
(fast)
(slow)
TwoLevels: Simplified • No data written to slow backend – Varien’s database backend does the same since CE 1.5
• Removes features – Priority/filling – Write-through – Read-through
http://goo.gl/o92Zr
TwoLevels
DATA
TAGS
(fast)
(slow)
remote dictionary server
Advanced key-value store In-memory and disk Automatic key expiration Can specify maxmemory Master/Slave(s) replication
http://redis.io
Strings (set, get, incr, append, …) Lists (push, pop, trim, index, sort, …) Hashes (set, get, del, keys, vals, …) Sets (add, rem, union, inter, diff, …) Sorted Sets (rank, score, union, sum, …) Transactions Publish/Subscribe Lua Scripting
Cm_Cache_Backend_Redis loadCache()
saveCache()
• HGET(‘zc:k:’.$id, ‘d’)
• HMSET($id, {$data, $tags, $time}) • EXPIRE($id, $lifetime) • SADD(‘zc:tags’, $tags) • SADD(‘zc:ti:’.$tags[$i++], $id)
cleanCache($tags)
removeCache()
• $ids = SUNION($tags) • DEL($ids) • DEL(‘zc:ti:’.$tags[$i++]) • SREM(‘zc:tags’, $tags)
• $tags = HGET(‘zc:k:’.$id, ‘t’) • DEL($id) • SREM(‘zc:ti:’.$tags[$i++])
http://goo.gl/1ThM8
Cm_Cache_Backend_File
• • • • •
Writes tags in append-only mode Randomly compacts large tag files Locks tag files for safe operation Fixes broken subdirectory distribution Unit tested
http://goo.gl/WWyM4
Testing Methodology Generate Test Data Once
Generate random data
Generate random ops
• 32-byte keys/tags • Base64 encoded • Random data size • Random tags per key
• N clients, X ops per client • 1 in 1000 chance for clean • 1 in 1000 chance for save
Repeatable Execution
Load cache data
• Bash script cleans cache and loads pregenerated cache data
Start N clients
• Read pregenerated ops into memory • Output reads, writes, cleans
Sum results from each client
• Awk script sums reads, writes, cleans
Choosing Parameters $ php shell/cache-benchmark.php analyze
Data Size 6000 5000 4000 3000 2000 1000 0
$ php shell/cache-benchmark.php init \ --name basic --clients 4 –ops 30000 --seed 1 \ --keys 20000 --tags 5000 --min-tags 1 --max-tags 10 --max-rec-size 32768 $ bash var/cachebench/basic/run.sh
Choosing Parameters $ php shell/cache-benchmark.php analyze
Tags per Key
Keys per Tag
12000
1600 1400 1200 1000 800 600 400 200 0
10000 8000 6000
4000 2000 0 0
2
4
6
8 10
0 2 4 6 8 10 12 14 17 19 21 24 28 32 34 41 > 47
$ php shell/cache-benchmark.php init \ --name basic --clients 4 –ops 30000 --seed 1 \ --keys 20000 --tags 5000 --min-tags 1 --max-tags 10 --max-rec-size 32768 $ bash var/cachebench/basic/run.sh
Benchmarks!!
Machine Specs
Backends Tested
• Dual Quad Core Xeon E5620 (2.4 Ghz Gulftown) • 12Gb RAM • 2x 250Gb SATA in RAID 1 • Debian 6.0 • dotdeb.org packages • Magento CE 1.6.2.0
• Files • Database • Memcache + Files • Memcache + Database • Memcache + Redis* • Redis (lzf compression) • Cm_Cache_Backend_File • * = simplified two-levels
Basic Comparison
60
Thousands Ops/sec
50 40 30
20 10 files
reads 20,672 writes 7,415 cleans 1.3
database 6,517 988 986
memcfiles 16,494 2,877 1.2
memcdb 17,109 2,103 1,470
memcredis 15,360 2,370 259
memcredis* 17,320 2,593 1,263
redis-lzf
cm-files
14,957 4,371 6,173
52,008 4,198 3,391
Concurrency Scaling (Number of Clients) Reads
Writes 25
250
Cleans 16
Thousands Ops/sec
14 200
20
150
15
12 10 8
10
100
6 4
5
50
2 0
0
0 2
8 files
16 32 64
2
database
memc-db
8 16 32 64 Concurrent Clients memc-redis*
2
8
redis-lzf
16
32
cm-files
64
Capacity Scaling (Keys and Tags)
Thousands Ops/sec
Reads
Writes
140
16
120
14
Cleans 12 10
12
100
8
10
80
6
8 60
6
40
4
4
20
2
0
0
2 0
Thousands of Keys/Tags files
database
memc-db
memc-redis*
redis-lzf
cm-files
The Ugly Files • Unbearably slow clean • Tmpfs is moot • Negligible clean gains • No read gains
Database • Poor read latency • Capacity doesn’t scale • Would contend with Magento queries
Redis Pros
Redis Cons
Redis: Gotchas & Tips maxmemory • Recommend ‘volatile-lru’ policy • LRU algorithm evicts one of N (default=3) random keys
Compression • Saves ~69% with gzip, ~50% with lzf • Use Google’s “snappy” or PECL’s “lzf” library for best performance • Set `compress_threshold` lower to compress more often
phpredis native extension • Only use db 0 with phpredis due to reconnect bug
Redis Compression Options 40 35 33.76
34.24
33.73
32.37
31.15
30.39
31.33
33.03 31.00
Thousands Ops/Sec
30 25 20 15.48
16.02
15
15.25
14.04
14.74
14.12
14.55
14.38 11.34
10
8.62
8.84
8.64
9.08
8.29
gzip-1k gzip-4k gzip-8k gzip-16k lzf-1k
lzf-4k
lzf-8k
lzf-16k
none
7.46
7.53
7.50
7.60
5 0 reads
writes
cleans
Cm_Cache_Backend_File
Pros
Cons
Memcache + ? Redis (w/ simplified TwoLevels)
• Slightly better performance • Very low resource utilization • No affect on SQL database
Database • Contention with Magento queries • Lots of overhead: • Tag indexes • Durable writes
Recommendations Small (single web node) • Cm_Cache_Backend_File • Cm_Cache_Backend_Redis
Medium (1-5 web nodes) • Cm_Cache_Backend_Redis
Large (6+ web nodes) • Memcached + Redis (w/ simplified TwoLevels)
More possibilities
github.com/AntonStoeckl/Zend_Cache_Backend_Mongo • Blazing fast • MongoDb’s memory usage can’t be capped • No automatic online compaction so long-term use may require maintenance
Sharding with Redis to overcome single-threaded problem • 1 shard for tags, N shards for data
Magento Session Handlers Files (php implementation) • Good for single servers
Database (SQL) • Lacks locking mechanism
Memcache • Not persistent
Eaccelerator • Not persistent • Lacks locking mechanism • Contends with opcode cache
Cm_RedisSession Persistent Cluster-friendly No garbage collection needed Optimistic locking Compression supported Includes online migration script
http://goo.gl/D2Dyw
Optimistic Locking
Increment ‘lock’
If lock == 1, take lock
Common case read: • HINCRBY • HMSET, EXPIRE • HGET(id, ‘data’)
Else If timeout, break lock
Fetch session data
Common case write: • HGET(id, ‘pid’) • HMSET, EXPIRE
Questions?
Colin Mollenhour
[email protected] github.com/colinmollenhour @colinmollenhour