Nov 23, 2009 - OS dependent Memory Management. Memory profiler ... T_DATA lacks: Type Identity ... mmap(), munmap(), mad
Ruby Memory Management Hacks Ruby のメモリ管理とかなんとか
SASADA Koichi Department of Creative Informatics, Graduate School of Science and Technology, The University of Tokyo
2009/11/23
Ruby Memory Management Hacks
1
Summary of this talk y Ruby 1.9.2 may improve several
performance
y Memory management efficiencies y Fiber performance
y Memory profiler feature
y Researches are progressing y Memory Profiler y MVM 2009/11/23
Ruby Memory Management Hacks
2
Agenda y (1) Memory Management Hacks y OS dependent Memory Management
y Memory profiler (including Studentʼs work)
y (2) VM performance improvement
y Fiber performance improvement (Studentʼs
work) y Other progress
y (3) Multi-VM (MVM) Progress 2009/11/23
Ruby Memory Management Hacks
3
(1) Memory Management Hacks 1. Trivial Optimizations 2. Memory Profiler
y “objspace” library
y New T_DATA representation
3. OS dependent Memory
Management
2009/11/23
Ruby Memory Management Hacks
4
Memory Management Importance y Memory is VERY BIG, EXPENSIVE! y malloc()/free() hell
y Garbage collection overhead y Long running services
y e.g. 30 years continuous running
•20/Nov/1979 JST
2009/11/23
Ruby Memory Management Hacks
5
Memory Management Problems y CRuby/MRI has several issues
about M/M
y Naive Garbage Collection
•Not a JVM, .NET or other VM including smart garbage collection •Programmer friendly, GC unfriendly C extension interfaces y No memory profilers y Object Fragmentation 2009/11/23
Ruby Memory Management Hacks
6
Solution (1) Reduce GC Pressure
y CRuby.GC != Generational GC y Old, long life object might be GC
pressure (increase GC frequency) y NODE, InstructionSequence
• Inline cache entries (ICE) are NODE. y Modules, Classes
y [1.9.2 Feature] ICE is no longer GC
managed object (manage it explicitly). y Reducing long lived NODE object.
→ Reducing GC pressure. y Planning to adopt other NODE object. 2009/11/23
Ruby Memory Management Hacks
7
Solution (2) Memory Profiler y Memory Profiler Support
y ObjectSpace#memsize_of(obj) y Experimental Memory Profiler
y [NOTE] These methods are not
expected to work except C Ruby.
2009/11/23
Ruby Memory Management Hacks
8
Memory Profiler ObjctSpace#memsize_of y [1.9.2 Feature] 1.9.2 introduces
“objspace” library:
y This library adds the following
methods •ObjectSpace.memsize_of(obj) •ObjectSpace.count_objects_size •ObjectSpace.count_nodes •ObjectSpace.count_tdata_objects
2009/11/23
Ruby Memory Management Hacks
9
Memory Profiler ObjctSpace#memsize_of example require ʻobjspaceʼ ObjectSpace.memsize_of(ʻ.ʼ*100) #=> 100 ObjectSpace.memsize_of(Time.now) #=> 60 ObjectSpace.memsize_of(Thread.new{}) #=> 524888 ObjectSpace.memsize_of(STDOUT) #=> 124 2009/11/23
Ruby Memory Management Hacks
10
Memory Profiler New T_DATA Representation y T_DATA: Wrapping C structure
y Traditional T_DATA representation includes: y pointer to memory object what T_DATA wrap y pointer to mark function y pointer to free function
y Make a T_DATA value with Data_Wrap/Make_Struct(klass, mark, free, sval) y T_DATA lacks: y Type Identity
y Memory size function 2009/11/23
Ruby Memory Management Hacks
11
Memory Profiler New T_DATA representation (2) y [1.9.2 Feature] We introduce the type
rb_data_type_t:
y wrap_struct_name
y pointer to mark function y pointer to free function
y pointer to memsize_of function
y Make T_DATA object with
TypedData_Wrap/Make_Struct(klass, type, data_type, sval)
2009/11/23
Ruby Memory Management Hacks
12
Memory Profiler New T_DATA representation (3) y Example:
static const rb_data_type_t time_data_type = { "time", time_mark, time_free, time_memsize, }; … obj = TypedData_Make_Struct(klass, struct time_object, &time_data_type, tobj);
2009/11/23
Ruby Memory Management Hacks
13
Memory Profiler New T_DATA representation (4) y New C APIs
y rb_objspace_data_type_memsize(obj) y rb_objspace_data_type_name(obj) y RTYPEDDATA_DATA(obj) y RTYPEDDATA_P(obj)
y RTYPEDDATA_TYPE(obj)
2009/11/23
Ruby Memory Management Hacks
14
Memory Profiler Statistic Profiler y Sampling memory status
periodically y We already get tools
2009/11/23
Ruby Memory Management Hacks
15
Statistic Profiler Examples (per T_xxx + Stack frame)
2009/11/23
Ruby Memory Management Hacks
16
Statistic Profiler Examples (per Classes)
2009/11/23
Ruby Memory Management Hacks
17
Statistic Profiler Examples (T_xxx / count)
2009/11/23
Ruby Memory Management Hacks
18
Statistic Profiler Examples (NODE)
2009/11/23
Ruby Memory Management Hacks
19
Statistic Profiler Examples (T_DATA)
2009/11/23
Ruby Memory Management Hacks
20
Memory Profiler Deterministic Profiler y [Research] We are working on
deterministic memory profiler
y Explicit realtime memory usage y Explicit object lifetime
y Explicit object generated place
(file:line) y Prepare probe points and some APIs
y Inspired from Java realtime memory
profiler, valgrind/massif profiler
2009/11/23
Ruby Memory Management Hacks
21
Solution (3) OS dependent Memory Management y Problem: Object Fragmentation
y Memory block can not be de-allocate if
at least one object exists.
2009/11/23
Ruby Memory Management Hacks
22
1.9 Memory Management Data Structure (Variable length) Fixed,16KB
Fixed,16KB
Fixed,16KB
Fixed,16KB
2009/11/23
Ruby Memory Management Hacks
23
Memory Management OS dependent Memory Management y Make more small block
y Use OS dependent memory
management API
y mmap(), munmap(), madvise() on UNIX y Similar APIs on Windows
2009/11/23
Ruby Memory Management Hacks
24
1.9.2 Memory Management Data Structure Arena (128 MB) Fixed, 4KB Fixed, 4KB Fixed, 4KB
2009/11/23
Ruby Memory Management Hacks
25
Memory Management OS dependent Memory Management y Many advantages
y Low object fragmentation
y Easy to manage with OS M/M
y 4KB alignment page helps bitmap
marking y 128MB bulk allocation helps is_pointer_to_heap() checking (reduce GC overhead) 2009/11/23
Ruby Memory Management Hacks
26
Memory Management OS dependent Memory Management Evaluation
yNot yet yHELP!
2009/11/23
Ruby Memory Management Hacks
27
(2) VM Performance Improvement 1. Change VM instruction set a little 2. Fiber performance improvement
2009/11/23
Ruby Memory Management Hacks
28
(2.1) VM instruction set changes y Make instance variable index cache
in inline cache entry → Need Instruction cache entry in operand of getinstance/setinstance y Change instruction cache treatment y This point is NOTICE for YARV
assembler users (… who use it?)
2009/11/23
Ruby Memory Management Hacks
29
(2.2) Fiber Performance Improvement y Fiber is:
y 1.9 Feature
y Coroutine, Semi-Coroutine
y Primitive for concurrent programming y Used in Enumerator#next
y Problem is:
y Slow context switching
2009/11/23
Ruby Memory Management Hacks
30
Fiber Current implementation y Fiber context switching needs
machine stack copy
Heap sp Machine Stack Machine Stack Fiber B Fiber A
Machine Stack Fiber B
Machine Stack Fiber A (buffer)
Copy buffer (B) Copy buffer (A) 2009/11/23
Ruby Memory Management Hacks
31
Fiber Current implementation y Why such a expensive method? y The reason is Portability
y There are no general method to switch
machine stack y Assembler approach has portability, maintainability problems. •e.g. [ruby-core:14149]
2009/11/23
Ruby Memory Management Hacks
32
Fiber performance improvement Solution y Use OS dependent features
y OS features are well-maintained y getcontext/setcontext on UNIX
•Linux (2.6-), Solaris FreeBSD, Mac OSX y Fiber API on Windows y And machine stack copy for other OSs 2009/11/23
Ruby Memory Management Hacks
33
Fiber performance improvement Switching SP with OS features y [1.9.2 Feature] Lightweight fiber
context switching Heap sp
Machine Stack
Machine Stack
Fiber B Fiber A
2009/11/23
Ruby Memory Management Hacks
34
Fiber performance evaluation Context switch cost (sec) Platform
Machine stack depth 0
1
2
3
Windows (1.9.1)
5.304
5.226
5.694
6.021
Windows (1.9.2)
1.872
1.887
1.903
1.887
Linux (1.9.1)
4.882
5.587
6.398
7.392
Linux (1.9.2)
1.675
1.675
1.673
1.661
Solaris (1.9.1)
130.995
137.903
144.062
151.819
Solaris (1.9.2)
42.507
42.408
42.056
42.357
2009/11/23
Ruby Memory Management Hacks
35
Fiber performance evaluation Context switch cost (speedup ratio) 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0
Windows Linux Solaris
0 2009/11/23
1
2 Ruby Memory Management Hacks
3 36
Fiber performance Proc vs. Fiber # Proc pr = Proc.new{}; 10000000.times{pr.call} # Fiber f = Fiber.new{ Fiber.yield while true}; 10000000.times{f.resume} 2009/11/23
Ruby Memory Management Hacks
37
Fiber performance Proc vs. Fiber 18 16
Execution time
14 12 10 8 6 4 2 0 Proc 2009/11/23
Fiber (1.9.1) Ruby Memory Management Hacks
Fiber (1.9.2) 38
Fiber performance Fiber generation speed y 3000000.times do
Fiber.new{}.resume end y Ruby 1.9.1 y User time : 13.649 y System time:0.008 y Real time: 13.657
y Ruby 1.9.2: y User time: 10.541 y System time:1.136 y Real time: 11.677 2009/11/23
Most of time consumed by GC overhead. This is future work.
in Sec
Ruby Memory Management Hacks
39
(3) Multi-VM (MVM) y Overview y Motivation y Model y Challenges y API Design
y Progress y Features y Evaluations 2009/11/23
Ruby Memory Management Hacks
40
(3) Multi-VM (MVM)
This one!
Quoted from Matz keynote 2009/11/23
Ruby Memory Management Hacks
41
MVM Motivation y Parallel execution
y Multi-core, many core, …
y Parallel thread seems bad idea
y Isolated VMs can run in parallel
y Application embedded VM
y Easy to handle VM by your application
2009/11/23
Ruby Memory Management Hacks
42
Model of MVM Ruby Process
Ruby VM (under GVL)
Ruby VM (under GVL) RT
Native Thread System S/W
・・・
RT
RT
NT
NT
・・・
NT
Thread Scheduler
S/W H/W Processor(s) PE
PE
・・・
PE
PE: Processor Element, NT: Native Thread 2009/11/23
Ruby Memory Management Hacks
43
MVM Challenges y Compatibility
y Process global information
• C global variables • File related information (e.g. w/dir) • Process information (including signal) • C extension
y Communication interface design
2009/11/23
Ruby Memory Management Hacks
44
MVM API y C API /* include/ruby/vm.h */ y ruby_vm_create(…) y ruby_vm_run(…)
y ruby_vm_start(…)
y Ruby API # not mature
y vm = RubyVM.new(…)
y vm.push(…) / vm.pop() # comm. y Now under consideration 2009/11/23
Ruby Memory Management Hacks
45
MVM C Extension y Introduce new convention y Init_foo(void)
y InitVM_foo(void) ← New!
y Do not use global variable
y If C extension doesnʼt have
InitVM_foo() function, restricted to use by only main (first) VM
2009/11/23
Ruby Memory Management Hacks
46
MVM Progress Current Features y Create VM by Ruby/C API y Run in parallel each VMs
y Simple VM communication API y Not mature
2009/11/23
Ruby Memory Management Hacks
47
MVM - Preliminary evaluation Evaluation environment y MacBook
y Intel Core 2 Duo 2.1 GHz, 2 Cores y L2$ 3 MB
y Memory 4 GB
y Bus speed 800 MHz
2009/11/23
Ruby Memory Management Hacks
48
MVM - Preliminary evaluation VM creation time 14
Creation time (sec)
12 10 8 6 4 2 0 vm 2009/11/23
fork
system
Ruby Memory Management Hacks
thread
fiber 49
MVM Progress Evaluation (ping/pong) require 'benchmark' N = ARGV[0] ? ARGV[0].to_i : 100 puts "N = #{N}" W = 10 Benchmark.bm(W) {|bm| vm = RubyVM.new("ruby", "-e", %q{ vm1 = RubyVM.current vm2 = RubyVM.parent #{N}.times {vm2.send(vm1.recv)} }).start data = :"foo" bm.report("send/recv") { N.times {|i| vm.send(data)}; vm.join } } 2009/11/23
Ruby Memory Management Hacks
50
MVM Progress Evaluation (symbol ping/pong) Execution time (sec)
80 70 60 50 40 30 20 10 0 MVM 2009/11/23
Ruby Memory Management Hacks
dRuby 51
MVM Progress Evaluation (parallel execution) fibdef =