The Art of Java Performance Tuning

32 downloads 114512 Views 6MB Size Report
May 14, 2013 ... Java Performance is Complex. • Write once run everywhere. – Java is slow because it's interpreted. • No, there are Just In Time (JIT) compilers.
14/05/2013

© Ed Merks | EDL V1.0

1

Java Performance is Complex • Write once run everywhere – Java is slow because it’s interpreted • No, there are Just In Time (JIT) compilers

– Different hardware and platforms – Different JVMs • Different tuning options

– Different language versions

14/05/2013

© Ed Merks | EDL V1.0

2

Faster is Better

14/05/2013

© Ed Merks | EDL V1.0

3

Smaller is Better

14/05/2013

© Ed Merks | EDL V1.0

4

14/05/2013

© Ed Merks | EDL V1.0

5

Measuring

14/05/2013

© Ed Merks | EDL V1.0

6

Benchmarking

14/05/2013

© Ed Merks | EDL V1.0

7

Profiling

14/05/2013

© Ed Merks | EDL V1.0

8

14/05/2013

© Ed Merks | EDL V1.0

9

Don’t Trust Your Friends • Your friends are stupid

14/05/2013

© Ed Merks | EDL V1.0

10

14/05/2013

© Ed Merks | EDL V1.0

11

Don’t Trust Yourself • You know nothing

14/05/2013

© Ed Merks | EDL V1.0

12

Don’t Trust the Experts • The experts are misguided

14/05/2013

© Ed Merks | EDL V1.0

13

Definitely Don’t Trust Me!

14/05/2013

© Ed Merks | EDL V1.0

14

Don’t Trust Anything • Everything that’s true today might be false tomorrow • Whatever you verify is true today is false somewhere else

14/05/2013

© Ed Merks | EDL V1.0

15

Where Does That Leave You? • Don’t worry • Be happy • Write sloppy code and place blame elsewhere – Java – The hardware – The platform – JVM – Poor tools 14/05/2013

© Ed Merks | EDL V1.0

16

14/05/2013

© Ed Merks | EDL V1.0

17

Algorithmic Complexity • How does the performance scale relative to the growth of the input? – – – – – –

O(1) – hashed lookup O(log n) – binary search O(n) – list contains O(n log n) – efficient sorting O(n^2) – bubble sorting O(2^n) – combinatorial explosion

• No measurement is required 14/05/2013

© Ed Merks | EDL V1.0

35

30 25 20 15 10 5 0

18

Loop Invariants • Don’t do something in a loop you that can do outside the loop public NamedElement find(NamedElement namedElement){ for (NamedElement otherNamedElement : getNamedElements()) { if (namedElement.getName().equals(otherNamedElement.getName())) { return otherNamedElement; } } return null; }

• Learn to use Alt-Shift-↑ and Alt-Shift-L

14/05/2013

© Ed Merks | EDL V1.0

19

Generics Hide Casting • Java 5 hides things in the source, but it doesn’t make that free at runtime public NamedElement find(NamedElement namedElement) { String name = namedElement.getName(); for (NamedElement otherNamedElement : getNamedElements()) { if (name.equals(otherNamedElement.getName())) { return otherNamedElement; } } return null; }

• Not just the casting is hidden but the iterator too 14/05/2013

© Ed Merks | EDL V1.0

20

Overriding Generic Methods • Overriding a generic method often results in calls through a bridge method – That bridge method does casting which isn’t free new HashMap() { @Override public Object put(String key, Object value) { return super.put(key == null ? null : key.intern(), value); } };

14/05/2013

© Ed Merks | EDL V1.0

21

Accessing Private Fields • Accessing a private field of another class implies a method call public static class Context { private class Point { private int x; private int y; }

}

public void compute() { Point point = new Point(); point.x = 10; point.y = 10; }

14/05/2013

© Ed Merks | EDL V1.0

22

External Measurements • Profiling – Tracing • Each and every (unfiltered) call in the process is carefully tracked and recorded • Detailed counts and times, but is slow, and intrusive, and doesn’t reliably reflect non-profiled performance

– Sampling • The running process is periodically sampled to give a statistical estimate of where the time is being spent • Fast and unintrusive, but unreliable beyond hot spot identification 14/05/2013

© Ed Merks | EDL V1.0

23

Call It Less Often • Before you focus on making something faster focus on calling it less often

14/05/2013

© Ed Merks | EDL V1.0

24

External Measurements • Consider using YourKit – They support* open source

14/05/2013

© Ed Merks | EDL V1.0

25

Internal Measurements • Clock-based measurements – System.currentTimeMillis – System.nanoTime (Java 1.5)

• Accuracy verses Precision – Nanoseconds are more precise than milliseconds – But you can’t trust the accuracy of either

14/05/2013

© Ed Merks | EDL V1.0

26

Micro Benchmarks • Measuring small bits of logic to draw conclusions about which approach performs best – These are fraught with problems – The same JIT will produce very different results in isolation from what it does in real life – The hardware may produce very different results in isolation from what it does in a real application – You simply can’t measure threading reliably 14/05/2013

© Ed Merks | EDL V1.0

27

Micro Benchmarks • The JIT will turn your code into a very cheap no-op – Your benchmark must compute a result visible to the harness

• Because the clocks are inaccurate you must execute for a long time – That typically implies doing something in a loop and then of course you’re measuring the loop overhead too 14/05/2013

© Ed Merks | EDL V1.0

28

Micro Benchmarks • Do as much as possible outside the benchmark and outside the loop • You want to know the performance of the compiled code, not the interpreted code – You need a warmup • Use -XX:+PrintCompilation

– Beware the garbage collector • Use -verbose:gc

14/05/2013

© Ed Merks | EDL V1.0

29

Micro Measurements • I wrote a small benchmark harness – http://git.eclipse.org/c/emf/org.eclipse.emf.git/tree/tests/org.eclipse. emf.test.core/src/org/eclipse/emf/test/core/BenchmarkHarness.java

– Write a class that extends Benchmark and implements run – The harness runs the benchmark to determine many times it must run to use approximately a minimum of one second – Then it runs it repeatedly, gathering statistics 14/05/2013

© Ed Merks | EDL V1.0

30

Platform • Hardware Intel Core i7-2920XM CPU @ 2.5Ghz

• OS Windows 7 Professional Service Pack 1

• JVM java version "1.6.0_32" Java(TM) SE Runtime Environment (build 1.6.0_32-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode)

14/05/2013

© Ed Merks | EDL V1.0

31

The Simplest Micro Measurement • This is the simplest thing you can measure public static class CountedLoop extends Benchmark { public CountedLoop() { super(1000000); } @Override

public int run() { int total = 0; for (int i = 0; i < count; ++i) { total += i; } return total; }

@Override

}

public String getLogic() { return "total += i;"; }

• 0.348 < 0.348 < 0.350 CV%: 0.00 CR 95%: 0.348