databene GmbH in Gründung. Software Architect. & Performance Engineer.
JBoss User Group München. 2. April 2009. Hibernate. Performance Tuning ...
Hibernate Performance Tuning JBoss User Group München
2. April 2009
Volker Bergmann databene GmbH in Gründung Software Architect & Performance Engineer
databene.org
Volker Bergmann
Volker Bergmann Passionate Software Architect and Troubleshooter Specialized in Performance Assessment and Tuning 10 years of experience with enterprise application performance JBoss Certified Consultant Author of benerator, the leading open source performance test data generation tool
2
databene.org
Volker Bergmann
There is no Silver Bullet
High-Performance Hibernate applications require obeying some basic rules, but mainly a continuous diagnosis and optimization process
Understand what you are doing - how to do it can be looked up!
3
databene.org
Volker Bergmann
Agenda The Patient‘s Anatomy Prophylaxis Vitamins Handling serious Cases Therapies and adverse Effects Therapies for the Desperate
4
databene.org
Volker Bergmann
Understanding the patient Application architecture
5
databene.org
Volker Bergmann
Bird‘s eye view The slowest things you can do: application (server) network roundtrip
database disk access
hard disk
network connection creation Disk access network communication (esp. in WAN)
They account for: Call latency (speed) bandwidth consumption (capacity)
6
databene.org
Volker Bergmann
Bird‘s eye view Workload Effects: application (server) network roundtrip
CPU overload Locking
database disk access
hard disk
7
databene.org
Volker Bergmann
performance tuning There is no performance=200% switch Performance tuning is about trading resources Problems
Measures
work load
latency
bandwidth
locking (wait times)
work load
memory
reusing resources
In most cases we incur additional resource use for alleviating a serious bottleneck on another critical resource Example: Caching is trading client memory and work load for network latency and database workload 8
databene.org
Volker Bergmann
frog‘s eye view Check operating system, e.g. Compression
Java application (server)
Encryption Fragmentation Tune the database
network roundtrip
indexes!
database
cache size! the bigger the better!
database cache
buffer sizes
operating system disk access
table spaces
9
hard disk
databene.org
Volker Bergmann
frog‘s eye view Choose an appropriate JDBC driver (e.g. Oracle 11 driver for Oracle 10) Make use of JDBC features: fetch size & batch size A database connection pool (e.g. DBCP) reuses database connections A PreparedStatement cache reuses formerly created PreparedStatements
Hibernate application
prepared stmt cache connection pool jdbc driver network roundtrip
database database cache operating system disk access
hard disk
10
databene.org
Volker Bergmann
frog‘s eye view Hibernate client
The session is the facade of Hibernate
Caches keep copies of DB data for saving DB calls 1st level cache keeps POJOs of the current transaction 2nd level cache keeps data shared among sessions
Be careful about the 2nd level cache, it‘s does plenty of work! 11
databene.org
session 1st level cache 2nd level cache prepared stmt cache connection pool jdbc driver network roundtrip
database database cache operating system disk access
hard disk
Volker Bergmann
Prophylaxis Decisions in early project phases mostly not viable to change in ex-post optimization
12
databene.org
Volker Bergmann
Architectural decisions Important decisions in early project phases Which operations require transactional behavior at all? Which transaction isolation is sufficient? usually Read Commited consequential design guidelines Is optimistic locking an option? Risky to change ex-post if test coverage is low
13
databene.org
Volker Bergmann
Design Decisions Inheritance bears the cost of joins Do not worry too much about using inheritance at all - it‘s OK! Have flat hierarchies Have small discriminator columns for indexing: char(1) Avoid Composite or String keys (uuid, guid): they are large All references to an entity have the type of the key All primary keys and most foreign keys must be indexed Composite and String indexes are more difficult to organize 14
databene.org
Volker Bergmann
Vitamins Easy to apply with little risk
15
databene.org
Volker Bergmann
Optimize the database Do not rely on the DDL generated by Hibernate - optimize! Hibernate-generated DDLs are not intended to be productionready Index structure Index compression table space characteristics Index-organized tables (e.g. for discriminator) Ask your DBA for more
16
databene.org
Volker Bergmann
Database Indexing Consider indexing all columns involved in queries FK columns Indexing means improving select performance at the cost of inserts and updates
You can define indexes in the hibernate configuration: @Table(appliesTo="tableName", indexes = { @Index(name="index1", columnNames={"column1", "column2"} ) } )
17
databene.org
Volker Bergmann
Basic Performance Checklist Optimize the database ...the DBA is your friend! Have the database cache as big as possible Check operating system options Tune JDBC fetch size (e.g. 1000) and batch size (e.g. 30) Use Connection Pooling and PreparedStatement Caching Initially try to cope without 2nd level cache Otherwise do not overflow Hibernate caches to disk Always use parametrized queries Reduce logging to the minimum amount acceptable 18
databene.org
Volker Bergmann
Primary Keys Use an efficient key generation strategy: (seq)hilo Do not use sequence: It issues an extra query for each object to create
19
databene.org
Volker Bergmann
Diagnosing Performance Fundamental performance process guidelines
20
databene.org
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
Load Generator
Hard- and software setup Client load and concurrency
Application Under Test
Database content
Neglecting any of these makes performance test results useless for production performance prediction!
21
databene.org
Database Under Test
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
Load Generator
Hard- and software setup Client load and concurrency Database content
Production Application
Neglecting any of these makes performance test results useless for production performance prediction!
21
Application Under Test
databene.org
Database Under Test Production Database
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
Load Generator Load Generator 800 concurrent users
Hard- and software setup Client load and concurrency Database content
Production Application
Neglecting any of these makes performance test results useless for production performance prediction!
21
Application Under Test
databene.org
Database Under Test Production Database
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
JMeter, Grinder, LoadRunner
Load Generator Load Generator 800 concurrent users
Hard- and software setup Client load and concurrency Database content
Production Application
Neglecting any of these makes performance test results useless for production performance prediction!
21
Application Under Test
databene.org
Database Under Test Production Database
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
JMeter, Grinder, LoadRunner
Load Generator Load Generator 800 concurrent users
Hard- and software setup Client load and concurrency Database content
Production Application
Neglecting any of these makes performance test results useless for production performance prediction!
21
Application Under Test
databene.org
Database Under Test Production Database 10M customers
Volker Bergmann
Performance Testing Essentials Have a test setup that resembles production in
JMeter, Grinder, LoadRunner
Load Generator Load Generator 800 concurrent users
Hard- and software setup Client load and concurrency
Application Under Test
Database content
Production Application
Neglecting any of these makes performance test results useless for production performance prediction!
21
databene.org
Production Data, Benerator
Database Under Test Production Database 10M customers
Volker Bergmann
Concurrency matters We expect 100 calls per second... ...so let‘s write a unit test that calls the application 100 times and check that it takes less than a second... Do you agree?
22
databene.org
Volker Bergmann
What are 100 calls per second? Scenario 1 1 User does 100 Calls per second
uniform behavior single-threaded
23
databene.org
Volker Bergmann
What are 100 calls per second? Scenario 2 100 Users do 1 Call per second
uniform behavior single-threaded
variate behavior multithreaded rare locking relative frequency
Scenario 1 1 User does 100 Calls per second
Latency
23
databene.org
Volker Bergmann
What are 100 calls per second? Scenario 1 1 User does 100 Calls per second
Scenario 2 100 Users do 1 Call per second
Scenario 3 360.000 Users do 1 Call per hour Internet
uniform behavior single-threaded
erratic behavior high concurrency massive locking cache gets worthless
relative frequency
relative frequency
variate behavior multithreaded rare locking
Locking
Latency
23
databene.org
Latency
Volker Bergmann
Initializing the test database Manually defined database setups (e.g. with DbUnit) are to small: Full table scans are faster than index lookups Queries and updates require much less work than with large tables
Programmed data generation is challenging. You need to create large volumes of valid and variate data quickly
24
databene.org
Volker Bergmann
Initializing the test database Production data extraction can be problematic availability (new application or outsourcing) secrecy concerns DBA competence required possibly obsolete data transformation needed
Hardly any test data generation tool is applicable for real-world applications. Major challenge: data that complies to strong validity constraints defining and reusing customizations 25
databene.org
Volker Bergmann
Benerator The leading open source tool for test data generation Written by Volker Bergmann Strongly customizable Can completely synthesize data Can extract and anonymize production data
26
databene.org
Volker Bergmann
Benerator Architecture
Benerator Core
27
databene.org
Volker Bergmann
Benerator Architecture Generators, Business Domain Packages
Benerator Core
27
databene.org
Volker Bergmann
Benerator Architecture Generators, Business Domain Packages
Benerator Core
Data Import/Export, Metadata Import
27
databene.org
Volker Bergmann
Benerator Architecture Generators, Business Domain Packages
Core Extensions
Benerator Core
Core Extensions
Data Import/Export, Metadata Import
27
databene.org
Volker Bergmann
Benerator Architecture Generators, Business Domain Packages
Benerator Core
Core Extensions
Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird
27
File Text CSV XML Flat Excel(TM) DbUnit Custom
databene.org
Core Extensions
System EJB JMS Web Service JCR
Volker Bergmann
Benerator Architecture Generators, Business Domain Packages
Weight Function
Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird
27
Benerator Core
File Text CSV XML Flat Excel(TM) DbUnit Custom
databene.org
Task Validator
K
Sequence
O
1,2,3,...
System EJB JMS Web Service JCR
Volker Bergmann
Benerator Architecture
Sequence
Weight Function
Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird
27
Finance
Benerator Core
File Text CSV XML Flat Excel(TM) DbUnit Custom
databene.org
@ Net
Task Validator
K
1,2,3,...
Address
O
Person
System EJB JMS Web Service JCR
Volker Bergmann
(User) Interfaces ruise control
Eclipse
benclipse
Ant
Telnet
Benerator Plugin
Benerator
28
databene.org
Volker Bergmann
Benerator Tool Demo
29
databene.org
Volker Bergmann
Always remember: That‘s Production! data volume, distribution & variety
user load & concurrency
different tasks application under test 30
databene.org
processes competing for resources Volker Bergmann
Process define performance requirements early
measure performance early
met requirements ?
have a performance goal have reproducible tests don‘t optimize without need
find & remove biggest bottleneck
prove performance impact of each change
profile application
expect that removing a major bottleneck shifts minor ones 31
databene.org
Volker Bergmann
Profiling Tools client
JVM CPU Load Java Method Latency
MBean console
session 1st level cache profiler
Cache Hit Ratio
2nd level cache
statistics / logger
prepared stmt cache connection pool
SQL Latency
jdbc driver
jdbc monitor
VM
Network Traffic DB CPU Load
network roundtrip
database os tools
database cache
db monitor
operating system
Disk Traffic
disk access
hard disk
32
databene.org
Volker Bergmann
Profilers A profiler enables you to gather most relevant data in a single application and with correlated time series Wide range from open source profilers to high-priced ones: Open Source: NetBeans Low Price: JProfiler ($X00) High End - High Price: dynaTrace, PerformaSure ($X0,000)
The higher the price, the higher the value you get from these tools If the application does not have challenging performance requirements, a low price profiler does well for an experienced user. 33
databene.org
Volker Bergmann
Therapies and adverse effects Performance Tuning measures and when not to apply them
34
databene.org
Volker Bergmann
Latency & Bandwidth
Bandwidth
Let‘s switch to another analogy for understanding bandwidth and latency: Car traffic
Latency 35
databene.org
Volker Bergmann
Handling Latency & Bandwidth How to copy with small bandwidth and high latency?
36
databene.org
Volker Bergmann
Handling Latency & Bandwidth How to copy with small bandwidth and high latency?
Have a better network
36
databene.org
Volker Bergmann
Handling Latency & Bandwidth How to copy with small bandwidth and high latency?
Have a better network
Save database calls
36
databene.org
Volker Bergmann
Handling Latency & Bandwidth How to copy with small bandwidth and high latency?
Have a better network Use bulk/batch invocation A * Bus Tours
Get more payload from a trip: • join • fetch size Save database calls
36
databene.org
Volker Bergmann
The standard problem WYSILTYG: What you see is less than you get! Imagine setting the prices of all products in a category to 1! (or 1$): Category cat = (Category) session.get(Category.class, 1L); Set products = cat.getProducts(); for (Product prod : products) product.setPrice(1);
tx.commit();
37
databene.org
Volker Bergmann
The standard problem WYSILTYG: What you see is less than you get! Imagine setting the prices of all products in a category to 1! (or 1$): Category cat = (Category) select ... from CATEGORY C session.get(Category.class, 1L); where C.ID = ? Set products = cat.getProducts(); for (Product prod : products) product.setPrice(1);
tx.commit();
37
select ID from PRODUCT P where P.CATEGORY = ? N x select ... from PRODUCT P where P.ID = ? N x update PRODUCT P set ... where P.ID = ? commit databene.org
Volker Bergmann
Saving DB Calls What causes a call?
How to save it?
HQL / Criteria Queries
Join Queries (2.L) Query+entity cache
get(pk)
(1./2.L) Entity cache
flush(), commit()
FlushMode.MANUAL
Navigation!, e.g.
Eager Fetching!
product.getCategory()
(2.L) Association Cache
LOB access
In-line storage ( additional work! 44
databene.org
Volker Bergmann
DML Operations HQL supports UDPATE and DELETE operations Syntax: (UPDATE | DELETE) FROM? EntityName (WHERE conditions)?
Optimal solution: String hqlUpdate = "update Product p " + "set p.price = :newPrice where c.category.id = :catId"; session.createQuery( hqlUpdate ) .setString( "newPrice", newPrice ) .setLong( "catId", categoryId ) .executeUpdate(); tx.commit(); update PRODUCT set ... where category.id = 1 commit; 45
databene.org
Volker Bergmann
Therapies for the desperate (or ruthless) Kids don‘t try this at home
46
databene.org
Volker Bergmann
Flushing Hibernate delays the execution of inserts and updates They are sent bulk in a flush() Issuing a query may cause Hibernate to flush before (for assuring consistent query results) Reorder operations: early queries, late inserts/updates You can take control of flushing by session.setFlushMode(FlushMode.MANUAL); The only automatic flush will then be issued at commit() you can manually issue a flush with session.flush(); 47
databene.org
Volker Bergmann
Database Fine-Tuning Stored procedures can reduce latency significantly! Use (if necessary proprietary) SQL Use SQL hints, e.g. select /*+ index(user name_idx) */ ... from user ...
48
databene.org
Volker Bergmann
Constraints Database Constraints are your friend they can help you a lot (in realizing software defects) They rarely cause trouble (mainly insert and update performance - seldomly a bottleneck) mostly worth the performance overhead so make friends! Foreign key, unique, check Drop constraints only for tables with massive update quotas
49
databene.org
Volker Bergmann
Detached Object Caching The Hibernate 2nd Level Cache does a lot of work caching relational structures, not objects! mapping structures to objects, relations and queries creating a copy of each object for each session (thread) synchronizing updates
EHCache can easily be adopted for direct use: caching detached objects delivering them by reference objects may not be attached to the session or a persistent object! defining custom cache organization (e.g. effective-dated objects) defining custom cache eviction algorithms 50
databene.org
Volker Bergmann
Conclusions Some performance-relevant decisions must be taken in design stage - they are not a topic for ex post tuning Some no-brain recipes assure good base performance Tuning is not an end in itself - you might regret it If performance is not sufficient you need to first profile, then optimize Tuning always trades resources for alleviating the current most critical bottleneck Diagnose all participating systems and layers within! Not each tuning measure is useful in each context! 51
databene.org
Volker Bergmann
Performance Tuning is no checklist processing! Some concepts are always useful, other ones shall not be applied blindly. Always know what you are doing!
52
databene.org
Volker Bergmann
Q&A ..questions left?
53
databene.org
Volker Bergmann
Thanks for your attention!
http://databene.org
[email protected]
54
databene.org
Volker Bergmann