Hibernate Performance Tuning - databene

33 downloads 1332 Views 2MB Size Report
databene GmbH in Gründung. Software Architect. & Performance Engineer. JBoss User Group München. 2. April 2009. Hibernate. Performance Tuning ...
Hibernate Performance Tuning JBoss User Group München

2. April 2009

Volker Bergmann databene GmbH in Gründung Software Architect & Performance Engineer

databene.org

Volker Bergmann

Volker Bergmann Passionate Software Architect and Troubleshooter Specialized in Performance Assessment and Tuning 10 years of experience with enterprise application performance JBoss Certified Consultant Author of benerator, the leading open source performance test data generation tool

2

databene.org

Volker Bergmann

There is no Silver Bullet

High-Performance Hibernate applications require obeying some basic rules, but mainly a continuous diagnosis and optimization process

Understand what you are doing - how to do it can be looked up!

3

databene.org

Volker Bergmann

Agenda The Patient‘s Anatomy Prophylaxis Vitamins Handling serious Cases Therapies and adverse Effects Therapies for the Desperate

4

databene.org

Volker Bergmann

Understanding the patient Application architecture

5

databene.org

Volker Bergmann

Bird‘s eye view The slowest things you can do: application (server) network roundtrip

database disk access

hard disk

network connection creation Disk access network communication (esp. in WAN)

They account for: Call latency (speed) bandwidth consumption (capacity)

6

databene.org

Volker Bergmann

Bird‘s eye view Workload Effects: application (server) network roundtrip

CPU overload Locking

database disk access

hard disk

7

databene.org

Volker Bergmann

performance tuning There is no performance=200% switch Performance tuning is about trading resources Problems

Measures

work load

latency

bandwidth

locking (wait times)

work load

memory

reusing resources

In most cases we incur additional resource use for alleviating a serious bottleneck on another critical resource Example: Caching is trading client memory and work load for network latency and database workload 8

databene.org

Volker Bergmann

frog‘s eye view Check operating system, e.g. Compression

Java application (server)

Encryption Fragmentation Tune the database

network roundtrip

indexes!

database

cache size! the bigger the better!

database cache

buffer sizes

operating system disk access

table spaces

9

hard disk

databene.org

Volker Bergmann

frog‘s eye view Choose an appropriate JDBC driver (e.g. Oracle 11 driver for Oracle 10) Make use of JDBC features: fetch size & batch size A database connection pool (e.g. DBCP) reuses database connections A PreparedStatement cache reuses formerly created PreparedStatements

Hibernate application

prepared stmt cache connection pool jdbc driver network roundtrip

database database cache operating system disk access

hard disk

10

databene.org

Volker Bergmann

frog‘s eye view Hibernate client

The session is the facade of Hibernate

Caches keep copies of DB data for saving DB calls 1st level cache keeps POJOs of the current transaction 2nd level cache keeps data shared among sessions

Be careful about the 2nd level cache, it‘s does plenty of work! 11

databene.org

session 1st level cache 2nd level cache prepared stmt cache connection pool jdbc driver network roundtrip

database database cache operating system disk access

hard disk

Volker Bergmann

Prophylaxis Decisions in early project phases mostly not viable to change in ex-post optimization

12

databene.org

Volker Bergmann

Architectural decisions Important decisions in early project phases Which operations require transactional behavior at all? Which transaction isolation is sufficient? usually Read Commited consequential design guidelines Is optimistic locking an option? Risky to change ex-post if test coverage is low

13

databene.org

Volker Bergmann

Design Decisions Inheritance bears the cost of joins Do not worry too much about using inheritance at all - it‘s OK! Have flat hierarchies Have small discriminator columns for indexing: char(1) Avoid Composite or String keys (uuid, guid): they are large All references to an entity have the type of the key All primary keys and most foreign keys must be indexed Composite and String indexes are more difficult to organize 14

databene.org

Volker Bergmann

Vitamins Easy to apply with little risk

15

databene.org

Volker Bergmann

Optimize the database Do not rely on the DDL generated by Hibernate - optimize! Hibernate-generated DDLs are not intended to be productionready Index structure Index compression table space characteristics Index-organized tables (e.g. for discriminator) Ask your DBA for more

16

databene.org

Volker Bergmann

Database Indexing Consider indexing all columns involved in queries FK columns Indexing means improving select performance at the cost of inserts and updates

You can define indexes in the hibernate configuration: @Table(appliesTo="tableName", indexes = { @Index(name="index1", columnNames={"column1", "column2"} ) } )

17

databene.org

Volker Bergmann

Basic Performance Checklist Optimize the database ...the DBA is your friend! Have the database cache as big as possible Check operating system options Tune JDBC fetch size (e.g. 1000) and batch size (e.g. 30) Use Connection Pooling and PreparedStatement Caching Initially try to cope without 2nd level cache Otherwise do not overflow Hibernate caches to disk Always use parametrized queries Reduce logging to the minimum amount acceptable 18

databene.org

Volker Bergmann

Primary Keys Use an efficient key generation strategy: (seq)hilo Do not use sequence: It issues an extra query for each object to create

19

databene.org

Volker Bergmann

Diagnosing Performance Fundamental performance process guidelines

20

databene.org

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

Load Generator

Hard- and software setup Client load and concurrency

Application Under Test

Database content

Neglecting any of these makes performance test results useless for production performance prediction!

21

databene.org

Database Under Test

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

Load Generator

Hard- and software setup Client load and concurrency Database content

Production Application

Neglecting any of these makes performance test results useless for production performance prediction!

21

Application Under Test

databene.org

Database Under Test Production Database

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

Load Generator Load Generator 800 concurrent users

Hard- and software setup Client load and concurrency Database content

Production Application

Neglecting any of these makes performance test results useless for production performance prediction!

21

Application Under Test

databene.org

Database Under Test Production Database

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

JMeter, Grinder, LoadRunner

Load Generator Load Generator 800 concurrent users

Hard- and software setup Client load and concurrency Database content

Production Application

Neglecting any of these makes performance test results useless for production performance prediction!

21

Application Under Test

databene.org

Database Under Test Production Database

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

JMeter, Grinder, LoadRunner

Load Generator Load Generator 800 concurrent users

Hard- and software setup Client load and concurrency Database content

Production Application

Neglecting any of these makes performance test results useless for production performance prediction!

21

Application Under Test

databene.org

Database Under Test Production Database 10M customers

Volker Bergmann

Performance Testing Essentials Have a test setup that resembles production in

JMeter, Grinder, LoadRunner

Load Generator Load Generator 800 concurrent users

Hard- and software setup Client load and concurrency

Application Under Test

Database content

Production Application

Neglecting any of these makes performance test results useless for production performance prediction!

21

databene.org

Production Data, Benerator

Database Under Test Production Database 10M customers

Volker Bergmann

Concurrency matters We expect 100 calls per second... ...so let‘s write a unit test that calls the application 100 times and check that it takes less than a second... Do you agree?

22

databene.org

Volker Bergmann

What are 100 calls per second? Scenario 1 1 User does 100 Calls per second

uniform behavior single-threaded

23

databene.org

Volker Bergmann

What are 100 calls per second? Scenario 2 100 Users do 1 Call per second

uniform behavior single-threaded

variate behavior multithreaded rare locking relative frequency

Scenario 1 1 User does 100 Calls per second

Latency

23

databene.org

Volker Bergmann

What are 100 calls per second? Scenario 1 1 User does 100 Calls per second

Scenario 2 100 Users do 1 Call per second

Scenario 3 360.000 Users do 1 Call per hour Internet

uniform behavior single-threaded

erratic behavior high concurrency massive locking cache gets worthless

relative frequency

relative frequency

variate behavior multithreaded rare locking

Locking

Latency

23

databene.org

Latency

Volker Bergmann

Initializing the test database Manually defined database setups (e.g. with DbUnit) are to small: Full table scans are faster than index lookups Queries and updates require much less work than with large tables

Programmed data generation is challenging. You need to create large volumes of valid and variate data quickly

24

databene.org

Volker Bergmann

Initializing the test database Production data extraction can be problematic availability (new application or outsourcing) secrecy concerns DBA competence required possibly obsolete data transformation needed

Hardly any test data generation tool is applicable for real-world applications. Major challenge: data that complies to strong validity constraints defining and reusing customizations 25

databene.org

Volker Bergmann

Benerator The leading open source tool for test data generation Written by Volker Bergmann Strongly customizable Can completely synthesize data Can extract and anonymize production data

26

databene.org

Volker Bergmann

Benerator Architecture

Benerator Core

27

databene.org

Volker Bergmann

Benerator Architecture Generators, Business Domain Packages

Benerator Core

27

databene.org

Volker Bergmann

Benerator Architecture Generators, Business Domain Packages

Benerator Core

Data Import/Export, Metadata Import

27

databene.org

Volker Bergmann

Benerator Architecture Generators, Business Domain Packages

Core Extensions

Benerator Core

Core Extensions

Data Import/Export, Metadata Import

27

databene.org

Volker Bergmann

Benerator Architecture Generators, Business Domain Packages

Benerator Core

Core Extensions

Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird

27

File Text CSV XML Flat Excel(TM) DbUnit Custom

databene.org

Core Extensions

System EJB JMS Web Service JCR

Volker Bergmann

Benerator Architecture Generators, Business Domain Packages

Weight Function

Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird

27

Benerator Core

File Text CSV XML Flat Excel(TM) DbUnit Custom

databene.org

Task Validator

K

Sequence

O

1,2,3,...

System EJB JMS Web Service JCR

Volker Bergmann

Benerator Architecture

Sequence

Weight Function

Data base Oracle MySQL HSQL Derby Postgres DB2 SQL Server Firebird

27

Finance

Benerator Core

File Text CSV XML Flat Excel(TM) DbUnit Custom

databene.org

@ Net

Task Validator

K

1,2,3,...

Address

O

Person

System EJB JMS Web Service JCR

Volker Bergmann

(User) Interfaces ruise control

Eclipse

benclipse

Ant

Telnet

Benerator Plugin

Benerator

28

databene.org

Volker Bergmann

Benerator Tool Demo

29

databene.org

Volker Bergmann

Always remember: That‘s Production! data volume, distribution & variety

user load & concurrency

different tasks application under test 30

databene.org

processes competing for resources Volker Bergmann

Process define performance requirements early

measure performance early

met requirements ?

have a performance goal have reproducible tests don‘t optimize without need

find & remove biggest bottleneck

prove performance impact of each change

profile application

expect that removing a major bottleneck shifts minor ones 31

databene.org

Volker Bergmann

Profiling Tools client

JVM CPU Load Java Method Latency

MBean console

session 1st level cache profiler

Cache Hit Ratio

2nd level cache

statistics / logger

prepared stmt cache connection pool

SQL Latency

jdbc driver

jdbc monitor

VM

Network Traffic DB CPU Load

network roundtrip

database os tools

database cache

db monitor

operating system

Disk Traffic

disk access

hard disk

32

databene.org

Volker Bergmann

Profilers A profiler enables you to gather most relevant data in a single application and with correlated time series Wide range from open source profilers to high-priced ones: Open Source: NetBeans Low Price: JProfiler ($X00) High End - High Price: dynaTrace, PerformaSure ($X0,000)

The higher the price, the higher the value you get from these tools If the application does not have challenging performance requirements, a low price profiler does well for an experienced user. 33

databene.org

Volker Bergmann

Therapies and adverse effects Performance Tuning measures and when not to apply them

34

databene.org

Volker Bergmann

Latency & Bandwidth

Bandwidth

Let‘s switch to another analogy for understanding bandwidth and latency: Car traffic

Latency 35

databene.org

Volker Bergmann

Handling Latency & Bandwidth How to copy with small bandwidth and high latency?

36

databene.org

Volker Bergmann

Handling Latency & Bandwidth How to copy with small bandwidth and high latency?

Have a better network

36

databene.org

Volker Bergmann

Handling Latency & Bandwidth How to copy with small bandwidth and high latency?

Have a better network

Save database calls

36

databene.org

Volker Bergmann

Handling Latency & Bandwidth How to copy with small bandwidth and high latency?

Have a better network Use bulk/batch invocation A * Bus Tours

Get more payload from a trip: • join • fetch size Save database calls

36

databene.org

Volker Bergmann

The standard problem WYSILTYG: What you see is less than you get! Imagine setting the prices of all products in a category to 1! (or 1$): Category cat = (Category) session.get(Category.class, 1L); Set products = cat.getProducts(); for (Product prod : products) product.setPrice(1);

tx.commit();

37

databene.org

Volker Bergmann

The standard problem WYSILTYG: What you see is less than you get! Imagine setting the prices of all products in a category to 1! (or 1$): Category cat = (Category) select ... from CATEGORY C session.get(Category.class, 1L); where C.ID = ? Set products = cat.getProducts(); for (Product prod : products) product.setPrice(1);

tx.commit();

37

select ID from PRODUCT P where P.CATEGORY = ? N x select ... from PRODUCT P where P.ID = ? N x update PRODUCT P set ... where P.ID = ? commit databene.org

Volker Bergmann

Saving DB Calls What causes a call?

How to save it?

HQL / Criteria Queries

Join Queries (2.L) Query+entity cache

get(pk)

(1./2.L) Entity cache

flush(), commit()

FlushMode.MANUAL

Navigation!, e.g.

Eager Fetching!

product.getCategory()

(2.L) Association Cache

LOB access

In-line storage ( additional work! 44

databene.org

Volker Bergmann

DML Operations HQL supports UDPATE and DELETE operations Syntax: (UPDATE | DELETE) FROM? EntityName (WHERE conditions)?

Optimal solution: String hqlUpdate = "update Product p " + "set p.price = :newPrice where c.category.id = :catId"; session.createQuery( hqlUpdate ) .setString( "newPrice", newPrice ) .setLong( "catId", categoryId ) .executeUpdate(); tx.commit(); update PRODUCT set ... where category.id = 1 commit; 45

databene.org

Volker Bergmann

Therapies for the desperate (or ruthless) Kids don‘t try this at home

46

databene.org

Volker Bergmann

Flushing Hibernate delays the execution of inserts and updates They are sent bulk in a flush() Issuing a query may cause Hibernate to flush before (for assuring consistent query results) Reorder operations: early queries, late inserts/updates You can take control of flushing by session.setFlushMode(FlushMode.MANUAL); The only automatic flush will then be issued at commit() you can manually issue a flush with session.flush(); 47

databene.org

Volker Bergmann

Database Fine-Tuning Stored procedures can reduce latency significantly! Use (if necessary proprietary) SQL Use SQL hints, e.g. select /*+ index(user name_idx) */ ... from user ...

48

databene.org

Volker Bergmann

Constraints Database Constraints are your friend they can help you a lot (in realizing software defects) They rarely cause trouble (mainly insert and update performance - seldomly a bottleneck) mostly worth the performance overhead so make friends! Foreign key, unique, check Drop constraints only for tables with massive update quotas

49

databene.org

Volker Bergmann

Detached Object Caching The Hibernate 2nd Level Cache does a lot of work caching relational structures, not objects! mapping structures to objects, relations and queries creating a copy of each object for each session (thread) synchronizing updates

EHCache can easily be adopted for direct use: caching detached objects delivering them by reference objects may not be attached to the session or a persistent object! defining custom cache organization (e.g. effective-dated objects) defining custom cache eviction algorithms 50

databene.org

Volker Bergmann

Conclusions Some performance-relevant decisions must be taken in design stage - they are not a topic for ex post tuning Some no-brain recipes assure good base performance Tuning is not an end in itself - you might regret it If performance is not sufficient you need to first profile, then optimize Tuning always trades resources for alleviating the current most critical bottleneck Diagnose all participating systems and layers within! Not each tuning measure is useful in each context! 51

databene.org

Volker Bergmann

Performance Tuning is no checklist processing! Some concepts are always useful, other ones shall not be applied blindly. Always know what you are doing!

52

databene.org

Volker Bergmann

Q&A ..questions left?

53

databene.org

Volker Bergmann

Thanks for your attention!

http://databene.org [email protected]

54

databene.org

Volker Bergmann