10 New Problem-Solving Tips Using ASH & AWR - aioug

57 downloads 857 Views 3MB Size Report
Solving Tips Using ASH &. AWR. Graham Wood ..... more than the maximum value assigned. • May result in unused capacity. Specify a minimum and maximum.
Oracle Performance Tuning Boot Camp: 10 New ProblemSolving Tips Using ASH & AWR Graham Wood Architect

1

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

3 types of Performance Management Reactive Performance Management

2

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Proactive Performance Management

Preventive Performance Management

3

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Reactive Performance Management 1.Comparing Performance Across Two Time Periods

2. Database Hang Analysis

3.SQL Performance Analysis

4

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Comparing Performance Across Two Periods  Performance was fine yesterday, today my application is really slow ?  Inconsistent Performance

– Over utilization of system resources – High load ad hoc query consuming resources – Change in execution plan of query – Parallel execution downgrade

5

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Compare Period ADDM SQL Commonality

Regressed SQL

AWR Snapshot Period 1

I/O Bound

AWR Snapshot Period 2

Compare Period ADDM

Analysis Report

• Full ADDM analysis across two AWR snapshot periods • Detects causes, measure effects, then correlates them • Causes: workload changes, configuration changes • Effects: regressed SQL, reach resource limits (CPU, I/O, memory, interconnect)

• Makes actionable recommendations along with quantified impact 6

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Undersized SGA

Compare Period ADDM: Method Identify what changed • Configuration changes, workload changes Quantify performance differences • Uses DB Time as basis for measuring performance Identify root cause • Correlate performance differences with changes

7

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13



30% smaller Buffer cache



10% new SQL



Top SQL increased 45%



Read I/O up 55%



Buffer cache reduction caused read I/O increase

Reactive Performance Management 1.Comparing Performance Across Two Time Periods

2.Database Hang Analysis

3.SQL Performance Analysis

8

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Compare Period ADDM

Database Hang Analysis  My database has hung ? I do not want to bounce it again  Database Hung state

– Blocking Sessions – Memory allocation issues – Library cache issues – Unresponsive Storage (ASM) – Interconnect problems 9

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time ADDM – Architecture EM Agent Deadlocks

Diagnostic Connection

Unresponsive DB

Hangs

JDBC Connection Enterprise Manager

• • • •

Latches

Real time analysis Database

ADDM Analysis

Uses a pre-established diagnostic connection for unresponsive systems Initiates a standard JDBC connection for real-time analysis Diagnostic connection collects data without holding latches or running SQL First intelligent advisor to diagnose problems in real-time as they occur, no matter how sick the system is 10

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time ADDM

• • • • • 11

Real-time analysis of hung or slow database systems Holistically identify global resource contentions and deadlocks Quantified performance impact Precise, actionable recommendations Provide cluster-wide analysis for RAC Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Reactive Performance Management 1.Comparing Performance Across Two Time Periods

Compare Period ADDM

2.Database Hang Analysis

Real-Time ADDM

3.SQL Performance Analysis

12

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

SQL Performance Analysis  I enabled parallel query, yet this query is taking

so long. Can you take a look ?  Parallel Downgrades – Uncontrolled parallel execution – Parallel Server availability – Object level settings – Session level settings

13

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time SQL Monitoring Insert executed with parallel hint.

14

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time SQL Monitoring Parallel Tab

• Parallel Coordinator busy for the entire duration !! 15

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time SQL Monitoring Enabled Parallel DML

• Parallel Slaves busy for the entire duration !!! 16

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Reactive Performance Management

17

1.Comparing Performance Across Two Time Periods

Compare Period ADDM

2.Database Hang Analysis

Real-Time ADDM

3.SQL Performance Analysis

SQL Monitoring

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

18

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Proactive Performance Management 4.Proactively Monitoring Long Running Programs

5.Analyzing Transient Performance Problems

Understanding Workload Profile 6. Correlating ASH & AWR 7. Using ASH Analytics

19

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Reactive Tracing of long running programs ?  Can you trace my program ?  What is wrong with tracing ? – A very reactive way of looking at problems – Overhead of writing data to trace files – Programs we want to trace are usually the ones

with issues – Impacts the performance of the production

system

20

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real Time Database Operation Monitoring NEW

 Database Operation (DBOPs) – Simple DBOP (already supported in 11g)  A SQL statement (e.g. SQL for DSS, batch/report SQL, runaway SQL)  A PL/SQL procedure/function – Composite (new in 12g)  Session(s) activity between two points of time defined by application code / DBA  For example, SQL*Plus script, batch job, ETL processing, …  At most one DBOP per DB session 21

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Naming a Database Operation – Naming or Tagging

– Bracketing

IMPLICIT

EXPLICIT BEGIN_OPERATION SQL PL/SQL Blocks … SQL SQL END_OPERATION 22

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

DBOP (Tag) SQL PL/SQL Blocks … SQL SQL

Real Time Database Operations Monitoring  Database monitoring of application jobs – Grouping of SQLs, sessions for the application job – Key scenarios: ETL operations, Quarter End Close job

 Real time monitoring driven by application specified tagging – Automatically tag Data pump jobs – Tagging ability in PLSQL, OCI, JDBC

 Avoids the overhead of SQLTrace  Visibility of Top SQL statements, system and session performance metrics

23

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Proactive Performance Management 4.Reactive Tracing of Long Running Programs

5.Analyzing Transient Performance Problems

Understanding Workload Profile • 6. Correlating ASH & AWR • 7. Using ASH Analytics

24

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Database Operations

Analyzing Transient Performance Problems  What happened last night the batch job took

twice the time to finish ?  No way to detect transient issues – We look at AWR data  Averaged out over the snapshot window – On-disk ASH Data  Sampled every 10 seconds – Very difficult to detect such issues in the “past”

25

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

26

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

• Hung or extremely slow databases • Uses a normal and diagnostic mode connection • Manual

Enhanced Real-Time ADDM

• Coarse grain performance comparison across two periods • Relies on AWR data • Manual

Real-Time ADDM

• Diagnose persistent performance issues • Uses AWR snapshots • Regular interval • Automatic or Manual

Compare Period ADDM

ADDM

Automatic Performance Diagnostics NEW

• Proactively detect and diagnose transient high-impact problems • Built inside the database • Automatic • Runs every 3 seconds

Real-Time ADDM  Automatic real time problem detection and analysis – Looks for triggering conditions every 3 seconds

 Database self-monitors for serious performance issues – Recognize bad performance trends and trigger analysis :  High CPU, I/O spikes, memory, interconnect, hangs, deadlocks

– Identify a problem before it threatens application performance

 Short duration (5 min spikes) ADDM analysis – Actionable advice for critical issues – Richer data set available for analysis

 Reports (analysis and data) stored in AWR for historical analysis – ADDM, SQL Monitoring reports 27

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

NEW

Triggering Conditions

28

#

Rule

Condition

1

High Load

Average active sessions greater than 3 times the number of CPU cores

2

I/O bound

Impact on active sessions based on single block read performance

3

CPU bound

Active sessions greater than 10% of total load and CPU utilization great than 50%

4

Over-allocated memory

Allocation over 95% of physical memory

5

Interconnect bound

Single block interconnect transfer time based

6

Session Limit

Session limit close to 100%

7

Process Limit

Process limit close to 100%

8

Hung Session

Significant number of hung sessions. If this number is greater than 10% of total sessions

9

Deadlock Detected

Any deadlock detected by hang analyzer

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Real-Time ADDM Report

29

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Proactive Performance Management 4.Reactive Tracing of Long Running Programs

Database Operations

5.Analyzing Transient Performance Problems

Real-Time ADDM

Understanding Workload Profile 6. Correlating ASH & AWR 7. Using ASH Analytics

30

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Understanding Workload Profile  The SQL Response Metric crossed the warning

threshold. What is wrong?  Several factors can impact SQL Response time – Increased or unusual load on system – Hardware Issues – Runaway queries consuming system resources – Changes in execution plans – Missing or stale object statistics

 Need a mechanism to quickly analyze in-memory

performance data 31

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

DB Response time analysis - AWR 

AWR top 5 section shows the Wait Class which contributes most to DB wait time

 Foreground Wait Class section in

AWR to see distribution of DB waits over Waits classes



32

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Objects involved in TX row lock contention can be identified in Segment Statistics section of AWR

Insert Information Protection Policy Classification from Slide 13

From AWR to ASH 

33

ASH report for the period of increase of Application waits will show the same waits as AWR

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.



Can I get the Application Module which suffered from this type of contention ?

Insert Information Protection Policy Classification from Slide 13

Extracting more data from ASH  Identify SQL statements and sessions impacted by waits on

“Application” Wait Class

34

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Extracting more data from ASH  Get a list of blocking sessions and DB objects !

35

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Understanding Workload Profile: ASH Analytics

• Graphical ASH report for advanced analysis

• Different visualizations: Stacked chart or Tree Map • Provides visual filtering for recursive drill-downs • Collaborate with others using Active • Select any time period for analysis Reports • Analyze performance across many dimensions 36

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Proactive Performance Management

37

4.Reactive Tracing of Long Running Programs

Database Operations

5.Analyzing Transient Performance Problems

Real-Time ADDM

Understanding Workload Profile 6. Correlating ASH & AWR 7. Using ASH Analytics

AWR & ASH Reports ASH Analytics

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

38

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Preventive Performance Management 8.Prevent Regression After Upgrade 9.Ensure Optimal Resource Allocation 10.Prevent Performance Issues Due To Application Changes 39

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

SQL Tuning Challenges Change Causing Problems  Situation – New SQL statements added as part of application patch deployment – Database upgrades – Database patching

 Response – Users: “How will the application perform after the changes?” – DBA: “How do I ensure that our SLA remains intact after the changes are

rolled out?”  Challenge – How to reduce business risk while absorbing new technologies? 40

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

SQL Tuning Advisor Gather Missing or Stale Statistics

SQL Profiling

Create a SQL Profile

Statistics Analysis

Add Missing Access Structures

Access Path Analysis

Modify SQL Constructs

SQL Restructure Analysis Alternative Plan Analysis Parallel Query Analysis

Automatic Tuning Optimizer

SQL Tuning Advisor

Adopt Alternative Execution Plan (11.2)

Administrator

Create Parallel SQL Profile (11.2)

Comprehensive SQL Tuning Recommendations

• • • •

Analyzes statistics for accuracy Recommends SQL Profiles for transparent application tuning Suggest access structures and alternate SQL to speed up query execution Identifies alternative execution plans using real-time and historical performance data to recover from plan regression • Recommends appropriate degree of parallelism for best performance 41

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Preventive Performance Management 8.Prevent Regression After Upgrade

9.Ensure Optimal Resource Allocation 10.Prevent Performance Issues Due To Application Changes

42

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

SQL Tuning Advisor

Ensure Optimal Resource Allocation  In a consolidated environment how can I

ensure one database is not running away with all my system resources ?  Database resource manager directives prevent

a single session to run away with all resources  In DB 12c CDB level resource plans ensure optimal resource allocations across PDBs  Create a resource allocation strategy  Allocate appropriate CPU and I/O (Exadata) across PDBs 43

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Allocating Resources in DB 12c • Gives maximum flexibility for each PDB • Allows any PDB to consume all available resource • Risky as one PDB can run away with all resources.

No Resource Allocation

44

NEW

Specify a minimum allocation

• • • •

Specify a minimum and maximum

• Ensures all PDBs get a specific share of the resources • Prevents a PBD from taking more than the maximum value assigned. • May result in unused capacity

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Ensures all PDBs get a specific share of the resources Allows any PDB to consume any unused resources Kicks in at 100% resource utilization. Assumes that not all PDBs will use its allocated resources

Insert Information Protection Policy Classification from Slide 13

Setting up Resource Manager in Oracle Enterprise Manager

• Extremely simple to manage the CDB resource plans using Enterprise Manager UI 45

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Preventive Performance Management 8.Prevent Regression After Upgrade

SQL Tuning Advisor

9.Ensure Optimal Resource Allocation

DB Resource Manager

10.Prevent Performance Issues Due To Application Changes

46

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Prevent performance issues due to Application Changes  The new BI system has very aggressive SLAs

defined. How can we ensure consistent performance across the system ?  Code migration, new indices, objects can often

impact performance of the application  How do we validate the performance of critical queries before rolling out these changes ?

47

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Validate Impact of custom code migration Trial 1

State 1

State 1 Custom Code Changes

Trial 2

State 2      

48

Use SPA Guided Workflow (recommended) or PL/SQL APIs Create a SQL tuning set of the top X (20 or 30) queries Establish first trial remotely using current state – baseline Make change – Create the indexes or migrate custom code Establish second trial remotely using the same SQL Tuning Set Review SPA report and rollout or rollback changes.

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Take the Guess Work Out!  Run your trial before and after migrating the change  Make sure your most important queries are not regressed  Take the guess work out

49

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Preventive Performance Management 8.Prevent Regression After Upgrade

50

SQL Tuning Advisor

9.Ensure Optimal Resource Allocation

DB Resource Manager

10.Prevent Performance Issues Due To Application Changes

SQL Performance Analyzer

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

51

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

52

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13