Oracle - MySQL Migration - Percona

152 downloads 140 Views 1MB Size Report
Apr 12, 2012 ... Lead the team responsible for Oracle & MySQL DBs service in support to technical systems, at Food and Agriculture Organization of United ...
Oracle - MySQL Migration Marco Tusa MySQL CTL

MySQL Conference 12 April 2012

Why Pythian • Recognized Leader: •

Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and SQL Server



Work with over 165 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments

• Expertise: •

One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 7 Oracle ACEs/ACE Directors



Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC

• Global Reach & Scalability: •

24/7/365 global remote support for DBA and consulting, systems administration, special projects or emergency response 3 © 2012 Pythian

Who am I?

4



Cluster Technical Leader at Pythian for MySQL technology



Previous manager Professional Service South EMEA at MySQL/SUN/Oracle



In MySQL before the SUN gets on us



Lead the team responsible for Oracle & MySQL DBs service in support to technical systems, at Food and Agriculture Organization of United Nations (FAO of UN)



Lead developer & system administrator teams in FAO managing the Intranet/Internet infrastructure.



Worked (a lot) in developing countries like (Ethiopia, Senegal, Ghana, Egypt …)



My Profile http://it.linkedin.com/in/marcotusa



Email [email protected] [email protected] © 2012 Pythian

Why MySQL I like to start from :* •Scalability

and Flexibility

•High

Performance

•High

Availability

•Robust •Web

Transactional Support

and Data Warehouse Strengths

•Strong

Data Protection

•Comprehensive Application

•Management •Open

Development

Ease

Source Freedom and 24 x 7 Support

•Lowest

Total Cost of Ownership

*http://www.mysql.com/why-mysql/topreasons.html

5

© 2012 Pythian

Why MySQL? MySQL TCO Savings Calculator (now)*

*From www.mysql.com TCO calculator

6

© 2012 Pythian

Why MySQL? MySQL TCO Savings Calculator (before)

*From www.mysql.com TCO calculator ancient time

7

© 2012 Pythian

Why MySQL? All good then? When should I migrate my environment to MySQL? Cost is not the only aspect to consider: •

Need to use MySQL correctly;



Be aware of existing issues •



good list of them from Baron*

Identify the real effort require for the migration.

*http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracleto-mysql/

8

© 2012 Pythian

12 things to know about MySQL (1) 1 Subqueries are poorly optimized (optimization expected in 5.6 http://dev.mysql.com/doc/refman/5.6/en/from-clause-subquery-optimization.html) 2 There is limited ability to audit (no user reference unless General log active). 3 Authentication is built-in. There is no LDAP, Active Directory, or other external authentication capability. (New PAM module available for 5.5 but only enterprise) 4 Data integrity checking is very weak, and even basic integrity constraints cannot always be enforced. (replication) 5 Most queries can use only a single index per table; some multi-index query plans exist in certain cases, but the cost is usually underestimated by the query optimizer, and they are often slower than a table scan. 6 Foreign keys are not supported in most storage engines.

9

© 2012 Pythian

12 things to know about MySQL (2) 7 Execution plans are not cached globally, only per-connection. 8 There are no integrated or add-on business intelligence, OLAP cube, etc packages. 9 There are no materialized views (also if we can use Event scheduler) 10 Replication is asynchronous and has many limitations and edge cases.

11 DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits open transactions and cannot be rolled back or crash-recovered. 12 Each storage engine can have widely varying behavior, features, and properties. (positive and negative)

10

© 2012 Pythian

Getting Started? Prepare a plan, and do not improvise • Analyze the source (from application to data design) • Identify show stoppers • Identify how to map what to what • Identify how to organize the target Most important: Be ready to do not force migration. If it does not make sense to proceed, STOP!

11

© 2012 Pythian

The Motto Use the right tool for the job

12

© 2012 Pythian

Most common source cases • Database is used only to store data all the logic reside in the application • Database contains logic such as stored procedure and complex package • Database containing data for data warehouse

• Real time data and historical records (telephone company)

13

© 2012 Pythian

Define the process Analyze Understand

Something fails

Match Src/dest Re/Design

Convert Import

Validate

14

Extract src

Test/POC

Schemas

Index

© 2012 Pythian

Logic

data

Partition

Mitigating risk of failure (analyze) When analyzing the source database(s) what should be the outcome? • Easy to understand excluding list • Identify Source type (Simple data move; data + Intelligence; data mart) • In detail review per schema of complexity

• Detailed assessment of modification and effort database objects • Detailed assessment of functions/functionalities used (also in the application) • Application assessment and review

15

© 2012 Pythian

Mitigating risk of failure (analyze) Easy to understand excluding list •Create a rank on the “impedance“ o Apply it to analyzed schema i.e.: Issue Workaround Reference to external schemas in the a different instance (db link) … Packages See Writing stored procedures

Not portable

9

Require full recode Require full recoding

See Writing stored procedures

9

Unique key longer then 255 characters

See Key length limitations

4

Views alias

Manually added

4

Sequences

See Migration of Sequences

3

Empty schemas

See empty schema definitions

2

© 2012 Pythian

Notes

10

Procedures

*The lower grade the better

16

Grade*

Columns alias must be added manually Whenever possible convert to autoincrement Convert to User definition

Mitigating risk of failure (analyze) 1. Identify and understand differences - Oracle vs MySQL behavior - DDL differences MySQL – Oracle - DML differences - Data formatting and encoding - Data set dimensions 2. Identify and understand business logic differences - map Oracle functions to MySQL - convert Oracle logic to MySQL (if possible) 3. Realize a Proof of Concept - involve an experienced Oracle DBA - involve an experience MySQL DBA - involve the developers - use real data - use real traffic

17

© 2012 Pythian

Mitigating risk of failure (understand) Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences (cont.) • Oracle is case insensitive in the schema object definition while MySQL is case sensitive (remember to set lower_case_table_names) • Oracle does not provide DEFAULT value for NOT NULL, MySQL does. • Oracle supports millisecond MySQL only from 5.6 • Oracle does not apply silent conversion to data types MySQL does (set sql_mode) • Oracle maximum VARCHAR2 dimension is 4,000 bytes, MySQL 65,535

18

© 2012 Pythian

Mitigating risk of failure (understand) Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences 1.what is what, understanding the naming conventions AUTO COMMIT Default enabled in MySQL - you can't ROLLBACK - Non Transactional Storage Engines - SET AUTOCOMMIT = {0 | 1}; 2.securing the database Database Authentication/Privileges - MySQL Privileges (local; no roles) - Oracle System Privileges (local/external; roles) 3.Dual in MySQL is not required - e.g. SELECT 1+1 but we provided for Oracle Compatibility - SELECT 1+1 FROM DUAL - SELECT CURRENT_USER() FROM DUAL 19

© 2012 Pythian

Mitigating risk of failure (understand) Understanding DDL differences Key length limitations Oracle handles index with a length up to the 40%(plus some overhead) of the database block size (db_block_size), this could be a problem with MySQL. MySQL can use 767/1000 bytes as a primary key or an index. But because in UTF-8, one character is 3 bytes, a primary key or any key can be at most 255 characters. Work around only for InnoDB innodb_large_prefix in case of Dynamic/Compressed ROW format.

20

© 2012 Pythian

Mitigating risk of failure (understand) Understanding DDL differences autoincrement/sequence Oracle uses sequence, while MySQL is bound to AUTO_INCREMENT AUTO_INCREMENT must be NOT NULL and part of the primary key

Oracle can retrieve sequence values MySQL need to use the function LAST_INSERT_ID(). The LAST_INSERT_ID() is maintained per connection and is thus safe for concurrent use. Do not use “SELECT MAX(id)+1 FROM tab”

21

© 2012 Pythian

Mitigating risk of failure (understand) Understanding Function Triggers difference Given The relevance in a Migration of the presence of SP/Trigger it is worth to talk about it a little bit more in details Procedure and triggers difference • one trigger for event in MySQL, all the different actions needs to be group • no packages, workaround using a fake schema • different behavior by storage engine and if transactional or not • Security assignments and security definer/invoker • Up to 5.5 very basic error handling and lack of “signal” . So version 5.5 is almost mandatory if in the need to use decent error handling.

22

© 2012 Pythian

Mitigating risk of failure (understand) Understanding Function Triggers difference MySQL stored programs can often add to application functionality and developer efficiency, and there are certainly many cases where the use of a procedural language such as the MySQL stored program language can do things that a non procedural language like SQL cannot.

There are also a number of reasons why a MySQL stored program approach may offer performance improvements over a traditional SQL approach • • •

It provides a procedural approach (SQL is a declarative, non procedural language) It reduces client-server traffic It allows us to divide and conquer complex statements

But…

23

© 2012 Pythian

Mitigating risk of failure (understand) Understanding Function Triggers difference One graph tells more then 1,000 words:

24

© 2012 Pythian

Mitigating risk of failure (understand) Understanding Function Triggers difference IF and CASE Statements When constructing IF and CASE statements, try to minimize the number of comparisons that these statements are likely to make by testing for the most likely scenarios first.

For instance, in the code in the next slide, the first statement maintains counts of various percentages. Assuming that the input data is evenly distributed, the first IF condition (percentage>95) will match about once in every 20 executions. On the other hand, the final condition will match in three out of four executions. So this means that for 75% of the cases, all four comparisons will need to be evaluated.

25

© 2012 Pythian

Mitigating risk of failure (understand) Understanding Function Triggers difference Non Optimized IF (percentage>95) THEN SET Above95=Above95+1; ELSEIF (percentage >=90) THEN SET Range90to95=Range90to95+1; ELSEIF (percentage >=75) THEN SET Range75to89=Range75to89+1; ELSE SET LessThan75=LessThan75+1; END IF;

Optimized IF (percentage=75 AND percentage=90 and percentage LIMIT • SEQ.CURRVAL --> LAST_INSERT_ID() • SEQ.NEXTVAL --> NULL • NO DUAL necessary (SELECT NOW()) • NO DECODE() --> IF() CASE() • JOIN (+) Syntax --> INNER|OUTER LEFT|RIGHT • No Hierarchical (connect to prior)

35

© 2012 Pythian

Mitigating risk of failure (convert) Data export & Index redesign

Re-organize the schema/table not just convert data types • Storage engines • Index full redesign • Data organization • Sharding • Partition • Logic rewrite • Inside MySQL • Move to application •

36

© 2012 Pythian

Mitigating risk of failure (POC) Realize a Proof of Concept

Don’t work Alone Involve Oracle experienced DBA Involve MySQL experience DBA Involve the developers Use real data Use real traffic Take one source for each type; start with the easy one

Go Back to the analysis phase if you have to 37

© 2012 Pythian

What should my migration doc contains? General document • Description of the main differences between platforms • Description of the work around found • Explanation of what to do to avoid most common issues

• Code write instructions • Common function mapping • List of the blocking issue(s) • List and explanation of what cannot be migrated and why

38

© 2012 Pythian

What should my migration doc contains? Per schema document • Overview of the effort for the migration Schema Name: Objects Table Views Procedure Function Trigger Package Total Time

39

Test Number 200 50 500 12 200 3

Time(min) 320 5 5000 200 2500 5 8030

© 2012 Pythian

hrs

Cost(0,50 cent/min) 5,3 2,65 0,08 0,04

83,33 3,33 41,67 0,08 133,80

41,67 1,67 20,83 0,04 66,90

What should my migration doc contains? Per schema document

•Effort per table like: Schema Name: Rows Estimated min Attribute Name lat long population SqKm Country

40

Test

Table

City

2000 10 Data type source VARCHAR2 FLOAT FLOAT Number Number CHAR

dim source

© 2012 Pythian

Data type dest 50 varchar FLOAT FLOAT 10,0 INT 7,0 MEDIUMINT 3 CHAR

dim dest 50

10 3

What should my migration doc contains? Trigger section

• Effort per schema

Schema Name:

Test

Events Insert Update

• Effort per Table: Schema Name: Total action time Trigger name Source Before Ins_change_ID Ins_change_ISO upd_population After del_died_male avr_pop_calculation Total *time in minutes for conversion

41

Before

Delete Total Packages Total Time

Test

Time(min) 50 600 50 600

50 150 3

After

600

Time(min) 20 300 20 300 10 50 3

1800

700

Table

24 Insert* Update* Delete* 20 10

30

Trigger name dest Ins_actions

15

15 30

© 2012 Pythian

100

upd_population

10 del_died_male avr_pop_calculation 10

What should my migration doc contains? Procedure - function section • Effort per schema Schema Name:

Package Pack1 Pack2

Test

Number 112 200

Pack3 Total Packages Total Time

Time(min) 1200 2000

200 521 3

Cost(0,50 cent/min)

2000

5200

• Effort per Table: Schema Name: Total Procedure name Proc_1 Func_1 Proc_2

Test

Table 3

Code rows 200 50 300

Impedance** 4 0 10

Total *time in minutes for conversion ** The lower the better 10 means no portable

42

Time* 480 120 600

© 2012 Pythian

Packge

comments Pckg1 Complex Error handling Pckg2 No problem Pckg1 Use of connect by prior

What should my migration doc contains? Document from the Proof of Concept per source type Expected results Real value from test Issues found Work around identify Time/Effort per schema • Breakdown per object (Table, View, Trigger, SP) • Redefine expectations • Review efforts and costs • • • • •

• Be ready to drop something from the migration list

43

© 2012 Pythian

Should I do all this manually? No! there are tools on the market but:

•Choose your product carefully ! •Better a simple one than something too complex •Always double check before applying

•Nothing will replace human/professional knowledge/experience

44

© 2012 Pythian

Thank you and Q&A To contact us… [email protected] 1-877-PYTHIAN

To follow us… http://www.pythian.com/news/ http://www.facebook.com/pages/The-Pythian-Group/163902527671 @pythian @pythianjobs http://www.linkedin.com/company/pythian

47

© 2012 Pythian