Apr 12, 2012 ... Lead the team responsible for Oracle & MySQL DBs service in support to
technical systems, at Food and Agriculture Organization of United ...
Oracle - MySQL Migration Marco Tusa MySQL CTL
MySQL Conference 12 April 2012
Why Pythian • Recognized Leader: •
Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and SQL Server
•
Work with over 165 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments
• Expertise: •
One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 7 Oracle ACEs/ACE Directors
•
Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC
• Global Reach & Scalability: •
24/7/365 global remote support for DBA and consulting, systems administration, special projects or emergency response 3 © 2012 Pythian
Who am I?
4
•
Cluster Technical Leader at Pythian for MySQL technology
•
Previous manager Professional Service South EMEA at MySQL/SUN/Oracle
•
In MySQL before the SUN gets on us
•
Lead the team responsible for Oracle & MySQL DBs service in support to technical systems, at Food and Agriculture Organization of United Nations (FAO of UN)
•
Lead developer & system administrator teams in FAO managing the Intranet/Internet infrastructure.
•
Worked (a lot) in developing countries like (Ethiopia, Senegal, Ghana, Egypt …)
•
My Profile http://it.linkedin.com/in/marcotusa
•
Email
[email protected] [email protected] © 2012 Pythian
Why MySQL I like to start from :* •Scalability
and Flexibility
•High
Performance
•High
Availability
•Robust •Web
Transactional Support
and Data Warehouse Strengths
•Strong
Data Protection
•Comprehensive Application
•Management •Open
Development
Ease
Source Freedom and 24 x 7 Support
•Lowest
Total Cost of Ownership
*http://www.mysql.com/why-mysql/topreasons.html
5
© 2012 Pythian
Why MySQL? MySQL TCO Savings Calculator (now)*
*From www.mysql.com TCO calculator
6
© 2012 Pythian
Why MySQL? MySQL TCO Savings Calculator (before)
*From www.mysql.com TCO calculator ancient time
7
© 2012 Pythian
Why MySQL? All good then? When should I migrate my environment to MySQL? Cost is not the only aspect to consider: •
Need to use MySQL correctly;
•
Be aware of existing issues •
•
good list of them from Baron*
Identify the real effort require for the migration.
*http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracleto-mysql/
8
© 2012 Pythian
12 things to know about MySQL (1) 1 Subqueries are poorly optimized (optimization expected in 5.6 http://dev.mysql.com/doc/refman/5.6/en/from-clause-subquery-optimization.html) 2 There is limited ability to audit (no user reference unless General log active). 3 Authentication is built-in. There is no LDAP, Active Directory, or other external authentication capability. (New PAM module available for 5.5 but only enterprise) 4 Data integrity checking is very weak, and even basic integrity constraints cannot always be enforced. (replication) 5 Most queries can use only a single index per table; some multi-index query plans exist in certain cases, but the cost is usually underestimated by the query optimizer, and they are often slower than a table scan. 6 Foreign keys are not supported in most storage engines.
9
© 2012 Pythian
12 things to know about MySQL (2) 7 Execution plans are not cached globally, only per-connection. 8 There are no integrated or add-on business intelligence, OLAP cube, etc packages. 9 There are no materialized views (also if we can use Event scheduler) 10 Replication is asynchronous and has many limitations and edge cases.
11 DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits open transactions and cannot be rolled back or crash-recovered. 12 Each storage engine can have widely varying behavior, features, and properties. (positive and negative)
10
© 2012 Pythian
Getting Started? Prepare a plan, and do not improvise • Analyze the source (from application to data design) • Identify show stoppers • Identify how to map what to what • Identify how to organize the target Most important: Be ready to do not force migration. If it does not make sense to proceed, STOP!
11
© 2012 Pythian
The Motto Use the right tool for the job
12
© 2012 Pythian
Most common source cases • Database is used only to store data all the logic reside in the application • Database contains logic such as stored procedure and complex package • Database containing data for data warehouse
• Real time data and historical records (telephone company)
13
© 2012 Pythian
Define the process Analyze Understand
Something fails
Match Src/dest Re/Design
Convert Import
Validate
14
Extract src
Test/POC
Schemas
Index
© 2012 Pythian
Logic
data
Partition
Mitigating risk of failure (analyze) When analyzing the source database(s) what should be the outcome? • Easy to understand excluding list • Identify Source type (Simple data move; data + Intelligence; data mart) • In detail review per schema of complexity
• Detailed assessment of modification and effort database objects • Detailed assessment of functions/functionalities used (also in the application) • Application assessment and review
15
© 2012 Pythian
Mitigating risk of failure (analyze) Easy to understand excluding list •Create a rank on the “impedance“ o Apply it to analyzed schema i.e.: Issue Workaround Reference to external schemas in the a different instance (db link) … Packages See Writing stored procedures
Not portable
9
Require full recode Require full recoding
See Writing stored procedures
9
Unique key longer then 255 characters
See Key length limitations
4
Views alias
Manually added
4
Sequences
See Migration of Sequences
3
Empty schemas
See empty schema definitions
2
© 2012 Pythian
Notes
10
Procedures
*The lower grade the better
16
Grade*
Columns alias must be added manually Whenever possible convert to autoincrement Convert to User definition
Mitigating risk of failure (analyze) 1. Identify and understand differences - Oracle vs MySQL behavior - DDL differences MySQL – Oracle - DML differences - Data formatting and encoding - Data set dimensions 2. Identify and understand business logic differences - map Oracle functions to MySQL - convert Oracle logic to MySQL (if possible) 3. Realize a Proof of Concept - involve an experienced Oracle DBA - involve an experience MySQL DBA - involve the developers - use real data - use real traffic
17
© 2012 Pythian
Mitigating risk of failure (understand) Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences (cont.) • Oracle is case insensitive in the schema object definition while MySQL is case sensitive (remember to set lower_case_table_names) • Oracle does not provide DEFAULT value for NOT NULL, MySQL does. • Oracle supports millisecond MySQL only from 5.6 • Oracle does not apply silent conversion to data types MySQL does (set sql_mode) • Oracle maximum VARCHAR2 dimension is 4,000 bytes, MySQL 65,535
18
© 2012 Pythian
Mitigating risk of failure (understand) Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences 1.what is what, understanding the naming conventions AUTO COMMIT Default enabled in MySQL - you can't ROLLBACK - Non Transactional Storage Engines - SET AUTOCOMMIT = {0 | 1}; 2.securing the database Database Authentication/Privileges - MySQL Privileges (local; no roles) - Oracle System Privileges (local/external; roles) 3.Dual in MySQL is not required - e.g. SELECT 1+1 but we provided for Oracle Compatibility - SELECT 1+1 FROM DUAL - SELECT CURRENT_USER() FROM DUAL 19
© 2012 Pythian
Mitigating risk of failure (understand) Understanding DDL differences Key length limitations Oracle handles index with a length up to the 40%(plus some overhead) of the database block size (db_block_size), this could be a problem with MySQL. MySQL can use 767/1000 bytes as a primary key or an index. But because in UTF-8, one character is 3 bytes, a primary key or any key can be at most 255 characters. Work around only for InnoDB innodb_large_prefix in case of Dynamic/Compressed ROW format.
20
© 2012 Pythian
Mitigating risk of failure (understand) Understanding DDL differences autoincrement/sequence Oracle uses sequence, while MySQL is bound to AUTO_INCREMENT AUTO_INCREMENT must be NOT NULL and part of the primary key
Oracle can retrieve sequence values MySQL need to use the function LAST_INSERT_ID(). The LAST_INSERT_ID() is maintained per connection and is thus safe for concurrent use. Do not use “SELECT MAX(id)+1 FROM tab”
21
© 2012 Pythian
Mitigating risk of failure (understand) Understanding Function Triggers difference Given The relevance in a Migration of the presence of SP/Trigger it is worth to talk about it a little bit more in details Procedure and triggers difference • one trigger for event in MySQL, all the different actions needs to be group • no packages, workaround using a fake schema • different behavior by storage engine and if transactional or not • Security assignments and security definer/invoker • Up to 5.5 very basic error handling and lack of “signal” . So version 5.5 is almost mandatory if in the need to use decent error handling.
22
© 2012 Pythian
Mitigating risk of failure (understand) Understanding Function Triggers difference MySQL stored programs can often add to application functionality and developer efficiency, and there are certainly many cases where the use of a procedural language such as the MySQL stored program language can do things that a non procedural language like SQL cannot.
There are also a number of reasons why a MySQL stored program approach may offer performance improvements over a traditional SQL approach • • •
It provides a procedural approach (SQL is a declarative, non procedural language) It reduces client-server traffic It allows us to divide and conquer complex statements
But…
23
© 2012 Pythian
Mitigating risk of failure (understand) Understanding Function Triggers difference One graph tells more then 1,000 words:
24
© 2012 Pythian
Mitigating risk of failure (understand) Understanding Function Triggers difference IF and CASE Statements When constructing IF and CASE statements, try to minimize the number of comparisons that these statements are likely to make by testing for the most likely scenarios first.
For instance, in the code in the next slide, the first statement maintains counts of various percentages. Assuming that the input data is evenly distributed, the first IF condition (percentage>95) will match about once in every 20 executions. On the other hand, the final condition will match in three out of four executions. So this means that for 75% of the cases, all four comparisons will need to be evaluated.
25
© 2012 Pythian
Mitigating risk of failure (understand) Understanding Function Triggers difference Non Optimized IF (percentage>95) THEN SET Above95=Above95+1; ELSEIF (percentage >=90) THEN SET Range90to95=Range90to95+1; ELSEIF (percentage >=75) THEN SET Range75to89=Range75to89+1; ELSE SET LessThan75=LessThan75+1; END IF;
Optimized IF (percentage=75 AND percentage=90 and percentage LIMIT • SEQ.CURRVAL --> LAST_INSERT_ID() • SEQ.NEXTVAL --> NULL • NO DUAL necessary (SELECT NOW()) • NO DECODE() --> IF() CASE() • JOIN (+) Syntax --> INNER|OUTER LEFT|RIGHT • No Hierarchical (connect to prior)
35
© 2012 Pythian
Mitigating risk of failure (convert) Data export & Index redesign
Re-organize the schema/table not just convert data types • Storage engines • Index full redesign • Data organization • Sharding • Partition • Logic rewrite • Inside MySQL • Move to application •
36
© 2012 Pythian
Mitigating risk of failure (POC) Realize a Proof of Concept
Don’t work Alone Involve Oracle experienced DBA Involve MySQL experience DBA Involve the developers Use real data Use real traffic Take one source for each type; start with the easy one
Go Back to the analysis phase if you have to 37
© 2012 Pythian
What should my migration doc contains? General document • Description of the main differences between platforms • Description of the work around found • Explanation of what to do to avoid most common issues
• Code write instructions • Common function mapping • List of the blocking issue(s) • List and explanation of what cannot be migrated and why
38
© 2012 Pythian
What should my migration doc contains? Per schema document • Overview of the effort for the migration Schema Name: Objects Table Views Procedure Function Trigger Package Total Time
39
Test Number 200 50 500 12 200 3
Time(min) 320 5 5000 200 2500 5 8030
© 2012 Pythian
hrs
Cost(0,50 cent/min) 5,3 2,65 0,08 0,04
83,33 3,33 41,67 0,08 133,80
41,67 1,67 20,83 0,04 66,90
What should my migration doc contains? Per schema document
•Effort per table like: Schema Name: Rows Estimated min Attribute Name lat long population SqKm Country
40
Test
Table
City
2000 10 Data type source VARCHAR2 FLOAT FLOAT Number Number CHAR
dim source
© 2012 Pythian
Data type dest 50 varchar FLOAT FLOAT 10,0 INT 7,0 MEDIUMINT 3 CHAR
dim dest 50
10 3
What should my migration doc contains? Trigger section
• Effort per schema
Schema Name:
Test
Events Insert Update
• Effort per Table: Schema Name: Total action time Trigger name Source Before Ins_change_ID Ins_change_ISO upd_population After del_died_male avr_pop_calculation Total *time in minutes for conversion
41
Before
Delete Total Packages Total Time
Test
Time(min) 50 600 50 600
50 150 3
After
600
Time(min) 20 300 20 300 10 50 3
1800
700
Table
24 Insert* Update* Delete* 20 10
30
Trigger name dest Ins_actions
15
15 30
© 2012 Pythian
100
upd_population
10 del_died_male avr_pop_calculation 10
What should my migration doc contains? Procedure - function section • Effort per schema Schema Name:
Package Pack1 Pack2
Test
Number 112 200
Pack3 Total Packages Total Time
Time(min) 1200 2000
200 521 3
Cost(0,50 cent/min)
2000
5200
• Effort per Table: Schema Name: Total Procedure name Proc_1 Func_1 Proc_2
Test
Table 3
Code rows 200 50 300
Impedance** 4 0 10
Total *time in minutes for conversion ** The lower the better 10 means no portable
42
Time* 480 120 600
© 2012 Pythian
Packge
comments Pckg1 Complex Error handling Pckg2 No problem Pckg1 Use of connect by prior
What should my migration doc contains? Document from the Proof of Concept per source type Expected results Real value from test Issues found Work around identify Time/Effort per schema • Breakdown per object (Table, View, Trigger, SP) • Redefine expectations • Review efforts and costs • • • • •
• Be ready to drop something from the migration list
43
© 2012 Pythian
Should I do all this manually? No! there are tools on the market but:
•Choose your product carefully ! •Better a simple one than something too complex •Always double check before applying
•Nothing will replace human/professional knowledge/experience
44
© 2012 Pythian
Thank you and Q&A To contact us…
[email protected] 1-877-PYTHIAN
To follow us… http://www.pythian.com/news/ http://www.facebook.com/pages/The-Pythian-Group/163902527671 @pythian @pythianjobs http://www.linkedin.com/company/pythian
47
© 2012 Pythian