Jul 31, 2012 ... 7/31/2012. 7. Business Intelligence and Microsoft SQL Server 2012 .... OLTP
systems are usually not good at delivering large aggregates ...
Microsoft MCSE Business Intelligence Certification Courseware Version 1.1
www.firebrandtraining.com
31/07/2012
Part 1: Business Intelligence MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012
1 7/31/2012 7/31/2012
1
©2007 – Body Temple
EQUIPPING THE ORGANIZATION FOR EFFECTIVE DECISION MAKING
• Effective Decision Making • Keys to Effective Decision Making • Business intelligence
2 7/31/2012 7/31/2012
2
©2007 – Body Temple
1
31/07/2012
Who is a Decision Maker Decision makers must be throughout the organization • Effective plans and policies created at the top can be undone by poor decision making at lower levels • Good decisions made by those at the bottom can be quickly overwhelmed by poor decisions made up the line
Effective decision making at every level leads to success
3 7/31/2012 7/31/2012
©2007 – Body Temple
3
What is an Effective Decision? Effective decisions are choices that move an organization closer to an agreed-on set of goals in a timely manner. Key ingredients necessary for making effective decisions: • There must be a set of goals to work toward. • There must be a way to measure whether a chosen course is moving toward or away from those goals. • Information based on those measures must be provided to the decision maker in a timely manner.
4 7/31/2012 7/31/2012
4
©2007 – Body Temple
2
31/07/2012
Goals Goals must be specific and should avoid being vague, as in: • “superior customer satisfaction” how to measure? • “increased profit margin” what costs are impacted? • “better meeting our community’s needs” what needs?
To function as part of effective decision making, a goal must: • Contain a specific target. • Provide a means to measure whether we are progressing toward that target.
5 7/31/2012 7/31/2012
©2007 – Body Temple
5
Is Your Map Upside-Down? Need a method of navigation to: • Determine whether we are heading towards or away from the goal • Measure the steps by which we are moving towards the goal
6 7/31/2012 7/31/2012
6
©2007 – Body Temple
3
31/07/2012
Business Intelligence and Microsoft SQL Server 2012 Business Intelligence is not simply a set of facts and figures on a printed report For the report to be an effective business intelligence tool it needs to be: • Formatted in a way that is easily understood • Has concise summaries of relevant data • Delivered in a timely fashion
7 7/31/2012 7/31/2012
©2007 – Body Temple
7
SQL Server 2012 Editions for BI
Capabilities Basic corporate BI
Basic Data Integration Self-Service BI Advanced Corporate BI Enterprise Data Management Advanced Data Integration Data Warehousing Advanced High Availability
Enterprise
Business Intelligence
Standard
• • • • • • • •
• • • • •
• •
8 7/31/2012 7/31/2012
8
©2007 – Body Temple
4
31/07/2012
EIM & Data Warehousing
Microsoft BI Technologies Microsoft SQL Server Integration Services
Microsoft SQL Azure and the Windows Azure Marketplace
SQL Server Database Engine
1011000110 SQL Server Data Quality Services
SQL Server Master Data Services
SQL Server Analysis Services
Business Intelligence
SQL Server Reporting Services Microsoft PowerPivot Technologies Microsoft Excel • • •
Data Mining Add-In PowerPivot Add-In MDS Add-In
Power View
Microsoft SharePoint Server
Reports, KPIs, and Dashboards
9 7/31/2012 7/31/2012
9
©2007 – Body Temple
MAKING THE MOST OF WHAT YOU’VE GOT – USING BUSINESS INTELLIGENCE
• What Business intelligence Can Do For You • Business Intelligence at Many Levels • Building the Foundation
10 7/31/2012 7/31/2012
10
©2007 – Body Temple
5
31/07/2012
When We Know What We Are Looking For Two approaches: • Layout-led Discovery • When we know the questions we want answered and where to find the information needed • The most common form of business intelligence and one we are all familiar with
• Data-led Discovery • We know the question, but we don’t know where to look for the answer • The information we find determines where we want to go next
11 7/31/2012 7/31/2012
©2007 – Body Temple
11
Discovering New Questions and Their Answers Data may hold answers to questions we have not though to ask Data may contain trends, correlations, and dependencies at a level of detail that would be impossible for a human being to notice using either the layout-led or data-led discovery • Discovery requires a computer to use data mining techniques • Works at the lowest level of detail • Uses advanced mathematical algorithms
12 7/31/2012 7/31/2012
12
©2007 – Body Temple
6
31/07/2012
The Top of the Pyramid Decision makers at the upper levels of our organizations must look at the big picture They deal with the long-term policies and direction They need: • Highly summarized measures • Higher Latency
13 7/31/2012 7/31/2012
©2007 – Body Temple
13
Mid-Level Mid-level decision makers are managing the operation of departments and other working units within the organization They set short-term goals
Do the planning for the functioning of these areas They need: • Summarized measures with drilldown • Some latency acceptable
14 7/31/2012 7/31/2012
14
©2007 – Body Temple
7
31/07/2012
The Broad Base The firepersons, managers, and group leaders Deal with daily operations Set daily goals Make decisions on resource allocation for the next week, the next day, or the next shift Plan the next sales campaign or the next sales call
They need: • Measures at the detail level • Low latency
15 7/31/2012 7/31/2012
15
©2007 – Body Temple
SEEKING THE SOURCE – THE SOURCE OF BUSINESS INTELLIGENCE
• Seeking the Source • The Data Mart • Snowflakes, Stars, and Analysis Services
16 7/31/2012 7/31/2012
16
©2007 – Body Temple
8
31/07/2012
Transactional Data Transactional data is the information stored to track the interactions, or business transactions, carried out by an organization Online transaction processing (OLTP) systems record business interactions as they happen. They support the day-to-day operation of an organization
17 7/31/2012 7/31/2012
©2007 – Body Temple
17
Difficulties Using Transactional Data for Business Intelligence OLTP systems are the treasure chests holding the raw data we need to calculate measures and create business intelligence Well-designed OLTP systems are optimized for efficiently processing and storing transactions • Normalized data
BI is concerned with aggregates OLTP systems are usually not good at delivering large aggregates
18 7/31/2012 7/31/2012
18
©2007 – Body Temple
9
31/07/2012
The Data Mart A data mart is a body of historical data in an electronic repository that does not participate in the daily operations of the organization. Instead, this data is used to create business intelligence. The data in the data mart usually applies to a specific area of the organization. • Data is available for our business intelligence needs somewhere outside of our OLTP systems
19 7/31/2012 7/31/2012
©2007 – Body Temple
19
Features of a Data Mart Built for speed of access Data is de-normalized (repeated) requiring fewer table joins for data retrieval Design pattern is organized around “facts”, knows as stars and snowflakes schemas Data Mart Characteristics: • No Real-Time Data • Data is copied from the OLTP systems periodically and written to the data mart.
• Consolidated • Data from different OLTP systems are consolidated into a single mart.
• Cleansed • Inconsistencies and errors are removed from transactional data so it has the consistency necessary for use in a data mart.
20 7/31/2012 7/31/2012
20
©2007 – Body Temple
10
31/07/2012
Data Mart Structure The data we use for business intelligence can be divided into four categories: • Measures • A measure is a numeric quantity expressing some aspect of the organization’s performance. The information represented by this quantity is used to support or evaluate the decision making and performance of the organization. A measure can also be called a fact. • Dimensions • A dimension is a categorization used to spread out an aggregate measure to reveal its constituent parts. • Attributes • An attribute is an additional piece of information pertaining to a dimension member that is not the unique identifier or the description of the member. • Hierarchies • A hierarchy is a structure made up of two or more levels of related dimensions. A dimension at an upper level of the hierarchy completely contains one or more dimensions from the next lower level of the hierarchy.
21 7/31/2012 7/31/2012
21
©2007 – Body Temple
Data Mart Structure Example (1 of 2) A measure of total sales as a single point of information:
By applying categorization or a dimension to that single point of data, we can spread it out, for example for each year
22 7/31/2012 7/31/2012
22
©2007 – Body Temple
11
31/07/2012
Data Mart Structure Example (2 of 2) Next we can spread the total sales for each product type
If we were to spread it further out by sales region, the measure becomes a cube
23 7/31/2012 7/31/2012
23
©2007 – Body Temple
The Star Schema All attributes are directly related to the key attribute, which enables users to browse the facts in the cube based on any attribute hierarchy in the dimension
24 7/31/2012 7/31/2012
24
©2007 – Body Temple
12
31/07/2012
The Snow Flakes Schema An attribute is either directly linked to the key attribute if their underlying table is directly linked to the fact table, or is indirectly linked by means of the attribute that is bound to the key in the underlying table that links the snowflake table to the directly linked table
25 7/31/2012 7/31/2012
25
©2007 – Body Temple
ONE-STOP SHOPPING – THE UNIFIED DIMENSIONAL MODEL
• Online Analytical Processing • The Unified Dimension Model • Tools of the Trade
26 7/31/2012 7/31/2012
26
©2007 – Body Temple
13
31/07/2012
Online Analytical Processing Online analytical processing (OLAP) systems enable users to quickly and easily retrieve information from data, usually in a data mart, for analysis. OLAP systems present data using measures, dimensions, hierarchies, and cubes.
27 7/31/2012 7/31/2012
©2007 – Body Temple
27
Building OLAP - Out of Cubes A cube is a structure that contains a value for one or more measures for each unique combination of the members of all its dimensions. • These are detail, or leaf-level, values.
The cube also contains aggregated values formed by the dimension hierarchies or when one or more of the dimensions is left out of the hierarchy. An aggregate is a value formed by combining values from a given dimension or set of dimensions to create a single value.
28 7/31/2012 7/31/2012
28
©2007 – Body Temple
14
31/07/2012
Features of an OLAP System Multidimensional Database • Structures around measures, dimensions, hierarchies, and cubes rather than tables, rows, columns and relations.
Preprocessed Aggregates • OLAP systems preprocess a portion of the aggregates that are found throughout the cube. • The preprocessing is done as part of the background task that loads or updates the data in the OLAP database.
Easily Understood • If designed properly, dimensions and hierarchies should match the structure of the organization.
29 7/31/2012 7/31/2012
©2007 – Body Temple
29
Architecture ROLAP – Relational OLAP • Stores the cube structure in a multidimensional database. • The leaf-level measures are left in the relational data mart. • The preprocessed aggregates are also stored in a relational database table.
MOLAP – Multidimensional OLAP • Stores the cube structure in a multidimensional database.
• Both the preprocessed aggregate values and a copy of the leaf-level values are placed in the multidimensional database as well
HOLAP – Hybrid OLAP • Combines ROLAP and MOLAP • Stores the cube structure and the preprocessed aggregates in a multidimensional database.
• Leaves the leaf-level data in the relational data mart that serves as the source of the cube.
30 7/31/2012 7/31/2012
30
©2007 – Body Temple
15
31/07/2012
Disadvantages Complexity to Develop and Administer • To maintain and easy to use environment for the end user, a certain amount of complexity is shifted to the development and administrative tasks of the system.
Data Mart Required • Using either a star or a snow flake layout.
Latency • Data needs to be migrated from the OLTP systems to the data mart.
Read-Only • Not a disadvantage but could be problematic when changes to the data are necessary to project certain results.
31 7/31/2012 7/31/2012
31
©2007 – Body Temple
The Unified Dimensional Model UDM introduced with SQL Server 2005 UDM is designed to provide all the benefits of an OLAP system with multidimensional storage and preprocessed aggregates, while avoiding a number of the drawbacks of more traditional OLAP systems
32 7/31/2012 7/31/2012
32
©2007 – Body Temple
16
31/07/2012
UDM Structure UDM is a structure that sits on top of a data mart and looks exactly like an OLAP system to an end user. • Does not require a data mart. • Can be built over one or more OLTP systems. • Can be built over both a data mart and OLTP systems data. • Can include data from other vendors databases and XML.
A UDM can have one or more data sources. UDM utilizes data views to determine which tables and fields to use from the data source.
33 7/31/2012 7/31/2012
©2007 – Body Temple
33
UDM Proactive Caching UDM uses proactive caching technology to obtain the performance advantages of traditional OLAP systems. The cache is created when needed and changed when the underlying data or the underlying structure changes. • Items are created in the cache before they have been requested by the user. • The UDM monitors the data in the data source. As the data is modified, the UDM updates its structures.
Proactive cache can be built using MOLAP, ROLAP, or HOLAP.
34 7/31/2012 7/31/2012
34
©2007 – Body Temple
17
31/07/2012
UDM Advantages OLAP Built on Transactional Data Extremely Low Latency
Ease of Creation and Maintenance Design Versioning with Source Control
35 7/31/2012 7/31/2012
©2007 – Body Temple
35
Creating Analysis Solutions with SQL Server 2012 • SQL Server Data Tools
Multidimensional models
Tabular models
Data mining models
• Microsoft Excel
PowerPivot tabular models
Data mining models
36 7/31/2012 7/31/2012
36
©2007 – Body Temple
18
31/07/2012
Creating Reporting Solutions with SQL Server 2012
Authoring
BI Developer
Report Designer • Project-Based Development
with source control
• Sophisticated Design
Environment
IT Pro
Power User
Report Builder • ClickOnce Installation • Reusable Report Elements • Flexible Layout
Information Worker
Power View • Interactive data
Visualization in the Browser
• Drag and Drop from Existing
Data Model
• Rich Design Capabilities
Subscriptions
Data Alerts
Delivery
Interactive
37 7/31/2012 7/31/2012
37
©2007 – Body Temple
19
31/07/2012
Part 2: Business Intelligence MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012
1 7/31/2012 7/31/2012
1
©2007 – Body Temple
BUILDING FOUNDATIONS – CREATING DATA MARTS
• Data Mart • Designing a data Mart • Table Compression
2 7/31/2012 7/31/2012
2
©2007 – Body Temple
1
31/07/2012
Who Needs a Data Mart Anyway? Even with the UDM, situations still exist where a data mart may be the best choice as a source for business intelligence data • Legacy Databases • Some databases may not have an appropriate OLE DB provider
• Data from Non-database Source • Data would need to be imported into a data mart before it can be utilized by a UDM
• No Physical Connection • No full-time connection available to the data
• Dirty data • Data requires cleaning before it can used as a source
3 7/31/2012 7/31/2012
©2007 – Body Temple
3
Designing a Data Mart A data mart is made up of measures and dimensions organized in hierarchies and attributes Design must take into consideration: • Identifying the information that our decision makers need
• Reconcile the information with the available data in the OLTP systems • Organize the data into the data mart components
4 7/31/2012 7/31/2012
4
©2007 – Body Temple
2
31/07/2012
Decision maker’s Needs Decision makers need to be involved in the design process: • Decision makers are the ones in the trenches • Decision makers ultimately determine the success or failure of a project
Questions that need to be answered by decision makers: • What facts, figures, statistics, and so forth do you need for effective decision making? (foundation and feedback measures)
• How should this information be sliced and diced for analysis? (dimensions) • What additional information can aid in finding exactly what is needed? (attributes)
5 7/31/2012 7/31/2012
5
©2007 – Body Temple
Available Data Reality check: is the data available in the OLTP systems? If the data is not available, can we get that information from another data source?
6 7/31/2012 7/31/2012
6
©2007 – Body Temple
3
31/07/2012
Data Mart Structures The structures, measure, dimensions, hierarchies, and attributes, will lead us to the star or snow flake schema that will define our data mart The next set of slides will discuss the process by which to address the design of each structure
7 7/31/2012 7/31/2012
©2007 – Body Temple
7
Measures The measures are the foundation and feedback information our decision makers require. Reconcile the requirements with what is available in the OLTP data to come up with a list of measures. Examples of numeric data that can be used as measures: Monetary Amounts, Counts, Time Periods The following will be needed for each measure: • Name of the measure • What OLTP field or fields should be used to supply the data • Data type (money, integer, decimal) • Formula used to calculate the measure (if there is one)
8 7/31/2012 7/31/2012
8
©2007 – Body Temple
4
31/07/2012
Dimensions and Hierarchies While measures define what the decision makers want to see, the dimensions and hierarchies define how they want to see it. Reconcile the requested dimensions and hierarchies with what is available from the OLTP data The following is needed for each dimension: • Name of the dimension • What OLTP field or fields are to be used to supply the data • Data type of the dimension’s key (the code that uniquely identifies each member of the dimension) • Name of the parent dimension (if there is one)
9 7/31/2012 7/31/2012
©2007 – Body Temple
9
Attributes Attributes provide additional information about a dimension and may result from Information decision makers want to: • Be readily available during analysis • Filter on during the analysis process
We need to reconcile the requested attributes with the data available from the OLTP database to come up with the list of attributes in our design. The following is needed for each attribute: • Name of the attribute • What OLTP field or fields are to be used to supply the data • Data type • Name of the dimension to which it applies
10 7/31/2012 7/31/2012
10
©2007 – Body Temple
5
31/07/2012
Stars and Snowflakes Measures are placed in a single table called the fact table. The dimensions at the lowest level of the hierarchies are each placed in their own dimension table In a Star Schema, all the information for a hierarchy is stored in the same table. In the snowflake schema, each level in the dimensional hierarchy has its own table. Dimensions are linked together with foreign key relationships to form the hierarchy.
11 7/31/2012 7/31/2012
©2007 – Body Temple
11
Table Compression Table compression modifies the way data is physically stored on the disk drive in order to save space • It is transparent to applications making use of the data
SQL Server provides for 2 types of table compression: • Row Compression • Page Compression
To enable table compression: ALTER TABLE ManufacturingFact REBUILD WITH (DATA_COMPRESSION = PAGE) ROW
12 7/31/2012 7/31/2012
12
©2007 – Body Temple
6
31/07/2012
TRANSFORMERS – INTEGRATION SERVICES STRUCTURE AND COMPONENTS
• Integration Services • Package Items
13 7/31/2012 7/31/2012
©2007 – Body Temple
13
Overview of Data Warehouse Load Cycles
ETL process extracts new and modified data
Staging Database
ETL process inserts or modifies data in the data warehouse based on changes
Data Warehouse
• Extract changes from data sources • Refresh the data warehouse based on
changes
Users modify data in business applications 7/31/2012 7/31/2012
14
Special considerations for slowly changing dimensions
14 ©2007 – Body Temple
7
31/07/2012
Review Options for ETL Microsoft SQL Server Integration Services The Import and Export Data Wizard Transact-SQL The bcp utility Replication 15 7/31/2012 7/31/2012
©2007 – Body Temple
15
Package Structure SSIS creates structures called packages • Used to move data between systems • Contain data sources and data destinations
SSIS is an ETL (Extract, Transform, and Load) tool that is: • Easy to use • Extremely flexible • Exceedingly capable
• Highly scalable
16 7/31/2012 7/31/2012
16
©2007 – Body Temple
8
31/07/2012
Package Items Control Flow • Control Flow Containers • Control Flow Tasks • Maintenance Plan tasks
Data Flow • Data Flow sources
• Data Flow transformation • Data Flow destinations
17 7/31/2012 7/31/2012
17
©2007 – Body Temple
Event Handlers Integration Services packages are event-driven An event can be the completion of a task or an error that occurs during task execution An event handler is a routine that is defined as a control flow Event handler tasks can be created in the Event Handlers Designer tab
18 7/31/2012 7/31/2012
18
©2007 – Body Temple
9
31/07/2012
Precedence Arrows Control the order in which tasks are executed Three options are available: • Success – Green • Failure – Red • Completion - Blue
19 7/31/2012 7/31/2012
©2007 – Body Temple
19
Deploying SSIS Packages Deployment from Development/Testing/Staging to a Production environment involves 4 primary tasks: • Package Configuration • Creating a Package Deployment Utility
• Installing with a Package Deployment Utility • Executing Integration Services Packages
20 7/31/2012 7/31/2012
20
©2007 – Body Temple
10
31/07/2012
SSIS Deployment Models Package Deployment Model
SSIS Packages are deployed and managed individually
Project Deployment Model
Multiple packages are deployed in a single project
Project
Project-level parameter Project-level connection manager
Package Package-level parameter Package connection manager
Deploy
SSIS Catalog
Deploy
Package Deployment Model
Package Package-level parameter Package connection manager
21 7/31/2012 7/31/2012
21
©2007 – Body Temple
Package Deployment Model • Storage
MSDB
File System
• Package Configurations
Property values to be set dynamically at run time
• Package Deployment
Utility
Generate all required files for easier deployment
22 7/31/2012 7/31/2012
22
©2007 – Body Temple
11
31/07/2012
Project Deployment Model • The SSIS catalog
Storage and management for SSIS projects on a SQL Server 2012 instance
• Folders
A hierarchical structure for organizing and securing SSIS projects
23 7/31/2012 7/31/2012
©2007 – Body Temple
23
Deployment Model Comparison Feature
Package Deployment
Project Deployment
Unit of Deployment
Package
Project
Storage
File system or MSDB
SSIS Catalog
Dynamic configuration
Package configurations
Environment variables mapped to project-level parameters and connection managers
Compiled format
Multiple .dtsx files
Single .ispac file
Troubleshooting
Configure logging for each package
SSIS catalog includes built-in reports and views
24 7/31/2012 7/31/2012
24
©2007 – Body Temple
12
31/07/2012
SSIS Catalog • Pre-requisites
SQL Server 2012
SQL CLR enabled
• Creating a catalog
Use SQL Server Management Studio
One SSIS catalog per SQL Server instance
• Catalog Security
Folder Security
Object Security
Catalog Encryption
Sensitive Parameters 25
7/31/2012 7/31/2012
25
©2007 – Body Temple
Environments and Variables
• Environments
Execution contexts for projects
• Variables
Environment-specific values that can be mapped to project parameters and connection manager properties at run time
26 7/31/2012 7/31/2012
26
©2007 – Body Temple
13
31/07/2012
Deploying an SSIS Project • Integration Services
Deployment Wizard
SQL Server Data Tools
SQL Server Management Studio
27 7/31/2012 7/31/2012
27
©2007 – Body Temple
Viewing Project Execution Information • Integration Services
Dashboard provides built-in reports
• Additional sources of
information:
Event Handlers
Error Outputs
Logging
Debug Dump Files
28 7/31/2012 7/31/2012
28
©2007 – Body Temple
14
31/07/2012
Control Flow Containers For Loop Container • Enables us to repeat a segment of a control flow, the number of times is controlled by 3 properties: • InitExpression, initial value • EvalExpression, evaluated on ever loop, if true the loop content is executed • AssignExpression, evaluated along with the EvalExpression after each execution of the loop
Foreach Loop Container • Iterates one time for each item in the collection
Sequence Container • No iteration, it only helps in organizing the tasks in a package
29 7/31/2012 7/31/2012
©2007 – Body Temple
29
Control Flow Tasks ActiveX Script Task
Message Queue Task
Analysis Services Execute DDL Task
Script Task
Analysis Services Processing Task
Send mail Task
Bulk Insert Task
Transfer Database Task
Data Flow Task
Transfer Error Messages Task
Data Mining Query Task
Transfer Jobs Task
Data Profiling Task
Transfer Logins Task
Execute DTS 2000 Package Task
Transfer Master Stored Procedures Task
Execute Package Task
Transfer SQL Server Objects Task
Execute Process Task
Web Service Task
Execute SQL Task
WMI Data Reader Task
File System Task
WMI Event Watcher Task
FTP Task
XML Task 30
7/31/2012 7/31/2012
30
©2007 – Body Temple
15
31/07/2012
Maintenance Plan Tasks
Back Up Database Task Check Database Integrity Task Check Database Integrity Task Execute T-SQL Statement Task History Cleanup Task
Maintenance Cleanup Task Notify Operator Task Rebuild Index Task Reorganize Index Task Shrink Database Task Update Statistics Task Custom Tasks
31 7/31/2012 7/31/2012
31
©2007 – Body Temple
Data Flow Sources ADO.NET Data Source Excel Source Flat File Source OLE DB Source Raw File Source XML Source
32 7/31/2012 7/31/2012
32
©2007 – Body Temple
16
31/07/2012
Data Flow Transformations Aggregate
Merge
Audit
Merge Join
Cache Transform (new)
Multicast
Character Map
OLE DB Command
Conditional Split
Percentage Sampling
Copy Column
Pivot
Data Conversion
Row Count
Data Mining Query
Row Sampling
Derived Column
Script Component
Export Column
Slowly Changing Dimension
Fuzzy Grouping
Term Extraction
Fuzzy Lookup
Term Lookup
Import Column
Union All
Lookup
Unpivot
33 7/31/2012 7/31/2012
©2007 – Body Temple
33
Data Flow Destinations
ADO.NET
OLE DB
Data Mining Model Training
Partition Processing
Data Radar Dimension Processing Excel
Raw File
Recordset SQL Server Compact SQL Server
Flat File
34 7/31/2012 7/31/2012
34
©2007 – Body Temple
17
31/07/2012
Package Debugging Setting Breakpoints • We can set a breakpoint on any of the control flow tasks in a package
Viewing Package State • While the package execution is paused at a breakpoint, there are several places to see the current execution state of the package
Viewing Data Flow • We can attach data viewers inside the data flow at various steps along the way to view the doings inside package tasks
35 7/31/2012 7/31/2012
©2007 – Body Temple
35
Change Data Capture 1.
Enable Change Data Capture
EXEC sys.sp_cdc_enable_db EXEC sys.sp_cdc_enable_table
1.
@source_schema = N'dbo', @source_name = N'Customers', @role_name = NULL, @supports_net_changes = 1
Map start and end times to log sequence numbers
DECLARE @from_lsn binary(10), @to_lsn binary(10); SET @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @StartDate) SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @EndDate)
2.
Handle null log sequence numbers
IF (@from_lsn IS NULL) OR (@to_lsn IS NULL) -- There may have been no transactions in the timeframe
1.
Extract changes between log sequence numbers SELECT * FROM cdc.fn_cdc_get_net_changes_dbo_Customers(@from_lsn, @to_lsn, 'all')
36 7/31/2012 7/31/2012
36
©2007 – Body Temple
18
31/07/2012
The CDC Control Task and Data Flow Components Initial Extraction CDC Control
Incremental Extraction
1
CDC Control
1
Get Processing Range
Mark Initial Load Start
CDC State Variable
2
CDC Source
CDC
CDC State Variable
2
Data Flow
Data Flow
Source
Staged Inserts
CDC Splitter 3
CDC State Table CDC Control
3
Staged Inserts
Mark Initial Load End
4
Staged Updates
Staged Deletes
CDC Control Mark Processed Range
1.
A CDC Control Task records the starting LSN
2.
A data flow extracts all records
3.
A CDC Control task records the ending LSN
1.
CDC Control Task establishes the range of LSNs to be extracted
2.
A CDC Source extracts records and CDC metadata
3.
Optionally, a CDC Splitter splits the data flow into inserts, updates, and deletes
4.
A CDC Control task records the ending LSN 37
7/31/2012 7/31/2012
©2007 – Body Temple
37
Loading a Data Warehouse from CDC Output Tables Staging and Data Warehouse Co-located
Remote Data Warehouse
Staging DB Data Warehouse
Execute SQL Task UPDATE… FROM JOIN ON BizKey
Data Flow
INSERT… FROM
Data Flow
Execute SQL Task
Staging DB
Data Warehouse
Source
Destination
Staged Inserts
Dimension Table
Source Staged Updates
OLE DB Command UPDATE…
DELETE WHERE BizKey IN or
UPDATE… FROM JOIN ON BizKey
Data Flow
Execute SQL Task Source Staged Deletes
OLE DB Command UPDATE… or DELETE…
38 7/31/2012 7/31/2012
38
©2007 – Body Temple
19
31/07/2012
Part 3: Delivering MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012
1 7/31/2012 7/31/2012
1
©2007 – Body Temple
DELIVERING BUSINESS INTELLIGENCE WITH REPORTING SERVICES
• Reporting Services • Report Server Architecture • Designing and Creating Reports
2 7/31/2012 7/31/2012
2
©2007 – Body Temple
1
31/07/2012
Reporting Scenarios 5
1 2
3
4
1.
Scheduled Delivery of Standard Reports
2.
On-Demand Access to Standard Reports
3.
Embedded Reports and Dashboards
4.
Request to IT for Custom Reports
5.
Self-Service Reporting 3
7/31/2012 7/31/2012
©2007 – Body Temple
3
Report Structure A report project can contain a number of reports. Each report contains two distinct sets of instructions: • data definition, controls where the data for the report comes from and what information is to be selected from that data • Contains 2 parts: data source and data set
• report layout, controls how the information is presented on the screen or on paper
The information in the data definition and the report layout is stored in XML format using the Report Definition Language (RDL) 4 7/31/2012 7/31/2012
4
©2007 – Body Temple
2
31/07/2012
Report Server Report Catalog • Hosts copies of the RDLs
Report Processor • Retrieves the RDL for the report from the Report Catalog
Data Providers • Knows how to retrieve the information from a data source
Renderers • Works with the processor to read through the report layout
Request Handler • Responsible for receiving requests for reports and passing those requests on to the report processor
5 7/31/2012 7/31/2012
©2007 – Body Temple
5
The Distributed Installation Reporting Services items are split between two computers that work together to create a complete Reporting Services system: • Database Server, hosts SQL Server 2012 which in turn host the databases that make up the Report Catalog • Report Server, runs Reporting Services Windows Service
6 7/31/2012 7/31/2012
6
©2007 – Body Temple
3
31/07/2012
The Scale-Out Installation A specialized form of the distributed installation • A single database server interacts with several report servers • Each of the report servers uses the same set of Report Catalog databases for its information • Allows us to handle more simultaneous users
7 7/31/2012 7/31/2012
©2007 – Body Temple
7
Review New & Enhanced Features SQL2012 • Power View – Interactive self-service reporting • Greater integration with SharePoint Server • Improved Rendering to Microsoft Word and Excel formats • Data Alerts – E-mail based notifications of changes to report data
SQL Server 2012 Reporting Services
• New Data Visualizations
• Self-Service Reporting
• Authoring Enhancements
Enhanced Report Builder
Sparklines and Data Bars
Textbox Rotation
Shared Datasets
Indicators
Lookup Functions
Report Parts
Maps
Aggregations of Aggregates
Pagination Enhancements
SQL Server 2008 R2 Reporting Services 8 7/31/2012 7/31/2012
8
©2007 – Body Temple
4
31/07/2012
Self-Service Reporting
• Empower information workers • Supplement standard reports • Reduce IT workload
• Supported by:
Report Builder 3.0
Shared Data Sources and Datasets
Report Parts
9 7/31/2012 7/31/2012
9
©2007 – Body Temple
SharePoint Integration • Implemented as a SharePoint
2010 Shared Service
• A SharePoint site provides the
UI for report server content and operations.
• Managed through SharePoint
Central Administration
• Creates an integrated,
consistent reporting environment for organizations that use SharePoint
10 7/31/2012 7/31/2012
10
©2007 – Body Temple
5
31/07/2012
Scaling Out Reporting Services in a SharePoint Farm • Add Application servers to the
SharePoint farm to scale-out report processing services
Install Reporting Services in SharePoint Mode
Network Load Balancer (NLB)
Reporting Services Add-in
Web Front-End (WFE)
• Add Web front-end servers to the
SharePoint farm to balance user requests
Install Reporting Services Add-in for SharePoint Products
Reporting Services SharePoint Mode
Application Service
Database Server 11 7/31/2012 7/31/2012
11
©2007 – Body Temple
FALLING INTO PLACE – MANAGING REPORTING SERVICES
• Report Manager • Managing Reports on the Report Server • Ad Hoc Reporting
12 7/31/2012 7/31/2012
12
©2007 – Body Temple
6
31/07/2012
Folders and The Report Manager Folders can be created in the Report Manager to group reports and other items and can contain: • Reports
• Supporting files (eternal images, shared data sources, etc.) • Other folders
The Report Manager application provides a straightforward method for creating and navigating folders in the Report Catalog. By default, the Report Manager site is installed in the default website on the server. It is located in a virtual directory called Reports. http://ComputerName/reports
13 7/31/2012 7/31/2012
13
©2007 – Body Temple
Deploying Reports Using the Report Designer The most common method of moving reports to the report server
14 7/31/2012 7/31/2012
14
©2007 – Body Temple
7
31/07/2012
Uploading Reports Using Report Manager Another common method of moving a report to the report server is by using the Report Manager, aka uploading the report
15 7/31/2012 7/31/2012
©2007 – Body Temple
15
Security In Reporting Services, security is designed for • Flexibility • Individual access rights can be assigned to each folder and to each items within the folder
• Ease of management • Security inheritance • Security Roles • Integration with Windows security
16 7/31/2012 7/31/2012
16
©2007 – Body Temple
8
31/07/2012
Integration with Windows Security Reporting Services does not maintain its own list of users and passwords It depends entirely on integration with Windows security note: Custom security is possible, however not advisable due to complexity
17 7/31/2012 7/31/2012
©2007 – Body Temple
17
Tasks and Rights Each task in Reporting Services has a corresponding right Tasks come in 2 flavors: • Security tasks
• System-wide security tasks
18 7/31/2012 7/31/2012
18
©2007 – Body Temple
9
31/07/2012
Roles The rights to perform tasks are grouped together to create roles Reporting Services includes several predefined roles: • The Browser Role
• The Publisher Role • The My Reports Role • The Content Manager Role • The System User Role • The System Administrator Role
Role assignments can be also be created for folders, reports or resources Folders, except the home folder) also inherit the role assignments of their parent roles
19 7/31/2012 7/31/2012
19
©2007 – Body Temple
Linked Reports The linked report is deployed to one folder It is then pointed to (linked to) from links placed elsewhere within the report catalog To the user the links look just like a report
20 7/31/2012 7/31/2012
20
©2007 – Body Temple
10
31/07/2012
Report Caching Report caching is an option that can be turned on individually for each report on the report server to speed the rendering of reports to users • The report server saves a copy, or instance, of the report in a temporary location the first time the report is executed • On subsequent executions, with the same parameter values chosen, the report server pulls the information necessary to render the report from the report cache
Cached reports are assigned an expiration data and time Cached reports must use stored credentials for the shared data sources
21 7/31/2012 7/31/2012
©2007 – Body Temple
21
Execution Snapshots An execution snapshot is another way to create a cached report instance • Created automatically • Can be created on a scheduled basis
• Can be created as soon as the feature is turned on for a particular report
Advantage over caching: • First use to retrieve the report after the cache has expired does not need to for the report to be generated
22 7/31/2012 7/31/2012
22
©2007 – Body Temple
11
31/07/2012
Report History The report history feature of the Report Manager enables us to keep copies of a report’s past execution • Lets us save the state of our data without having to save copies of the data itself • Have to provide a default value for each report parameter • Can start to pile up if we are not careful • They are not lost if the definition of the underlying report is changed • Just like the cached report instance, the report history snapshot contains both the report definition and the dataset
23 7/31/2012 7/31/2012
©2007 – Body Temple
23
Standard Subscriptions A request to push a particular report to a user or set of users Self-serve operations 2 delivery options: Email and File Share Can use multiple subscriptions on one report (different parameters, end of the week and end of the month) To subscribe to a report or create a subscription for delivery to others, you must have rights to the Manage Individual Subscriptions task • Browser, Content Manager, and My Reports roles have rights to manage individual subscriptions
24 7/31/2012 7/31/2012
24
©2007 – Body Temple
12
31/07/2012
Data Driven Subscriptions Aka “mass mailing” Enables us to take a report and e-mail it to a number of people on a mailing list To create a data-driven subscription for a report, you must have rights to the Manage All Subscriptions task • Only the Content Manager role has the rights to this task
While a data-driven subscription is a scheduled process rather than triggered by a particular event, we can make it behave almost as if it were eventdriven using stored procedures
25 7/31/2012 7/31/2012
©2007 – Body Temple
25
Ad Hoc Reporting Users may need to create reports that are one-time in nature or cannot wait till a report developer is available The Report Builder, along with the Report Models, provides a means for end users to explore their data without having to learn the ins and outs of SELECT statements and query builders The Report Model • Provides a nontechnical user with a view of database content without requiring an intimate knowledge of relational theory and practice • Hides all of the complexity of primary keys and foreign key constraints
26 7/31/2012 7/31/2012
26
©2007 – Body Temple
13
31/07/2012
Cleaning Up the Report Model Remove any numeric aggregates that don’t make sense Remove attributes that should not be present Rename entities that have cryptic names Put the proper items in the Lookup folder Use folders to organize entities, attributes, and roles Rearrange the entity, attribute, and role order
Manually create calculated attributes Add descriptions Create perspectives coinciding with business areas
27 7/31/2012 7/31/2012
©2007 – Body Temple
27
Entities, Roles, and Fields Reports are created in the Report Builder using entities, roles, and fields Entities are the objects or processes that our data knows something about • Can be grouped together in entity folders within the Report Model or in perspectives to help keep things organized
Roles show us how one entity relates to another entity • Enable us to show information from multiple entities together on a single report in a meaningful manner
Fields are bits of information: a product name, a machine number, or a date of manufacture • Fields are what we place on our reports to spit out these bits of information
28 7/31/2012 7/31/2012
28
©2007 – Body Temple
14
31/07/2012
Using Reporting Services without the Report Manager When using a custom application to access reports, it is not feasible to use the report manager Other approaches are: • URL Access • Web Services Access • The Report Viewer Control
29 7/31/2012 7/31/2012
©2007 – Body Temple
29
Tabular Data Model • An in-memory database that uses xVelocity in-memory technologies
• Based on the widely understood relational model • Quick and easy to create • Faster time to deployment • Easier to learn than multidimensional models, so has a lower barrier to
entry
• Scalability from desktop BI to organizational BI xVelocity
30 7/31/2012 7/31/2012
30
©2007 – Body Temple
15
31/07/2012
Options for Creating Tabular Data Models • Tabular Data Models in PowerPivot for Excel
Create a tabular data model in a Microsoft Excel workbook
Importing data automatically creates a tabular data model
The data is stored in the Excel workbook
• Tabular Data Models in Microsoft SQL Server 2012 Analysis
Services
Create a tabular data model by using SQL Server Data Tools
The data is stored in SQL Server 2012 Analysis Services
There are additional features to support larger, more complex solutions: •
Row-level security
• DirectQuery mode
•
Partitioning
• Deployment options 31
7/31/2012 7/31/2012
31
©2007 – Body Temple
PowerPivot Technologies • PowerPivot for Excel
Sophisticated desktop data analysis solution
Increased autonomy for information workers
Fast query response times
DAX for custom measures and calculated columns
Diagram view for management of tables and relationships
Hierarchies and perspectives
32 7/31/2012 7/31/2012
32
©2007 – Body Temple
16
31/07/2012
PowerPivot Technologies • PowerPivot for SharePoint
Portal for sharing and collaboration
Gallery to browse and access workbooks and reports
Server-side processing enables users to open workbooks in a browser
Central management and security for workbooks
33 7/31/2012 7/31/2012
33
©2007 – Body Temple
Features in PowerPivot • Diagram view • Hierarchies • Perspectives • Support for multiple relationships between tables. • The ability to sort one column by the values in
another column.
• New DAX functions • Reporting properties
34 7/31/2012 7/31/2012
34
©2007 – Body Temple
17
31/07/2012
Importing Tables from a Data Source • Create data source connections in
Excel PowerPivot window
• Use a wide range of connection
options including common thirdparty databases
• Automatically add related tables • Filter out columns that are not
required for analysis:
Improves PowerPivot performance
Simplifies user experience
• Provide table aliases for ease of use
35 7/31/2012 7/31/2012
35
©2007 – Body Temple
Sharing PowerPivot for Excel Workbooks
• Upload PowerPivot for Excel workbooks to PowerPivot Gallery on
SharePoint:
Browse workbooks and reports in the gallery
View them in Windows Internet Explorer
Open them in Excel for further analysis
• Use uploaded workbooks as data sources for Excel 36 7/31/2012 7/31/2012
36
©2007 – Body Temple
18
31/07/2012
Using PowerPivot Gallery • Shows thumbnail previews
of PowerPivot workbooks
• Offers different viewing
options:
Gallery
All Documents
Theater
Carousel
• Click a workbook to open
it in Internet Explorer
37 7/31/2012 7/31/2012
37
©2007 – Body Temple
Review of DAX • DAX is a formula-based language for building business logic and
queries in tabular data models:
Calculated columns
Measures
Queries
• Its syntax is similar to Microsoft Excel formulas:
It uses functions, operators, and values
It is easy to use and already familiar to information workers
• It differs from Excel formulas in key ways:
It is designed to work with relational data, not data ranges
It includes more advanced functionality
• There are new functions to expand DAX 38 7/31/2012 7/31/2012
38
©2007 – Body Temple
19
31/07/2012
DAX Functions • Text functions • Information functions • Filter and value functions • Logical functions • Mathematical and trigonometric functions • Statistical and aggregation functions • Date and time functions • Time intelligence functions
39 7/31/2012 7/31/2012
39
©2007 – Body Temple
DAX Syntax and Data Types • DAX formulas start with the equal sign (=) followed by an
expression
• Expressions can contain functions, operators, constants, and
references to columns
• Column references:
Fully qualified name:
'table name'[column name]
Unqualified name:
[column name] • Measure names must be enclosed in brackets • DAX uses eight data types
40 7/31/2012 7/31/2012
40
©2007 – Body Temple
20
31/07/2012
Aggregations • Summarize underlying detailed data for analysis • Use automatic aggregation for simple calculations:
SUM
COUNT
MIN
MAX
AVERAGE
• Create a measure for more complex aggregations
SUM('Reseller Sales'[Sales Amount])
41 7/31/2012 7/31/2012
41
©2007 – Body Temple
Context • A DAX measure or calculated column defines a field
that you can use in a PivotTable table or a PivotChart chart
• The exact values that appear in PivotTable tables
and PivotChart charts vary with context:
Row context
Query context
Filter context
42 7/31/2012 7/31/2012
42
©2007 – Body Temple
21
31/07/2012
DAX Queries • Client applications, such as the Power View reporting tool, issue
DAX queries
• You can write queries manually by using the DAX query language • You can filter, order, and summarize results
EVALUATE( FILTER('Reseller Sales', 'Reseller Sales'[OrderDateKey]>20040101))
43 7/31/2012 7/31/2012
43
©2007 – Body Temple
Calculated Columns • Named columns that are populated by using a DAX formula
• Create calculated columns in the Data View window of a
PowerPivot for Excel workbook:
Add a new column
Provide a name
Enter a DAX expression in the formula bar
• A value is calculated for each row in the table when the calculated
column is created
• Use calculated columns in PivotTable tables, PivotChart charts,
slicers, and measure definitions
=CONCATENATE('Employee'[First Name], CONCATENATE(" ", 'Employee'[Last Name]))
44 7/31/2012 7/31/2012
44
©2007 – Body Temple
22
31/07/2012
Measures • Named formulas that can contain sophisticated business logic:
Implicit measures
Explicit measures
• Create explicit measures in two places:
The PowerPivot Field List in an Excel worksheet
The Measure Grid in the table view in the PowerPivot window
• Use measures in PivotTable tables and Pivot Chart charts
IF([Previous Year], ([Sum of Sales Amount] – [Previous Year])/[Previous Year], BLANK())
45 7/31/2012 7/31/2012
45
©2007 – Body Temple
Multiple Relationships • Tabular data models support multiple relationships between
tables
• Only one relationship is active at a time • The active relationship is used by default in DAX formulas • The USERELATIONSHIP function enables you to select the
relationship that you want to use for a specific formula
=CALCULATE(SUM(Reseller Sales[Sales Amount]), USERELATIONSHIP(Reseller Sales[ShipDateKey],Date[DateKey]))
46 7/31/2012 7/31/2012
46
©2007 – Body Temple
23
31/07/2012
Time Intelligence • Compare data from one time period against equivalent data from
a different time period
• The tabular model should contain a separate table that
contains only date information
• The date table should have a continuous range of dates
without any gaps
• The column in the date table that uses the date data type
should use day as the lowest level of granularity
• Mark the table as Date Table to use time intelligence functions
against that table
• Use time intelligence functions to build measures
CALCULATE([Sum of Sales Amount], DATEADD('Date'[FullDateAlternateKey], -1, YEAR)) 47 7/31/2012 7/31/2012
©2007 – Body Temple
47
Dynamic Measures • Calculate different values for each row in a PivotTable table: Create a linked table that contains the input values for the dynamic
measure
Use the HASONEVALUE function to check that there is a single input
value for each row
Use conditional logic to apply different calculations to each row
based on the input value
IF([Check Single Values)], SWITCH(VALUES('Time Period'[Period]), "Current Year", [Sum Of Sales Amount], "Previous Year" [Previous Year], "YOY Growth" , IF(NOT(ISBLANK([Previous Year])), [Sum of Sales Amount] – [Previous Year], BLANK())), BLANK())
48 7/31/2012 7/31/2012
48
©2007 – Body Temple
24
31/07/2012
Part 4: Cloud Technologies in a BI Solution MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012
1 7/31/2012 7/31/2012
1
©2007 – Body Temple
Cloud Technologies in a BI Solution Cloud Data Sources SQL Azure SQL Azure Reporting Services The Windows Azure Marketplace DataMarket
2 7/31/2012 7/31/2012
2
©2007 – Body Temple
1
31/07/2012
Cloud Data Scenarios Application Databases
Third-Party Data
3 7/31/2012 7/31/2012
©2007 – Body Temple
3
Microsoft Cloud Platform for Data Windows Azure Marketplace Data Market
SQL Azure
Data Sync
Reporting Databases
4 7/31/2012 7/31/2012
4
©2007 – Body Temple
2
31/07/2012
Cloud Data and Services in the BI Ecosystem Windows Azure Marketplace DataMarket
DQS KB
Data Cleansing 1011000110
SQL Azure
ETL Staging Process
Staging Database
ETL Load Process
Data Warehouse
Data Sync
SQL Azure
5 7/31/2012 7/31/2012
©2007 – Body Temple
5
Comparing SQL Azure with SQL Server
V
6 7/31/2012 7/31/2012
6
©2007 – Body Temple
3
31/07/2012
Topology of SQL Azure
Load Balancer
TDS
7 7/31/2012 7/31/2012
©2007 – Body Temple
7
Using SQL Azure as a Data Source for a Data Warehouse
SSIS
Data Sync
8 7/31/2012 7/31/2012
8
©2007 – Body Temple
4
31/07/2012
SQL Azure Reporting • Cloud-based reporting • Create reports with the same
tools as on-premise Reporting Services
• Two core scenarios
Operational reports for Windows Azure SQL Database
Embedded reports in Windows or Azure applications
9 7/31/2012 7/31/2012
©2007 – Body Temple
9
Windows Azure Marketplace DataMarket Data Scenarios Windows Azure Marketplace DataMarket
DQS KB
Data Cleansing 1011000110
SSIS 10 7/31/2012 7/31/2012
10
©2007 – Body Temple
5