Document not found! Please try again

MCSE Business Intelligence Certification - Courseware - Firebrand ...

7 downloads 74 Views 3MB Size Report
Jul 31, 2012 ... 7/31/2012. 7. Business Intelligence and Microsoft SQL Server 2012 .... OLTP systems are usually not good at delivering large aggregates ...
Microsoft MCSE Business Intelligence Certification Courseware Version 1.1

www.firebrandtraining.com

31/07/2012

Part 1: Business Intelligence MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012

1 7/31/2012 7/31/2012

1

©2007 – Body Temple

EQUIPPING THE ORGANIZATION FOR EFFECTIVE DECISION MAKING

• Effective Decision Making • Keys to Effective Decision Making • Business intelligence

2 7/31/2012 7/31/2012

2

©2007 – Body Temple

1

31/07/2012

Who is a Decision Maker Decision makers must be throughout the organization • Effective plans and policies created at the top can be undone by poor decision making at lower levels • Good decisions made by those at the bottom can be quickly overwhelmed by poor decisions made up the line

Effective decision making at every level leads to success

3 7/31/2012 7/31/2012

©2007 – Body Temple

3

What is an Effective Decision? Effective decisions are choices that move an organization closer to an agreed-on set of goals in a timely manner. Key ingredients necessary for making effective decisions: • There must be a set of goals to work toward. • There must be a way to measure whether a chosen course is moving toward or away from those goals. • Information based on those measures must be provided to the decision maker in a timely manner.

4 7/31/2012 7/31/2012

4

©2007 – Body Temple

2

31/07/2012

Goals Goals must be specific and should avoid being vague, as in: • “superior customer satisfaction”  how to measure? • “increased profit margin”  what costs are impacted? • “better meeting our community’s needs”  what needs?

To function as part of effective decision making, a goal must: • Contain a specific target. • Provide a means to measure whether we are progressing toward that target.

5 7/31/2012 7/31/2012

©2007 – Body Temple

5

Is Your Map Upside-Down? Need a method of navigation to: • Determine whether we are heading towards or away from the goal • Measure the steps by which we are moving towards the goal

6 7/31/2012 7/31/2012

6

©2007 – Body Temple

3

31/07/2012

Business Intelligence and Microsoft SQL Server 2012 Business Intelligence is not simply a set of facts and figures on a printed report For the report to be an effective business intelligence tool it needs to be: • Formatted in a way that is easily understood • Has concise summaries of relevant data • Delivered in a timely fashion

7 7/31/2012 7/31/2012

©2007 – Body Temple

7

SQL Server 2012 Editions for BI

Capabilities Basic corporate BI

Basic Data Integration Self-Service BI Advanced Corporate BI Enterprise Data Management Advanced Data Integration Data Warehousing Advanced High Availability

Enterprise

Business Intelligence

Standard

• • • • • • • •

• • • • •

• •

8 7/31/2012 7/31/2012

8

©2007 – Body Temple

4

31/07/2012

EIM & Data Warehousing

Microsoft BI Technologies Microsoft SQL Server Integration Services

Microsoft SQL Azure and the Windows Azure Marketplace

SQL Server Database Engine



1011000110 SQL Server Data Quality Services

SQL Server Master Data Services

SQL Server Analysis Services

Business Intelligence

SQL Server Reporting Services Microsoft PowerPivot Technologies Microsoft Excel • • •

Data Mining Add-In PowerPivot Add-In MDS Add-In

Power View

Microsoft SharePoint Server

Reports, KPIs, and Dashboards

9 7/31/2012 7/31/2012

9

©2007 – Body Temple

MAKING THE MOST OF WHAT YOU’VE GOT – USING BUSINESS INTELLIGENCE

• What Business intelligence Can Do For You • Business Intelligence at Many Levels • Building the Foundation

10 7/31/2012 7/31/2012

10

©2007 – Body Temple

5

31/07/2012

When We Know What We Are Looking For Two approaches: • Layout-led Discovery • When we know the questions we want answered and where to find the information needed • The most common form of business intelligence and one we are all familiar with

• Data-led Discovery • We know the question, but we don’t know where to look for the answer • The information we find determines where we want to go next

11 7/31/2012 7/31/2012

©2007 – Body Temple

11

Discovering New Questions and Their Answers Data may hold answers to questions we have not though to ask Data may contain trends, correlations, and dependencies at a level of detail that would be impossible for a human being to notice using either the layout-led or data-led discovery • Discovery requires a computer to use data mining techniques • Works at the lowest level of detail • Uses advanced mathematical algorithms

12 7/31/2012 7/31/2012

12

©2007 – Body Temple

6

31/07/2012

The Top of the Pyramid Decision makers at the upper levels of our organizations must look at the big picture They deal with the long-term policies and direction They need: • Highly summarized measures  • Higher Latency 

13 7/31/2012 7/31/2012

©2007 – Body Temple

13

Mid-Level Mid-level decision makers are managing the operation of departments and other working units within the organization They set short-term goals

Do the planning for the functioning of these areas They need: • Summarized measures with drilldown • Some latency acceptable

14 7/31/2012 7/31/2012

14

©2007 – Body Temple

7

31/07/2012

The Broad Base The firepersons, managers, and group leaders Deal with daily operations Set daily goals Make decisions on resource allocation for the next week, the next day, or the next shift Plan the next sales campaign or the next sales call

They need: • Measures at the detail level • Low latency

15 7/31/2012 7/31/2012

15

©2007 – Body Temple

SEEKING THE SOURCE – THE SOURCE OF BUSINESS INTELLIGENCE

• Seeking the Source • The Data Mart • Snowflakes, Stars, and Analysis Services

16 7/31/2012 7/31/2012

16

©2007 – Body Temple

8

31/07/2012

Transactional Data Transactional data is the information stored to track the interactions, or business transactions, carried out by an organization Online transaction processing (OLTP) systems record business interactions as they happen. They support the day-to-day operation of an organization

17 7/31/2012 7/31/2012

©2007 – Body Temple

17

Difficulties Using Transactional Data for Business Intelligence OLTP systems are the treasure chests holding the raw data we need to calculate measures and create business intelligence Well-designed OLTP systems are optimized for efficiently processing and storing transactions • Normalized data

BI is concerned with aggregates OLTP systems are usually not good at delivering large aggregates

18 7/31/2012 7/31/2012

18

©2007 – Body Temple

9

31/07/2012

The Data Mart A data mart is a body of historical data in an electronic repository that does not participate in the daily operations of the organization. Instead, this data is used to create business intelligence. The data in the data mart usually applies to a specific area of the organization. • Data is available for our business intelligence needs somewhere outside of our OLTP systems

19 7/31/2012 7/31/2012

©2007 – Body Temple

19

Features of a Data Mart Built for speed of access Data is de-normalized (repeated) requiring fewer table joins for data retrieval Design pattern is organized around “facts”, knows as stars and snowflakes schemas Data Mart Characteristics: • No Real-Time Data • Data is copied from the OLTP systems periodically and written to the data mart.

• Consolidated • Data from different OLTP systems are consolidated into a single mart.

• Cleansed • Inconsistencies and errors are removed from transactional data so it has the consistency necessary for use in a data mart.

20 7/31/2012 7/31/2012

20

©2007 – Body Temple

10

31/07/2012

Data Mart Structure The data we use for business intelligence can be divided into four categories: • Measures • A measure is a numeric quantity expressing some aspect of the organization’s performance. The information represented by this quantity is used to support or evaluate the decision making and performance of the organization. A measure can also be called a fact. • Dimensions • A dimension is a categorization used to spread out an aggregate measure to reveal its constituent parts. • Attributes • An attribute is an additional piece of information pertaining to a dimension member that is not the unique identifier or the description of the member. • Hierarchies • A hierarchy is a structure made up of two or more levels of related dimensions. A dimension at an upper level of the hierarchy completely contains one or more dimensions from the next lower level of the hierarchy.

21 7/31/2012 7/31/2012

21

©2007 – Body Temple

Data Mart Structure Example (1 of 2) A measure of total sales as a single point of information:

By applying categorization or a dimension to that single point of data, we can spread it out, for example for each year

22 7/31/2012 7/31/2012

22

©2007 – Body Temple

11

31/07/2012

Data Mart Structure Example (2 of 2) Next we can spread the total sales for each product type

If we were to spread it further out by sales region, the measure becomes a cube

23 7/31/2012 7/31/2012

23

©2007 – Body Temple

The Star Schema All attributes are directly related to the key attribute, which enables users to browse the facts in the cube based on any attribute hierarchy in the dimension

24 7/31/2012 7/31/2012

24

©2007 – Body Temple

12

31/07/2012

The Snow Flakes Schema An attribute is either directly linked to the key attribute if their underlying table is directly linked to the fact table, or is indirectly linked by means of the attribute that is bound to the key in the underlying table that links the snowflake table to the directly linked table

25 7/31/2012 7/31/2012

25

©2007 – Body Temple

ONE-STOP SHOPPING – THE UNIFIED DIMENSIONAL MODEL

• Online Analytical Processing • The Unified Dimension Model • Tools of the Trade

26 7/31/2012 7/31/2012

26

©2007 – Body Temple

13

31/07/2012

Online Analytical Processing Online analytical processing (OLAP) systems enable users to quickly and easily retrieve information from data, usually in a data mart, for analysis. OLAP systems present data using measures, dimensions, hierarchies, and cubes.

27 7/31/2012 7/31/2012

©2007 – Body Temple

27

Building OLAP - Out of Cubes A cube is a structure that contains a value for one or more measures for each unique combination of the members of all its dimensions. • These are detail, or leaf-level, values.

The cube also contains aggregated values formed by the dimension hierarchies or when one or more of the dimensions is left out of the hierarchy. An aggregate is a value formed by combining values from a given dimension or set of dimensions to create a single value.

28 7/31/2012 7/31/2012

28

©2007 – Body Temple

14

31/07/2012

Features of an OLAP System Multidimensional Database • Structures around measures, dimensions, hierarchies, and cubes rather than tables, rows, columns and relations.

Preprocessed Aggregates • OLAP systems preprocess a portion of the aggregates that are found throughout the cube. • The preprocessing is done as part of the background task that loads or updates the data in the OLAP database.

Easily Understood • If designed properly, dimensions and hierarchies should match the structure of the organization.

29 7/31/2012 7/31/2012

©2007 – Body Temple

29

Architecture ROLAP – Relational OLAP • Stores the cube structure in a multidimensional database. • The leaf-level measures are left in the relational data mart. • The preprocessed aggregates are also stored in a relational database table.

MOLAP – Multidimensional OLAP • Stores the cube structure in a multidimensional database.

• Both the preprocessed aggregate values and a copy of the leaf-level values are placed in the multidimensional database as well

HOLAP – Hybrid OLAP • Combines ROLAP and MOLAP • Stores the cube structure and the preprocessed aggregates in a multidimensional database.

• Leaves the leaf-level data in the relational data mart that serves as the source of the cube.

30 7/31/2012 7/31/2012

30

©2007 – Body Temple

15

31/07/2012

Disadvantages Complexity to Develop and Administer • To maintain and easy to use environment for the end user, a certain amount of complexity is shifted to the development and administrative tasks of the system.

Data Mart Required • Using either a star or a snow flake layout.

Latency • Data needs to be migrated from the OLTP systems to the data mart.

Read-Only • Not a disadvantage but could be problematic when changes to the data are necessary to project certain results.

31 7/31/2012 7/31/2012

31

©2007 – Body Temple

The Unified Dimensional Model UDM introduced with SQL Server 2005 UDM is designed to provide all the benefits of an OLAP system with multidimensional storage and preprocessed aggregates, while avoiding a number of the drawbacks of more traditional OLAP systems

32 7/31/2012 7/31/2012

32

©2007 – Body Temple

16

31/07/2012

UDM Structure UDM is a structure that sits on top of a data mart and looks exactly like an OLAP system to an end user. • Does not require a data mart. • Can be built over one or more OLTP systems. • Can be built over both a data mart and OLTP systems data. • Can include data from other vendors databases and XML.

A UDM can have one or more data sources. UDM utilizes data views to determine which tables and fields to use from the data source.

33 7/31/2012 7/31/2012

©2007 – Body Temple

33

UDM Proactive Caching UDM uses proactive caching technology to obtain the performance advantages of traditional OLAP systems. The cache is created when needed and changed when the underlying data or the underlying structure changes. • Items are created in the cache before they have been requested by the user. • The UDM monitors the data in the data source. As the data is modified, the UDM updates its structures.

Proactive cache can be built using MOLAP, ROLAP, or HOLAP.

34 7/31/2012 7/31/2012

34

©2007 – Body Temple

17

31/07/2012

UDM Advantages OLAP Built on Transactional Data Extremely Low Latency

Ease of Creation and Maintenance Design Versioning with Source Control

35 7/31/2012 7/31/2012

©2007 – Body Temple

35

Creating Analysis Solutions with SQL Server 2012 • SQL Server Data Tools 

Multidimensional models



Tabular models



Data mining models

• Microsoft Excel 

PowerPivot tabular models



Data mining models

36 7/31/2012 7/31/2012

36

©2007 – Body Temple

18

31/07/2012

Creating Reporting Solutions with SQL Server 2012

Authoring

BI Developer

Report Designer • Project-Based Development

with source control

• Sophisticated Design

Environment

IT Pro

Power User

Report Builder • ClickOnce Installation • Reusable Report Elements • Flexible Layout

Information Worker

Power View • Interactive data

Visualization in the Browser

• Drag and Drop from Existing

Data Model

• Rich Design Capabilities

Subscriptions

Data Alerts

Delivery

Interactive

37 7/31/2012 7/31/2012

37

©2007 – Body Temple

19

31/07/2012

Part 2: Business Intelligence MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012

1 7/31/2012 7/31/2012

1

©2007 – Body Temple

BUILDING FOUNDATIONS – CREATING DATA MARTS

• Data Mart • Designing a data Mart • Table Compression

2 7/31/2012 7/31/2012

2

©2007 – Body Temple

1

31/07/2012

Who Needs a Data Mart Anyway? Even with the UDM, situations still exist where a data mart may be the best choice as a source for business intelligence data • Legacy Databases • Some databases may not have an appropriate OLE DB provider

• Data from Non-database Source • Data would need to be imported into a data mart before it can be utilized by a UDM

• No Physical Connection • No full-time connection available to the data

• Dirty data • Data requires cleaning before it can used as a source

3 7/31/2012 7/31/2012

©2007 – Body Temple

3

Designing a Data Mart A data mart is made up of measures and dimensions organized in hierarchies and attributes Design must take into consideration: • Identifying the information that our decision makers need

• Reconcile the information with the available data in the OLTP systems • Organize the data into the data mart components

4 7/31/2012 7/31/2012

4

©2007 – Body Temple

2

31/07/2012

Decision maker’s Needs Decision makers need to be involved in the design process: • Decision makers are the ones in the trenches • Decision makers ultimately determine the success or failure of a project

Questions that need to be answered by decision makers: • What facts, figures, statistics, and so forth do you need for effective decision making? (foundation and feedback measures)

• How should this information be sliced and diced for analysis? (dimensions) • What additional information can aid in finding exactly what is needed? (attributes)

5 7/31/2012 7/31/2012

5

©2007 – Body Temple

Available Data Reality check: is the data available in the OLTP systems? If the data is not available, can we get that information from another data source?

6 7/31/2012 7/31/2012

6

©2007 – Body Temple

3

31/07/2012

Data Mart Structures The structures, measure, dimensions, hierarchies, and attributes, will lead us to the star or snow flake schema that will define our data mart The next set of slides will discuss the process by which to address the design of each structure

7 7/31/2012 7/31/2012

©2007 – Body Temple

7

Measures The measures are the foundation and feedback information our decision makers require. Reconcile the requirements with what is available in the OLTP data to come up with a list of measures. Examples of numeric data that can be used as measures: Monetary Amounts, Counts, Time Periods The following will be needed for each measure: • Name of the measure • What OLTP field or fields should be used to supply the data • Data type (money, integer, decimal) • Formula used to calculate the measure (if there is one)

8 7/31/2012 7/31/2012

8

©2007 – Body Temple

4

31/07/2012

Dimensions and Hierarchies While measures define what the decision makers want to see, the dimensions and hierarchies define how they want to see it. Reconcile the requested dimensions and hierarchies with what is available from the OLTP data The following is needed for each dimension: • Name of the dimension • What OLTP field or fields are to be used to supply the data • Data type of the dimension’s key (the code that uniquely identifies each member of the dimension) • Name of the parent dimension (if there is one)

9 7/31/2012 7/31/2012

©2007 – Body Temple

9

Attributes Attributes provide additional information about a dimension and may result from Information decision makers want to: • Be readily available during analysis • Filter on during the analysis process

We need to reconcile the requested attributes with the data available from the OLTP database to come up with the list of attributes in our design. The following is needed for each attribute: • Name of the attribute • What OLTP field or fields are to be used to supply the data • Data type • Name of the dimension to which it applies

10 7/31/2012 7/31/2012

10

©2007 – Body Temple

5

31/07/2012

Stars and Snowflakes Measures are placed in a single table called the fact table. The dimensions at the lowest level of the hierarchies are each placed in their own dimension table In a Star Schema, all the information for a hierarchy is stored in the same table. In the snowflake schema, each level in the dimensional hierarchy has its own table. Dimensions are linked together with foreign key relationships to form the hierarchy.

11 7/31/2012 7/31/2012

©2007 – Body Temple

11

Table Compression Table compression modifies the way data is physically stored on the disk drive in order to save space • It is transparent to applications making use of the data

SQL Server provides for 2 types of table compression: • Row Compression • Page Compression

To enable table compression: ALTER TABLE ManufacturingFact REBUILD WITH (DATA_COMPRESSION = PAGE) ROW

12 7/31/2012 7/31/2012

12

©2007 – Body Temple

6

31/07/2012

TRANSFORMERS – INTEGRATION SERVICES STRUCTURE AND COMPONENTS

• Integration Services • Package Items

13 7/31/2012 7/31/2012

©2007 – Body Temple

13

Overview of Data Warehouse Load Cycles

ETL process extracts new and modified data

Staging Database

ETL process inserts or modifies data in the data warehouse based on changes

Data Warehouse

• Extract changes from data sources • Refresh the data warehouse based on

changes 

Users modify data in business applications 7/31/2012 7/31/2012

14

Special considerations for slowly changing dimensions

14 ©2007 – Body Temple

7

31/07/2012

Review Options for ETL Microsoft SQL Server Integration Services The Import and Export Data Wizard Transact-SQL The bcp utility Replication 15 7/31/2012 7/31/2012

©2007 – Body Temple

15

Package Structure SSIS creates structures called packages • Used to move data between systems • Contain data sources and data destinations

SSIS is an ETL (Extract, Transform, and Load) tool that is: • Easy to use • Extremely flexible • Exceedingly capable

• Highly scalable

16 7/31/2012 7/31/2012

16

©2007 – Body Temple

8

31/07/2012

Package Items Control Flow • Control Flow Containers • Control Flow Tasks • Maintenance Plan tasks

Data Flow • Data Flow sources

• Data Flow transformation • Data Flow destinations

17 7/31/2012 7/31/2012

17

©2007 – Body Temple

Event Handlers Integration Services packages are event-driven An event can be the completion of a task or an error that occurs during task execution An event handler is a routine that is defined as a control flow Event handler tasks can be created in the Event Handlers Designer tab

18 7/31/2012 7/31/2012

18

©2007 – Body Temple

9

31/07/2012

Precedence Arrows Control the order in which tasks are executed Three options are available: • Success – Green • Failure – Red • Completion - Blue

19 7/31/2012 7/31/2012

©2007 – Body Temple

19

Deploying SSIS Packages Deployment from Development/Testing/Staging to a Production environment involves 4 primary tasks: • Package Configuration • Creating a Package Deployment Utility

• Installing with a Package Deployment Utility • Executing Integration Services Packages

20 7/31/2012 7/31/2012

20

©2007 – Body Temple

10

31/07/2012

SSIS Deployment Models Package Deployment Model 

SSIS Packages are deployed and managed individually

Project Deployment Model 

Multiple packages are deployed in a single project

Project

Project-level parameter Project-level connection manager

Package Package-level parameter Package connection manager

Deploy

SSIS Catalog

Deploy

Package Deployment Model

Package Package-level parameter Package connection manager

21 7/31/2012 7/31/2012

21

©2007 – Body Temple

Package Deployment Model • Storage 

MSDB



File System

• Package Configurations 

Property values to be set dynamically at run time

• Package Deployment

Utility 

Generate all required files for easier deployment

22 7/31/2012 7/31/2012

22

©2007 – Body Temple

11

31/07/2012

Project Deployment Model • The SSIS catalog 

Storage and management for SSIS projects on a SQL Server 2012 instance

• Folders 

A hierarchical structure for organizing and securing SSIS projects

23 7/31/2012 7/31/2012

©2007 – Body Temple

23

Deployment Model Comparison Feature

Package Deployment

Project Deployment

Unit of Deployment

Package

Project

Storage

File system or MSDB

SSIS Catalog

Dynamic configuration

Package configurations

Environment variables mapped to project-level parameters and connection managers

Compiled format

Multiple .dtsx files

Single .ispac file

Troubleshooting

Configure logging for each package

SSIS catalog includes built-in reports and views

24 7/31/2012 7/31/2012

24

©2007 – Body Temple

12

31/07/2012

SSIS Catalog • Pre-requisites 

SQL Server 2012



SQL CLR enabled

• Creating a catalog 

Use SQL Server Management Studio



One SSIS catalog per SQL Server instance

• Catalog Security 

Folder Security



Object Security



Catalog Encryption



Sensitive Parameters 25

7/31/2012 7/31/2012

25

©2007 – Body Temple

Environments and Variables

• Environments 

Execution contexts for projects

• Variables 

Environment-specific values that can be mapped to project parameters and connection manager properties at run time

26 7/31/2012 7/31/2012

26

©2007 – Body Temple

13

31/07/2012

Deploying an SSIS Project • Integration Services

Deployment Wizard 

SQL Server Data Tools



SQL Server Management Studio

27 7/31/2012 7/31/2012

27

©2007 – Body Temple

Viewing Project Execution Information • Integration Services

Dashboard provides built-in reports

• Additional sources of

information: 

Event Handlers



Error Outputs



Logging



Debug Dump Files

28 7/31/2012 7/31/2012

28

©2007 – Body Temple

14

31/07/2012

Control Flow Containers For Loop Container • Enables us to repeat a segment of a control flow, the number of times is controlled by 3 properties: • InitExpression, initial value • EvalExpression, evaluated on ever loop, if true the loop content is executed • AssignExpression, evaluated along with the EvalExpression after each execution of the loop

Foreach Loop Container • Iterates one time for each item in the collection

Sequence Container • No iteration, it only helps in organizing the tasks in a package

29 7/31/2012 7/31/2012

©2007 – Body Temple

29

Control Flow Tasks ActiveX Script Task

Message Queue Task

Analysis Services Execute DDL Task

Script Task

Analysis Services Processing Task

Send mail Task

Bulk Insert Task

Transfer Database Task

Data Flow Task

Transfer Error Messages Task

Data Mining Query Task

Transfer Jobs Task

Data Profiling Task

Transfer Logins Task

Execute DTS 2000 Package Task

Transfer Master Stored Procedures Task

Execute Package Task

Transfer SQL Server Objects Task

Execute Process Task

Web Service Task

Execute SQL Task

WMI Data Reader Task

File System Task

WMI Event Watcher Task

FTP Task

XML Task 30

7/31/2012 7/31/2012

30

©2007 – Body Temple

15

31/07/2012

Maintenance Plan Tasks

Back Up Database Task Check Database Integrity Task Check Database Integrity Task Execute T-SQL Statement Task History Cleanup Task

Maintenance Cleanup Task Notify Operator Task Rebuild Index Task Reorganize Index Task Shrink Database Task Update Statistics Task Custom Tasks

31 7/31/2012 7/31/2012

31

©2007 – Body Temple

Data Flow Sources ADO.NET Data Source Excel Source Flat File Source OLE DB Source Raw File Source XML Source

32 7/31/2012 7/31/2012

32

©2007 – Body Temple

16

31/07/2012

Data Flow Transformations Aggregate

Merge

Audit

Merge Join

Cache Transform (new)

Multicast

Character Map

OLE DB Command

Conditional Split

Percentage Sampling

Copy Column

Pivot

Data Conversion

Row Count

Data Mining Query

Row Sampling

Derived Column

Script Component

Export Column

Slowly Changing Dimension

Fuzzy Grouping

Term Extraction

Fuzzy Lookup

Term Lookup

Import Column

Union All

Lookup

Unpivot

33 7/31/2012 7/31/2012

©2007 – Body Temple

33

Data Flow Destinations

ADO.NET

OLE DB

Data Mining Model Training

Partition Processing

Data Radar Dimension Processing Excel

Raw File

Recordset SQL Server Compact SQL Server

Flat File

34 7/31/2012 7/31/2012

34

©2007 – Body Temple

17

31/07/2012

Package Debugging Setting Breakpoints • We can set a breakpoint on any of the control flow tasks in a package

Viewing Package State • While the package execution is paused at a breakpoint, there are several places to see the current execution state of the package

Viewing Data Flow • We can attach data viewers inside the data flow at various steps along the way to view the doings inside package tasks

35 7/31/2012 7/31/2012

©2007 – Body Temple

35

Change Data Capture 1.

Enable Change Data Capture

EXEC sys.sp_cdc_enable_db EXEC sys.sp_cdc_enable_table

1.

@source_schema = N'dbo', @source_name = N'Customers', @role_name = NULL, @supports_net_changes = 1

Map start and end times to log sequence numbers

DECLARE @from_lsn binary(10), @to_lsn binary(10); SET @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @StartDate) SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @EndDate)

2.

Handle null log sequence numbers

IF (@from_lsn IS NULL) OR (@to_lsn IS NULL) -- There may have been no transactions in the timeframe

1.

Extract changes between log sequence numbers SELECT * FROM cdc.fn_cdc_get_net_changes_dbo_Customers(@from_lsn, @to_lsn, 'all')

36 7/31/2012 7/31/2012

36

©2007 – Body Temple

18

31/07/2012

The CDC Control Task and Data Flow Components Initial Extraction CDC Control

Incremental Extraction

1

CDC Control

1

Get Processing Range

Mark Initial Load Start

CDC State Variable

2

CDC Source

CDC

CDC State Variable

2

Data Flow

Data Flow

Source

Staged Inserts

CDC Splitter 3

CDC State Table CDC Control

3

Staged Inserts

Mark Initial Load End

4

Staged Updates

Staged Deletes

CDC Control Mark Processed Range

1.

A CDC Control Task records the starting LSN

2.

A data flow extracts all records

3.

A CDC Control task records the ending LSN

1.

CDC Control Task establishes the range of LSNs to be extracted

2.

A CDC Source extracts records and CDC metadata

3.

Optionally, a CDC Splitter splits the data flow into inserts, updates, and deletes

4.

A CDC Control task records the ending LSN 37

7/31/2012 7/31/2012

©2007 – Body Temple

37

Loading a Data Warehouse from CDC Output Tables Staging and Data Warehouse Co-located

Remote Data Warehouse

Staging DB Data Warehouse

Execute SQL Task UPDATE… FROM JOIN ON BizKey

Data Flow

INSERT… FROM

Data Flow

Execute SQL Task

Staging DB

Data Warehouse

Source

Destination

Staged Inserts

Dimension Table

Source Staged Updates

OLE DB Command UPDATE…

DELETE WHERE BizKey IN or

UPDATE… FROM JOIN ON BizKey

Data Flow

Execute SQL Task Source Staged Deletes

OLE DB Command UPDATE… or DELETE…

38 7/31/2012 7/31/2012

38

©2007 – Body Temple

19

31/07/2012

Part 3: Delivering MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012

1 7/31/2012 7/31/2012

1

©2007 – Body Temple

DELIVERING BUSINESS INTELLIGENCE WITH REPORTING SERVICES

• Reporting Services • Report Server Architecture • Designing and Creating Reports

2 7/31/2012 7/31/2012

2

©2007 – Body Temple

1

31/07/2012

Reporting Scenarios 5

1 2

3

4

1.

Scheduled Delivery of Standard Reports

2.

On-Demand Access to Standard Reports

3.

Embedded Reports and Dashboards

4.

Request to IT for Custom Reports

5.

Self-Service Reporting 3

7/31/2012 7/31/2012

©2007 – Body Temple

3

Report Structure A report project can contain a number of reports. Each report contains two distinct sets of instructions: • data definition, controls where the data for the report comes from and what information is to be selected from that data • Contains 2 parts: data source and data set

• report layout, controls how the information is presented on the screen or on paper

The information in the data definition and the report layout is stored in XML format using the Report Definition Language (RDL) 4 7/31/2012 7/31/2012

4

©2007 – Body Temple

2

31/07/2012

Report Server Report Catalog • Hosts copies of the RDLs

Report Processor • Retrieves the RDL for the report from the Report Catalog

Data Providers • Knows how to retrieve the information from a data source

Renderers • Works with the processor to read through the report layout

Request Handler • Responsible for receiving requests for reports and passing those requests on to the report processor

5 7/31/2012 7/31/2012

©2007 – Body Temple

5

The Distributed Installation Reporting Services items are split between two computers that work together to create a complete Reporting Services system: • Database Server, hosts SQL Server 2012 which in turn host the databases that make up the Report Catalog • Report Server, runs Reporting Services Windows Service

6 7/31/2012 7/31/2012

6

©2007 – Body Temple

3

31/07/2012

The Scale-Out Installation A specialized form of the distributed installation • A single database server interacts with several report servers • Each of the report servers uses the same set of Report Catalog databases for its information • Allows us to handle more simultaneous users

7 7/31/2012 7/31/2012

©2007 – Body Temple

7

Review New & Enhanced Features SQL2012 • Power View – Interactive self-service reporting • Greater integration with SharePoint Server • Improved Rendering to Microsoft Word and Excel formats • Data Alerts – E-mail based notifications of changes to report data

SQL Server 2012 Reporting Services

• New Data Visualizations

• Self-Service Reporting

• Authoring Enhancements



Enhanced Report Builder



Sparklines and Data Bars



Textbox Rotation



Shared Datasets



Indicators



Lookup Functions



Report Parts



Maps



Aggregations of Aggregates



Pagination Enhancements

SQL Server 2008 R2 Reporting Services 8 7/31/2012 7/31/2012

8

©2007 – Body Temple

4

31/07/2012

Self-Service Reporting

• Empower information workers • Supplement standard reports • Reduce IT workload

• Supported by: 

Report Builder 3.0



Shared Data Sources and Datasets



Report Parts

9 7/31/2012 7/31/2012

9

©2007 – Body Temple

SharePoint Integration • Implemented as a SharePoint

2010 Shared Service

• A SharePoint site provides the

UI for report server content and operations.

• Managed through SharePoint

Central Administration

• Creates an integrated,

consistent reporting environment for organizations that use SharePoint

10 7/31/2012 7/31/2012

10

©2007 – Body Temple

5

31/07/2012

Scaling Out Reporting Services in a SharePoint Farm • Add Application servers to the

SharePoint farm to scale-out report processing services 

Install Reporting Services in SharePoint Mode

Network Load Balancer (NLB)

Reporting Services Add-in

Web Front-End (WFE)

• Add Web front-end servers to the

SharePoint farm to balance user requests 

Install Reporting Services Add-in for SharePoint Products

Reporting Services SharePoint Mode

Application Service

Database Server 11 7/31/2012 7/31/2012

11

©2007 – Body Temple

FALLING INTO PLACE – MANAGING REPORTING SERVICES

• Report Manager • Managing Reports on the Report Server • Ad Hoc Reporting

12 7/31/2012 7/31/2012

12

©2007 – Body Temple

6

31/07/2012

Folders and The Report Manager Folders can be created in the Report Manager to group reports and other items and can contain: • Reports

• Supporting files (eternal images, shared data sources, etc.) • Other folders

The Report Manager application provides a straightforward method for creating and navigating folders in the Report Catalog. By default, the Report Manager site is installed in the default website on the server. It is located in a virtual directory called Reports. http://ComputerName/reports

13 7/31/2012 7/31/2012

13

©2007 – Body Temple

Deploying Reports Using the Report Designer The most common method of moving reports to the report server

14 7/31/2012 7/31/2012

14

©2007 – Body Temple

7

31/07/2012

Uploading Reports Using Report Manager Another common method of moving a report to the report server is by using the Report Manager, aka uploading the report

15 7/31/2012 7/31/2012

©2007 – Body Temple

15

Security In Reporting Services, security is designed for • Flexibility • Individual access rights can be assigned to each folder and to each items within the folder

• Ease of management • Security inheritance • Security Roles • Integration with Windows security

16 7/31/2012 7/31/2012

16

©2007 – Body Temple

8

31/07/2012

Integration with Windows Security Reporting Services does not maintain its own list of users and passwords It depends entirely on integration with Windows security note: Custom security is possible, however not advisable due to complexity

17 7/31/2012 7/31/2012

©2007 – Body Temple

17

Tasks and Rights Each task in Reporting Services has a corresponding right Tasks come in 2 flavors: • Security tasks

• System-wide security tasks

18 7/31/2012 7/31/2012

18

©2007 – Body Temple

9

31/07/2012

Roles The rights to perform tasks are grouped together to create roles Reporting Services includes several predefined roles: • The Browser Role

• The Publisher Role • The My Reports Role • The Content Manager Role • The System User Role • The System Administrator Role

Role assignments can be also be created for folders, reports or resources Folders, except the home folder) also inherit the role assignments of their parent roles

19 7/31/2012 7/31/2012

19

©2007 – Body Temple

Linked Reports The linked report is deployed to one folder It is then pointed to (linked to) from links placed elsewhere within the report catalog To the user the links look just like a report

20 7/31/2012 7/31/2012

20

©2007 – Body Temple

10

31/07/2012

Report Caching Report caching is an option that can be turned on individually for each report on the report server to speed the rendering of reports to users • The report server saves a copy, or instance, of the report in a temporary location the first time the report is executed • On subsequent executions, with the same parameter values chosen, the report server pulls the information necessary to render the report from the report cache

Cached reports are assigned an expiration data and time Cached reports must use stored credentials for the shared data sources

21 7/31/2012 7/31/2012

©2007 – Body Temple

21

Execution Snapshots An execution snapshot is another way to create a cached report instance • Created automatically • Can be created on a scheduled basis

• Can be created as soon as the feature is turned on for a particular report

Advantage over caching: • First use to retrieve the report after the cache has expired does not need to for the report to be generated

22 7/31/2012 7/31/2012

22

©2007 – Body Temple

11

31/07/2012

Report History The report history feature of the Report Manager enables us to keep copies of a report’s past execution • Lets us save the state of our data without having to save copies of the data itself • Have to provide a default value for each report parameter • Can start to pile up if we are not careful • They are not lost if the definition of the underlying report is changed • Just like the cached report instance, the report history snapshot contains both the report definition and the dataset

23 7/31/2012 7/31/2012

©2007 – Body Temple

23

Standard Subscriptions A request to push a particular report to a user or set of users Self-serve operations 2 delivery options: Email and File Share Can use multiple subscriptions on one report (different parameters, end of the week and end of the month) To subscribe to a report or create a subscription for delivery to others, you must have rights to the Manage Individual Subscriptions task • Browser, Content Manager, and My Reports roles have rights to manage individual subscriptions

24 7/31/2012 7/31/2012

24

©2007 – Body Temple

12

31/07/2012

Data Driven Subscriptions Aka “mass mailing” Enables us to take a report and e-mail it to a number of people on a mailing list To create a data-driven subscription for a report, you must have rights to the Manage All Subscriptions task • Only the Content Manager role has the rights to this task

While a data-driven subscription is a scheduled process rather than triggered by a particular event, we can make it behave almost as if it were eventdriven using stored procedures

25 7/31/2012 7/31/2012

©2007 – Body Temple

25

Ad Hoc Reporting Users may need to create reports that are one-time in nature or cannot wait till a report developer is available The Report Builder, along with the Report Models, provides a means for end users to explore their data without having to learn the ins and outs of SELECT statements and query builders The Report Model • Provides a nontechnical user with a view of database content without requiring an intimate knowledge of relational theory and practice • Hides all of the complexity of primary keys and foreign key constraints

26 7/31/2012 7/31/2012

26

©2007 – Body Temple

13

31/07/2012

Cleaning Up the Report Model Remove any numeric aggregates that don’t make sense Remove attributes that should not be present Rename entities that have cryptic names Put the proper items in the Lookup folder Use folders to organize entities, attributes, and roles Rearrange the entity, attribute, and role order

Manually create calculated attributes Add descriptions Create perspectives coinciding with business areas

27 7/31/2012 7/31/2012

©2007 – Body Temple

27

Entities, Roles, and Fields Reports are created in the Report Builder using entities, roles, and fields Entities are the objects or processes that our data knows something about • Can be grouped together in entity folders within the Report Model or in perspectives to help keep things organized

Roles show us how one entity relates to another entity • Enable us to show information from multiple entities together on a single report in a meaningful manner

Fields are bits of information: a product name, a machine number, or a date of manufacture • Fields are what we place on our reports to spit out these bits of information

28 7/31/2012 7/31/2012

28

©2007 – Body Temple

14

31/07/2012

Using Reporting Services without the Report Manager When using a custom application to access reports, it is not feasible to use the report manager Other approaches are: • URL Access • Web Services Access • The Report Viewer Control

29 7/31/2012 7/31/2012

©2007 – Body Temple

29

Tabular Data Model • An in-memory database that uses xVelocity in-memory technologies

• Based on the widely understood relational model • Quick and easy to create • Faster time to deployment • Easier to learn than multidimensional models, so has a lower barrier to

entry

• Scalability from desktop BI to organizational BI xVelocity

30 7/31/2012 7/31/2012

30

©2007 – Body Temple

15

31/07/2012

Options for Creating Tabular Data Models • Tabular Data Models in PowerPivot for Excel 

Create a tabular data model in a Microsoft Excel workbook



Importing data automatically creates a tabular data model



The data is stored in the Excel workbook

• Tabular Data Models in Microsoft SQL Server 2012 Analysis

Services 

Create a tabular data model by using SQL Server Data Tools



The data is stored in SQL Server 2012 Analysis Services



There are additional features to support larger, more complex solutions: •

Row-level security

• DirectQuery mode



Partitioning

• Deployment options 31

7/31/2012 7/31/2012

31

©2007 – Body Temple

PowerPivot Technologies • PowerPivot for Excel 

Sophisticated desktop data analysis solution



Increased autonomy for information workers



Fast query response times



DAX for custom measures and calculated columns



Diagram view for management of tables and relationships



Hierarchies and perspectives

32 7/31/2012 7/31/2012

32

©2007 – Body Temple

16

31/07/2012

PowerPivot Technologies • PowerPivot for SharePoint 

Portal for sharing and collaboration



Gallery to browse and access workbooks and reports



Server-side processing enables users to open workbooks in a browser



Central management and security for workbooks

33 7/31/2012 7/31/2012

33

©2007 – Body Temple

Features in PowerPivot • Diagram view • Hierarchies • Perspectives • Support for multiple relationships between tables. • The ability to sort one column by the values in

another column.

• New DAX functions • Reporting properties

34 7/31/2012 7/31/2012

34

©2007 – Body Temple

17

31/07/2012

Importing Tables from a Data Source • Create data source connections in

Excel PowerPivot window

• Use a wide range of connection

options including common thirdparty databases

• Automatically add related tables • Filter out columns that are not

required for analysis: 

Improves PowerPivot performance



Simplifies user experience

• Provide table aliases for ease of use

35 7/31/2012 7/31/2012

35

©2007 – Body Temple

Sharing PowerPivot for Excel Workbooks

• Upload PowerPivot for Excel workbooks to PowerPivot Gallery on

SharePoint: 

Browse workbooks and reports in the gallery



View them in Windows Internet Explorer



Open them in Excel for further analysis

• Use uploaded workbooks as data sources for Excel 36 7/31/2012 7/31/2012

36

©2007 – Body Temple

18

31/07/2012

Using PowerPivot Gallery • Shows thumbnail previews

of PowerPivot workbooks

• Offers different viewing

options: 

Gallery



All Documents



Theater



Carousel

• Click a workbook to open

it in Internet Explorer

37 7/31/2012 7/31/2012

37

©2007 – Body Temple

Review of DAX • DAX is a formula-based language for building business logic and

queries in tabular data models: 

Calculated columns



Measures



Queries

• Its syntax is similar to Microsoft Excel formulas: 

It uses functions, operators, and values



It is easy to use and already familiar to information workers

• It differs from Excel formulas in key ways: 

It is designed to work with relational data, not data ranges



It includes more advanced functionality

• There are new functions to expand DAX 38 7/31/2012 7/31/2012

38

©2007 – Body Temple

19

31/07/2012

DAX Functions • Text functions • Information functions • Filter and value functions • Logical functions • Mathematical and trigonometric functions • Statistical and aggregation functions • Date and time functions • Time intelligence functions

39 7/31/2012 7/31/2012

39

©2007 – Body Temple

DAX Syntax and Data Types • DAX formulas start with the equal sign (=) followed by an

expression

• Expressions can contain functions, operators, constants, and

references to columns

• Column references: 

Fully qualified name:

'table name'[column name] 

Unqualified name:

[column name] • Measure names must be enclosed in brackets • DAX uses eight data types

40 7/31/2012 7/31/2012

40

©2007 – Body Temple

20

31/07/2012

Aggregations • Summarize underlying detailed data for analysis • Use automatic aggregation for simple calculations: 

SUM



COUNT



MIN



MAX



AVERAGE

• Create a measure for more complex aggregations

SUM('Reseller Sales'[Sales Amount])

41 7/31/2012 7/31/2012

41

©2007 – Body Temple

Context • A DAX measure or calculated column defines a field

that you can use in a PivotTable table or a PivotChart chart

• The exact values that appear in PivotTable tables

and PivotChart charts vary with context: 

Row context



Query context



Filter context

42 7/31/2012 7/31/2012

42

©2007 – Body Temple

21

31/07/2012

DAX Queries • Client applications, such as the Power View reporting tool, issue

DAX queries

• You can write queries manually by using the DAX query language • You can filter, order, and summarize results

EVALUATE( FILTER('Reseller Sales', 'Reseller Sales'[OrderDateKey]>20040101))

43 7/31/2012 7/31/2012

43

©2007 – Body Temple

Calculated Columns • Named columns that are populated by using a DAX formula

• Create calculated columns in the Data View window of a

PowerPivot for Excel workbook: 

Add a new column



Provide a name



Enter a DAX expression in the formula bar

• A value is calculated for each row in the table when the calculated

column is created

• Use calculated columns in PivotTable tables, PivotChart charts,

slicers, and measure definitions

=CONCATENATE('Employee'[First Name], CONCATENATE(" ", 'Employee'[Last Name]))

44 7/31/2012 7/31/2012

44

©2007 – Body Temple

22

31/07/2012

Measures • Named formulas that can contain sophisticated business logic: 

Implicit measures



Explicit measures

• Create explicit measures in two places: 

The PowerPivot Field List in an Excel worksheet



The Measure Grid in the table view in the PowerPivot window

• Use measures in PivotTable tables and Pivot Chart charts

IF([Previous Year], ([Sum of Sales Amount] – [Previous Year])/[Previous Year], BLANK())

45 7/31/2012 7/31/2012

45

©2007 – Body Temple

Multiple Relationships • Tabular data models support multiple relationships between

tables

• Only one relationship is active at a time • The active relationship is used by default in DAX formulas • The USERELATIONSHIP function enables you to select the

relationship that you want to use for a specific formula

=CALCULATE(SUM(Reseller Sales[Sales Amount]), USERELATIONSHIP(Reseller Sales[ShipDateKey],Date[DateKey]))

46 7/31/2012 7/31/2012

46

©2007 – Body Temple

23

31/07/2012

Time Intelligence • Compare data from one time period against equivalent data from

a different time period

• The tabular model should contain a separate table that

contains only date information

• The date table should have a continuous range of dates

without any gaps

• The column in the date table that uses the date data type

should use day as the lowest level of granularity

• Mark the table as Date Table to use time intelligence functions

against that table

• Use time intelligence functions to build measures

CALCULATE([Sum of Sales Amount], DATEADD('Date'[FullDateAlternateKey], -1, YEAR)) 47 7/31/2012 7/31/2012

©2007 – Body Temple

47

Dynamic Measures • Calculate different values for each row in a PivotTable table:  Create a linked table that contains the input values for the dynamic

measure

 Use the HASONEVALUE function to check that there is a single input

value for each row

 Use conditional logic to apply different calculations to each row

based on the input value

IF([Check Single Values)], SWITCH(VALUES('Time Period'[Period]), "Current Year", [Sum Of Sales Amount], "Previous Year" [Previous Year], "YOY Growth" , IF(NOT(ISBLANK([Previous Year])), [Sum of Sales Amount] – [Previous Year], BLANK())), BLANK())

48 7/31/2012 7/31/2012

48

©2007 – Body Temple

24

31/07/2012

Part 4: Cloud Technologies in a BI Solution MSCE SQL Server 2012 BI Designing Business Intelligence with Microsoft SQL Server 2012

1 7/31/2012 7/31/2012

1

©2007 – Body Temple

Cloud Technologies in a BI Solution Cloud Data Sources SQL Azure SQL Azure Reporting Services The Windows Azure Marketplace DataMarket

2 7/31/2012 7/31/2012

2

©2007 – Body Temple

1

31/07/2012

Cloud Data Scenarios Application Databases

Third-Party Data

3 7/31/2012 7/31/2012

©2007 – Body Temple

3

Microsoft Cloud Platform for Data Windows Azure Marketplace Data Market

SQL Azure

Data Sync

Reporting Databases

4 7/31/2012 7/31/2012

4

©2007 – Body Temple

2

31/07/2012

Cloud Data and Services in the BI Ecosystem Windows Azure Marketplace DataMarket

DQS KB

Data Cleansing 1011000110

SQL Azure

ETL Staging Process

Staging Database

ETL Load Process

Data Warehouse

Data Sync

SQL Azure

5 7/31/2012 7/31/2012

©2007 – Body Temple

5

Comparing SQL Azure with SQL Server

V

6 7/31/2012 7/31/2012

6

©2007 – Body Temple

3

31/07/2012

Topology of SQL Azure

Load Balancer

TDS

7 7/31/2012 7/31/2012

©2007 – Body Temple

7

Using SQL Azure as a Data Source for a Data Warehouse

SSIS

Data Sync

8 7/31/2012 7/31/2012

8

©2007 – Body Temple

4

31/07/2012

SQL Azure Reporting • Cloud-based reporting • Create reports with the same

tools as on-premise Reporting Services

• Two core scenarios 

Operational reports for Windows Azure SQL Database



Embedded reports in Windows or Azure applications

9 7/31/2012 7/31/2012

©2007 – Body Temple

9

Windows Azure Marketplace DataMarket Data Scenarios Windows Azure Marketplace DataMarket

DQS KB

Data Cleansing 1011000110

SSIS 10 7/31/2012 7/31/2012

10

©2007 – Body Temple

5