Introduction of SQL Server Analysis Services

SSAS Stand for SQL Server Analysis Services. It is the On-Line Analytical Processing (OLAP) Component of Microsoft SQL Server. SSAS allows you to build multidimensional structures called Cubes to pre-calculate and store complex aggregations, and also to build mining models to perform data analysis to identify valuable information like trends, patterns, relationships etc. within the data using Data Mining capabilities of SSAS, which otherwise could be really difficult to determine without Data Mining capabilities. The SSAS main idea is to trade off increased storage space now for speed of querying later. OLAP does this by pre-calculating and storing aggregates. This takes a lot of disk space, but it means that when you want to explore the data you can do it so quickly.

Online Analytical Processing Server (OLAP):
Online Analytical Processing Server (OLAP) is based on multidimensional data model. It allows the managers, analysts to get insight the information through fast, consistent, interactive access to information. In this chapter we will discuss about types of OLAP, operations on OLAP, Difference between OLAP and Statistical Databases and OLTP.

Types of OLAP Servers

We have four types of OLAP servers that are listed below.
I.                   Relational OLAP (ROLAP)
II.                Multidimensional OLAP (MOLAP)
III.             Hybrid OLAP (HOLAP)
IV.             Specialized SQL Servers

I. Relational OLAP (ROLAP):
The Relational OLAP servers are placed between relational back-end server and client front-end tools. To store and manage warehouse data the Relational OLAP use relational or extended-relational DBMS.
ROLAP includes the following.
Implementation of aggregation navigation logic.
Optimization for each DBMS back end.
Additional tools and services.

 

II. Multidimensional OLAP (MOLAP):

Multidimensional OLAP (MOLAP) uses the array-based multidimensional storage engines for multidimensional views of data. With multidimensional data stores, the storage utilization may be low if the data set is sparse. Therefore many MOLAP Server uses the two level of data storage representation to handle dense and sparse data sets.

III. Hybrid OLAP (HOLAP):

The hybrid OLAP technique combination of ROLAP and MOLAP both. It has both the higher scalability of ROLAP and faster computation of MOLAP. HOLAP server allows storing the large data volumes of detail data. The aggregations are stored separated in MOLAP store.
Specialized SQL servers provides advanced query language and query processing support for SQL queries over star and snowflake schemas in a read-only environment.

 

IV. OLAP Operations:

As we know that the OLAP server is based on the multidimensional view of data hence we will discuss the OLAP operations in multidimensional data.
Here is the list of OLAP operations.
A.    Roll-up
B.     Drill-down
C.     Slice and dice
D.    Pivot (rotate)

A. ROLL-UP
This operation performs aggregation on a data cube in any of the following way:
Ø  By climbing up a concept hierarchy for a dimension
Ø  By dimension reduction.

Consider the following diagram showing the roll-up operation.

Ø  The roll-up operation is performed by climbing up a concept hierarchy for the dimension location.
Ø  Initially the concept hierarchy was "street < city < province < country".
Ø  On rolling up the data is aggregated by ascending the location hierarchy from the level of city to level of country.
Ø  The data is grouped into cities rather than countries.
Ø  When roll-up operation is performed then one or more dimensions from the data cube are removed.

 

B. DRILL-DOWN

Drill-down operation is reverse of the roll-up. This operation is performed by either of the following way:
Ø  By stepping down a concept hierarchy for a dimension.
Ø  By introducing new dimension.
Consider the following diagram showing the drill-down operation:


Ø  The drill-down operation is performed by stepping down a concept hierarchy for the dimension time.
Ø  Initially the concept hierarchy was "day < month < quarter < year."
Ø  On drill-up the time dimension is descended from the level quarter to the level of month.
Ø  When drill-down operation is performed then one or more dimensions from the data cube are added.
Ø  It navigates the data from less detailed data to highly detailed data.

C. SLICE

The slice operation performs selection of one dimension on a given cube and gives us a new sub cube. Consider the following diagram showing the slice operation.


Ø  The Slice operation is performed for the dimension time using the criterion time ="Q1".
Ø  It will form a new sub cube by selecting one or more dimensions.

 

DICE

The Dice operation performs selection of two or more dimension on a given cube and gives us a new sub cube. Consider the following diagram showing the dice operation:


The dice operation on the cube based on the following selection criteria that involve three dimensions.
Ø  (location = "Toronto" or "Vancouver")
Ø  (time = "Q1" or "Q2")
Ø  (item =" Mobile" or "Modem").

 

D.PIVOT

The pivot operation is also known as rotation. It rotates the data axes in view in order to provide an alternative presentation of data.  Consider the following diagram showing the pivot operation.


In this the item and location axes in 2-D slice are rotated.

OLAP vs OLTP

 

SN
Data Warehouse (OLAP)
Operational Database(OLTP)
1
This involves historical processing of information.
This involves day to day processing.
2
OLAP systems are used by knowledge workers such as executive, manager and analyst.
OLTP systems are used by clerk, DBA, or database professionals.
3
This is used to analysis the business.
This is used to run the business.
4
It focuses on Information out.
It focuses on Data in.
5
This is based on Star Schema, Snowflake Schema and Fact Constellation Schema.
This is based on Entity Relationship Model.
6
It focuses on Information out.
This is application oriented.
7
This contains historical data.
This contains current data.
8
This provides summarized and consolidated data.
This provides primitive and highly detailed data.
9
This provides summarized and multidimensional view of data.
This provides detailed and flat relational view of data.
10
The number or users are in Hundreds.
The number of users is in thousands.
11
The numbers of records accessed are in millions.
The numbers of records accessed are in tens.
12
The database size is from 100GB to TB
The database size is from 100 MB to GB.
13
This is highly flexible.
This provides high performance.

The basic concepts of SSAS:
Cube
Dimension table
Dimension
Level
Fact table
Measure
Schema

SSAS Architecture:

Server Architecture for SSAS:


Client Architecture:


Cube

The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly. For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title. 
Cubes are ordered into dimensions and measures. Dimensions come from dimension tables, while measures come from fact tables.

Dimension table

A dimension table contains hierarchical data by which you’d like to summarize. Examples would be an Orders table that you might group by year, month, week, and day of receipt or a Books table that you might want to group by genre and title.

Dimension

Each cube has one or more dimensions, each based on one or more dimension tables. A dimension represents a category for analyzing business data: time or category in the examples above. Typically, a dimension has a natural hierarchy so that lower results can be “rolled up” into higher results. For example, in a geographical level you might have city totals aggregated into state totals, or state totals into country totals.

Level

Each type of summary that can be retrieved from a single dimension is called a level. For example, you can speak of a week level or a month level in a time dimension.

Fact table

A fact table contains the basic information that you wish to summarize. This might be order detail information, payroll records, drug effectiveness information, or anything else that’s amenable to summing and averaging. Any table that you’ve used with a Sum or Avg function in a totals query is a good bet to be a fact table.

Measure

Every cube will contain one or more measures, each based on a column in a fact table that you’d like to analyze. In the cube of book order information, for example, the measures would be things such as unit sales and profit.

Schema

Fact tables and dimension tables are related, which is hardly surprising, given that you use the dimension tables to group information from the fact table. The relations within a cube form a schema. There are two basic OLAP schemas: star and snowflake. In a star schema, every dimension table is related directly to the fact table. In a snowflake schema, some dimension tables are related indirectly to the fact table. For example, if your cube includes OrderDetails as a fact table, with Customers and Orders as dimension tables, and Customers is related to Orders, which in turn is related to OrderDetails, then you’re dealing with a snowflake schema.


Workflow Diagram:

Starting with an existing relational database or your data layers, this is the database you want to provide the data for your reports or data mining activities, this is your transnational database or your OLTP database. You may have an environment with a data warehouse, this database could also be used to provide the data for your OLAP reports and data mining activities the great thing about analysis services you can use either an OLTP database or data warehouse database as a source for your data.


Type of schemas:
1.      STAR Schema
2.      SNOWFLAKE Schema
3.      STARFLAKE Schema

1. STAR Schema:
This is the simplest Schema design available and in the star schema design, a single object (the fact table) sits in the middle and is radially connected to other surrounding objects (dimension lookup tables) like a star. Each dimension is represented as a single table. The primary key in each dimension table is related to a foreign key in the fact table. In simple when ALL the DIMENSION TABLES are DIRECTLY related to FACT TABLE then that Schema is called as Star Schema. PFB the Screenshot showing STAR Schema.


2. SNOWFLAKE Schema:
The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. In a star schema, each dimension is represented by a single dimensional table, whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each representing a level in the dimensional hierarchy. When ALL the dimension tables are NOT directly related to FACT table then that type of schema is called as Snow Flake Schema. PFB Screenshot of Snow Flake Schema.


3. STARFLAKE Schema:
A hybrid structure that contains a mixture of star (denormalized) and snowflake (normalized) schema. A starflake schema is a combination of a star schema and a snowflake schema. Starflake schemas are snowflake schemas where only some of the dimension tables have been denormalized. Starflake schemas aim to leverage the benefits of both star schemas and snowflake schemas. The hierarchies of star schemas are denormalized, while the hierarchies of snowflake schemas are normalized. Starflake schemas are normalized to remove any redundancies in the dimensions. To normalize the schema, the shared dimensional hierarchies are placed in outriggers.








Comments

  1. Great artcile, yet it would be better if in future you can share more about this subject.
    msbi training in Hyderabad

    ReplyDelete
  2. Thanks for your valuable feedback

    ReplyDelete
  3. Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating MicroStrategy Online Training Bangalore

    ReplyDelete
  4. Through this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training

    ReplyDelete
  5. Excellent content ,Thanks for sharing this .,
    CSPO certification

    ReplyDelete
    Replies
    1. Thanks for your valuable feedback and please subscribe my blog for more updates.

      Delete
  6. Replies
    1. Thanks for your valuable feedback and please subscribe my blog for more updates.

      Delete

Post a Comment

Hi User,
Thanks for visiting My Blog and please provide your valuable feedback and subscribe for more updates. Please don't post any spam content or comments.
Thank You

Popular Posts

Failed to execute the package or element. Build errors were encountered

Exception deserializing the package "The process cannot access the file because it is being used by another process."

Temporary enable and disable SSRS subscriptions