Introduction of SQL Server Analysis Services
SSAS Stand for SQL Server Analysis
Services. It is the On-Line Analytical Processing (OLAP)
Component of Microsoft SQL Server. SSAS allows you to build
multidimensional structures called Cubes to pre-calculate and store complex
aggregations, and also to build mining models to perform data analysis to
identify valuable information like trends, patterns, relationships etc. within
the data using Data Mining capabilities of SSAS, which otherwise could be
really difficult to determine without Data Mining capabilities.
The SSAS main idea is to trade off increased storage space now for speed of
querying later. OLAP does this by pre-calculating and storing aggregates. This
takes a lot of disk space, but it means that when you want to explore the data
you can do it so quickly.
Online
Analytical Processing Server (OLAP):
Online Analytical
Processing Server (OLAP) is based on multidimensional data model. It allows the
managers, analysts to get insight the information through fast, consistent,
interactive access to information. In this chapter we will discuss about types
of OLAP, operations on OLAP, Difference between OLAP and Statistical Databases
and OLTP.
Types of OLAP Servers
We have four types
of OLAP servers that are listed below.
I.
Relational OLAP (ROLAP)
II.
Multidimensional OLAP
(MOLAP)
III.
Hybrid OLAP (HOLAP)
IV.
Specialized SQL
Servers
I. Relational OLAP (ROLAP):
The Relational OLAP
servers are placed between relational back-end server and client front-end
tools. To store and manage warehouse data the Relational OLAP use relational or
extended-relational DBMS.
ROLAP includes the
following.
Implementation
of aggregation navigation logic.
Optimization
for each DBMS back end.
Additional
tools and services.
II. Multidimensional OLAP (MOLAP):
Multidimensional
OLAP (MOLAP) uses the array-based multidimensional storage engines for
multidimensional views of data. With multidimensional data stores, the storage
utilization may be low if the data set is sparse. Therefore many MOLAP Server
uses the two level of data storage representation to handle dense and sparse
data sets.
III. Hybrid OLAP (HOLAP):
The hybrid OLAP
technique combination of ROLAP and MOLAP both. It has both the higher
scalability of ROLAP and faster computation of MOLAP. HOLAP server allows storing
the large data volumes of detail data. The aggregations are stored separated in
MOLAP store.
Specialized SQL
servers provides advanced query language and query processing support for SQL
queries over star and snowflake schemas in a read-only environment.
IV. OLAP Operations:
As we know that the
OLAP server is based on the multidimensional view of data hence we will discuss
the OLAP operations in multidimensional data.
Here is the list of
OLAP operations.
A.
Roll-up
B.
Drill-down
C.
Slice and dice
D.
Pivot (rotate)
A. ROLL-UP
This operation
performs aggregation on a data cube in any of the following way:
Ø
By climbing up a
concept hierarchy for a dimension
Ø
By dimension
reduction.
Consider the
following diagram showing the roll-up operation.
Ø
The roll-up operation
is performed by climbing up a concept hierarchy for the dimension location.
Ø
Initially the concept
hierarchy was "street < city < province < country".
Ø
On rolling up the data
is aggregated by ascending the location hierarchy from the level of city to
level of country.
Ø
The data is grouped
into cities rather than countries.
Ø
When roll-up operation
is performed then one or more dimensions from the data cube are removed.
B. DRILL-DOWN
Drill-down
operation is reverse of the roll-up. This operation is performed by either of
the following way:
Ø
By stepping down a
concept hierarchy for a dimension.
Ø
By introducing new
dimension.
Consider the
following diagram showing the drill-down operation:
Ø
The drill-down
operation is performed by stepping down a concept hierarchy for the dimension
time.
Ø
Initially the concept
hierarchy was "day < month < quarter < year."
Ø
On drill-up the time
dimension is descended from the level quarter to the level of month.
Ø
When drill-down
operation is performed then one or more dimensions from the data cube are
added.
Ø
It navigates the data
from less detailed data to highly detailed data.
C. SLICE
The slice operation
performs selection of one dimension on a given cube and gives us a new sub
cube. Consider the following diagram showing the slice operation.
Ø
The Slice operation is
performed for the dimension time using the criterion time ="Q1".
Ø
It will form a new sub
cube by selecting one or more dimensions.
DICE
The Dice operation
performs selection of two or more dimension on a given cube and gives us a new sub
cube. Consider the following diagram showing the dice operation:
The dice operation
on the cube based on the following selection criteria that involve three
dimensions.
Ø
(location =
"Toronto" or "Vancouver")
Ø
(time = "Q1"
or "Q2")
Ø
(item ="
Mobile" or "Modem").
D.PIVOT
The pivot operation
is also known as rotation. It rotates the data axes in view in order to provide
an alternative presentation of data. Consider
the following diagram showing the pivot operation.
In this the item
and location axes in 2-D slice are rotated.
OLAP vs OLTP
SN
|
Data Warehouse (OLAP)
|
Operational Database(OLTP)
|
1
|
This involves historical processing of information.
|
This involves day to day processing.
|
2
|
OLAP systems are used by knowledge workers such as executive,
manager and analyst.
|
OLTP systems are used by clerk, DBA, or database professionals.
|
3
|
This is used to analysis the business.
|
This is used to run the business.
|
4
|
It focuses on Information out.
|
It focuses on Data in.
|
5
|
This is based on Star Schema, Snowflake Schema and Fact
Constellation Schema.
|
This is based on Entity Relationship Model.
|
6
|
It focuses on Information out.
|
This is application oriented.
|
7
|
This contains historical data.
|
This contains current data.
|
8
|
This provides summarized and consolidated data.
|
This provides primitive and highly detailed data.
|
9
|
This provides summarized and multidimensional view of data.
|
This provides detailed and flat relational view of data.
|
10
|
The number or users are in Hundreds.
|
The number of users is in thousands.
|
11
|
The numbers of records accessed are in millions.
|
The numbers of records accessed are in tens.
|
12
|
The database size is from 100GB to TB
|
The database size is from 100 MB to GB.
|
13
|
This is highly flexible.
|
This provides high performance.
|
The basic
concepts of SSAS:
Cube
Dimension table
Dimension
Level
Fact table
Measure
Schema
SSAS Architecture:
Server Architecture for
SSAS:
Client Architecture:
Cube
The basic unit of storage and analysis in
Analysis Services is the cube. A cube is a collection of data that’s been
aggregated to allow queries to return data quickly. For example, a cube of
order data might be aggregated by time period and by title, making the cube
fast when you ask questions concerning orders by week or orders by title.
Cubes are ordered into dimensions and measures. Dimensions come from dimension tables, while measures come from fact tables.
Cubes are ordered into dimensions and measures. Dimensions come from dimension tables, while measures come from fact tables.
Dimension table
A dimension table contains hierarchical data by
which you’d like to summarize. Examples would be an Orders table that you might
group by year, month, week, and day of receipt or a Books table that you might
want to group by genre and title.
Dimension
Each cube has one or more dimensions, each based
on one or more dimension tables. A dimension represents a category for
analyzing business data: time or category in the examples above. Typically, a
dimension has a natural hierarchy so that lower results can be “rolled up” into
higher results. For example, in a geographical level you might have city totals
aggregated into state totals, or state totals into country totals.
Level
Each type of summary that can be retrieved from a
single dimension is called a level. For example, you can speak of a week level
or a month level in a time dimension.
Fact table
A fact table contains the basic information that
you wish to summarize. This might be order detail information, payroll records,
drug effectiveness information, or anything else that’s amenable to summing and
averaging. Any table that you’ve used with a Sum or Avg function in a totals
query is a good bet to be a fact table.
Measure
Every cube will contain one or more measures,
each based on a column in a fact table that you’d like to analyze. In the cube
of book order information, for example, the measures would be things such as
unit sales and profit.
Schema
Fact tables and dimension tables are related,
which is hardly surprising, given that you use the dimension tables to group
information from the fact table. The relations within a cube form a schema.
There are two basic OLAP schemas: star and snowflake. In a star schema, every
dimension table is related directly to the fact table. In a snowflake schema,
some dimension tables are related indirectly to the fact table. For example, if
your cube includes OrderDetails as a fact table, with Customers and Orders as
dimension tables, and Customers is related to Orders, which in turn is related
to OrderDetails, then you’re dealing with a snowflake schema.
Workflow Diagram:
Starting
with an existing relational database or your data layers, this is the database
you want to provide the data for your reports or data mining activities, this
is your transnational database or your OLTP database. You may have an
environment with a data warehouse, this database could also be used to provide
the data for your OLAP reports and data mining activities the great thing about
analysis services you can use either an OLTP database or data warehouse
database as a source for your data.
Type
of schemas:
1.
STAR Schema
2.
SNOWFLAKE Schema
3.
STARFLAKE
Schema
1. STAR Schema:
This
is the simplest Schema design available and in the star schema design, a single
object (the fact table) sits in the middle and is radially connected
to other surrounding objects (dimension lookup tables) like a star.
Each dimension is represented as a single table. The primary key in
each dimension table is related to a foreign key in the fact table. In
simple when ALL the DIMENSION TABLES are DIRECTLY related to FACT TABLE then that
Schema is called as Star Schema. PFB the Screenshot showing STAR Schema.
2. SNOWFLAKE Schema:
The
snowflake schema is an extension of the star schema, where each point of the
star explodes into more points. In a star schema, each dimension is represented
by a single dimensional table, whereas in a snowflake schema, that dimensional
table is normalized into multiple lookup tables, each representing a level in
the dimensional hierarchy. When ALL the dimension tables are NOT directly
related to FACT table then that type of schema is called as Snow Flake Schema.
PFB Screenshot of Snow Flake Schema.
3. STARFLAKE Schema:
A
hybrid structure that contains a mixture of star (denormalized) and snowflake
(normalized) schema. A starflake schema is a
combination of a star schema and a snowflake schema. Starflake schemas are
snowflake schemas where only some of the dimension tables have been
denormalized. Starflake schemas aim to
leverage the benefits of both star schemas and snowflake schemas. The
hierarchies of star schemas are denormalized, while the hierarchies of
snowflake schemas are normalized. Starflake schemas are normalized to remove any
redundancies in the dimensions. To normalize the schema, the shared dimensional
hierarchies are placed in outriggers.
Great artcile, yet it would be better if in future you can share more about this subject.
ReplyDeletemsbi training in Hyderabad
Thanks for your valuable feedback
ReplyDeleteNice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating MicroStrategy Online Training Bangalore
ReplyDeleteThanks for your feedback.
DeleteThanks for feedback
ReplyDeleteThrough this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training
ReplyDeleteThanks for your feedback.
DeleteExcellent Blog Article. Thanks for sharing this information with us.
ReplyDeleteSnowflake Training
Snowflake Online Training
Snowflake Online Training in Hyderabad
Snowflake Course Online
Snowflake Training in Ameerpet
Snowflake Online Training institute
Thanks for your feedback.
DeleteThanks for your feedback.
ReplyDeleteExcellent content ,Thanks for sharing this .,
ReplyDeleteCSPO certification
Thanks for your valuable feedback and please subscribe my blog for more updates.
DeleteThanks for sharing such an informative post. We are providing the best services click on below links to visit our website.
ReplyDeleteSnowflake Training
Snowflake Training in Hyderabad
Snowflake Online Training
Snowflake Online Training Hyderabad
Snowflake Training Online
Snowflake Training in Ameerpet
Snowflake Training Institute in Hyderabad
Thanks for your valuable feedback and please subscribe my blog for more updates.
Delete
ReplyDeleteImpressive Article . Thanks for sharing the information.
Snowflake Training
Snowflake Training in Hyderabad
Snowflake Online Training
Snowflake Online Training Hyderabad
Snowflake Training Online
Snowflake Training in Ameerpet
Snowflake Training Institute in Hyderabad
Thanks for your valuable feedback and please subscribe my blog for more updates.
Delete