Slowly Changing Dimension Transformation
The Slowly Changing Dimension transformation coordinates the
updating and inserting of records in data warehouse dimension tables.
The Slowly
Changing Dimension transformation provides the following functionality for
managing slowly changing dimensions:
·
Matching incoming
rows with rows in the lookup table to identify new and existing rows.
·
Identifying incoming
rows that contain changes when changes are not permitted.
·
Identifying inferred
member records that require updating.
·
Identifying incoming
rows that contain historical changes that require insertion of new records and
the updating of expired records.
·
Detecting incoming
rows that contain changes that require the updating of existing records,
including expired ones.
The Slowly
Changing Dimension transformation supports four types of changes: changing
attribute, historical attribute, fixed attribute, and inferred member.
·
Changing attribute
changes overwrite existing records. This kind of change is equivalent to a Type
1 change. The Slowly Changing Dimension transformation directs these rows to an
output named Changing
Attributes Updates Output.
·
Historical attribute
changes create new records instead of updating existing ones. The only change
that is permitted in an existing record is an update to a column that indicates
whether the record is current or expired. This kind of change is equivalent to
a Type 2 change. The Slowly Changing Dimension transformation directs these
rows to two outputs: Historical Attribute Inserts Output and New Output.
·
Fixed attribute
changes indicate the column value must not change. The Slowly Changing
Dimension transformation detects changes and can direct the rows with changes
to an output named Fixed Attribute Output.
·
Inferred member indicates
that the row is an inferred member record in the dimension table. An inferred
member exists when a fact table references a dimension member that is not yet
loaded. A minimal inferred-member record is created in anticipation of relevant
dimension data, which is provided in a subsequent loading of the dimension
data. The Slowly Changing Dimension transformation directs these rows to an
output named Inferred
Member Updates. When data for the inferred member is loaded, you can
update the existing record rather than create a new one.
I
have created two tables one Test_Stage(Which I am using as Staging table) and
another one is Test(Which I am using as Main table) with the following syntax.
You can clearly see that Main table is having TWO Extra columns to track
history.
Create table Test_Stage
(
Emp_Id int,
Emp_Name Varchar(100),
Country Varchar(100)
);
insert into
Test_Stage
values('1','Balu','India');
Insert into
Test_Stage
values('2','Ranga','UK');
insert into
Test_Stage
values('3','Rayudu','USA');
insert into
Test_Stage
values('4','Mani','UAE');
insert into
Test_Stage
values('5','Sarath','Pak');
Create table
Test
(
Emp_Id int,
Emp_Name
Varchar(100),
Country
Varchar(100),
Start_Date
Datetime,
End_Date
DateTime
);
Step 1:
Take a Data Flow Task and edit it.
Step 2:
Take OLE DB source and edit it and take Data Base as
BALU.
Step 3:
Take table as [dbo].[Test_Stage].
Step 4:
Take Slowly Changing Dimension transformation and
edit it.
Step 5:
Select the dimension table (in our case it is “Test”) and
Keys under “Select Dimension Table and Keys” page as shown below.
Step
6:
In the above screenshot we can see that for
Country attribute I selected “Historical Attribute” as I wish to track the
historical changes for that attribute and hence whenever there is a change in
that particular attribute column value then only it tracks history by creating
a new record and for other attribute Emp_Name, it will directly apply UPDATE on
the table.
Step 7:
Under “Historical Attributes
Option” page select “Use Start and End date to identify current and expired
records” option and by selecting start date column and End date column
Step
8:
Once
you click on Finish, SSIS server creates all the functionality that is required
to implement SCD Type 1 based on the information provided by us. It will add
Data flow tasks for Inserting new records and updating existing records as
shown below.
Step
9:
I run the package then it will check for newly added records
and inserts those records and updates the already existing records if there is
any change in data. As there is NO RECORDS in TEST table in this run it will
load all the records under New Insert as shown below.
Step 10:
Now if you run the queries you can see the
data updated in TEST table.
Nice blog. Thank For sharing Valuable Information
ReplyDeleteMsbi Online Training in Hyderabad
Msbi Online Training in India
Thank you for valuable feedback
Delete