The Slowly Changing Dimension transformation in SSIS coordinates the inserting and updating records in data warehouse dimension tables. This transformation supports four types of changes, and in this article, we will explain about SSIS Slowly Changing Dimension Type 2 (also called as SCD Historical attribute or SCD 2).
SSIS Slowly Changing Dimension Type 2 (SCD 2): If you want to maintain the historical data of a column, then mark them as Historical attributes. If your Dimension table members (or Columns) marked as Historical attributes, then it will maintain the current record, and on top of that, it will create a new record with changing details. In real-time, the type of change provided by this SCD 2 or SSIS Slowly Changing Dimension Type 2 is beneficial. For example, if we want to maintain the Employee’s previous and current city or Employee’s prior department and current department, then we can mark these columns as Historical attributes. Before we step into the example, let us see the data inside our Employees Dimension table.
NOTE: Since SSIS Slowly Changing Dimension Type 2 is maintaining historical data, as the data grows, it will be challenging to maintain the data.
The following screenshot will show you the data inside our [Employees] table. For this SSIS SCD 2 or Slowly Changing Dimension Type 2 demo, we are going to use this data. You can get the same data by selecting TOP 10 records from [DimEmployee] table in [AdventureWorksDW2014] database.
The below image shows the data present in the [EmployeeSource] table, and we are going to update or insert these two records into the [Employees] table.
If you observe the above screenshot, the Department Name of the EmployeId (253022876) is different from original value (it has changed from ‘Marketing’ to ‘Sales,’ and We have a new record with EmployeId 253022870. Our task is to insert the new record and update the Department Name of the second record.
SSIS Slowly Changing Dimension Type 2 example
STEP 1: Open BIDS and Drag and drop the data flow task from the toolbox to control flow and name it as SSIS Slowly Changing Dimension Type 2.
Double click on it will open the data flow tab.
STEP 2: Drag and drop OLE DB Source, Slowly Changing Dimension from SSIS toolbox to data flow region
STEP 3: Double click on OLE DB source in the data flow region will open the connection manager settings and provides space to write our SQL statement. Here we are using already created OLE DB Connection Manager and EmployeSource table present in the [SSIS Source] as the Source table. If you find any difficulty in configures the OLE DB Source, then please refer to OLE DB Source in SSIS article.
STEP 4: Click on the columns tab to verify the columns. In this tab, we can uncheck the unwanted columns also.
To use the OLE DB Source data, Please Drag & drop the blue arrow from OLE DB Source to Slowly Changing Dimension Transformation to connect the data.
SCD 2 STEP 5: Double-click the SSIS Slowly Changing Dimension transformation. Once you click on it, It will open Slowly Changing Dimension Wizard. The first page is a welcome page. If you don’t want to see this page again, then Please tick the checkbox “Do not show this page again”. For now, click Next
STEP 6: Select a Dimension Table and Keys: This page is used to configure the Dimensional table information.
Connection Manager: This option is used to provide a connection to the dimension table. Here we are using the already created OLE DB Connection Manager, which connects to [SSIS Source] database present in our SQL Server. You can change the connection manager settings by clicking the New button.
Table or View: This option is useful to select the Data Warehouse Dimension table or View on which you want to perform slowly changing operation. Here we are choosing the Employees table.
- Input Columns: Columns that are coming from the OLE DB Source.
- Dimension Columns: Columns that are available in the Employees Dimension table.
- Key Type: To perform Slowly Changing Dimension 2 or SSIS SCD 2, we need at least one Business key. Here, we all know that the Employee Alternative Key is the key column. Please change the Key type from Not a Key Column to Business key
STEP 7 – Slowly Changing Dimension Columns: This is the main page in this whole wizard to create SSIS SCD 2 or Slowly Changing Dimension type 2. Here we have
- Dimension Columns: All the columns that are available in the Dimension table will be available in this section. Here you have to select the columns that you want to choose for Change Type. From the below screenshot you can observe that we are selecting the Birthdate, Login Id and Department Name columns
- Change Type: Select the change type. In this example, we are keeping the default Fixed attribute as change type for Birthdate, Login Id columns, and we want to maintain the historic data for Department Name, so we selected this as a Historical attribute.
TIP: By default, the wizard assigns the Fixed attribute as the Change Type. But you can change them.
STEP 8 – Fixed and Changing Attribute Options: This page has two options, and we already explained the first option in SSIS Slowly Changing Dimension Type 0 article. We will disclose the second option in the next section.
STEP 9 – Historical Attribute Options: To maintain the historical data, we have to configure the following option. First, we will show you the configuration setting of the second option, and later we show the other option.
SSIS Slowly Changing Dimension Type 2 using Start Date and End Date: Please use Use start and end dates to identify current and expired records option to use the Start and End dates.
- Start Date Column: Here, you have to select the starting date of the records — for example, Employee Joining date or Production start date, etc.
- End Date Column: Here, you have to select the end date of the records (generally, the values will be NULL) — for example, Employee End Date.
- Variable to set date values: Please specify the Starting date of the newly inserted record & End date of the current record.
From the below screenshot, you can observe that we selected the StartDate as the Start Date Column, EndDate as the End Date Column, and container starting time as the Starting date of the newly inserted record.
STEP 10 – Inferred Dimension Members: We will discuss this option in the Slowly Changing Dimension Inferred Dimension article. For now, we are de-selecting this option and click Next button
STEP 10 – Finish the Slowly Changing Dimension Wizard: Click finish button to finish configuring the SSIS Slowly Changing Dimension Type 2
Once you click on the finish button, our Data Flow will automatically change. If you observe the below screenshot, it added the Derived Column Transformation to set the End Date as the container starting time. OLE DB Command Transformation to update the End date (Null value) to container starting time. Union All Transformation to combine the historical Data and New inputs. Derived Column Transformation to set the New record Start date as container starting time, OLE DB Destination to insert new records into the Dimension table.
(Optional Step): Let us double-click on the Derived Column Transformation to check the automatically generated expression to set the End date.
(Optional Step): Let us double-click on the OLE DB Command Transformation to check the Update statement that was created automatically.
(This is an Optional Step): Let us double-click on the Derived Column Transformation to check the automatically generated expression to set the start date of new records.
(Optional Step): Let us double-click on the OLE DB Destination. And check the automatically created destination (table name and connection manager setting and mappings).
Click OK to finish our SSIS Slowly Changing Dimension Type 2 package design. Let us run the SSIS SCD 2 package
From the above screenshot, you can observe that, though two rows are coming from the OLE DB Source. One row inserted directly, and the other row is maintained the historical information and then inserted.
Let’s open the SQL Server Management Studio and check the results. If you observe the below records, our package
- Added New record of Employee Alternative key 253022870.
- Set the End date Employee Alternative key 253022876 and
- Added New record for Employee Alternative key 253022876 with updated department name.
Let us see what will happen when we check the Use a single column to show current and expired records option for SSIS Slowly Changing Dimension Type 2:
- Column to indicate current record: Please specify the column name that specifies the current and expired records. Here we are selecting the Status column
- Value when Current: Here, you have to specify the value that will appear for the current records.
- End Date Column: Here, you have to specify the value that will appear for the expired records.
Click OK to finish the SCD 2, or SSIS Slowly Changing Dimension Type 2 package design and let us run the package.
Let’s open the SQL Server Management Studio and check the results. If you observe the below records, our SSIS Slowly Changing Dimension Type 2 package
- Added New record of Employee Alternative key 253022870.
- Set the Status as Expired for the Employee Alternative key 253022876 and
- Added New record for Employee Alternative key 253022876 with updated department name and set the status as Current.