SSIS Best Practices

This article explains the list of SSIS Best Practices one must apply for optimal performance of the packages in the production environment.

SSIS Best Practices

The following are some of the best practices while working with SSIS projects but are not limited to.

  • If SSIS and SQL Server run on the same server, use the SQL Server destination instead of the OLE DB destination to improve Integration Services performance.
  • If the destination table has partitions, perform the parallel processing to load data faster.
  • Use Merge Statement or query instead of separate Insert, Update, and Delete Statements. 
  • While loading a large amount of data, try to remove all the indexes on the target table. Once the data is loaded, recreate those indexes. This approach will improve performance.
  • Use Transactions when needed.
  • Use NOLOCK to remove lockings.
  • Avoid OLE DB Command transformation altogether.
  • Reduce Statging operations.

Remaining SSIS Best Practices

Apart from the above, there are a few more important SSIS best practices that you must apply for better performance.

Choosing Table Columns

By default, OLE DB and ADO. NET sources allow selecting tables within a Database. It uses the Select * From Table statement. It gives real-time performance issues because the buffer space will fill with unwanted columns. Always write your own query by selecting the required columns. 

Join Multiple Tables

Instead of using SSIS Merge Join Transformation on multiple tables, use query to write left, right, inner, self, and cross joins. It avoids the extra step of sorting the inputs (sort transformation).

Data Access Mode

Always use the SQL Command and write the query. Otherwise, choose the stored procedure. Strictly avoid selecting the table.

Stored Procedures

If you have any change, it is the best practice to use stored procedures.

Restrict the Number of Rows

Use the Where clause to restrict the number of rows the select statement returns. For instance, bring a particular region’s sales, last week’s orders, etc.

Combine Multiple Tables

Instead of loading data from multiple tables and applying Union All Transformation, use the Union or Union All operator within the query.

Aggregations

Avoid performing aggregations in large data sets with millions of records. Instead, write Group by statements within the query.

Sort Data

Most tables will already be sorted in the database because of the indexes. However, you can use the Order By clause to sort the data instead of using the sort Transformation. This transformation is entirely blocking one and does the sorting in memory.

Data Type Conversion

Try to perform minimal datatype conversion. Avoid them and do it at the query level to improve performance.

DateTime Conversions

Try converting the date and time in the source or destination table.

Check and Uncheck fields in the Columns Tab

Allowing all the columns into the Source component and unchecking the unwanted columns is a bad practice. Because you have already brought the columns into the memory, deselecting them will not increase performance. Always get the required columns. 

Clean Data while loading

It is better to perform basic cleaning operations within the query if you load data from the Relational database. For instance, replacing nulls, removing unwanted spaces, replacing special characters, etc. It is faster than doing the integration services. Here, you have to write expressions and block the data.

Fast Load

If the destination table supports the fast load, use it. An option for OLE DB Destination is table or view – fast load, use it. 

Avoid the Blocking Transformation

From all the above steps, you can understand that we always mentioned alternatives to the blocking transformations. Always avoid blocking and asynchronous transformations because they rely on memory.

Checkpoints

Use checkpoints to understand the successful and failed tasks. Always set the SaveCheckPoints property to True for every task. So, we can start the task from where it fails instead of executing it from the beginning. 

Default Buffer Size

Change the DefaultBufferSize and DefaultBufferMaxRows as per your requirement. Try experimenting with different values, but remember to set the AutoAdjustBufferSize property.

Process Bar

Always check the SSIS progress bar while the package runs for best practices. It gives all kinds of information to improve the performance. For instance, it gives warnings about unwanted columns.