The Term Lookup Transformation in SSIS is used to find terms within the string or text. It uses the reference table (contains terms) to find the matched sentences from the source data and also counts the number of times a term repeated in the source data row.
The SSIS Term Lookup transformation adds two extra columns to the transformation output called Term and Frequency by default. SSIS Term contains a term from the lookup table, and Frequency includes the number of times the phrase is repeated or occurred in the input data set.
TIP: The SSIS Term Lookup Transformation only allows the columns with DT_WSTR and DT_NTEXT data type to perform term lookup. If your input column is different from these two, Please use the Data Conversion to convert other Data Types to DT_WSTR and DT_NTEXT data type.
For example, if your company has 100 products and it is allowing users to write reviews on your products. Imagine you are getting 2000 reviews per day then, how will you find which review belongs to which product? To resolve these types of issues, SSIS introduced the Term Lookup Transformation. Using this, you can find which row has product 1 review and which row has product 90 review.
TIP: Please refer Term Extraction Transformation article to understand the term extraction technique
Term Lookup Transformation Matching Techniques
The SSIS Term Lookup Transformation uses the following rules to perform Term lookup on the source Data:
- If the reference table contains a singular form of the word, SSIS Term Lookup Transformation considers both the singular and plural forms of the word in the source data as a singular form. For example, if the term lookup table contains Employee and we have Employee and Employees terms in source data, SSIS Term Lookup Transformation would count both of them as a match for the lookup term
- If the reference table contains a plural form of the word, Term Lookup Transformation considers both the singular and plural forms of the word in the source data as separate Terms. For example, if the term lookup table contains Employees and we have Employee and Employees terms in source data, Term Lookup Transformation would count Employee separately and Employees separately.
- If we configured the SSIS Term Lookup Transformation to perform case-sensitive matches, the Employee is treated separately from the EMPLOYEE. If we have Employee as a first word in the sentence, then Term Lookup Transformation will count Employee and Employee as the same term.
Term Lookup Transformation in SSIS Example
In this example, we are going to show you how to configure Term Lookup Transformation in SSIS with sample data.
Here is our Lookup Source table
Here is our lookup table from the Database.
STEP 1: Create a new project in BIDS with the Integration Service Project Template. Drag the Data flow task to the Control flow region and rename it as per your requirements.
Double click on it, and it will open the data flow tab.
STEP 2: Drag and drop OLE DB Source, Term Lookup Transformation and OLE DB destination to data flow region
STEP 3: Double click on OLE DB source in the data flow region will open the connection manager settings. It provides space to write our SQL statement. Or we can select the table from the Dropdown list.
STEP 4: Click on the columns tab to verify the columns. In this tab, we can uncheck the unwanted columns.
STEP 5: Click ok and edit the SSIS Term Lookup Transformation
In the reference table tab, Configure the OLE DB connection manager to reference database and then select the table you wish to add as reference table (Here it is Term Lookup Transformation table).
TIP: This SSIS Term Lookup Transformation only supports OLE DB Connection Manager to connect with the reference table.
NOTE: This Transformation will accept only Unicode String [DT_WSTR] or Unicode text stream [DT_NTEXT] as its data type.
Step 6: Use the Term Lookup tab to map an input column to a lookup column in a reference table and to provide an alias for each output column.
Use the Advanced tab of the Transformation Editor Dialog box to specify whether the lookup should be case-sensitive or not.
From the below screenshot, you can see there is a warning symbol on the Transformation. And it is telling that error output is not connected. You can remove the warning symbol by configuring the error output. So double click on the Configure Error Output button will open a new window to set the error output.
The default configuration of this Transformation is to redirect error rows. You can get rid of this warning by connecting the error output, or by changing the default behavior to Ignore Failure or Fail Component. Let me change to Ignore Failure
Click ok to finish configuring the Transformation.
STEP 7: Now, we have to provide the Server, database, and table details of the destination. So double click on the OLE DB Destination and provide the required information
From the above screenshot, see that we selected the database as our destination database and [Term Lookup Transformation Output] table as destination table.
STEP 8: Click on the Mappings tab to check whether the source columns exactly mapped to the destination columns.
Click ok to finish designing our SSIS Term Lookup Transformation package. Let us run the package
Let’s open the SQL Server Management Studio and check the results
From the above screenshot, you may see, this transformation is displaying the Term and Frequency (number of times term is repeating) of the Term in a sentence.