In this article, we show how to extract Nouns using Term Extraction Transformation in SSIS. Please refer to Extract Noun Phrases Using Transformation article to understand the extraction of Noun Phrases from the Source Data.
TIP: Before reading this SSIS extract nouns article, Please refer to Term Extraction article for the definition, properties, and functionality of this Transformation.
Below screenshot shows our source data
Configure Term Extraction Transformation in SSIS – Extract Nouns
STEP 1: Open BIDS and Drag and drop the data flow task from the toolbox to control flow and rename it as Extract Nouns Using Term Extraction Transformation in SSIS.
Double click on it will open the data flow tab.
STEP 2: Drag and drop OLE DB Source, Term Extraction Transformation, and OLE DB Destination from the toolbox to data flow region
STEP 3: Double click on OLE DB source in the data flow region will open the connection manager settings and provides space to write our SQL statement.
Here we are going to select the below Database as our source database, and SQL Command we are going to use is:
STEP 4: Click on the columns tab to verify the columns. Here, we can uncheck the unwanted columns.
Drag the OLE DB source output arrow on to the Term Extraction Transformation to perform a transformation on the source Data.
STEP 5: Double click on the Term Extraction Transformation will open the Editor to configure it. Within the Term Extraction tab, You need to choose the column to use for the Extraction from the available input columns. We left the output column names to the default Term and Score.
Exclusion Tab: If you want to exclude specific terms during term extraction, configure this Tab by specifying a column that contains exclusion terms.
In this example, let us leave this because we want to extract all the Nouns from source data.
STEP 6: Advanced tab of the Term Extraction Transformation Editor Dialog box is essential to select Term Type, Source Type, and Frequency Threshold. In this example, we are extracting Noun only. So, we chose Noun as term type and selecting the Frequency Threshold as 1.
From the below screenshot, you can see there is a warning symbol on the Term Extraction Transformation. It is showing that error output is not connected. You can remove the warning symbol by configuring the error output of the Term Extraction Transformation. So double-click on the Configure Error Output button opens a new window to set the error output.
The default configuration of a Term Extraction Transformation is to redirect error rows. You can get rid of this warning by connecting the error output. Or by changing the default behavior to Ignore Failure or Fail Component. Let’s change to Ignore Failure
Click ok to finish configuring the SSIS Term Extraction Transformation to extract nouns.
STEP 7: Now, we have to give the Server, database, and table details of the target. So double-click on the OLE DB Destination and provide the required information.
Here we selected the database as destination data source (localhost as server instance) and [Extracting Nouns using Term Extraction] table as our destination table
STEP 8: Click on the Mappings tab to check whether the source columns exactly mapped to the destination columns. If not, please assign them to the appropriate destination column
Click ok to finish designing our Extract Nouns using Term Extraction Transformation in SSIS package. Let us run the package
Let’s open the SQL Server Management Studio and check the results
TIP: If we are extracting terms from any product description, the product name will repeat the number of times. But we don’t require the product name in the output. In these situations, we add these product names table in the exclusion list.