Talend Aggregate Row

The Talend Aggregate row or tAggregateRow field is used to perform Aggregations on numerical column data based on the given grouping column and returns the output. We can use this Talend Aggregate Row field to find the Sum of sales by Region, Total orders by State provinces, etc.

In this Talend Aggregate row example, we used the SQL Server database Customer table. The below screenshot shows the data inside that table.

Customer Sales Table with HireDate

Talend Aggregate Row Example

First, drag and drop the DBConnection, DBInput, and DBCommit to establish a SQL Server Connection. Next, we selected the Customers table from the metadata.

Configure Server Connection and write table Select query

Next, drag the Talend tAggregateRow from the palette to the job design. As you can see from the below tooltip, tAggregateRow matches SQL GroupBy features. It receives a data flow and aggregates it based on the Group by column(s).

Talend Aggregate Row 3

Please connect the DBInput main row to tAggregateRow. Generally, the input columns won’t sync with the tAggregateRow. So, click the Edit Schema button to add the required columns to the tAggregateRow. 

Remember, you have to add only the Group by columns and the aggregated columns. It is because all the columns that are not part of the aggregation have to be in the Group by section. Here, we are adding Occupation, Yearly Income, and Sales columns.  

Talend Aggregate Row 5

Within the Talend tAggregateRow Components tab, there is a Group By section and an Operations section. Within the Group By section, you have to select the Group by columns and map them.

  • Output Column: Please select the output column for grouping. It is the same column from the tAggregateRow output schema.
  • Input Column Position: Here, you have to map or select the input column from the source, i.e., tDBInput.

Please use the Add (plus) button to add the Group by Columns. Here, we added Occupation columns and mapped with the input column as well. You can use the drop-down list to see the other columns as well.

Talend Aggregate Row 6

Within the Talend Aggregate Row Operations section, we have

  • Output Column: Please select the output column for aggregation. It is a column from the tAggregateRow output schema.
  • Function: Please select the aggregation you want to perform. For instance, Sum, Avg, etc. 
  • Input Column Position: Here, you have to map or select the input column from the source, i.e., tDBInput.
  • Ignore Null Values: Do you want to ignore the null values?

By default, when you click on the plus button, Talend will add the first numeric column from the tAggregateRow output schema, and select the count function.

Talend Aggregate Row 8

The following screenshot shows the list of available aggregation functions in Talend. I already explained them in the SQL tutorial, so please refer to the same.

List of available functions in Talend Aggregate Row

As you can see, we are grouping Customers by occupation and finding the Sum of Yearly Income and Average Sales.

Talend Aggregate Row 10

After finishing the tAggregateRow configuration, we are using the DBOutput to save the Talend Aggregate Row output in the SQL Database. Here, we are creating the Talend_tAggregateRow table as the destination table.

Talend Aggregate Row 11

Let us run the Talend aggregate row Job.

Run the Talend Aggregate Row Job

Let me open the Management studio to check the tAggregateRow result. To clarify, we used the T-SQL query on the Customers table to compare the result with tAggregateRow.

SELECT [Occupation]
,SUM([YearlyIncome]) AS [YearlyIncome]
,AVG([Sales]) AS [Sales]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY [Occupation]
ORDER BY YearlyIncome DESC
Check the Result

This time, we are the Education fields, so we added the same in the Talend aggregate row schema.

Talend Aggregate Row 15

Next, we used the Education column in Group By. It means, first, customers will grouped by Education and then by occupation and perform the Sum and Average. 

Talend Aggregate Row 16

Please check the Talend Aggregate row result.

Check the Result of Multiple Group by columns

and the code we used for the above is

SELECT Education, [Occupation] ,SUM([YearlyIncome]) AS [YearlyIncome]
,AVG([Sales]) AS [Sales]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY Education, [Occupation] ORDER BY YearlyIncome DESC

Let me select the Occupation column only and then create a few more new columns to store the output.

Talend Aggregate Row with multiple grouping columns

Next, we used most of the Talend aggregate functions as the output.

Talend Aggregate Row 19

The result of the Talend Aggregate Row job

Result table

and the code is

SELECT [Occupation]
,SUM([YearlyIncome]) AS [IncomeSum]
,AVG([YearlyIncome]) AS [IncomeAvg]
,COUNT([YearlyIncome]) AS [IncomeCount]
,MIN([YearlyIncome]) AS [IncomeMin]
,MAX([YearlyIncome]) AS [IncomeMax]
,STDEV([YearlyIncome]) AS [IncomeStd]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY [Occupation] ORDER BY IncomeSum DESC;