Talend Aggregate Row

The Talend Aggregate row or tAggregateRow field used to perform Aggregations on numerical column data based on the given grouping column and returns the output. We can use this Talend Aggregate Row field to find the Sum of sales by Region, Total orders by State provinces, etc.

In this Talend Aggregate row example, we used the SQL Server database Customer table. The below screenshot shows the data inside that table.

Talend Aggregate Row 1

Talend Aggregate Row Example

First, drag and drop the DBConnection, DBInput, and DBCommit to establish a SQL Server Connection. Next, we selected the Customers table from the metadata.

Talend Aggregate Row 2

Next, drag the Talend tAggregateRow from palette to job design. As you can see from the below tooltip, tAggregateRow matches SQL GroupBy features. It receives a data flow and aggregates it based on the Group by column(s).

Talend Aggregate Row 3

Please connect the DBInput main row to tAggregateRow. Generally, the input columns won’t sync with the tAggregateRow. So, click the Edit Schema button to add the required columns to the tAggregateRow. 

Talend Aggregate Row 4

Remember, you have to add only the Group by columns and the aggregated columns. It is because all the columns that are not part of the aggregation have to be in the Group by section. Here, we are adding Occupation, Yearly Income, and Sales columns.  

Talend Aggregate Row 5

Within the Talend tAggregateRow Components tab, there is a Group By section and Operations section. Within the Group By section, you have to select the Group by columns and map them.

  • Output Column: Please select the output column for grouping. It is the same column from the tAggregateRow output schema.
  • Input Column Position: Here, you have to map or select the input column from the source, i.e., tDBInput.

Please use the Add (plus) button to add the Group by Columns. Here, we added Occupation columns and mapped with the input column as well.

Talend Aggregate Row 6

From the below screenshot, you can see the other columns as well.

Talend Aggregate Row 7

Within the Talend Aggregate Row Operations section, we have

  • Output Column: Please select the output column for aggregation. It is a column from the tAggregateRow output schema.
  • Function: Please select the aggregation you want to perform. For instance, Sum, Avg, etc. 
  • Input Column Position: Here, you have to map or select the input column from the source, i.e., tDBInput.
  • Ignore Null Values: Do you want to ignore the null values?

By default, when you click on the plus button, Talend will add the first numeric column from the tAggregateRow output schema, and select the count function.

Talend Aggregate Row 8

The following screenshot shows the list of available aggregation functions in Talend. I already explained them in the SQL tutorial, so please refer to the same.

Talend Aggregate Row 9

As you can see, we are grouping Customers by occupation and finding the Sum of Yearly Income and Average Sales.

Talend Aggregate Row 10

After finishing the tAggregateRow configuration, we are using the DBOutput to save the Talend Aggregate Row output in SQL Database. Here, we are creating the Talend_tAggregateRow table as the destination table.

Talend Aggregate Row 11

Let us run the Talend aggregate row Job.

Talend Aggregate Row 12

Let me open the Management studio to check the tAggregateRow result.

Talend Aggregate Row 13

To clarify, we used the T-SQL query on the Customers table to compare the result with tAggregateRow.

SELECT [Occupation]
,SUM([YearlyIncome]) AS [YearlyIncome]
,AVG([Sales]) AS [Sales]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY [Occupation]
ORDER BY YearlyIncome DESC
Talend Aggregate Row 14

This time, we are the Education fields, so we added the same in the Talend aggregate row schema.

Talend Aggregate Row 15

Next, we used the Education column in Group By. It means, first, customers will group by Education and then by occupation and perform the Sum and Average. 

Talend Aggregate Row 16

Please check the Talend Aggregate row result.

Talend Aggregate Row 17

and the code we used for the above is

SELECT Education, [Occupation] ,SUM([YearlyIncome]) AS [YearlyIncome]
,AVG([Sales]) AS [Sales]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY Education, [Occupation] ORDER BY YearlyIncome DESC

Let me select the Occupation column only and then created a few more new columns to store the output.

Talend Aggregate Row 18

Next, we used most of the Talend aggregate functions as the output.

Talend Aggregate Row 19

And the result of the Talend Aggregate Row job is

Talend Aggregate Row 20

and the code is

SELECT [Occupation]
,SUM([YearlyIncome]) AS [IncomeSum]
,AVG([YearlyIncome]) AS [IncomeAvg]
,COUNT([YearlyIncome]) AS [IncomeCount]
,MIN([YearlyIncome]) AS [IncomeMin]
,MAX([YearlyIncome]) AS [IncomeMax]
,STDEV([YearlyIncome]) AS [IncomeStd]
FROM [SQL Tutorial].[dbo].[Customer]
GROUP BY [Occupation] ORDER BY IncomeSum DESC;