The Talend Aggregate Sorted row or tAggregateSortedRow field is the same as the tAggreagteRow. However, it performs Aggregations on sorted data and returns the output. We can use this Talend Aggregate Sorted Row field if we know that the source sorted. Because Talend tAggregateSortedRow does not maintain any buffer, it is faster than the tAggreateRow.
In this Talend Aggregate sorted row example, we used the same Customer table that we used in the Aggregate row example. The below screenshot shows the data inside that table.
Talend Aggregate Sorted Row Example
First, drag and drop the DBInput and choose the Customers table from the metadata. Please refer to the tAggregateRow article to understand the connection settings.
Next, drag the Talend tAggregateSortedRow from palette to job design. By selecting the field, you can see the tooltip defining the tAggregateSortedRow.
Please connect the DBInput main row to Talend tAggregateSortedRow. We already explained all the options in the previous example. Apart from them, it has one more field in the Component tab called Input rows count. Here, you have to specify the total number of rows coming from the input.
Next, click the Edit Schema button to add the required columns to the tAggregateSortedRow. Here, we are adding Occupation, Yearly Income, and Sales columns.
For the number of Input rows, you can manually type the number or select Number of lines (NB_Line) from the outline. Here, we accidentally added the NB_LINE of the tAggregateSortedRow, but you have to add the DBInput NB_LINE. We changed it later and forgot to take that screenshot.
Please use the plus button to add the group by columns. Here, we added Occupation columns for grouping. Next, we added Yearly Income and Sales in the Operations section and selected the Sum function.
It means we are grouping Customers by Occupation and finding the Sum of Yearly Income, and Sales.
Here, we are using the DBOutput to save the Talend Aggregate Sorted Row output in SQL Database. Here, we are creating the Talend_tAggSortedRow table as the destination table.
Let us run the Talend Aggregate Sorted row Job.
Let me open the Management studio to check the tAggregateSortedRow result.
Two compare the result; we also wrote a T-SQL query on the Customer table. And you can see from the below screenshot, both are returning different results. It is because Talend tAggregateSortedRow assumes that the data is sorted.
Let us sort the data and run this Talend Aggregate Sorted Row. To do so, let me drag and drop the tSortRow field in the middle.
Here, we used the tSortRow field to sort the data based on the Occupation in the ascending order.
Next, we connected this tSortRow with Talend tAggregateSortedRow, kept the remaining setting as it is, and run the job.
Please check the Talend aggregate sorted row result.
This time, we are using the sorted data as the DBInput source data. For this example, we changed the input table from Customers to Talend_Sorted_Customers. It is the output table that we got from the Talend Sort Data article.
Please keep the Grouping and aggregate functions as it is or, as shown below.
Let us run this Talend Aggregate Sorted Row job.
And the result of the Talend tAggregateSortedRow job is