Grouping

API documentation on grouping data, including SQL examples, GraphQL equivalents, and parameters for filtering, sorting, paginating, and applying aggregate functions.

Overview

An important scenario when working with data is grouping. Grouping allows you to aggregate data based on a specific column or columns. This enables fetching summary of data based on various aggregate functions.

For example, you may want to group orders by the customer column and then count the number of orders and their total value in each group. In SQL, this can be done using the GROUP BY clause as follows:

SELECT CustNo, Count([OrdNo]), SUM(DNOrdSum)
FROM [Ord]
GROUP BY CustNo

When grouping data, you can select:

  • The columns that you want to group by. The rows with the same value in the selected columns are grouped together.
  • The aggregate functions that you want to apply to the grouped data. The aggregate functions are applied to the columns that are not included in the GROUP BY clause.

Because the WHERE clause is evaluated before the GROUP BY clause, you can use the WHERE clause to filter the rows that are included in the grouping. For instance, if we want to take into account only the orders in a specific interval (let’s say the current month) we can apply a filter as follows:

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo

However, filtering the rows in the WHERE clause is not always possible. For instance, if we want to retrieve data only for customers that made purchases over some specific amount, we cannot use the WHERE clause because the aggregated sum is not a table column. In this case, we can use the HAVING clause to filter the groups. The HAVING clause is evaluated after the GROUP BY clause and it can be used to filter the groups based on aggregate functions. An example is shown here:

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo
HAVING SUM(DNOrdSum) > 1000

You can also specify a sorting order for the resulting data using the SORT BY clause. The SORT BY clause is evaluated after the HAVING clause. In the following example, data is sorted ascending by the customer number.

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo
HAVING SUM(DNOrdSum) > 1000
SORT BY CustNo ASC

The capabilities to group data and filter based on aggregate functions in available in BNXT GraphQL as well. The following sections describe how to use the groupBy and having clauses in BNXT GraphQL.

The connection type

In the previous section, we learned about the connection types and how to perform queries using them. Every connection types has a set of parameters that allows to filter, sort, paginate, but also group. An example is shown in the following image for the order table:

Order table

The following parameters can be used to group data:

Parameter Description
filter Specifies the filter expression. The filter expression is evaluated before the groupBy clause.
groupBy Specifies the columns that you want to group by. The rows with the same value in the selected columns are grouped together.
having Specifies the aggregate functions that you want to apply to the grouped data. The aggregate functions are applied to the columns that are not included in the groupBy clause. The having clause is applied to the data after grouping.
sortOrder Specifies the sort order of the grouped data.

Grouping data

The following example shows how to group data using the groupBy parameter. The example groups the orders by the customer number and then counts the number of orders and their total value in each group.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      groupBy : [{customerNo : DEFAULT}])
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 0,
            "aggregates": {
              "count": {
                "orderNo": 111
              },
              "sum": {
                "orderSumNetDomestic": 3195
              }
            }
          },
          {
            "customerNo": 10125,
            "aggregates": {
              "count": {
                "orderNo": 1
              },
              "sum": {
                "orderSumNetDomestic": 0
              }
            }
          },

          ...
        ]
      }
    }
  }
}

To filter the date, you can use the filter parameter. The following example groups the purchase orders by the customer number and then counts the number of orders and their total value in each group. The orders are filtered by amount, which must be positive, and by the order date.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}])
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          },
          {
            "customerNo": 10001,
            "aggregates": {
              "count": {
                "orderNo": 3
              },
              "sum": {
                "orderSumNetDomestic": 58047
              }
            }
          },
          ...
        ]
      }
    }
  }
}

You can also filter the aggregated data by using the having parameter. The following example groups the purchase orders by the customer number and then counts the number of orders and their total value in each group. The orders are filtered by amount, which must be positive, and by the order date. The aggregated data is filtered by the total value of the orders, which must be greater than 100000.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}],
      having : {
        _sum : {
          orderSumNetDomestic : {_gt : 100000}
        }
      })
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          },
          {
            "customerNo": 10285,
            "aggregates": {
              "count": {
                "orderNo": 186
              },
              "sum": {
                "orderSumNetDomestic": 312975
              }
            }
          }
        ]
      }
    }
  }
}

Finally, you can also sort the data using the sortOrder parameter. The following example, expands the previous one by retrieving the data sorted by descending customer number.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}],
      having : {
        _sum : {
          orderSumNetDomestic : {_gt : 100000}
        }
      },
      sortOrder : {customerNo : DESC})
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10285,
            "aggregates": {
              "count": {
                "orderNo": 186
              },
              "sum": {
                "orderSumNetDomestic": 312975
              }
            }
          },
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          }
        ]
      }
    }
  }
}

The groupBy argument

The type of the group by argument is an array of non-null objects of the type <tablename>_GroupByType. An example is Order_GroupByType for the Order table. Each element of the array specifies one column for grouping. Their order in the GROUP BY clause is the one in which they appear in the array. Let’s take the following example:

groupBy : [{customerNo : DEFAULT}, [orderDate : DEFAULT]]

This groups the data by the customerNo and the orderDate columns.

When you specify a column for grouping, you must also specify the kind of grouping that you want to perform. The possible values are:

Value Description
DEFAULT The default grouping. The rows with the same value in the column are grouped together.
ROLLUP The ROLLUP grouping. This creates a group for each combination of column expressions and also rolls up the results into subtotals and grand totals.

While the column order is not important for the DEFAULT grouping, it is important for the ROLLUP grouping. For instance, GROUP BY ROLLUP (A, B, C) creates groups for each combination of column expressions as show in the following list:

A, B, C
A, B, NULL
A, NULL, NULL
NULL, NULL, NULL

To learn more about the ROLLUP grouping, see the SELECT - GROUP BY- Transact-SQL documentation.

To show how this work, lets take the following example: select transactions grouped by account number and period.

First, we use DEFAULT for account number and ROLLUP for period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : DEFAULT},
        {period : ROLLUP},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 3000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          }
        ]
      }
    }
  }
}

Due to the applied filter, the result includes the account numbers 3000, 4000, and 4410 and the period 12.

The result is a series of groups for the following combinations:

Account number Period
3000 12
3000 0
4000 12
4000 0
4410 12
4410 0

Next, we change the grouping type use ROLLUP for account number and DEFAULT for period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : ROLLUP},
        {period : DEFAULT},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 0,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 8535
              }
            }
          }
        ]
      }
    }
  }
}

The result is a series of groups for the following combinations:

Account number Period
3000 12
4000 12
4410 12
0 12

Finally, let’s use ROLLUP for both account number and period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : ROLLUP},
        {period : ROLLUP},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}
Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 3000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 0,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 8535
              }
            }
          }
        ]
      }
    }
  }
}

This time, we get the following combinations:

Account number Period
3000 12
3000 0
4000 12
4000 0
4410 12
4410 0
0 0

The having argument

The type of the having argument is an object of the type <tablename>_HavingType. An example is Order_HavingType for the Order table. The having clause is applied to the data after grouping and defines filters on aggregated data. The <tablename>_HavingType type has the following fields:

  • all the columns in the table
  • all the available aggregate functions, such as SUM and COUNT, but prefixed with an underscore: _sum, _count, etc.
  • _and and _or operators that allow to combine multiple conditions

The available aggregate functions are listed in the following table:

Aggregate Column types Description
sum numerical The sum of all the values.
sumDistinct numerical The sum of all distinct values.
average numerical The average of all the values.
averageDistinct numerical The averate of all distinct values.
count numerical The number of the items.
countDistinct numerical The number of distinct items.
minimum numerical, date, time The minimum value.
maximum numerical, date, time The maximum value.
variance numerical The statistical variance of all the values.
varianceDistinct numerical The statistical variance of all the distinct values.
variancePopulation numerical The statistical variance for the population of all the values.
variancePopulationDistinct numerical The statistical variance for the population of all the distinct values.
standardDeviation numerical The statistical standard deviation of all the values.
standardDeviationDistinct numerical The statistical standard deviation of all the distinct values.
standardDeviationPopulation numerical The statistical standard deviation for the population of all the values.
standardDeviationPopulationDistinct numerical The statistical standard deviation for the population of all the distinct values.

In the previous example we have used this condition:

having : {_sum : {orderSumNetDomestic : {_gt : 100000}}}

This is equivalent to the following SQL clause:

HAVIG SUM(DNOrdSum) > 100000

Because of the nature of GraphQL, the order of fields is:

  1. aggregate function (AF)
  2. column name (COL)
  3. operator (OP)
  4. value

Therefore, the specification {AF : {COL : {OP : value}}} is translated to AF(COL) OP value and not AF(COL OP value).

You can build complex filters such as the following where we select all the groups either have the count of orders greater than 100 or the sum of the order values is between 100000 and 200000:

having : {
  _or: [
    {_count :{orderNo : {_gt : 100}}},
    {
      _and : [
        {_sum : {orderSumNetDomestic : {_gt : 100000}}},
        {_sum : {orderSumNetDomestic : {_lt : 200000}}},
      ]
    }
  ]
}

Tip

The operators used in having clauses are the same used for filter expressions (for the filter argument). To learn more about these, see Filtering.

Limitations

There are also some limitations when grouping data that you must be aware of:

  • You cannot use pagination. The first/after and before/last arguments do not work.
  • You cannot group by or filter data from a joined table.
  • Sorting does not include the aggregated data.
Last modified September 24, 2024