Grouping

Overview

An important scenario when working with data is grouping. Grouping allows you to aggregate data based on a specific column or columns. This enables fetching summary of data based on various aggregate functions.

For example, you may want to group orders by the customer column and then count the number of orders and their total value in each group. In SQL, this can be done using the GROUP BY clause as follows:

SELECT CustNo, Count([OrdNo]), SUM(DNOrdSum)
FROM [Ord]
GROUP BY CustNo

When grouping data, you can select:

The columns that you want to group by. The rows with the same value in the selected columns are grouped together.
The aggregate functions that you want to apply to the grouped data. The aggregate functions are applied to the columns that are not included in the GROUP BY clause.

Because the WHERE clause is evaluated before the GROUP BY clause, you can use the WHERE clause to filter the rows that are included in the grouping. For instance, if we want to take into account only the orders in a specific interval (let’s say the current month) we can apply a filter as follows:

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo

However, filtering the rows in the WHERE clause is not always possible. For instance, if we want to retrieve data only for customers that made purchases over some specific amount, we cannot use the WHERE clause because the aggregated sum is not a table column. In this case, we can use the HAVING clause to filter the groups. The HAVING clause is evaluated after the GROUP BY clause and it can be used to filter the groups based on aggregate functions. An example is shown here:

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo
HAVING SUM(DNOrdSum) > 1000

You can also specify a sorting order for the resulting data using the SORT BY clause. The SORT BY clause is evaluated after the HAVING clause. In the following example, data is sorted ascending by the customer number.

SELECT CustNo, Count(OrdNo), SUM(DNOrdSum)
FROM Ord
WHERE ChDt > 20231101
GROUP BY CustNo
HAVING SUM(DNOrdSum) > 1000
SORT BY CustNo ASC

The capabilities to group data and filter based on aggregate functions in available in BNXT GraphQL as well. The following sections describe how to use the groupBy and having clauses in BNXT GraphQL.

The connection type

In the previous section, we learned about the connection types and how to perform queries using them. Every connection types has a set of parameters that allows to filter, sort, paginate, but also group. An example is shown in the following image for the order table:

Order table

The following parameters can be used to group data:

Parameter	Description
`filter`	Specifies the filter expression. The filter expression is evaluated before the `groupBy` clause.
`groupBy`	Specifies the columns that you want to group by. The rows with the same value in the selected columns are grouped together.
`having`	Specifies the aggregate functions that you want to apply to the grouped data. The aggregate functions are applied to the columns that are not included in the `groupBy` clause. The `having` clause is applied to the data after grouping.
`sortOrder`	Specifies the sort order of the grouped data.

Grouping data

The following example shows how to group data using the groupBy parameter. The example groups the orders by the customer number and then counts the number of orders and their total value in each group.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      groupBy : [{customerNo : DEFAULT}])
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 0,
            "aggregates": {
              "count": {
                "orderNo": 111
              },
              "sum": {
                "orderSumNetDomestic": 3195
              }
            }
          },
          {
            "customerNo": 10125,
            "aggregates": {
              "count": {
                "orderNo": 1
              },
              "sum": {
                "orderSumNetDomestic": 0
              }
            }
          },

          ...
        ]
      }
    }
  }
}

To filter the date, you can use the filter parameter. The following example groups the purchase orders by the customer number and then counts the number of orders and their total value in each group. The orders are filtered by amount, which must be positive, and by the order date.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}])
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          },
          {
            "customerNo": 10001,
            "aggregates": {
              "count": {
                "orderNo": 3
              },
              "sum": {
                "orderSumNetDomestic": 58047
              }
            }
          },
          ...
        ]
      }
    }
  }
}

You can also filter the aggregated data by using the having parameter. The following example groups the purchase orders by the customer number and then counts the number of orders and their total value in each group. The orders are filtered by amount, which must be positive, and by the order date. The aggregated data is filtered by the total value of the orders, which must be greater than 100000.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}],
      having : {
        _sum : {
          orderSumNetDomestic : {_gt : 100000}
        }
      })
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          },
          {
            "customerNo": 10285,
            "aggregates": {
              "count": {
                "orderNo": 186
              },
              "sum": {
                "orderSumNetDomestic": 312975
              }
            }
          }
        ]
      }
    }
  }
}

Finally, you can also sort the data using the sortOrder parameter. The following example, expands the previous one by retrieving the data sorted by descending customer number.

Query

query read_grouped_orders($cid : Int)
{
  useCompany(no : $cid)
  {
    order(
      filter : {_and : [
        {orderSumNetDomestic : {_gt : 0}},
        {orderDate : {_gt : 20210101}},
        {customerNo : {_not_eq : 0}}
      ]},
      groupBy : [{customerNo : DEFAULT}],
      having : {
        _sum : {
          orderSumNetDomestic : {_gt : 100000}
        }
      },
      sortOrder : {customerNo : DESC})
    {
      items
      {
        customerNo
        aggregates
        {
          count
          {
            orderNo
          }
          sum
          {
            orderSumNetDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "order": {
        "items": [
          {
            "customerNo": 10285,
            "aggregates": {
              "count": {
                "orderNo": 186
              },
              "sum": {
                "orderSumNetDomestic": 312975
              }
            }
          },
          {
            "customerNo": 10000,
            "aggregates": {
              "count": {
                "orderNo": 289
              },
              "sum": {
                "orderSumNetDomestic": 305400
              }
            }
          }
        ]
      }
    }
  }
}

The `groupBy` argument

The type of the group by argument is an array of non-null objects of the type <tablename>_GroupByType. An example is Order_GroupByType for the Order table. Each element of the array specifies one column for grouping. Their order in the GROUP BY clause is the one in which they appear in the array. Let’s take the following example:

groupBy : [{customerNo : DEFAULT}, [orderDate : DEFAULT]]

This groups the data by the customerNo and the orderDate columns.

When you specify a column for grouping, you must also specify the kind of grouping that you want to perform. The possible values are:

Value	Description
`DEFAULT`	The default grouping. The rows with the same value in the column are grouped together.
`ROLLUP`	The `ROLLUP` grouping. This creates a group for each combination of column expressions and also rolls up the results into subtotals and grand totals.

While the column order is not important for the DEFAULT grouping, it is important for the ROLLUP grouping. For instance, GROUP BY ROLLUP (A, B, C) creates groups for each combination of column expressions as show in the following list:

A, B, C
A, B, NULL
A, NULL, NULL
NULL, NULL, NULL

To learn more about the ROLLUP grouping, see the SELECT - GROUP BY- Transact-SQL documentation.

To show how this work, lets take the following example: select transactions grouped by account number and period.

First, we use DEFAULT for account number and ROLLUP for period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : DEFAULT},
        {period : ROLLUP},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 3000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          }
        ]
      }
    }
  }
}

Due to the applied filter, the result includes the account numbers 3000, 4000, and 4410 and the period 12.

The result is a series of groups for the following combinations:

Account number	Period
3000	12
3000	0
4000	12
4000	0
4410	12
4410	0

Next, we change the grouping type use ROLLUP for account number and DEFAULT for period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : ROLLUP},
        {period : DEFAULT},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 0,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 8535
              }
            }
          }
        ]
      }
    }
  }
}

The result is a series of groups for the following combinations:

Account number	Period
3000	12
4000	12
4410	12
0	12

Finally, let’s use ROLLUP for both account number and period:

Query

query read_gla_transactions_grouped($cid : Int)
{
  useCompany(no : $cid)
  {
    generalLedgerTransaction(
      filter : {
        _and: [
          {accountNo : {_gte : 3000}},
          {year : {_gte : 2020}}
        ]},
      groupBy: [
        {accountNo : ROLLUP},
        {period : ROLLUP},
      ],
      having : {
        _sum : {postedAmountDomestic : {_not_eq : 0}}
      })
    {
      items
      {
        accountNo
        period
        aggregates
        {
          sum
          {
            postedAmountDomestic
          }
        }
      }
    }
  }
}

Result

{
  "data": {
    "useCompany": {
      "generalLedgerTransaction": {
        "items": [
          {
            "accountNo": 3000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 3000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": -4950
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4000,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 450
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 12,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 4410,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 13035
              }
            }
          },
          {
            "accountNo": 0,
            "period": 0,
            "aggregates": {
              "sum": {
                "postedAmountDomestic": 8535
              }
            }
          }
        ]
      }
    }
  }
}

This time, we get the following combinations:

Account number	Period
3000	12
3000	0
4000	12
4000	0
4410	12
4410	0
0	0

The `having` argument

The type of the having argument is an object of the type <tablename>_HavingType. An example is Order_HavingType for the Order table. The having clause is applied to the data after grouping and defines filters on aggregated data. The <tablename>_HavingType type has the following fields:

all the columns in the table
all the available aggregate functions, such as SUM and COUNT, but prefixed with an underscore: _sum, _count, etc.
_and and _or operators that allow to combine multiple conditions

The available aggregate functions are listed in the following table:

Aggregate	Column types	Description
`sum`	numerical	The sum of all the values.
`sumDistinct`	numerical	The sum of all distinct values.
`average`	numerical	The average of all the values.
`averageDistinct`	numerical	The averate of all distinct values.
`count`	numerical	The number of the items.
`countDistinct`	numerical	The number of distinct items.
`minimum`	numerical, date, time	The minimum value.
`maximum`	numerical, date, time	The maximum value.
`variance`	numerical	The statistical variance of all the values.
`varianceDistinct`	numerical	The statistical variance of all the distinct values.
`variancePopulation`	numerical	The statistical variance for the population of all the values.
`variancePopulationDistinct`	numerical	The statistical variance for the population of all the distinct values.
`standardDeviation`	numerical	The statistical standard deviation of all the values.
`standardDeviationDistinct`	numerical	The statistical standard deviation of all the distinct values.
`standardDeviationPopulation`	numerical	The statistical standard deviation for the population of all the values.
`standardDeviationPopulationDistinct`	numerical	The statistical standard deviation for the population of all the distinct values.

In the previous example we have used this condition:

having : {_sum : {orderSumNetDomestic : {_gt : 100000}}}

This is equivalent to the following SQL clause:

HAVIG SUM(DNOrdSum) > 100000

Because of the nature of GraphQL, the order of fields is:

aggregate function (AF)
column name (COL)
operator (OP)
value

Therefore, the specification {AF : {COL : {OP : value}}} is translated to AF(COL) OP value and not AF(COL OP value).

You can build complex filters such as the following where we select all the groups either have the count of orders greater than 100 or the sum of the order values is between 100000 and 200000:

having : {
  _or: [
    {_count :{orderNo : {_gt : 100}}},
    {
      _and : [
        {_sum : {orderSumNetDomestic : {_gt : 100000}}},
        {_sum : {orderSumNetDomestic : {_lt : 200000}}},
      ]
    }
  ]
}

Tip

The operators used in having clauses are the same used for filter expressions (for the filter argument). To learn more about these, see Filtering.

Limitations

There are also some limitations when grouping data that you must be aware of:

You cannot use pagination. The first/after and before/last arguments do not work.
You cannot group by or filter data from a joined table.
Sorting does not include the aggregated data.

Grouping

Overview

The connection type

Grouping data

The groupBy argument

The having argument

Limitations

The `groupBy` argument

The `having` argument