优化 GROUP BY 语句

ORDER BY NULL

默认情况下,MySQL 会根据分组字段进行排序。对于含有 GROUP BY col1, col2, … 的 SQL 语句,相当于隐式地包含了一个 ORDER BY col1, col2, … 子句。

如果查询包括 GROUP BY想要避免排序结果的消耗,则可以指定 ORDER BY NULL 来禁止排序。

EXPLAIN
SELECT payment_date, SUM(amount)
FROM payment
GROUP BY payment_date;

+-------------+-------------------------------+
|id           |1                              |
+-------------+-------------------------------+
|select_type  |SIMPLE                         |
+-------------+-------------------------------+
|table        |payment                        |
+-------------+-------------------------------+
|partitions   |NULL                           |
+-------------+-------------------------------+
|type         |ALL                            |
+-------------+-------------------------------+
|possible_keys|NULL                           |
+-------------+-------------------------------+
|key          |NULL                           |
+-------------+-------------------------------+
|key_len      |NULL                           |
+-------------+-------------------------------+
|ref          |NULL                           |
+-------------+-------------------------------+
|rows         |16086                          |
+-------------+-------------------------------+
|filtered     |100                            |
+-------------+-------------------------------+
|Extra        |Using temporary; Using filesort|
+-------------+-------------------------------+

在 Extra 列中的 “Using filesort” 表明了上述 SQL 语句含有排序语义,若想要去除排序语义,在 SQL 语句后面添加一个 ORDER BY NULL 子句:

EXPLAIN
SELECT payment_date, SUM(amount)
FROM payment
GROUP BY payment_date
ORDER BY NULL; 

+-------------+---------------+
|id           |1              |
+-------------+---------------+
|select_type  |SIMPLE         |
+-------------+---------------+
|table        |payment        |
+-------------+---------------+
|partitions   |NULL           |
+-------------+---------------+
|type         |ALL            |
+-------------+---------------+
|possible_keys|NULL           |
+-------------+---------------+
|key          |NULL           |
+-------------+---------------+
|key_len      |NULL           |
+-------------+---------------+
|ref          |NULL           |
+-------------+---------------+
|rows         |16086          |
+-------------+---------------+
|filtered     |100            |
+-------------+---------------+
|Extra        |Using temporary|
+-------------+---------------+

WITH ROLLUP

分组查询的一个变种就是要求 MySQL 对返回的分组结果再做一次超级聚合。可以使用 WITH ROLLUP 子句来实现这种逻辑。

SELECT 
    YEAR(rental_date) y, 
    MONTH(rental_date) m, 
    COUNT(*) num
FROM rental
GROUP BY y, m
WITH ROLLUP;
+----+----+-----+
|y   |m   |num  |
+----+----+-----+
|2005|5   |1156 |
|2005|6   |2311 |
|2005|7   |6709 |
|2005|8   |5686 |
|2005|NULL|15862|
|2006|2   |182  |
|2006|NULL|182  |
|NULL|NULL|16044|
+----+----+-----+

注:根据上面的实例,可以推断出 GROUP BY y, m WITH ROLLUP,相当于 GROUP BY yGROUP BY y, m GROUP BY NULL(无 GROUP BY)三者的 UNION(并集),即等价于以下语句:

(
    SELECT YEAR(rental_date) y, MONTH(rental_date) m, COUNT(*) num
    FROM rental
    GROUP BY y, m
) UNION ALL (
    SELECT YEAR(rental_date) y, NULL, COUNT(*) num
    FROM rental
    GROUP BY y
) UNION ALL (
    SELECT NULL, NULL, COUNT(*) num
    FROM rental
);

应该尽可能地将 WITH ROLLUP 功能转移到应用程序中处理,在应用程序中做超级聚合是更好的,虽然这需要给客户端返回更多的结果。

最后更新于