Use window functions for cumulative sum and cumulative count (#40752)

* Use window functions for SQL drivers for cumulative count and sum * H2/Snowflake/Redshift implementation * Test fixes * Use SELECT expression positions as the default window function implementation * Some test fixes. * Fix bugs in add-alias-info * MongoDB fixes * Another pass * BigQuery should use positions? * Disable tests for the failing drivers, we can fix in follow-on PRs. * Extra info * Add GH issue numbers * Update more GH issue numbers * Fix SQL Server!!!! wooo * I THOUGHT WE WERE DISABLING SPARK SQL!!!!!!!!!!1 * Cleanup to get things ready for review. * Snowflake + Vertica shouldn't use output column numbers for ORDER BY in OVER * Code cleanup * Test fix?

Use window functions for cumulative sum and cumulative count (#40752)
9d112756 · Cam Saul · GitHub · 15bc3349 · 9d112756 · 9d112756
Unverified Commit 9d112756 authored 11 months ago by Cam Saul Committed by GitHub 11 months ago
--- a/docs/developers-guide/driver-changelog.md
+++ b/docs/developers-guide/driver-changelog.md
@@ -13,6 +13,61 @@ title: Driver interface changelog
  efficient way possible. This is currently only required for drivers that support the `:uploads` feature, and has
  a default implementation for JDBC-based drivers.

+- New feature `:window-functions` has been added. Drivers that implement this method are expected to implement the
+  cumulative sum (`:cum-sum`) and cumulative count (`:cum-count`) aggregation clauses in their native query language.
+  For non-SQL drivers (drivers not based on our `:sql` or `:sql-jdbc` drivers), this feature flag is set to `false` by
+  default; the old (broken) post-processing implementations of cumulative aggregations will continue to be used. (See
+  issues [#13634](https://github.com/metabase/metabase/issues/13634) and
+  [#15118](https://github.com/metabase/metabase/issues/15118) for more information on why the old implementation is
+  broken.)
+
+  Non-SQL drivers should be updated to implement cumulative aggregations natively if possible.
+
+  SQL drivers that support ordering by output column numbers in `OVER` `ORDER BY` expressions, e.g.
+
+  ```sql
+  SELECT
+    sum(sum(my_column) OVER (ORDER BY 1 ROWS UNBOUNDED PRECEDING) AS cumulative_sum
+    date_trunc(my_timestamp, 'month') AS my_timestamp_month
+  FROM
+    my_table
+  GROUP BY
+    date_trunc(my_timestamp, 'month')
+  ORDER BY
+    date_trunc(my_timestamp, 'month')
+  ```
+
+  will work without any changes; this is the new default behavior for drivers based on the `:sql` or `:sql-jdbc`
+  drivers.
+
+  For databases that do not support ordering by output column numbers (e.g. MySQL/MariaDB), you can mark your driver
+  as not supporting the `:sql/window-functions.order-by-output-column-numbers` feature flag, e.g.
+
+  ```clj
+  (defmethod driver/database-supports? [:my-driver :sql/window-functions.order-by-output-column-numbers]
+    [_driver _feature _database]
+    false)
+   ```
+
+  In this case, the `:sql` driver will instead generate something like
+
+  ```sql
+  SELECT
+    sum(sum(my_column) OVER (ORDER BY date_trunc(my_timestamp, 'month') ROWS UNBOUNDED PRECEDING) AS cumulative_sum
+    date_trunc(my_timestamp, 'month') AS my_timestamp_month
+  FROM
+    my_table
+  GROUP BY
+    date_trunc(my_timestamp, 'month')
+  ORDER BY
+    date_trunc(my_timestamp, 'month')
+  ```
+
+  If neither of these strategies work for you, you might need to do something more complicated -- see
+  [#40982](https://github.com/metabase/metabase/pull/40982) for an example of complex query transformations to get
+  fussy BigQuery working. (More on this soon.) If all else fails, you can always specify that your driver does not
+  support `:window-functions`, and it will fall back to using the old broken implementation.
+
 ## Metabase 0.49.1

 - Another driver feature has been added: `describe-fields`. If a driver opts-in to supporting this feature, The

--- a/e2e/test/scenarios/custom-column/custom-column.cy.spec.js
+++ b/e2e/test/scenarios/custom-column/custom-column.cy.spec.js
@@ -341,7 +341,7 @@ describe("scenarios > question > custom column", () => {
      .should("have.length.of.at.least", 8);
  });

-  it.skip("should create custom column after aggregation with 'cum-sum/count' (metabase#13634)", () => {
+  it("should create custom column after aggregation with 'cum-sum/count' (metabase#13634)", () => {
    cy.createQuestion(
      {
        name: "13634",
@@ -365,7 +365,7 @@ describe("scenarios > question > custom column", () => {
    cy.log("Reported failing in v0.34.3, v0.35.4, v0.36.8.2, v0.37.0.2");
    // eslint-disable-next-line no-unscoped-text-selectors -- deprecated usage
    cy.findByText("Foo Bar");
-    cy.findAllByText("57911");
+    cy.findAllByText("57,911");
  });

  it("should not be dropped if filter is changed after aggregation (metaabase#14193)", () => {

--- a/modules/drivers/athena/src/metabase/driver/athena.clj
+++ b/modules/drivers/athena/src/metabase/driver/athena.clj
@@ -32,10 +32,11 @@
 ;;; |                                          metabase.driver method impls                                          |
 ;;; +----------------------------------------------------------------------------------------------------------------+

-(doseq [[feature supported?] {:foreign-keys              true
-                              :datetime-diff             true
-                              :nested-fields             false #_true ; huh? Not sure why this was `true`. Disabled for now.
-                              :test/jvm-timezone-setting false}]
+(doseq [[feature supported?] {:datetime-diff                                       true
+                              :foreign-keys                                        true
+                              :nested-fields                                       false
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :test/jvm-timezone-setting                           false}]
  (defmethod driver/database-supports? [:athena feature] [_driver _feature _db] supported?))

 ;;; +----------------------------------------------------------------------------------------------------------------+

--- a/modules/drivers/bigquery-cloud-sdk/src/metabase/driver/bigquery_cloud_sdk.clj
+++ b/modules/drivers/bigquery-cloud-sdk/src/metabase/driver/bigquery_cloud_sdk.clj
@@ -462,10 +462,13 @@
                              :datetime-diff           true
                              :now                     true
                              :convert-timezone        true
-                              ;; BigQuery uses timezone operators and arguments on calls like extract() and timezone_trunc() rather than literally
-                              ;; using SET TIMEZONE, but we need to flag it as supporting set-timezone anyway so that reporting timezones are
-                              ;; returned and used, and tests expect the converted values.
-                              :set-timezone            true}]
+                              ;; BigQuery uses timezone operators and arguments on calls like extract() and
+                              ;; timezone_trunc() rather than literally using SET TIMEZONE, but we need to flag it as
+                              ;; supporting set-timezone anyway so that reporting timezones are returned and used, and
+                              ;; tests expect the converted values.
+                              :set-timezone            true
+                              ;; temporarily disabled -- will be fixed by #40982
+                              :window-functions        false}]
  (defmethod driver/database-supports? [:bigquery-cloud-sdk feature] [_driver _feature _db] supported?))

 ;; BigQuery is always in UTC

--- a/modules/drivers/mongo/test/metabase/driver/mongo/query_processor_test.clj
+++ b/modules/drivers/mongo/test/metabase/driver/mongo/query_processor_test.clj
@@ -106,8 +106,14 @@
                   (qp.compile/compile
                    (mt/mbql-query attempts
                      {:aggregation [[:count]]
-                       :filter      [:time-interval $datetime :last :month]}))))
+                       :filter      [:time-interval $datetime :last :month]}))))))))))

+(deftest ^:parallel no-initial-projection-test-2
+  (mt/test-driver :mongo
+    (testing "Don't need to create initial projections anymore (#4216)"
+      (testing "Don't create an initial projection for datetime-fields that use `:default` bucketing (#14838)"
+        (mt/with-clock #t "2021-02-15T17:33:00-08:00[US/Pacific]"
+          (mt/dataset attempted-murders
            (testing "should still work even with bucketing bucketing"
              (let [tz    (qp.timezone/results-timezone-id :mongo mt/db)
                    query (mt/with-metadata-provider (mt/id)
@@ -147,8 +153,7 @@
                                      {"$project" {"_id"        false
                                                   "datetime"   "$_id.datetime"
                                                   "datetime_2" "$_id.datetime_2"
-                                                   "count"      true}}
-                                      {"$sort" {"datetime" 1}}]
+                                                   "count"      true}}]
                        :collection  "attempts"
                        :mbql?       true}
                       query))

--- a/modules/drivers/presto-jdbc/src/metabase/driver/presto_jdbc.clj
+++ b/modules/drivers/presto-jdbc/src/metabase/driver/presto_jdbc.clj
@@ -43,15 +43,16 @@
 (driver/register! :presto-jdbc, :parent #{:sql-jdbc
                                          ::sql-jdbc.legacy/use-legacy-classes-for-read-and-set})

-(doseq [[feature supported?] {:set-timezone                    true
-                              :basic-aggregations              true
-                              :standard-deviation-aggregations true
-                              :expressions                     true
-                              :native-parameters               true
-                              :expression-aggregations         true
-                              :binning                         true
-                              :foreign-keys                    true
-                              :now                             true}]
+(doseq [[feature supported?] {:basic-aggregations                                  true
+                              :binning                                             true
+                              :expression-aggregations                             true
+                              :expressions                                         true
+                              :foreign-keys                                        true
+                              :native-parameters                                   true
+                              :now                                                 true
+                              :set-timezone                                        true
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :standard-deviation-aggregations                     true}]
  (defmethod driver/database-supports? [:presto-jdbc feature] [_driver _feature _db] supported?))

 ;;; Presto API helpers

--- a/modules/drivers/redshift/src/metabase/driver/redshift.clj
+++ b/modules/drivers/redshift/src/metabase/driver/redshift.clj
@@ -33,11 +33,12 @@

 (driver/register! :redshift, :parent #{:postgres ::sql-jdbc.legacy/use-legacy-classes-for-read-and-set})

-(doseq [[feature supported?] {:test/jvm-timezone-setting false
-                              :nested-field-columns      false
-                              :describe-fields           true
-                              :describe-fks              true
-                              :connection-impersonation  true}]
+(doseq [[feature supported?] {:connection-impersonation                            true
+                              :describe-fields                                     true
+                              :describe-fks                                        true
+                              :nested-field-columns                                false
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :test/jvm-timezone-setting                           false}]
  (defmethod driver/database-supports? [:redshift feature] [_driver _feat _db] supported?))

 ;;; +----------------------------------------------------------------------------------------------------------------+

--- a/modules/drivers/snowflake/src/metabase/driver/snowflake.clj
+++ b/modules/drivers/snowflake/src/metabase/driver/snowflake.clj
@@ -49,11 +49,12 @@

 (driver/register! :snowflake, :parent #{:sql-jdbc ::sql-jdbc.legacy/use-legacy-classes-for-read-and-set})

-(doseq [[feature supported?] {:datetime-diff                          true
-                              :now                                    true
-                              :convert-timezone                       true
-                              :connection-impersonation               true
-                              :connection-impersonation-requires-role true}]
+(doseq [[feature supported?] {:connection-impersonation                            true
+                              :connection-impersonation-requires-role              true
+                              :convert-timezone                                    true
+                              :datetime-diff                                       true
+                              :now                                                 true
+                              :sql/window-functions.order-by-output-column-numbers false}]
  (defmethod driver/database-supports? [:snowflake feature] [_driver _feature _db] supported?))

 (defmethod driver/humanize-connection-error-message :snowflake

--- a/modules/drivers/sparksql/src/metabase/driver/sparksql.clj
+++ b/modules/drivers/sparksql/src/metabase/driver/sparksql.clj
@@ -209,8 +209,10 @@
                              :native-parameters               true
                              :nested-queries                  true
                              :standard-deviation-aggregations true
-                              :test/jvm-timezone-setting       false}]
-  (defmethod driver/database-supports? [:sparkql feature] [_driver _feature _db] supported?))
+                              :test/jvm-timezone-setting       false
+                              ;; disabled for now, see issue #40991 to fix this.
+                              :window-functions                false}]
+  (defmethod driver/database-supports? [:sparksql feature] [_driver _feature _db] supported?))

 ;; only define an implementation for `:foreign-keys` if none exists already. In test extensions we define an alternate
 ;; implementation, and we don't want to stomp over that if it was loaded already

--- a/modules/drivers/sqlserver/src/metabase/driver/sqlserver.clj
+++ b/modules/drivers/sqlserver/src/metabase/driver/sqlserver.clj
@@ -32,13 +32,14 @@

 (driver/register! :sqlserver, :parent :sql-jdbc)

-(doseq [[feature supported?] {:regex                                  false
-                              :case-sensitivity-string-filter-options false
-                              :now                                    true
-                              :datetime-diff                          true
-                              :convert-timezone                       true
-                              :test/jvm-timezone-setting              false
-                              :index-info                             true}]
+(doseq [[feature supported?] {:case-sensitivity-string-filter-options              false
+                              :convert-timezone                                    true
+                              :datetime-diff                                       true
+                              :index-info                                          true
+                              :now                                                 true
+                              :regex                                               false
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :test/jvm-timezone-setting                           false}]
  (defmethod driver/database-supports? [:sqlserver feature] [_driver _feature _db] supported?))

 (defmethod driver/database-supports? [:sqlserver :percentile-aggregations]
@@ -448,7 +449,9 @@
      ;; For the GROUP BY, we replace the unoptimized fields with the optimized ones, e.g.
      ;;
      ;;    GROUP BY year(field), month(field)
-      (apply sql.helpers/group-by new-hsql (mapv (partial sql.qp/->honeysql driver) optimized)))))
+      (apply sql.helpers/group-by new-hsql (mapv (partial sql.qp/->honeysql driver) optimized))
+      ;; remove duplicate group by clauses (from the optimize breakout clauses stuff)
+      (update new-hsql :group-by distinct))))

 (defn- optimize-order-by-subclauses
  "Optimize `:order-by` `subclauses` using [[optimized-temporal-buckets]], if possible."
@@ -468,7 +471,9 @@
  ;; year(), month(), and day() can make use of indexes while DateFromParts() cannot.
  (let [query         (update query :order-by optimize-order-by-subclauses)
        parent-method (get-method sql.qp/apply-top-level-clause [:sql-jdbc :order-by])]
-    (parent-method driver :order-by honeysql-form query)))
+    (-> (parent-method driver :order-by honeysql-form query)
+        ;; order bys have to be distinct in SQL Server!!!!!!!1
+        (update :order-by distinct))))

 ;; SQLServer doesn't support `TRUE`/`FALSE`; it uses `1`/`0`, respectively; convert these booleans to numbers.
 (defmethod sql.qp/->honeysql [:sqlserver Boolean]

--- a/modules/drivers/vertica/src/metabase/driver/vertica.clj
+++ b/modules/drivers/vertica/src/metabase/driver/vertica.clj
@@ -27,11 +27,12 @@
                                      ::sql-jdbc.legacy/use-legacy-classes-for-read-and-set
                                      ::sql.qp.empty-string-is-null/empty-string-is-null})

-(doseq [[feature supported?] {:percentile-aggregations   false
-                              :datetime-diff             true
-                              :now                       true
-                              :convert-timezone          true
-                              :test/jvm-timezone-setting false}]
+(doseq [[feature supported?] {:convert-timezone                                    true
+                              :datetime-diff                                       true
+                              :now                                                 true
+                              :percentile-aggregations                             false
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :test/jvm-timezone-setting                           false}]
  (defmethod driver/database-supports? [:vertica feature] [_driver _feature _db] supported?))

 (defmethod driver/db-start-of-week :vertica

--- a/src/metabase/db.clj
+++ b/src/metabase/db.clj
@@ -80,7 +80,6 @@
    (fn [& args]
      (apply f* (unique-identifier) args))))

-
 (defn increment-app-db-unique-indentifier!
  "Increment the [[unique-identifier]] for the Metabase application DB. This effectively flushes all caches using it as
  a key (including things using [[mdb/memoize-for-application-db]]) such as the Settings cache. Should only be used

--- a/src/metabase/driver.clj
+++ b/src/metabase/driver.clj
@@ -589,7 +589,10 @@
    :describe-fields

    ;; Does the driver support fingerprint the fields. Default is true
-    :fingerprint})
+    :fingerprint
+
+    ;; Does this driver support window functions like cumulative count and cumulative sum? (default: false)
+    :window-functions})

 (defmulti database-supports?
  "Does this driver and specific instance of a database support a certain `feature`?
@@ -609,8 +612,11 @@
    (database-supports? :mongo :set-timezone mongo-db) ; -> true"
  {:arglists '([driver feature database]), :added "0.41.0"}
  (fn [driver feature _database]
-    (when-not (features feature)
-      (throw (Exception. (tru "Invalid driver feature: {0}" feature))))
+    ;; only make sure unqualified keywords are explicitly defined in [[features]].
+    (when (simple-keyword? feature)
+      (when-not (features feature)
+        (throw (ex-info (tru "Invalid driver feature: {0}" feature)
+                        {:feature feature}))))
    [(dispatch-on-initialized-driver driver) feature])
  :hierarchy #'hierarchy)


--- a/src/metabase/driver/h2.clj
+++ b/src/metabase/driver/h2.clj
@@ -66,16 +66,16 @@
 ;;; |                                             metabase.driver impls                                              |
 ;;; +----------------------------------------------------------------------------------------------------------------+

-(doseq [[feature supported?] {:full-join                 false
-                              :regex                     true
-                              :percentile-aggregations   false
-                              :actions                   true
+(doseq [[feature supported?] {:actions                   true
                              :actions/custom            true
                              :datetime-diff             true
+                              :full-join                 false
+                              :index-info                true
                              :now                       true
+                              :percentile-aggregations   false
+                              :regex                     true
                              :test/jvm-timezone-setting false
-                              :uploads                   true
-                              :index-info                true}]
+                              :uploads                   true}]
  (defmethod driver/database-supports? [:h2 feature]
    [_driver _feature _database]
    supported?))

--- a/src/metabase/driver/mysql.clj
+++ b/src/metabase/driver/mysql.clj
@@ -53,20 +53,21 @@

 (defmethod driver/display-name :mysql [_] "MySQL")

-(doseq [[feature supported?] {:persist-models                         true
-                              :convert-timezone                       true
-                              :datetime-diff                          true
-                              :now                                    true
-                              :regex                                  false
-                              :percentile-aggregations                false
-                              :full-join                              false
-                              :uploads                                true
-                              :schemas                                false
-                              ;; MySQL LIKE clauses are case-sensitive or not based on whether the collation of the server and the columns
-                              ;; themselves. Since this isn't something we can really change in the query itself don't present the option to the
-                              ;; users in the UI
-                              :case-sensitivity-string-filter-options false
-                              :index-info                             true}]
+(doseq [[feature supported?] {;; MySQL LIKE clauses are case-sensitive or not based on whether the collation of the
+                              ;; server and the columns themselves. Since this isn't something we can really change in
+                              ;; the query itself don't present the option to the users in the UI
+                              :case-sensitivity-string-filter-options              false
+                              :convert-timezone                                    true
+                              :datetime-diff                                       true
+                              :full-join                                           false
+                              :index-info                                          true
+                              :now                                                 true
+                              :percentile-aggregations                             false
+                              :persist-models                                      true
+                              :regex                                               false
+                              :schemas                                             false
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :uploads                                             true}]
  (defmethod driver/database-supports? [:mysql feature] [_driver _feature _db] supported?))

 ;; This is a bit of a lie since the JSON type was introduced for MySQL since 5.7.8.

--- a/src/metabase/driver/postgres.clj
+++ b/src/metabase/driver/postgres.clj
@@ -61,13 +61,14 @@
 (defmethod driver/display-name :postgres [_] "PostgreSQL")

 ;; Features that are supported by Postgres and all of its child drivers like Redshift
-(doseq [[feature supported?] {:convert-timezone         true
-                              :datetime-diff            true
-                              :now                      true
-                              :persist-models           true
-                              :table-privileges         true
-                              :schemas                  true
-                              :connection-impersonation true}]
+(doseq [[feature supported?] {:connection-impersonation                            true
+                              :convert-timezone                                    true
+                              :datetime-diff                                       true
+                              :now                                                 true
+                              :persist-models                                      true
+                              :schemas                                             true
+                              :sql/window-functions.order-by-output-column-numbers false
+                              :table-privileges                                    true}]
  (defmethod driver/database-supports? [:postgres feature] [_driver _feature _db] supported?))

 (defmethod driver/database-supports? [:postgres :nested-field-columns]

--- a/src/metabase/driver/sql.clj
+++ b/src/metabase/driver/sql.clj
@@ -18,16 +18,17 @@

 (driver/register! :sql, :abstract? true)

-(doseq [feature [:standard-deviation-aggregations
-                 :foreign-keys
-                 :expressions
+(doseq [feature [:advanced-math-expressions
+                 :binning
                 :expression-aggregations
+                 :expressions
+                 :foreign-keys
                 :native-parameters
                 :nested-queries
-                 :binning
-                 :advanced-math-expressions
                 :percentile-aggregations
-                 :regex]]
+                 :regex
+                 :standard-deviation-aggregations
+                 :window-functions]]
  (defmethod driver/database-supports? [:sql feature] [_driver _feature _db] true))

 (doseq [join-feature [:left-join

--- a/src/metabase/driver/sql/query_processor.clj
+++ b/src/metabase/driver/sql/query_processor.clj
@@ -420,7 +420,7 @@
  of `honeysql-form`. Most drivers can use the default implementations for all of these methods, but some may need to
  override one or more (e.g. SQL Server needs to override this method for the `:limit` clause, since T-SQL uses `TOP`
  instead of `LIMIT`)."
-  {:added "0.32.0", :arglists '([driver top-level-clause honeysql-form query]), :style/indent 2}
+  {:added "0.32.0", :arglists '([driver top-level-clause honeysql-form query]), :style/indent [:form]}
  (fn [driver top-level-clause _ _]
    [(driver/dispatch-on-initialized-driver driver) top-level-clause])
  :hierarchy #'driver/hierarchy)
@@ -687,6 +687,101 @@
   (->honeysql driver mbql-expr)
   (->honeysql driver power)])

+;;; NOCOMMIT -- we need to document the new `:window-functions` feature.
+(defmethod driver/database-supports? [:sql :sql/window-functions.order-by-output-column-numbers]
+  [_driver _feature _database]
+  true)
+
+(defn- format-rows-unbounded-preceding [_clause _args]
+  ["ROWS UNBOUNDED PRECEDING"])
+
+(sql/register-clause!
+ ::rows-unbounded-preceding
+ #'format-rows-unbounded-preceding
+ nil)
+
+(defn- over-rows-with-order-by-copying-expressions
+  "e.g.
+
+    OVER (ORDER BY date_trunc('month', my_column))"
+  [driver expr]
+  (let [{:keys [group-by]} (apply-top-level-clause
+                            driver
+                            :breakout
+                            {}
+                            *inner-query*)]
+    [:over
+     [expr
+      {:order-by                  (mapv (fn [expr]
+                                          [expr :asc])
+                                        group-by)
+       ::rows-unbounded-preceding []}]]))
+
+;;; For databases that do something intelligent with ORDER BY 1, 2: we can take advantage of that functionality
+;;; implement the window function versions of cumulative sum and cumulative sum more simply.
+
+(defn- over-rows-with-order-by-output-column-numbers
+  "e.g.
+
+    OVER (ORDER BY 1 ROWS UNBOUNDED PRECEDING)"
+  [_driver expr]
+  [:over
+   [expr
+    {:order-by                  (mapv (fn [i]
+                                        [[:inline (inc i)] :asc])
+                                      (range (count (:breakout *inner-query*))))
+     ::rows-unbounded-preceding []}]])
+
+(defn- cumulative-aggregation-window-function [driver expr]
+  (let [over-rows-with-order-by (if (driver/database-supports? driver
+                                                               :sql/window-functions.order-by-output-column-numbers
+                                                               (lib.metadata/database (qp.store/metadata-provider)))
+                                  over-rows-with-order-by-output-column-numbers
+                                  over-rows-with-order-by-copying-expressions)]
+    (over-rows-with-order-by driver expr)))
+
+;;;    cum-count()
+;;;
+;;; should compile to SQL like
+;;;
+;;;    sum(count()) OVER (ORDER BY ...)
+;;;
+;;; where the ORDER BY matches what's in the query (i.e., the breakouts), or
+;;;
+;;;    sum(count()) OVER (ORDER BY 1 ROWS UNBOUNDED PRECEDING)
+;;;
+;;; if the database supports ordering by SELECT expression position
+(defmethod ->honeysql [:sql :cum-count]
+  [driver [_cum-count expr-or-nil]]
+  ;; a cumulative count with no breakouts doesn't really mean anything, just compile it as a normal count.
+  (if (empty? (:breakout *inner-query*))
+    (->honeysql driver [:count expr-or-nil])
+    (cumulative-aggregation-window-function
+     driver
+     [:sum (if expr-or-nil
+             [:count (->honeysql driver expr-or-nil)]
+             [:count :*])])))
+
+;;;    cum-sum(total)
+;;;
+;;; should compile to SQL like
+;;;
+;;;    sum(sum(total)) OVER (ORDER BY ...)
+;;;
+;;; where the ORDER BY matches what's in the query (i.e., the breakouts), or
+;;;
+;;;    sum(sum(total)) OVER (ORDER BY 1 ROWS UNBOUNDED PRECEDING)
+;;;
+;;; if the database supports ordering by SELECT expression position
+(defmethod ->honeysql [:sql :cum-sum]
+  [driver [_cum-sum expr]]
+  ;; a cumulative sum with no breakouts doesn't really mean anything, just compile it as a normal sum.
+  (if (empty? (:breakout *inner-query*))
+    (->honeysql driver [:sum expr])
+    (cumulative-aggregation-window-function
+     driver
+     [:sum [:sum (->honeysql driver expr)]])))
+
 (defn- interval? [expr]
  (mbql.u/is-clause? :interval expr))


--- a/src/metabase/driver/sql_jdbc/execute.clj
+++ b/src/metabase/driver/sql_jdbc/execute.clj
@@ -135,6 +135,7 @@
  driver/dispatch-on-initialized-driver
  :hierarchy #'driver/hierarchy)

+;;; TODO -- we should just make this a FEATURE!!!!!1
 (defmulti ^Statement statement-supported?
  "Indicates whether the given driver supports creating a java.sql.Statement, via the Connection. By default, this is
  true for all :sql-jdbc drivers.  If the underlying driver does not support Statement creation, override this as

--- a/src/metabase/legacy_mbql/util.cljc
+++ b/src/metabase/legacy_mbql/util.cljc
@@ -399,11 +399,13 @@
 (mu/defn add-order-by-clause :- mbql.s/MBQLQuery
  "Add a new `:order-by` clause to an MBQL `inner-query`. If the new order-by clause references a Field that is
  already being used in another order-by clause, this function does nothing."
-  [inner-query                                        :- mbql.s/MBQLQuery
-   [_ [_ id-or-name :as _field], :as order-by-clause] :- mbql.s/OrderBy]
-  (let [existing-fields (set (for [[_ [_ id-or-name]] (:order-by inner-query)]
-                               id-or-name))]
-    (if (existing-fields id-or-name)
+  [inner-query                           :- mbql.s/MBQLQuery
+   [_dir orderable, :as order-by-clause] :- mbql.s/OrderBy]
+  (let [existing-orderables (into #{}
+                                  (map (fn [[_dir orderable]]
+                                         orderable))
+                                  (:order-by inner-query))]
+    (if (existing-orderables orderable)
      ;; Field already referenced, nothing to do
      inner-query
      ;; otherwise add new clause at the end
@@ -577,6 +579,7 @@
                (assert (not= candidate original)
                        (str "unique-alias-fn must return a different string than its input. Input: "
                             (pr-str candidate)))
+                (swap! id+original->unique assoc [id name-key] candidate)
                (recur id candidate))))))))))

 (mu/defn uniquify-names :- [:and
@@ -632,7 +635,6 @@

  Most often, `aggregation->name-fn` will be something like `annotate/aggregation-name`, but for purposes of keeping
  the `metabase.legacy-mbql` module seperate from the `metabase.query-processor` code we'll let you pass that in yourself."
-  {:style/indent 1}
  [aggregation->name-fn :- fn?
   aggregations         :- [:sequential mbql.s/Aggregation]]
  (lib.util.match/replace aggregations
@@ -648,7 +650,6 @@
 (mu/defn pre-alias-and-uniquify-aggregations :- UniquelyNamedAggregations
  "Wrap every aggregation clause in a `:named` clause with a unique name. Combines `pre-alias-aggregations` with
  `uniquify-named-aggregations`."
-  {:style/indent 1}
  [aggregation->name-fn :- fn?
   aggregations         :- [:sequential mbql.s/Aggregation]]
  (-> (pre-alias-aggregations aggregation->name-fn aggregations)