Skip to content
Snippets Groups Projects
This project is mirrored from https://github.com/metabase/metabase. Pull mirroring updated .
  1. Dec 28, 2021
    • dpsutton's avatar
      Handle nested queries which have agg at top level and nested (#19437) · 50818a92
      dpsutton authored
      * Handle nested queries which have agg at top level and nested
      
      Previously when matching columns in the outer query with columns in the
      inner query we had use id, and then recently, field_ref. This is
      problematic for aggregations.
      
      Consider https://github.com/metabase/metabase/issues/19403
      
      The mbql for this query is
      
      ```clojure
      {:type :query
       :query {:aggregation [[:aggregation-options
                              [:count]
                              {:name "count"}]
                             [:aggregation-options
                              [:avg
                               [:field
                                "sum"
                                {:base-type :type/Float}]]
                              {:name "avg"}]]
               :limit 10
               :source-card-id 1960
               :source-query {:source-table 1
                              :aggregation [[:aggregation-options
                                             [:sum
                                              [:field 23 nil]]
                                             {:name "sum"}]
                                            [:aggregation-options
                                             [:max
                                              [:field 28 nil]]
                                             {:name "max"}]]
                              :breakout [[:field 26 nil]]
                              :order-by [[:asc
                                          [:field 26 nil]]]}}
       :database 1}
      ```
      
      The aggregations in the top level select will be type checked as :name
      "count" :field_ref [:aggregation 0]. The aggregations in the nested
      query will be turned into :name "sum" :field_ref [:aggregation 0]! This
      is because aggregations are numbered "on their level" and not
      globally. So when the fields on the top level look at the metadata for
      the nested query and merge it, it unifies the two [:aggregation 0]
      fields but this is INCORRECT. These aggregations are not the same, they
      just happen to be the first aggregations at each level.
      
      Its illustrative to see what a (select * from (query with aggregations))
      looks like:
      
      ```clojure
      {:database 1
       :query {:source-card-id 1960
               :source-metadata [{:description "The type of product, valid values include: Doohicky, Gadget, Gizmo and Widget"
                                  :semantic_type :type/Category
                                  :coercion_strategy nil
                                  :name "CATEGORY"
                                  :field_ref [:field 26 nil]
                                  :effective_type :type/Text
                                  :id 26
                                  :display_name "Category"
                                  :fingerprint {:global {:distinct-count 4
                                                         :nil% 0}
                                                :type {:type/Text {:percent-json 0
                                                                   :percent-url 0
                                                                   :percent-email 0
                                                                   :percent-state 0
                                                                   :average-length 6.375}}}
                                  :base_type :type/Text}
                                 {:name "sum"
                                  :display_name "Sum of Price"
                                  :base_type :type/Float
                                  :effective_type :type/Float
                                  :semantic_type nil
                                  :field_ref [:aggregation 0]}
                                 {:name "max"
                                  :display_name "Max of Rating"
                                  :base_type :type/Float
                                  :effective_type :type/Float
                                  :semantic_type :type/Score
                                  :field_ref [:aggregation 1]}]
               :fields ([:field 26 nil]
                        [:field
                         "sum"
                         {:base-type :type/Float}]
                        [:field
                         "max"
                         {:base-type :type/Float}])
               :source-query {:source-table 1
                              :aggregation [[:aggregation-options
                                             [:sum
                                              [:field 23 nil]]
                                             {:name "sum"}]
                                            [:aggregation-options
                                             [:max
                                              [:field 28 nil]]
                                             {:name "max"}]]
                              :breakout [[:field 26 nil]]
                              :order-by [[:asc
                                          [:field 26 nil]]]}}
       :type :query
       :middleware {:js-int-to-string? true
                    :add-default-userland-constraints? true}
       :info {:executed-by 1
              :context :ad-hoc
              :card-id 1960
              :nested? true
              :query-hash #object["[B" 0x10227bf4 "[B@10227bf4"]}
       :constraints {:max-results 10000
                     :max-results-bare-rows 2000}}
      ```
      
      The important bits are that it understands the nested query's metadata
      to be
      
      ```clojure
      {:name "sum"
       :display_name "Sum of Price"
       :field_ref [:aggregation 0]}
      {:name "max"
       :display_name "Max of Rating"
       :field_ref [:aggregation 1]}
      ```
      
      And the fields on the outer query to be:
      ```clojure
      ([:field
        "sum"
        {:base-type :type/Float}]
       [:field
        "max"
        {:base-type :type/Float}])
      ```
      
      So there's the mismatch: the field_ref on the outside is [:field "sum"]
      but on the inside the field_ref is [:aggregation 0]. So the best way to
      match up when "looking down" into sub queries is by id and then by name.
      
      * Some drivers return 4.0 instead of 4 so make them all ints
      
      * Handle dataset metadata in a special way
      
      rather than trying to set confusing merge rules, just special case
      metadata from datasets.
      
      Previously, was trying to merge the "preserved keys" on top of the
      normal merge order. This caused lots of issues. They were
      trivial. Native display names are very close to the column name, whereas
      mbql names go through some humanization. So when you select price
      from (select PRICE ...) its an mbql with a nested native query. The
      merge order meant that the display name went from "Price"
      previously (the mbql nice name for the outer select) to "PRICE", the
      underlying native column name. Now we don't special case the display
      name (and other fields) of regular source-metadata.
      
      Also, there were issues with mbql on top of an mbql dataset. Since it is
      all mbql, everything is pulled from the database. So if there were
      overrides in the nested mbql dataset, like description, display name,
      etc, the outer field select already had the display name, etc. from the
      database rather than allowing the edits to override from the nested
      query.
      
      Also, using a long `:source-query/dataset?` keyword so it is far easier
      to find where this is set. With things called just `:dataset` it can be
      quite hard to find where these keys are used. When using the namespaced
      keyword, greping and finding usages is trivial. And the namespace gives
      more context
      50818a92
    • dpsutton's avatar
      Bump tools deps (#19492) · 97e8390e
      dpsutton authored
      97e8390e
    • Alexander Polyankin's avatar
      Fix export type from cljs (#19493) · ef6aa089
      Alexander Polyankin authored
      ef6aa089
  2. Dec 27, 2021
  3. Dec 26, 2021
  4. Dec 24, 2021
  5. Dec 23, 2021
  6. Dec 22, 2021
  7. Dec 21, 2021
Loading