Skip to content
Snippets Groups Projects
  • dpsutton's avatar
    c4050fee
    Delay substituting persisted native query until after analysis (#28739) · c4050fee
    dpsutton authored
    * Delay substituting persisted native query until after analysis
    
    We were substituting persisted queries into the query at too early a
    stage.
    
    Card 729 is a persisted model of `select * from categories`
    
    ```clojure
    {:type :query,
     :query {:source-table "card__729",
             :expressions {:adjective ["case"
                                       [[[">" ["field" 100 nil] 30]
                                         "expensive"]
                                        [[">" ["field" 100 nil] 20]
                                         "not too bad"]]
                                       {:default "its ok"}]},
             :aggregation [["count"]],
             :breakout [["expression" "adjective" nil]
                        ["field" 101 nil]]},
     :database 3,
     :parameters []}
    ```
    
    This would immediately get turned into an equivalent
    
    ```clojure
    {:type :query,
     :query {:source-table 12
             :persisted-info/native "select \"id\", \"ean\", \"title\", \"category\", \"vendor\", \"price\", \"rating\", \"created_at\" from \"metabase_cache_424a9_3\".\"model_729_products_r\""
             :expressions {:adjective ["case"
                                       [[[">" ["field" 100 nil] 30]
                                         "expensive"]
                                        [[">" ["field" 100 nil] 20]
                                         "not too bad"]]
                                       {:default "its ok"}]},
             :aggregation [["count"]],
             :breakout [["expression" "adjective" nil]
                        ["field" 101 nil]]},
     :database 3,
     :parameters []}
    ```
    
    And the obvious failure here is the use of `[:field 100 nil]` (price)
    against an inner query of `select ... price ... from cache_table`. This
    could break many queries and caused a flood of
    
    ```
    WARN middleware.fix-bad-references :: Bad :field clause [:field 100 #:metabase.query-processor.util.add-alias-info{:source-table :metabase.query-processor.util.add-alias-info/source, :source-alias "price"}] for field "products.price" at [:expressions "adjective" :case :>]: clause should have a :join-alias. Unable to infer an appropriate join. Query may not work as expected.
    WARN middleware.fix-bad-references :: Bad :field clause [:field 100 #:metabase.query-processor.util.add-alias-info{:source-table :metabase.query-processor.util.add-alias-info/source, :source-alias "price"}] for field "products.price" at [:expressions "adjective" :case :>]: clause should have a :join-alias. Unable to infer an appropriate join. Query may not work as expected.
    ```
    
    * forgot to commit new test file
    
    * Handle persisted before native
    
    * Add test for native models
    
    - have to populate the metadata. You can do this by running the query,
    but there are some requirements around the `:info` object to actually
    save it. So just take the more mechanical route
    - `wait-for-result` returns a time out value, but it's namespaced
    autoresolved to a namespace inside of the metabase.test forest and not
    mt itself. So easy to mess up and watch for the wrong keyword. Want an
    arity so i can check my own timeout value and throw a helpful error
    message. I checked around and noone seems to care. So not great, but at
    least people aren't checking `::mt/timed-out` when it should be
    `::tu.async/timed-out` or
    `:metabase.test.util.async/timed-out`. Possibly best scenario is we
    remove the autoresolved bit and have it return `:async/timed-out` or
    something stable.
    
    * properly compile the native query
    
    * bump timeout to set redshift metadata
    Delay substituting persisted native query until after analysis (#28739)
    dpsutton authored
    * Delay substituting persisted native query until after analysis
    
    We were substituting persisted queries into the query at too early a
    stage.
    
    Card 729 is a persisted model of `select * from categories`
    
    ```clojure
    {:type :query,
     :query {:source-table "card__729",
             :expressions {:adjective ["case"
                                       [[[">" ["field" 100 nil] 30]
                                         "expensive"]
                                        [[">" ["field" 100 nil] 20]
                                         "not too bad"]]
                                       {:default "its ok"}]},
             :aggregation [["count"]],
             :breakout [["expression" "adjective" nil]
                        ["field" 101 nil]]},
     :database 3,
     :parameters []}
    ```
    
    This would immediately get turned into an equivalent
    
    ```clojure
    {:type :query,
     :query {:source-table 12
             :persisted-info/native "select \"id\", \"ean\", \"title\", \"category\", \"vendor\", \"price\", \"rating\", \"created_at\" from \"metabase_cache_424a9_3\".\"model_729_products_r\""
             :expressions {:adjective ["case"
                                       [[[">" ["field" 100 nil] 30]
                                         "expensive"]
                                        [[">" ["field" 100 nil] 20]
                                         "not too bad"]]
                                       {:default "its ok"}]},
             :aggregation [["count"]],
             :breakout [["expression" "adjective" nil]
                        ["field" 101 nil]]},
     :database 3,
     :parameters []}
    ```
    
    And the obvious failure here is the use of `[:field 100 nil]` (price)
    against an inner query of `select ... price ... from cache_table`. This
    could break many queries and caused a flood of
    
    ```
    WARN middleware.fix-bad-references :: Bad :field clause [:field 100 #:metabase.query-processor.util.add-alias-info{:source-table :metabase.query-processor.util.add-alias-info/source, :source-alias "price"}] for field "products.price" at [:expressions "adjective" :case :>]: clause should have a :join-alias. Unable to infer an appropriate join. Query may not work as expected.
    WARN middleware.fix-bad-references :: Bad :field clause [:field 100 #:metabase.query-processor.util.add-alias-info{:source-table :metabase.query-processor.util.add-alias-info/source, :source-alias "price"}] for field "products.price" at [:expressions "adjective" :case :>]: clause should have a :join-alias. Unable to infer an appropriate join. Query may not work as expected.
    ```
    
    * forgot to commit new test file
    
    * Handle persisted before native
    
    * Add test for native models
    
    - have to populate the metadata. You can do this by running the query,
    but there are some requirements around the `:info` object to actually
    save it. So just take the more mechanical route
    - `wait-for-result` returns a time out value, but it's namespaced
    autoresolved to a namespace inside of the metabase.test forest and not
    mt itself. So easy to mess up and watch for the wrong keyword. Want an
    arity so i can check my own timeout value and throw a helpful error
    message. I checked around and noone seems to care. So not great, but at
    least people aren't checking `::mt/timed-out` when it should be
    `::tu.async/timed-out` or
    `:metabase.test.util.async/timed-out`. Possibly best scenario is we
    remove the autoresolved bit and have it return `:async/timed-out` or
    something stable.
    
    * properly compile the native query
    
    * bump timeout to set redshift metadata
Code owners
Assign users and groups as approvers for specific file changes. Learn more.