Skip to content
Snippets Groups Projects
Unverified Commit 48b6f349 authored by Chris Truter's avatar Chris Truter Committed by GitHub
Browse files

Opt out of non-reserved SQL keywords (#44609)

parent d145a247
Branches
Tags
No related merge requests found
......@@ -76,7 +76,7 @@
io.github.eerohele/pp {:git/tag "2024-01-04.60" ; super fast pretty-printing library
:git/sha "a428751"
:git/url "https://github.com/eerohele/pp"}
io.github.metabase/macaw {:mvn/version "0.1.49"} ; Parse native SQL queries
io.github.metabase/macaw {:mvn/version "0.1.60"} ; Parse native SQL queries
;; The 2.X line of Resilience4j requires Java 17, so we cannot upgrade this dependency until that is our minimum JVM version
io.github.resilience4j/resilience4j-retry {:mvn/version "1.7.1" #_ "must be 1.7.1"} ; Support for retrying operations
io.prometheus/simpleclient_hotspot {:mvn/version "0.16.0"} ; prometheus jvm collector
......
......@@ -3,11 +3,11 @@
namespace is to:
1. Translate Metabase-isms into generic SQL that Macaw can understand.
2. Contain Metabase-specific business logic.
2. Encapsulate Metabase-specific business logic.
The primary way of interacting with parsed queries is through their associated QueryFields (see model
file). QueryFields are maintained through the `update-query-fields-for-card!` function. This is invoked as part of
the lifecycle of a card (see Card model).
the lifecycle of a card (see the Card model).
Query rewriting happens with the `replace-names` function."
(:require
......@@ -72,10 +72,10 @@
[field value]
(if-let [f (quote->stripper (first value))]
[:= field (f value)]
;; Technically speaking this is not correct for all databases.
;; Technically speaking, this is not correct for all databases.
;;
;; For example Oracle treats non-quoted identifiers as uppercase, but still expects a case-sensitive match.
;; Similarly, Postgres treat all non-quoted identifiers as lowercase, and again expects an exact match.
;; For example, Oracle treats non-quoted identifiers as uppercase, but still expects a case-sensitive match.
;; Similarly, Postgres treats all non-quoted identifiers as lowercase, and again expects an exact match.
;; H2 on the other hand will choose whether to cast it to uppercase or lowercase based on a system variable... T_T
;;
;; MySQL, by contrast, is truly case-insensitive, and as the lowest common denominator it's what we cater for.
......@@ -150,12 +150,12 @@
(let [db-id (:database query)
macaw-opts (nqa.impl/macaw-options driver)
sql-string (:query (nqa.sub/replace-tags query))
parsed-query (macaw/query->components (macaw/parsed-query sql-string) macaw-opts)
parsed-query (macaw/query->components (macaw/parsed-query sql-string macaw-opts) macaw-opts)
explicit-ids (explicit-field-ids-for-query parsed-query db-id)
implicit-ids (set/difference
(implicit-field-ids-for-query parsed-query db-id)
explicit-ids)]
{:explicit explicit-ids
{:explicit explicit-ids
:implicit implicit-ids}))
(defn field-ids-for-native
......
......@@ -29,7 +29,7 @@
;; For MySQL, columns and aliases can never be case-sensitive, and for SQL Server the default collation is case-
;; insensitive too, so it makes sense to just treat all databases as case-insensitive as a whole.
;;
;; In future Macaw may support discriminating on the identifier type, in which case we could be more precise for
;; In future, Macaw may support discriminating on the identifier type, in which case we could be more precise for
;; these databases. Being 100% correct would require querying system variables and schema configuration however,
;; which is likely a step too far in complexity.
;;
......@@ -40,4 +40,11 @@
{:case-insensitive :agnostic
;; For both MySQL and SQL Server, whether identifiers are case-sensitive depends on database configuration only,
;; and quoting has no effect on this, so we disable this option for consistency with `:case-insensitive`.
:quotes-preserve-case? (not (#{:mysql :sqlserver} driver))}))
:quotes-preserve-case? (not (contains? #{:mysql :sqlserver} driver))
;; There is no plan to be exhaustive yet.
;; Note that while an allowed list would be more conservative, at the time of writing only 2 of the bundled
;; drivers use FINAL as a reserved word, and mentioning them all would be prohibitive.
;; In the future, we will use multimethods to define this explicitly per driver, or even discover it automatically
;; through the JDBC connection, where possible.
:non-reserved-words (vec (remove nil? [(when-not (contains? #{:snowflake :oracle} driver)
:final)]))}))
......@@ -102,7 +102,7 @@
;; Plain variable
;; Note that the order of the clauses matters: `card-ref?` or `snippet?` could be true when is a `Param?`,
;; so we need to handle those cases specially first and leave this as the the token fall-through
;; so we need to handle those cases specially first and leave this as the token fall-through
(params/Param? token)
(let [sub (gen-variable-sentinel raw-query)]
(recur rest
......@@ -116,7 +116,7 @@
(add-tag substitutions sub token)))
:else
;; "this should never happen" but if it does we certainly want to know about it.
;; "this should never happen" but if it does, we certainly want to know about it.
(throw (ex-info "Unsupported token in native query" {:token token})))))))
(defn- replace-all
......@@ -146,8 +146,8 @@
([query renames]
;; Postgres is both popular and adheres closely to the standard SQL specifications.
(replace-names :postgres query renames))
;; Currently we take just the driver, but in future it may more sense to take the entire database entity, in order to
;; match the actual configuration, reserved words for the given version, etc.
;; Currently we take just the driver, but in future it may more sense to take the entire database entity, to match
;; the actual configuration, reserved words for the given version, etc.
([driver query renames]
(let [raw-query (get-in query [:native :query])
parsed-query (params.parse/parse raw-query)
......
......@@ -2,6 +2,7 @@
(:require
[clojure.test :refer :all]
[metabase.db.connection :as mdb.connection]
[metabase.driver :as driver]
[metabase.native-query-analyzer :as query-analyzer]
[metabase.public-settings :as public-settings]
[metabase.test :as mt]))
......@@ -49,9 +50,9 @@
(q "select \"ID\" from venues"))))
(testing "you can mix quoted and unquoted names"
(is (= {:explicit #{(mt/id :venues :id) (mt/id :venues :name)} :implicit nil}
(q "select v.\"ID\", v.name from venues")))
(q "select v.\"ID\", v.name from venues v")))
(is (= {:explicit #{(mt/id :venues :id) (mt/id :venues :name)} :implicit nil}
(q "select v.`ID`, v.name from venues"))))
(q "select v.`ID`, v.name from venues v"))))
(testing "It will find all relevant columns if query is not specific"
(is (= {:explicit #{(mt/id :venues :id) (mt/id :checkins :id)} :implicit nil}
(q "select id from venues join checkins"))))
......@@ -64,4 +65,9 @@
:implicit count)))
(is (= 6
(-> (q "select v.* from venues v join checkins")
:implicit count)))))))
:implicit count))))
(when (not (contains? #{:snowflake :oracle} driver/*driver*))
(testing "Analysis does not fail due to keywords that are only reserved in other databases"
(is (= {:explicit #{(mt/id :venues :id)} :implicit nil}
(q "select id as final from venues"))))))))
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment