Skip to content
Snippets Groups Projects
Unverified Commit 2ada1fc4 authored by dpsutton's avatar dpsutton Committed by GitHub
Browse files

Fix errors when downgrading then upgrading to bigquery driver (#22121)

This issue has a simple fix but a convoluted story. The new bigquery
driver handles multiple schemas and puts that schema (dataset-id) in the
normal spot on a table in our database. The old driver handled only a
single schema by having that dataset-id hardcoded in the database
details and leaving the schema slot nil on the table row.

```clojure
;; new driver describe database:
[{:name "table-1" :schema "a"}
 {:name "table-2" :schema "b"}]

;; old driver describe database (with dataset-id "a" on the db):
[{:name "table-1" :schema nil}]
```

So if you started on the new driver and then downgraded for some reason,
the table sync would see you had tables with schemas, but when it
enumerated the tables in the database on the next sync, would see tables
without schemas. It did not unify these two together, nor did it archive
the tables with a schema. You ended up with both copies in the
database, all active.

```clojure
[{:name "table-1" :schema "a"}
 {:name "table-2" :schema "b"}
 {:name "table-1" :schema nil}]
```

If you then tried to migrate back to the newer driver, we migrated them
as normal: since the old driver only dealt with one schema but left it
nil, put that dataset-id on all of the tables connected to this
connection.

But since the new driver and then the old driver created copies of the
same tables, you would end up with a constraint violation: tables with
the same name and, now after the migration, the same schema. Ignore this
error and the sync in more recent versions will correctly inactivate the
old tables with no schema.

```clojure
[{:name "table-1" :schema "a"}  <-|
 {:name "table-2" :schema "b"}    | constraint violation
 {:name "table-1" :schema "a"}] <-|

;; preferrable:
[{:name "table-1" :schema "a"}
 {:name "table-2" :schema "b"}
 {:name "table-1" :schema nil :active false}]
```
parent 826fd47a
No related branches found
No related tags found
No related merge requests found
......@@ -48,6 +48,7 @@
(eval . (put-clojure-indent 'mt/format-rows-by 1))
(eval . (put-clojure-indent 'mt/query 1))
(eval . (put-clojure-indent 'mt/test-drivers 1))
(eval . (put-clojure-indent 'mt/test-driver 1))
(eval . (put-clojure-indent 'prop/for-all 1))
(eval . (put-clojure-indent 'qp.streaming/streaming-response 1))
(eval . (put-clojure-indent 'u/select-keys-when 1))
......
......@@ -332,13 +332,18 @@
(let [db-id (u/the-id database)]
(log/infof (trs "DB {0} had hardcoded dataset-id; changing to an inclusion pattern and updating table schemas"
db-id))
(db/execute! {:update MetabaseTable
:set {:schema dataset-id}
:where [:and
[:= :db_id db-id]
[:or
[:= :schema nil]
[:not= :schema dataset-id]]]})
(try
(db/execute! {:update MetabaseTable
:set {:schema dataset-id}
:where [:and
[:= :db_id db-id]
[:or
[:= :schema nil]
[:not= :schema dataset-id]]]})
;; if we are upgrading to the sdk driver after having downgraded back to the old driver we end up with
;; duplicated tables with nil schema. Happily only in the "dataset-id" schema and not all schemas. But just
;; leave them with nil schemas and they will get deactivated in sync.
(catch Exception _e))
(let [updated-db (-> (assoc-in database [:details :dataset-filters-type] "inclusion")
(assoc-in [:details :dataset-filters-patterns] dataset-id)
(m/dissoc-in [:details :dataset-id]))]
......
......@@ -335,8 +335,24 @@
:name tbl-nm
:fields #{{:name "int_col", :database-type "INTEGER", :base-type :type/Integer, :database-position 0}
{:name "array_col", :database-type "INTEGER", :base-type :type/Array, :database-position 1}}}
(driver/describe-table :bigquery-cloud-sdk (mt/db) {:name tbl-nm, :schema "v3_test_data"}))
"`describe-table` should detect the correct base-type for array type columns")))))
(driver/describe-table :bigquery-cloud-sdk (mt/db) {:name tbl-nm, :schema "v3_test_data"}))
"`describe-table` should detect the correct base-type for array type columns")))))
(deftest sync-inactivates-old-duplicate-tables
(testing "If on the new driver, then downgrade, then upgrade again (#21981)"
(mt/test-driver :bigquery-cloud-sdk
(mt/dataset avian-singles
(try
(let [synced-tables (db/select Table :db_id (mt/id))]
(is (= 2 (count synced-tables)))
(db/insert-many! Table (map #(dissoc % :id :schema) synced-tables))
(sync/sync-database! (mt/db) {:scan :schema})
(let [synced-tables (db/select Table :db_id (mt/id))]
(is (partial= {true [{:name "messages"} {:name "users"}]
false [{:name "messages"} {:name "users"}]}
(-> (group-by :active (doto synced-tables tap>))
(update-vals #(sort-by :name %)))))))
(finally (db/delete! Table :db_id (mt/id) :active false)))))))
(deftest retry-certain-exceptions-test
(mt/test-driver :bigquery-cloud-sdk
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment