Skip to content
Snippets Groups Projects
Unverified Commit be52e4fc authored by dpsutton's avatar dpsutton Committed by GitHub
Browse files

Delete old persistence schemas in redshift (#30013)

* Delete old persistence schemas in redshift

Reminder that recently a change occurred that populates a cache_info
table in each persisted schema:

```sql
-- postgres/redshift
test-data=# select * from metabase_cache_424a9_379.cache_info ;
       key        |                value
------------------+--------------------------------------
 settings-version | 1
 created-at       | 2023-03-29T14:16:24.849866Z
 instance-uuid    | 407e4ba8-2bab-470f-aeb5-9fc63fd18c4e
 instance-name    | Metabase Test
(4 rows)
```

So for each cache schema, we can classify it as
- old style (a commit before this change. more and more infrequent)
- new style recent
- new style expired

And we can delete them accordingly in startup:
```
2023-04-11 20:09:03,402 INFO data.redshift :: Dropping expired cache schema: metabase_cache_0149c_359
2023-04-11 20:09:04,733 INFO data.redshift :: Dropping expired cache schema: metabase_cache_0149c_70
2023-04-11 20:09:05,557 INFO data.redshift :: Dropping expired cache schema: metabase_cache_0149c_71
2023-04-11 20:09:06,284 INFO data.redshift :: Dropping expired cache schema: metabase_cache_0149c_90
...
2023-04-11 20:20:33,271 INFO data.redshift :: Dropping expired cache schema: metabase_cache_fe4a7_90
2023-04-11 20:20:34,284 INFO data.redshift :: Dropping old cache schema without `cache_info` table: metabase_cache_8f4b8_358
2023-04-11 20:20:35,076 INFO data.redshift :: Dropping old cache schema without `cache_info` table: metabase_cache_8f4b8_69
...
```

It's possible this will at first cause a few flakes if we are unlucky
enough to drop a cache schema without `cache_info` for an instance that
is running tests at that point. But the `cache_info` table has been
backported so the chances of that become smaller each day.

I've let a week elapse from that change before committing this so
hopefully it is not an issue in practice.

Number of queries:
Makes a single query to get all schemas, then for each schema makes a
query to classify it. This can be unified into a single query with some
shenanigans like

```clojure
(sql/format {:select [:schema :created-at]
             :from   {:union-all
                      (for [schema schemas]
                        {:select [[[:inline schema] :schema]
                                  [{:select [:value]
                                    :from   [(keyword schema "cache_info")]
                                    :where  [:= :key [:inline "created-at"]]}
                                   :created-at]]})}}
            {:dialect :ansi})
```

But i found that this query is extremely slow and does not offer any
benefit over the simpler, repeated queries. And as we run this on each
commit now, the number of schemas will be far lower and it will be on
the order of 5-10 schemas (and therefore queries) and therefore not an
issue.

* Ngoc's suggestions

- docstring for `delete-old-schemas!`
- combine nested `doseq`
- use java-time over interop with java.time
parent 232472de
No related branches found
No related tags found
No related merge requests found
(ns metabase.test.data.redshift
(:require
[clojure.java.jdbc :as jdbc]
[clojure.string :as str]
[java-time :as t]
[metabase.driver.sql-jdbc.connection :as sql-jdbc.conn]
[metabase.driver.sql-jdbc.sync :as sql-jdbc.sync]
[metabase.driver.sql.test-util.unique-prefix :as sql.tu.unique-prefix]
......@@ -71,15 +73,89 @@
;;; Create + destroy the schema used for this test session
(defn- delete-old-schemas! [^java.sql.Connection conn]
(with-open [rset (.. conn getMetaData getSchemas)
stmt (.createStatement conn)]
(while (.next rset)
(let [schema (.getString rset "TABLE_SCHEM")
sql (format "DROP SCHEMA IF EXISTS \"%s\" CASCADE;" schema)]
(when (sql.tu.unique-prefix/old-dataset-name? schema)
(log/info (u/format-color 'blue "[redshift] %s" sql))
(.execute stmt sql))))))
(defn- reducible-result-set [^java.sql.ResultSet rset]
(reify clojure.lang.IReduceInit
(reduce [_ rf init]
(with-open [rset rset]
(loop [res init]
(if (.next rset)
(recur (rf res rset))
res))))))
(defn- fetch-schemas [^java.sql.Connection conn]
(reify clojure.lang.IReduceInit
(reduce [_ rf init]
(reduce ((map (fn [^java.sql.ResultSet rset]
(.getString rset "TABLE_SCHEM"))) rf)
init
(reducible-result-set (.. conn getMetaData getSchemas))))))
(def ^Long HOURS-BEFORE-EXPIRED-THRESHOLD
"Number of hours that elapse before a persisted schema is considered expired."
6)
(defn- classify-cache-schemas
"Classifies the persistence cache schemas. Returns a map with where each value is a (possibly empty) sequence of
schemas:
{:old-style-cache schemas without a `cache_info` table
:recent schemas with a `cache_info` table and are recently created
:expired `cache_info` table and created [[HOURS-BEFORE-EXPIRED-THRESHOLD]] ago
:lacking-created-at should never happen, but if they lack an entry for `created-at`
:unknown-error if an error was thrown while classifying the schema}"
[^java.sql.Connection conn schemas]
(let [threshold (t/minus (t/instant) (t/hours HOURS-BEFORE-EXPIRED-THRESHOLD))]
(with-open [stmt (.createStatement conn)]
(let [classify! (fn [schema-name]
(try (let [sql (format "select value from %s.cache_info where key = 'created-at'"
schema-name)
rset (.executeQuery stmt sql)]
(if (.next rset)
(let [date-string (.getString rset "value")
created-at (java.time.Instant/parse date-string)]
(if (t/before? created-at threshold)
:expired
:recent))
:lacking-created-at))
(catch com.amazon.redshift.util.RedshiftException e
(if (re-find #"relation .* does not exist" (or (ex-message e) ""))
:old-style-cache
(do (log/error "Error classifying cache schema" e)
:unknown-error)))
(catch Exception e
(log/error "Error classifying cache schema" e)
:unknown-error)))]
(group-by classify! schemas)))))
(defn- delete-old-schemas!
"Remove unneeded schemas from redshift. Local databases are thrown away after a test run. Shared cloud instances do
not have this luxury. Test runs can create schemas where models are persisted and nothing cleans these up, leading
to redshift clusters hitting the max number of tables allowed."
[^java.sql.Connection conn]
(let [{old-convention :old
caches-with-info :cache} (reduce (fn [acc s]
(cond (sql.tu.unique-prefix/old-dataset-name? s)
(update acc :old conj s)
(str/starts-with? s "metabase_cache_")
(update acc :cache conj s)
:else acc))
{:old [] :cache []}
(fetch-schemas conn))
{:keys [expired
old-style-cache
lacking-created-at]} (classify-cache-schemas conn caches-with-info)
drop-sql (fn [schema-name] (format "DROP SCHEMA IF EXISTS \"%s\" CASCADE;"
schema-name))]
;; don't delete unknown-error and recent.
(with-open [stmt (.createStatement conn)]
(doseq [[collection fmt-str] [[old-convention "Dropping old data schema: %s"]
[expired "Dropping expired cache schema: %s"]
[lacking-created-at "Dropping cache without created-at info: %s"]
[old-style-cache "Dropping old cache schema without `cache_info` table: %s"]]
schema collection]
(log/infof fmt-str schema)
(.execute stmt (drop-sql schema))))))
(defn- create-session-schema! [^java.sql.Connection conn]
(with-open [stmt (.createStatement conn)]
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment