Skip to content
Snippets Groups Projects
user avatar
dpsutton authored
* Handle `nil` cache during initial cache population

this bug only happens soon after startup. The general idea of the cache
is that we repopulate if needed, and other threads can use the stale
version while repopulating. This is valid except at startup before the
cache has finished populating. The `(setting.cache/cache)` value will
still be nil to other threads while the first thread populates it.

this is a race but sounds like an unlikely one to hit in practice. But
we have a special case: `site-url`. We have middleware that checks the
site-url on every api request and sets it if it is nil. So if an
instance gets two requests before the cache has been populated, it will
set it to whatever value is in the headers, overriding whatever has been
set in the UI.

The following snippet was how i diagnosed this. But to simulate startup
you can just `(require 'metabase.models.setting.cache :reload)` to get
back to an "empty" cache and let the threads race each other. I also put
high contention on reloading by dropping the millisecond cache update
interval to 7 milliseconds. But this refresh interval does not matter:
we just fall back to the old version of the cache. It is only the
initial cache population that is using `nil` as a current cache.

```clojure
public-settings=> (let [mismatch1 (atom [])
                        mismatch2 (atom [])
                        iterations 100000
                        latch (java.util.concurrent.CountDownLatch. 2)
                        nil-value (atom [])]
                    (future
                      (dotimes [_ iterations]
                        (let [value (site-url)
                              cache (metabase.models.setting.cache/cache)]
                          (when (not= value "http://localhost:3000")
                            (swap! mismatch1 conj value))
                          (when (not= (get cache "site-url") "http://localhost:3000")
                            (swap! nil-value conj cache))))
                      (.countDown latch))
                    (future
                      (dotimes [_ iterations]
                        (let [value (site-url)
                              cache (metabase.models.setting.cache/cache)]
                          (when (not= value "http://localhost:3000")
                            (swap! mismatch2 conj value))
                          (when (not= (get cache "site-url") "http://localhost:3000")
                            (swap! nil-value conj cache))))
                      (.countDown latch))
                    (.await latch)
                    (def nil-value nil-value)
                    [(count @mismatch1) (take 10 @mismatch1) (count @mismatch2) (take 10 @mismatch2)])
[0 () 1616 (nil nil nil nil nil nil nil nil nil nil)]
```

* Don't attempt to get setting values from db/cache before db ready

* We check `db-is-setup?` above this level

db check has to happen higher up at `db-or-cache-value` as neither the
db will work, nor the cache based on selecting from the db, if the db is
not set up.

* Remove some of the nested whens (thanks howon)

Lots of nesting due to requiring and checking for nil of the required
var. This can never really be nil but just to overly cautiously guard,
put in an or with the `(constantly false)`. Combine the two predicates
into a single `and` and then swap our homegrown `(when (seq x) x)` for
the equivalent `(not-empty x)`.
6ca66e6e
History
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
Name Last commit Last update
..
metabase