-
- Downloads
Change database sync and field values job scheduling [ci drivers]
Previously we configured each database to have separate trigger + job instances for sync+analyze and field values scanning. We were using the default threadcount and "do nothing" upon misfires. This resulted in a race condition like below. By default, sync and field-values scans are scheduled at 50 minutes past the hour (two separate jobs). As an example assume we have 10 databases configured. This would result in 20 jobs firing at 50 minutes past the hour. The order of execution of these jobs is not defined. By default, we have 4 threads to execute these jobs. If the 4 threads are busy when a trigger occurs, it will wait up to one minute (the default). If it can't get a thread in 1 minute, it will misfire. The policy we had setup will discard the job when it misfires. The end result of this is a race condition where we could potentially not run sync on many databases and also be running 4 scans or syncs at the same time, potentially causing memory issues. This commit switches to a single job for sync and a single job for field value scans. Every database will have a trigger for sync and a trigger for field values, but only 1 instance of sync and field values scanning will run at a time. In the event of a misfire, it will execute it as soon as it's able and only once.
Showing
- resources/quartz.properties 5 additions, 0 deletionsresources/quartz.properties
- src/metabase/task.clj 18 additions, 0 deletionssrc/metabase/task.clj
- src/metabase/task/sync_databases.clj 64 additions, 49 deletionssrc/metabase/task/sync_databases.clj
- test/metabase/task/sync_databases_test.clj 46 additions, 72 deletionstest/metabase/task/sync_databases_test.clj
- test/metabase/test/util.clj 2 additions, 1 deletiontest/metabase/test/util.clj
Please register or sign in to comment