;; Unfortunately the process of loading test data onto an external server for CI purposes is a little involved. Before testing against Druid, you'll need to perform the following steps:
;; For EC2 instances, make sure to expose ports `8082` & `8090` for Druid while loading data. Once data has finished loading, you only need to expose port `8082`.
;;
;; 1. Setup Zookeeper
;; 1A. Download & extract Zookeeper from `http://zookeeper.apache.org/releases.html#download`
;; 1B. Create `zookeeper/conf/zoo.cfg` -- see the Getting Started Guide: `http://zookeeper.apache.org/doc/r3.4.6/zookeeperStarted.html`
;; 1C. `zookeeper/bin/zkServer.sh start`
;; 1D. `zookeeper/bin/zkServer.sh status` (to make sure it started correctly)
;; 2. Setup Druid
;; 2A. Download & extract Druid from `http://druid.io/downloads.html`
;; 2B. `cp druid/run_druid_server.sh druid/run_historical.sh` and bump the `-Xmx` setting to `6g` or so
;; The task will keep you apprised of its progress until it completes (takes 1-2 minutes)
;; 4B. Keep an eye on `<host>:8082/druid/v2/datasources`. (e.g. "http://ec2-52-90-109-199.compute-1.amazonaws.com:8082/druid/v2/datasources")
;; This endpoint will return an empty array until the broker knows about the newly ingested segments. When it returns an array with the string `"checkins"` you're ready to
;; run the tests.
;; 4C. Kill the `overlord` process once the data has finished loading.
;; 5. Running Tests
;; 5A. You can run tests like `ENGINES=druid MB_DRUID_PORT=8082 MB_DRUID_HOST=http://ec2-52-90-109-199.compute-1.amazonaws.com lein test`
;; Unfortunately the process of loading test data onto an external server for CI purposes is a little involved.
;; A complete step-by-step guide is available on the wiki at `https://github.com/metabase/metabase/wiki/Setting-up-Druid-for-CI-on-EC2`
;; Refer to that page for more information.
(def^:private^:constdefault-filename"Default filename for batched ingestion data file."
"checkins.json")
...
...
@@ -126,7 +98,7 @@
(def^:private^:constindexer-timeout-seconds
"Maximum number of seconds we should wait for the indexing task to finish before deciding it's failed."