Skip to content
Snippets Groups Projects
This project is mirrored from https://github.com/metabase/metabase. Pull mirroring updated .
  1. Oct 21, 2024
  2. Oct 18, 2024
  3. Oct 17, 2024
  4. Oct 14, 2024
  5. Oct 09, 2024
    • Cal Herries's avatar
      Lean on DB queries for describe-table for Mongo (#46598) · 6324f948
      Cal Herries authored
      
      This PR reimplements driver/describe-table for MongoDB. Before we would query a sample of documents from a collection and analyse them in Clojure. Instead, we now now execute a query that does a similar aggregation, but most of the calculation is done in the Mongo database.
      
      Based on a few tests the performance is slightly slower when the collection contains small or deeply nested documents but much faster for large ones. But the main difference is in memory usage. This uses very little memory in the Metabase instance because all of the aggregation is done in the database.
      
      
      Nested fields are a naturally recursive problem but here we unroll potential recursions to a `max-depth` number of queries that look for nesting at each depth level.
      
      * ~ use DB to describe the table
      
      * ~ optimize root query
      
      * ~ nested-level-query works and gets objects too
      
      * + root query gets objects too
      
      * + driver/describe-table :mongo works
      
      * ~ remove old implementation
      
      * Various fixes for faster sync
      
      Upgraded driver to 5.2.0
      Updated data load to insert many rather than 1 row at a time.
      Dropped max-depth to 7, see comment.
      
      ---------
      
      Co-authored-by: default avatarCase Nelson <case@metabase.com>
      Unverified
      6324f948
  6. Oct 08, 2024
    • lbrdnk's avatar
      [Databricks] Address initial remarks (#48377) · 72495873
      lbrdnk authored
      * Address initial remarks
      
      * Extract hive-like to separate module and set it as dependency
      
      * Remove hive-like also from spark
      Unverified
      72495873
    • Cam Saul's avatar
      :race_car::rocket::race_car::rocket: :race_car::rocket: SHAVE 7 MINUTES OFF OF NON-CORE DRIVER TEST RUNS IN CI :race_car::rocket::race_car::rocket: :race_car::rocket: (#47681) · cd4d7646
      Cam Saul authored
      * Parallel driver tests PoC
      
      * Set fail-fast to false for now
      
      * Try splitting up non-driver tests to see how broken tests are
      
      * Whoops fix plain BE tests
      
      * Ok nvm I'll test this in another branch
      
      * Fix fail-fast
      
      * Experiment with the improved Hawk split logic
      
      * Fix some broken/flaky tests
      
      * Experiment: try splitting MySQL 8 tests into FOUR jobs
      
      * Divide other Postgres and MySQL tests up and use num-partitions = 2
      
      * Another test fix :wrench:
      
      * Flaky test fix :wrench:
      
      * Try making more stuff fast
      
      * Make athena fast??
      
      * Fix a few more things
      
      * Test fixes? :wrench:
      
      * Fix configs
      
      * Fix Mongo job syntax
      
      * Fix busted test from #46942
      
      * Fix Mongo config again
      
      * wait-for-port needs to specify shell I guess
      
      * More cleanup
      
      * await-port can't have a timeout-minutes I guess
      
      * Let's only parallelize MySQL for now.
      
      * Cleanup action
      
      * Cleanup wait-for-port action
      
      * Fix another flaky test
      
      * NOW driver tests will be FAST
      
      * Need to mark driver tests too
      
      * Fix wrong tag
      
      * Use Hawk 1.0.5
      
      * Fix busted metabase.public-settings-test/landing-page-setting-test
      
      * Fix busted `metabase.api.database-test/get-database-test` etc. (hopefully)
      
      * Fix busted `metabase.sync.sync-metadata.fields-test/sync-fks-and-fields-test` for Oracle
      
      * Maybe this fixed `metabase.query-processor.middleware.permissions-test/e2e-ignore-user-supplied-perms-test` maybe not
      
      * Fix busted metabase.api.dashboard-test/dependent-metadata-test because endpoint had differemt sort order than test
      
      * Ok my test fix did not work
      
      * Fix metabase.sync.sync-metadata.fields-test/sync-fks-and-fields-test for Redshift
      
      * Better test name
      
      * More test fixes :wrench:
      
      * Schema fix
      
      * PR feedback
      
      * Split off test partitioning into separate PR
      
      * Fix failing Oracle tests
      
      * Another round of test fixes, hopefully
      
      * Fix failing Redshift tests
      
      * Maybe the last round of test fixes
      
      * Fix Oracle
      
      * Fix stray line
      Unverified
      cd4d7646
  7. Oct 02, 2024
    • Case Nelson's avatar
      fix: bigquery more resilient querying (#48175) · 3fd17081
      Case Nelson authored
      * fix: bigquery more resilient querying
      
      Inline some function calls to make it easier to track what's happening.
      
      Make sure that cancellation during the initial query and subsequent page
      fetches are handled properly. Explicitly throw when cancelled.
      
      Only retry queries if bigquery says they are retryable.
      
      Try to cancel the BigQuery job if an exception or cancellation occurs.
      
      * Add comment explaining execution flow
      
      * Bump bigquery deps
      
      * Bump biquery dependencies
      
      * Fix tests
      
      * Fix formatting
      Unverified
      3fd17081
  8. Sep 27, 2024
  9. Sep 26, 2024
    • lbrdnk's avatar
      Databricks JDBC driver (#42263) · c04928d5
      lbrdnk authored
      * Databricks JDBC driver base
      
      * Add databricks CI job
      
      * WIP data loading -- it works, further cleanup needed
      
      * Cleanup
      
      * Implement ->honeysql to enable data loading
      
      * Hardcode catalog job var
      
      * Implement driver methods and update tests
      
      * Derive hive instead of sql-jdbc
      
      * Cleanup leftovers after deriving hive
      
      * Run databricks tests on push
      
      * Cleanp and enable set-timezone
      
      * Disable database creation by tests
      
      * Add Databricks to broken drivers for timezone tests
      
      * Exclude Databricks from test
      
      * Enable have-select-privilege?-test
      
      * Restore sql-jdbc-drivers-using-default-describe-table-or-fields-impl post rebase
      
      * Restore joined-date-filter-test
      
      * Adjust to work with dataset definition tests
      
      * Adjust alternative date tests
      
      * Remove leftover reflecttion warning set
      
      * Update test exts
      
      * cljfmt vscode
      
      * Add databricks to kondo drivers
      
      * Update metabase-plugin.yaml
      
      * Update databricks_jdbc.clj
      
      * Rework test extensions
      
      * Update general data loading code to work with Databricks
      
      * Reset tests to orig
      
      * Use DateTimeWithLocalTZ for TIMESTAMP database type
      
      * Convert to LocalDateTime for set-parameter
      
      * Update test extensions field-base-type->sql-tyoe
      
      * Update database-type->base-type
      
      * Enable creation of time columns in test data even though not supported
      
      * Fix typo
      
      * Update tests
      
      * Udpate tests
      
      * Update drivers.yml
      
      * Disable dynamic dataset loading tests
      
      * Adjust the iso-8601-text-fields-should-be-queryable-date-test
      
      * Update load-data/row-xform
      
      * Add time type exception to test
      
      * Update test data loading and enable test
      
      * Whitespace
      
      * Enable all driver jobs
      
      * Update comment
      
      * Make catalog mandatory
      
      * Remove comment
      
      * Remove log level from spec generation
      
      * Update sql.qp/datetime-diff
      
      * Update read-column-thunk
      
      * Remove comment
      
      * Simplify date-time->results-local-date-time
      
      * Update comment
      
      * Move definitions
      
      * Update test extension types mapping
      
      * Remove now obsolete ddl/insert-rows-honeysql-form implementation
      
      * Update sql-jdbc.conn/connection-details->spec for perturb-db-details
      
      * Update load-data/do-insert!
      
      * Remove ssh tunnel from driver as tests do not work with it
      
      * Update test
      
      * Promote ::dynamic-dataset-loading to :test/dynamic-dataset-loading and modify corresponding tests
      
      * Adjust to broken TIMESTAMP_NTZ sync
      
      * Update read-column-thunk to return timestamps always in Z
      
      * Comment
      
      * Disable tests for dynamic datasets
      
      * Return spark jobs into drivers.yml
      
      * Update Databricks CI catalog
      
      * Remove vscode cljfmt tweak
      
      * Update iso-8601-text-fields-expected-rows
      
      * Update datetime-diff
      
      * Formatting
      
      * cljfmt
      
      * Add placeholder test
      
      * Remove comment
      
      * cljfmt
      
      * Use EnableArrow=0 connection param
      
      * Remove comment
      
      * Comment
      
      * Update tests
      
      * cljfmt
      
      * Update driver's deps.edn
      
      * Update tests
      
      * Implement alternative `describe-table`
      
      * WIP Workaround for timestamp_ntz sync, will be thrown away probably
      
      * Update metabase-plugin.yaml with schema filters
      
      * Update driver to use schema filters and remove now redundant sync implemnetations
      
      * Update tests
      
      * Update tests extensions
      
      * Update test
      
      * Add feature flags for fast sync
      
      * Implement describe-fields
      
      * Implement describe-fks-sql
      
      * Enable fast sync features
      
      * Use full_data_type
      
      * Comment
      
      * Add exception for timestamp_ntz columns to new sync code
      
      * Implement db-default-timezone
      
      * Add timestamp_ntz ignored test
      
      * Add db-default-timezone-test
      
      * Fix typo
      
      * Update setReadOnly
      
      * Add comment on setAutoCommit
      
      * Update chunk-size
      
      * Add timezone-in-set-and-read-functions-test
      
      * Drop Athena from driver exceptions
      
      * Use set/intersection instead of a filter
      
      * Add explicit fast-sync tests
      
      * Update describe-fields-sql and add comment
      
      * Add preprocess-additional-options
      
      * Add leading semicolon test
      
      * Disable dataset creation and update comment
      
      * Rename driver to `databricks`
      
      * Use old secret names
      
      * Fix wrongly copied hsql list
      
      * Temporarily allow database creation
      
      * Add *allow-database-deletion*
      
      * Temporarily allow database creation
      
      * Disable database creation
      
      * cljfmt
      
      * cljfmt
      Unverified
      c04928d5
  10. Sep 24, 2024
  11. Sep 16, 2024
    • Case Nelson's avatar
      fix: snowflake compile day trunc to timestamp_ltz (#47874) · 85857aeb
      Case Nelson authored
      Fixes #47426
      
      Date operations return values based on the timestamp offset rather than
      the session/db timezone. Converting to `timestamp_ltz` first ensures
      that we get predictable results.
      
      In the test, it is important that database timezone matches
      report_timezone so that post-processing gives proper results.
      Unverified
      85857aeb
  12. Sep 09, 2024
    • Case Nelson's avatar
      fix: bigquery add null checks to result processing (#47590) · 07dd373f
      Case Nelson authored
      * fix: bigquery add null checks to result processing
      
      Fixes #47339
      
      On the related issue there are different stacktraces indicating
      likely sources of null pointer exceptions.
      
      1. `.getNextPage` is likely returning a nil value. I was unable to reproduce this but one thing I did notice is that `hasNextPage` is recommended over checking `.getNextPageToken`. Added nil handling around `page` possibly being nil.
      
      2. `cancel-chan` may be triggered before processing begins and such `execute-bigquery` would pass nil as a TableResult to the initial reducer. Testing if cancel-chan happens at just the right moment would be too flaky for CI testing but I was able to reproduce this locally and is fixed by the nil handling added.
      
      3. `cancel-chan` may be triggered during query processing. This is covered by a test now.
      
      * Check hasNextPage in test:
      
      * Add test for null getNextPage
      
      * Fix cljfmt
      Unverified
      07dd373f
  13. Sep 04, 2024
  14. Aug 30, 2024
  15. Aug 29, 2024
  16. Aug 28, 2024
  17. Aug 27, 2024
  18. Aug 26, 2024
  19. Aug 23, 2024
  20. Aug 22, 2024
  21. Aug 21, 2024
  22. Aug 20, 2024
  23. Aug 14, 2024
  24. Aug 07, 2024
    • dpsutton's avatar
      Exclude javax.annotation/javax.annotation-api (#46546) · e5a3827e
      dpsutton authored
      dep is brought in by google cloud
      
      From `clj -X:deps tree :aliases '[:ee :drivers]'`
      
      ```
        . metabase/bigquery-cloud-sdk metabase/modules/drivers/bigquery-cloud-sdk
          . com.google.cloud/google-cloud-bigquery 2.38.1
            . com.google.cloud/google-cloud-core 2.35.0
            [...]
            . javax.annotation/javax.annotation-api 1.3.2
      ```
      
      Before this change:
      
      ```
      ❯ clj -X:deps tree :aliases '[:ee :drivers]' | grep 'javax.annotation/javax.annotation-api'
            . javax.annotation/javax.annotation-api 1.3.2
      ```
      
      After:
      
      ```
      ❯ clj -X:deps tree :aliases '[:ee :drivers]' > deps-master
      
      ❯ echo $?
      0
      ```
      Unverified
      e5a3827e
  25. Aug 05, 2024
  26. Aug 01, 2024
    • lbrdnk's avatar
      Avoid adding temporal-unit to lhs cols (#46262) · 07648603
      lbrdnk authored
      * Avoid adding temporal-unit to lhs cols
      
      * Use :default temporal-unit
      
      * Simplify logic, update comments
      
      * Update substitute-field-filter-test
      
      * Update align-temporal-unit-with-param-type-test
      
      * Update field-filter-date-test
      
      * Use end-excludding gte lt filter for DateTime fields
      
      * Update substitute-field-filter-test
      
      * Update align-temporal-unit-with-param-type-test
      
      * Update field-filter-date-test
      
      * Update bigquery test + comment
      
      * Update date-str->qp-aware-offset-dt
      
      * Add guard for unexpected date string format
      
      This is just for the completeness, I haven't encountered it.
      
      * Address review remarks
      
      * Update comment
      Unverified
      07648603
  27. Jul 31, 2024
  28. Jul 25, 2024
  29. Jul 23, 2024
  30. Jul 19, 2024
    • lbrdnk's avatar
      Add basic auth params to `dbms-version` call on Druid (#45729) · a5eb647e
      lbrdnk authored
      * Add basic auth to dbms-version
      
      * Add test
      Unverified
      a5eb647e
    • lbrdnk's avatar
      Remove `:foreign-keys` feature or convert to `:metadata/key-constraints` where appropriate (#44894) · 9c708c21
      lbrdnk authored
      
      * Post sync hook stub for implicit joins testing stub
      
      * Add join alias to field lvalues to enable sorting on joined fields
      
      * Disable :foreign-keys on Mongo
      
      * Require :left-join support for implicit joins instead of :foreign-keys
      
      * Update implicit joins tests
      
      * Adjust sync-fields-test
      
      * Update implicit joins feature check test
      
      * Transform post-sync-hook to normal function
      
      * Add foreign key relationships only for dbmses without :foreign-keys feature
      
      * Update test to handle Oracle correctly
      
      * Split convoluted fn
      
      * Avoid unnecessary computations for datasets with no fks
      
      * Update docstring
      
      * Fix driver usage
      
      * Transform :foreign-keys to :metadata/key-constraints in test data loading code
      
      * Update sync_test.clj
      
      * Update driver_test.clj
      
      * Update moviedb.clj
      
      * Update dataset_definition_test.clj
      
      * Update fetch_metadata.clj
      
      * Update fields_test.clj
      
      * Update driver.clj
      
      * Update driver/sql.clj
      
      * Set sql driver join support to true for all joins
      
      Deriving drivers are expected to set to false where applicable.
      
      * Update sqllite.clj
      
      * Remove foreign-keys from spark
      
      * Remove :foreign-keys from presto
      
      * Remove :foreign-keys from Athena
      
      * Remove foreign-keys from big query
      
      Reading docs it seems fk inference should be ok. Let's see the test results. Act based on that.
      
      * Update test_metadata.cljc
      
      * Set key-constraints to false for bigquery
      
      * Add foreign keys to sqlite manually until sync is fixed
      
      * Return driver require to Athena
      
      * Correct typo
      
      * Add naive primary key heuristic
      
      * Update pk fk logic to handle name components correctly
      
      * Add alias escaping to presto
      
      * Add ordering to test
      
      * Add order by to test
      
      * Update test
      
      * Remove use of rewrite-fields-to-force-using-column-aliases in order by fields
      
      * Add exception to alias forcing
      
      * Different approach to exception from alias forcing
      
      * Alternative approach for prefixing idents in bigq
      
      * All seelcted fields by desired alias
      
      * Rewrite only fields not from this source table
      
      * Update test
      
      * Enable breakout-on-fk-field-test for :left-join drivers
      
      * Add feature comment
      
      * Explicit joins tests foreign-keys removal
      
      * Update nested_queries_test.clj
      
      * Update remapping tests
      
      * Update tests
      
      * Update tests to handle sqlite results format
      
      * Disable metadata/key-constraints on sqlite during tests until
      
      * Address remarks
      
      * Remove mt/with-mock-fks-for-drivers-without-fk-constraints
      
      * Update bigquery test
      
      * Update tests
      
      * Adjust row level restrictions
      
      * Add parameterized-sql feature
      
      * Update comment
      
      * Update leftovers
      
      * Order keys
      
      * Remove foreign keys from frontend
      
      * Fix FE unit
      
      * Update driver changelog
      
      * Address review remark
      
      * Update test/metabase/query_processor/test_util.clj
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      
      * Update docs/developers-guide/driver-changelog.md
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      
      * Update docs/developers-guide/driver-changelog.md
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      
      * Update modules/drivers/bigquery-cloud-sdk/src/metabase/driver/bigquery_cloud_sdk/query_processor.clj
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      
      * Update docs/developers-guide/driver-changelog.md
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      
      * Update test
      
      * Update comment and reduce expression
      
      * Update comment
      
      * Address remarks
      
      * Fix merge
      
      ---------
      
      Co-authored-by: default avatarmetamben <103100869+metamben@users.noreply.github.com>
      Unverified
      9c708c21
Loading