This project is mirrored from https://github.com/metabase/metabase.
Pull mirroring updated .
- Nov 17, 2021
-
-
Jeff Evans authored
* Add SSL keystore and truststore secret properties to Oracle driver YAML files Update Oracle driver to set keystore and truststore connection options from secret values Adding new `select-keys-sequentially` helper function for dealing with assertions on transformed conn props Add new function to the secret namespace to return a lighter weight "secret map", and remove that functionality from the `handle-db-details-secret-prop!`, so that it can be reused from connection testing Modifying CircleCI to set the corresponding truststore secret conn prop vars instead of JVM-level truststore settings, as the test had been doing previously Expand Oracle test connection details to incorporate all SSL related properties, and also make it a function instead of delayed def (so changes can be picked up later) Misc. fixes in Oracle driver YAML files Update `exception-classes-not-to-retry` to include SSLHandshakeException Fix misc issues in secret and database model code Update CircleCI config to use correct secret based env var keys and values Add *database-name-override* dynamic var in `metabase.test.data.interface` to allow for DB name overriding Fix up ssl connectivity test so all parts pass successfully Get rid of crazy with-redefs to swap in the existing test-data's schema, and instead, just dropping the DB if it was created under a different schema (similar to what H2 does) Co-authored-by:
Gustavo Saiani <gustavo@poe.ma>
-
- Nov 08, 2021
-
-
Jeff Evans authored
Update new BigQuery driver to only use project-id in qualified names if it differs from credentials (#18843) * Update new BigQuery driver to only use project-id in qualified names if it differs from credentials Add new key to db-details for :bigquery-cloud-sdk when a database is updated, called :project-id-from-credentials, and extract the project-id from the service account creds when the database is saved Updating logic in bigquery-cloud-sdk QP to only include the project-id when qualifying an Identifier, if the project-id (override) value is set, and its value differs from the :project-id-from-credentials value mentioned above Adding a test to confirm that a card query continues to work after changing the associated DB's driver from :bigquery to :bigquery-cloud-sdk Change all syncs in the test to {:scan :schema} to speed them up (deeper level not needed)
-
- Nov 04, 2021
-
-
Jeff Evans authored
Add a new dynamic var - *source-query* - to sqp.qp to bind the current source-query, and bind that in the same place that *table-alias* is currently bound Update bigquery-cloud-sqp qp implementation to check the sqp.qp/*source-query*, and only qualify the source query alias if the type of source query is a table (which is apparently required; see #15074) Add test for #18742
-
- Oct 27, 2021
-
-
Jeff Evans authored
* Make the checkins_interval transient datasets more uniquely named Append a timestamp suffix (initialized when the namespace was loaded) to transient dataset names Add verbose comments explaining what we're doing Add transient-dataset? functions for both BQ drivers to put logic in one place (which currently just see if the dataset is named `checkins_interval_*` Ditch the "legacydriver_" prefix from the legacy BigQuery driver when creating new datasets; it's no longer needed now that transient datasets will have a much more unique suffix Update the logic for the transient dataset suffix so that it can be more easily detected/parsed later (for cleanup) Remove old (more than 2 hours) transient datasets when test extensions namespaces load Add various docstrings
-
- Oct 26, 2021
-
-
Pawit Pornkitprasan authored
This reverts commit c6688f5caf226c6ec5be424859fc00b93665620f. Redshift only supports VACUUM command at a time on the whole cluster, so running VACUUM can fail with parallel executions.
-
Jeff Evans authored
v2.0.0.3 -> v2.1.0.1
-
- Oct 19, 2021
-
-
Jeff Evans authored
Removing duplicate property declaration from presto-jdbc driver YAML Add test that executes against all drivers to confirm that no duplicate names come out of connection-properties Change the way the test runs to avoid needing to initialize test data namespace (to make googleanalytics happy) Unskip repro Cypress test
-
- Oct 18, 2021
-
-
Cam Saul authored
-
Howon Lee authored
Previously mongo custexps were just columns: you couldn't have group bys with them and you couldn't have filters with them. This allows those, because those weren't really tested before. Also enabled nemanja's tests for them.
-
dpsutton authored
* Ensure we are paginating resultsets Made big tables in both pg and mysql pg: ```sql create table large_table ( id serial primary key, large_text text ); insert into large_table (large_text) select repeat('Z', 4000) from generate_series(1, 500000) ``` In mysql use the repl: ```clojure (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5) ["CREATE TABLE large_table (id int NOT NULL PRIMARY KEY AUTO_INCREMENT, foo text);"]) (do (jdbc/insert-multi! (sql-jdbc.conn/db->pooled-connection-spec 5) :large_table (repeat 50000 {:foo (apply str (repeat 5000 "Z"))})) :done) (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5) ["ALTER TABLE large_table add column properties json default null"]) (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5) ["update large_table set properties = '{\"data\":{\"cols\":null,\"native_form\":{\"query\":\"SELECT `large_table`.`id` AS `id`, `large_table`.`foo` AS `foo` FROM `large_table` LIMIT 1\",\"params\":null},\"results_timezone\":\"UTC\",\"results_metadata\":{\"checksum\":\"0MnSKb8145UERWn18F5Uiw==\",\"columns\":[{\"semantic_type\":\"type/PK\",\"coercion_strategy\":null,\"name\":\"id\",\"field_ref\":[\"field\",200,null],\"effective_type\":\"type/Integer\",\"id\":200,\"display_name\":\"ID\",\"fingerprint\":null,\"base_type\":\"type/Integer\"},{\"semantic_type\":null,\"coercion_strategy\":null,\"name\":\"foo\",\"field_ref\":[\"field\",201,null],\"effective_type\":\"type/Text\",\"id\":201,\"display_name\":\"Foo\",\"fingerprint\":{\"global\":{\"distinct-count\":1,\"nil%\":0.0},\"type\":{\"type/Text\":{\"percent-json\":0.0,\"percent-url\":0.0,\"percent-email\":0.0,\"percent-state\":0.0,\"average-length\":500.0}}},\"base_type\":\"type/Text\"}]},\"insights\":null,\"count\":1}}'"]) ``` and then from the terminal client repeat this until we have 800,000 rows: ```sql insert into large_table (foo, properties) select foo, properties from large_table; ``` Then can exercise from code with the following: ```clojure (-> (qp/process-query {:database 5 ; use appropriate db and tables here :query {:source-table 42 ;; :limit 1000000 }, :type :query} ;; don't retain any rows, purely just counting ;; so resultset is what retains too many rows {:rff (fn [metadata] (let [c (volatile! 0)] (fn count-rff ([] {:data metadata}) ([result] (assoc-in result [:data :count] @c)) ([result _row] (vswap! c inc) result)))) }) :data :count) ``` PG was far easier to blow up. Mysql took quite a bit of data. Then we just set a fetch size on the result set so that we (hopefully) only have than many rows in memory in the resultset at once. The streaming will write to the download stream as it goes. PG has one other complication in that the fetch size can only be honored if autoCommit is false. The reasoning seems to be that each statement is in a transaction and commits and to commit it has to close resultsets and therefore it has to realize the entire resultset otherwise you would only get the initial page if any. * Set default fetch size to 500 ;; Long queries on gcloud pg ;; limit 10,000 ;; fetch size | t1 | t2 | t3 ;; ------------------------------- ;; 100 | 6030 | 8804 | 5986 ;; 500 | 1537 | 1535 | 1494 ;; 1000 | 1714 | 1802 | 1611 ;; 3000 | 1644 | 1595 | 2044 ;; limit 30,000 ;; fetch size | t1 | t2 | t3 ;; ------------------------------- ;; 100 | 17341 | 15991 | 16061 ;; 500 | 4112 | 4182 | 4851 ;; 1000 | 5075 | 4546 | 4284 ;; 3000 | 5405 | 5055 | 4745 * Only set fetch size if not default (0) Details of `:additional-options "defaultRowFetchSize=3000"` can set a default fetch size and we can easily honor that. This allows overriding per db without much work on our part. * Remove redshift custom fetch size code This removes the automatic insertion of a defaultRowFetchSize=5000 on redshift dbs. Now we always set this to 500 in the sql-jdbc statement and prepared statement fields. And we also allow custom ones to persist over our default of 500. One additional benefit of removing this is that it always included the option even if a user added ?defaultRowFetchSize=300 themselves so this should actually give more control to our users. Profiling quickly on selecting 79,000 rows from redshift, there essentially no difference between a fetch size of 500 (the default) and 5000 (the old redshift default); both were 12442 ms or so. * unused require of settings in redshift tests * Appease the linter * Unnecessary redshift connection details tests
-
- Oct 05, 2021
-
-
Jeff Evans authored
Remove the `driver/format-custom-field-name` implementation for :snowflake, which was causing the problem and appears to be no longer needed anyway Add test for this scenario to `metabase.query-processor-test.expression-aggregations-test`
-
- Oct 03, 2021
-
-
Pawit Pornkitprasan authored
Co-authored-by:
Justin McCarthy <justin@strongdm.com>
-
- Sep 30, 2021
-
-
Jeff Evans authored
Add `escape-alias` multimethod to `sql.qp` to handle translating an alias (whether for a field or expression) into something supported by the driver in quesion Add default impl that is essentially identity Marking `sql.qp/field->alias` as deprecated, and changing its default implementation to invoke the new `sql.qp/escape-alias` on the field's `:name` (with the intention of removing it in a future release, since individual driver won't need to override this, so much as they'll need to override `escape-alias` itself). Override `sql.qp/escape-alias` method for both BigQuery drivers to run through the same `->valid-field-identifier` function already defined for this purpose Add test for a custom expression with weird characters
-
- Sep 28, 2021
-
-
Pawit Pornkitprasan authored
- `TEXT` in Redshift is equivalent to `VARCHAR(256)`, use `VARCHAR(1024)` instead - Unskip all tests skipped on Redshift due to sample dataset issue - Add explicit `:order-by` to `join-source-queries-with-joins-test` to make it pass in Redshift as Redshift does not preserve order after join (like a lot of other distributed query engines) - Add `VACUUM` and `ANALYZE` call after INSERT to improve performance
-
Jeff Evans authored
Fix BigQuery Percentile and Median expressions not using correct backtick quoting Ensure the identifier components are quoted by reifying `ToSql` for the `APPROX_QUANTILES` expression Add test to ensure query is created properly Mark `:percentile-aggregations` as being supported in the new driver
-
- Sep 27, 2021
-
-
Jeff Evans authored
Add test for ordering when nils are present Adding test to ensure that we can handle a column containing nils in an ORDER BY clause Adding `format-name` override for `bigquery-cloud-sdk` to convert dash to underscore in test table names Adding comment showing IDs of each row in the daily-bird-counts.edn file Format last column as int to make Oracle happy Add impls for `presto` and `presto-jdbc` `sorts-nil-first?` as false Make the `sorts-nil-first?` multimethod accept a base type in order to return an answer Vertica, at least, has a different answer for different types; see: https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AnalyzingData/Optimizations/NULLPlacementByAnalyticFunctions.htm#2 Updating Vertica implementation to consider the `base-type` as per its documentation Updating all invocations of `sorts-nil-first?` in tests to pass the relevant base type
-
Jeff Evans authored
Support query cancellation in new BigQuery driver Remove now unused 60 second timeout Remove a bunch of unneeded code (`with-finished-response`) from the legacy driver Add support for canceling query, which stops pagination of results Add test to confirm pagination stops upon cancellation Add test to confirm that max rows enforcement via middleware prevents unnecessary pages from being fetched
-
- Sep 20, 2021
-
-
Jeff Evans authored
Add separate catch clause for BigQueryException, to `execute-bigquery`, which conveniently includes a property to indicate whether it's retryable, and add that to the ex-data map Add test that confirms a retryable exception is, in fact, retried
-
- Sep 16, 2021
-
-
Jeff Evans authored
Recognize mode=REPEATED and return :type/Array for it Add test for temp table using an array type column to ensure base type is recongized properly
-
- Sep 15, 2021
-
-
Jeff Evans authored
Make use of new visible-if property in plugin definition
-
- Sep 13, 2021
-
-
Gustavo Saiani authored
-
Jeff Evans authored
Add parameter handling for BigDecimal that checks whether the value is in range for BIGNUMERIC or BIGDECIMAL and sets accordingly Updating parameter test to hit both these cases Modifying test code/macros to support a with-table semantic for testing the decimal types via a temp table Writing new test to confirm that all numeric types are synced correctly and can be queried for
-
- Sep 10, 2021
-
-
Howon Lee authored
This was a miserable grind in the old BQ syntax but a happy fun time in the new BQ syntax, basically just turning it on and squashing a bunch of test bugs. One test bug to note especially is that the underscore-transformed set of bird scarcity data for the bird scarcity tests is actually necessary, because BQ will hit you with a complaint about underscores even if you successfully and correctly quote the table names.
-
- Sep 01, 2021
-
-
pawit-metabase authored
* update sqlite-jdbc to 3.36.0.1 The new version includes arm64 binaries and allows Cypress, which uses SQLite, to run on Apple Silicon. * Improve date/time handling in SQLite
-
- Aug 31, 2021
-
-
Cam Saul authored
* Test init code: read driver parents from driver deps.edn * Rename :metabase.build-driver/parents -> :metabase.driver/parents
-
- Aug 30, 2021
-
-
Jeff Evans authored
* Randomize BigQuery test view names Use tu/random-name instead of a name based on the output of gensym * Add extra prefix for legacy BigQuery driver created databases * Update legacy BigQuery driver to have unique name for transient datasets For those datasets created and destroyed by the legacy BigQuery driver QP tests, add an additional prefix so that it doesn't conflict with the new BigQuery driver tests
-
- Aug 25, 2021
-
-
Jeff Evans authored
0.254 -> 0.260
-
dpsutton authored
* Include the results base type as the cols effective type previously was merged the effective type from the col which caused issues with dates when selecting a particular temporal unit, ie month. Queries like this return months as integers so the base and effective types are :type/Integer, and something somewhere else is responsible for the unit conversion of `1` to January (or feb, months get weird i think) ```clojure ;; after {:description "The date and time an order was submitted.", :unit :month-of-year, :name "CREATED_AT", :field_ref [:field 4 {:temporal-unit :month-of-year}], :effective_type :type/Integer, ;; we have the proper effective type :display_name "Created At", :base_type :type/Integer} ;; before: {:description "The date and time an order was submitted.", :unit :month-of-year, :name "CREATED_AT", :field_ref [:field 4 {:temporal-unit :month-of-year}], :effective_type :type/DateTime, ;; included effective type from db :display_name "Created At", :base_type :type/Integer} ``` * tests * Driver tests * Ignore effective/base types in breakout tests sparksql is weird and its totally unrelated to the file under test * Use correct options for datetimes that are projected to month, etc When aggregating by a datetime column, you can choose a unit: month, hour-of-day, week, etc. You often (always?) end up with an integer. Previously the base_type was (inexplicably) :type/Integer and the effective_type was :type/DateTime which really just didn't make sense. The ideal solution is :type/DateTime as the base_type and :type/Integer as the effective_type. But this breaks the frontend as it expects to treat these columns in a special way. But it expected them to report as date columns. I've changed it only here. I thought about making isDate understand the `column.unit` attribute and recognize that these are in fact dates, but that seems wrong. These are integers that are special cases. It seems that the contexts that need to talk about dates should understand integers in a special way than all date code needs to be aware that integers might flow through. This might mean extra work but ultimately feels better as the correct solution. * unit is not default
-
- Aug 24, 2021
-
-
Jeff Evans authored
* Add Kerberos support to Presto JDBC Driver Adding Kerberos properties as DB details properties Refactoring utility methods for dealing with additional options Adding test for Kerb properties -> connection string Adding GitHub action to run integration test Start to move logic for capturing necessary files to new script, and calling that from mba run Adjust settings in `metabase.test.data.presto-jdbc` to capture test parameters
-
Jeff Evans authored
Fix counting on grouped aggregated field Adding QP test Reworking the BigQuery version of the :breakout to more closely match the structure of the standard sql.qp version
-
- Aug 23, 2021
-
-
Howon Lee authored
Mongo custom expressions now are turned on for mongo 5.0 and after only. One idiosyncrasy is that we don't really support the BSON ID format at all in our typesystem despite having it in there, but both of my attempts to get them to work with the native mongo pipeline commands bounced because they really aren't strings, they're BSON ID's. We do rely on the pipeline commands heavily which is the reason for the version requirement.
-
- Aug 20, 2021
-
-
Jeff Evans authored
New BigQuery Driver Create new :bigquery-cloud-sdk driver using the google-cloud-bigquery library instead, and whose source is adapted from the :bigquery driver https://cloud.google.com/bigquery/docs/reference/libraries Marking existing :bigquery driver as deprecated, and superseded-by the new one (bigquery-cloud-sdk) Update new driver and query processor code to use newer Google SDK Switch test data loading over to use new API Add project-id connection property to override the value from the service account JSON, and use it as part of qualified names in the query processor if set Updating google driver so its libraries are compatible with the newer ones used in BigQuery Update date bucketing tests to skip :bigquery-cloud-sdk (new driver) where :bigquery is skipped Update `with-bigquery-fks` to take in the driver, since there are now multiple ones Adding test to confirm that overriding project-id for a public BQ project works (sync and query) Fixing a bunch of whitespace alignment errors in tests
-
- Aug 16, 2021
-
-
Jeff Evans authored
From 3.12.13 -> 3.13.6
-
- Aug 10, 2021
-
-
Cam Saul authored
-
- Aug 05, 2021
-
-
dpsutton authored
-
- Aug 04, 2021
-
-
Jeff Evans authored
Remove the `driver/can-connect?` implementation and move the `select-keys` logic instead to the existing place in `sql-jdbc.conn/connection-details->spec` Update test to also include the `:let-user-control-scheduling` details entry
-
- Aug 02, 2021
-
-
Jeff Evans authored
Updating schema connection property to have proper display name Updating display name for database to be "Schema" Fixing `driver/can-connect?` impl in `presto_jdbc.clj` to actually use the correct driver, and dissoc `:engine` before calling the `sql-jdbc.conn` method Adding test for the `test-database-connection` method, which is what led to the error from the UI
-
- Jul 30, 2021
-
-
Cam Saul authored
-
- Jul 29, 2021
-
-
Cam Saul authored
* Fix some Eastwood failures * Fix a lot of Eastwood errors now that it runs against test namespaces [ci skip] * Run Eastwood against test namespaces [WIP] * Bump Eastwood version to fix some errors in Eastwood itself * Fix another lint warning * Fix test failure * More test fixes
* Fix all the warnings! * Test fix * Bump Eastwood GH action timeout to 20 minutes (!)
-
- Jul 23, 2021
-
-
Jeff Evans authored
Implement JDBC based Presto driver Adding new Presto JDBC driver using the PrestoDB JDBC driver from `https://github.com/prestodb/presto` Marking the old Presto driver as being `superseded-by` the new one Pulling out common Presto code into new presto-common driver (modeled after the relationship between, ex: `googleanalytics` and `google)` Putting common QP/HoneySQL logic into the new (abstract) :presto-common driver Updating :presto driver to extend from the new common driver and only adding HTTP/REST API related methods Adding implementation of Presto JDBC driver, named :presto-jdbc, extending from :presto-common and :sql-jdbc Using com.facebook.presto/presto-jdbc as underlying JDBC driver dependency (since this is explicitly for Presto clusters, as opposed to Trino) Adapting code from the existing Presto driver where appropriate Adding new dependency-satisfied? implementation for :env-var, to allow for a plugin to require an env var be set, and making the Presto JDBC driver depend on that (specifically: `mb-enable-presto-jdbc-driver`) Adding CircleCI configuration to run against the newer Presto (0.254) Docker image Adding explicit ordering in a few tests where it was missing Fixing presto-type->base-type (timestamps were being synced as :type/Time because the regex pattern was wrong) Add tx/format-name test implementation for :presto-jdbc to lowercase table name Make modified test Oracle friendly Fixing bug parsing the `[:zone :time]` case within `metabase.util.date-2.parse/parse-with-formatter`; the offset is nil so it can't be passed directly in this case, so use the `standard-offset` fn (which was moved from `date-2` to `common` to get a standard offset for that zone Fixing more test failures by adding explicit ordering Changing sync to check whether the driver supports foreign keys before attempting to sync those (since drivers might throw an exception if attempting to check) Moving some common test dataset functionality between :presto and :presto-jdbc to a new test.data ns for :presto-common Adding HoneySQL form for :count-where, since we have to explicitly give a higher precision Decimal in order for Presto to not reduce the precision in results Put limit within subquery for `expression-using-aggregation-test` (since ordering from subquery is not guaranteed in Presto) Adding impls for ->prepared-substitution to handle substitutions for native query params Adding HoneySQL impls for `mod` (to do as "mod(x,y)" and `timestamp` (since it's a function with no parens to invoke) functions Adding various `sql.qp/date` impls that use the `AT TIME ZONE` operator to account for report tz, and make bucketing tests happy
-