Commits · a610b92bda2c8fcc4a54f7a60e8428150fb498a6 · Engineering Digital Service / Metabase

This project is mirrored from https://github.com/metabase/metabase. Pull mirroring updated 12 minutes ago.

Oct 18, 2021

Consolidate DB connection timeouts; raise default to 10 seconds (#18541) · a610b92b
Cam Saul authored 3 years ago

Unverified

a610b92b
Cap XLSX column widths at 255 characters (#18535) · 5252d8c4
Noah Moss authored 3 years ago

Unverified

5252d8c4

Ensure we are paginating resultsets (#18477) · a33fa568

dpsutton authored 3 years ago

* Ensure we are paginating resultsets

Made big tables in both pg and mysql

pg:
```sql
create table large_table
(
    id         serial primary key,
    large_text text
);

insert into large_table (large_text)
select repeat('Z', 4000)
from generate_series(1, 500000)
```

In mysql use the repl:
```clojure

  (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5)
                 ["CREATE TABLE large_table (id int NOT NULL PRIMARY KEY AUTO_INCREMENT, foo text);"])

  (do
    (jdbc/insert-multi! (sql-jdbc.conn/db->pooled-connection-spec 5)
                        :large_table
                        (repeat 50000 {:foo (apply str (repeat 5000 "Z"))}))
    :done)

  (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5)
                 ["ALTER TABLE large_table add column properties json default null"])

  (jdbc/execute! (sql-jdbc.conn/db->pooled-connection-spec 5)
                 ["update large_table set properties = '{\"data\":{\"cols\":null,\"native_form\":{\"query\":\"SELECT
                 `large_table`.`id` AS `id`, `large_table`.`foo` AS `foo` FROM `large_table` LIMIT
                 1\",\"params\":null},\"results_timezone\":\"UTC\",\"results_metadata\":{\"checksum\":\"0MnSKb8145UERWn18F5Uiw==\",\"columns\":[{\"semantic_type\":\"type/PK\",\"coercion_strategy\":null,\"name\":\"id\",\"field_ref\":[\"field\",200,null],\"effective_type\":\"type/Integer\",\"id\":200,\"display_name\":\"ID\",\"fingerprint\":null,\"base_type\":\"type/Integer\"},{\"semantic_type\":null,\"coercion_strategy\":null,\"name\":\"foo\",\"field_ref\":[\"field\",201,null],\"effective_type\":\"type/Text\",\"id\":201,\"display_name\":\"Foo\",\"fingerprint\":{\"global\":{\"distinct-count\":1,\"nil%\":0.0},\"type\":{\"type/Text\":{\"percent-json\":0.0,\"percent-url\":0.0,\"percent-email\":0.0,\"percent-state\":0.0,\"average-length\":500.0}}},\"base_type\":\"type/Text\"}]},\"insights\":null,\"count\":1}}'"])

```

and then from the terminal client repeat this until we have 800,000 rows:
```sql
insert into large_table (foo, properties) select foo, properties from large_table;
```

Then can exercise from code with the following:

```clojure
(-> (qp/process-query {:database 5 ; use appropriate db and tables here
                        :query {:source-table 42
                                ;; :limit 1000000
                                },
                        :type :query}
                        ;; don't retain any rows, purely just counting
                        ;; so resultset is what retains too many rows
                       {:rff (fn [metadata]
                               (let [c (volatile! 0)]
                                 (fn count-rff
                                   ([]
                                    {:data metadata})
                                   ([result]
                                    (assoc-in result [:data :count] @c))
                                   ([result _row]
                                    (vswap! c inc)
                                    result))))
                        })
     :data :count)
```

PG was far easier to blow up. Mysql took quite a bit of data.

Then we just set a fetch size on the result set so that we (hopefully)
only have than many rows in memory in the resultset at once. The
streaming will write to the download stream as it goes.

PG has one other complication in that the fetch size can only be honored
if autoCommit is false. The reasoning seems to be that each statement is
in a transaction and commits and to commit it has to close resultsets
and therefore it has to realize the entire resultset otherwise you would
only get the initial page if any.

* Set default fetch size to 500

;; Long queries on gcloud pg
;; limit 10,000
;; fetch size | t1   | t2   | t3
;; -------------------------------
;; 100        | 6030 | 8804 | 5986
;; 500        | 1537 | 1535 | 1494
;; 1000       | 1714 | 1802 | 1611
;; 3000       | 1644 | 1595 | 2044

;; limit 30,000
;; fetch size | t1    | t2    | t3
;; -------------------------------
;; 100        | 17341 | 15991 | 16061
;; 500        | 4112  | 4182  | 4851
;; 1000       | 5075  | 4546  | 4284
;; 3000       | 5405  | 5055  | 4745

* Only set fetch size if not default (0)

Details of `:additional-options "defaultRowFetchSize=3000"` can set a
default fetch size and we can easily honor that. This allows overriding
per db without much work on our part.

* Remove redshift custom fetch size code

This removes the automatic insertion of a defaultRowFetchSize=5000 on
redshift dbs. Now we always set this to 500 in the sql-jdbc statement
and prepared statement fields. And we also allow custom ones to persist
over our default of 500.

One additional benefit of removing this is that it always included the
option even if a user added ?defaultRowFetchSize=300 themselves so this
should actually give more control to our users.

Profiling quickly on selecting 79,000 rows from redshift, there
essentially no difference between a fetch size of 500 (the default) and
5000 (the old redshift default); both were 12442 ms or so.

* unused require of settings in redshift tests

* Appease the linter

* Unnecessary redshift connection details tests

Unverified

a33fa568

Oct 15, 2021

Fix flaky `rotate-encryption-key!-test` (#18371) · 82b3147c

Jeff Evans authored 3 years ago

Fix flaky `rotate-encryption-key!-test`

Bind a temporary site locale when running the test, and disable the settings cache

Now, a full explanation of what was making the test flaky, for posterity:

When this test runs, it first tries to create a blank app DB. This is either postgres, h2, or mysql, depending on the driver under test, which is slightly confusing, but anyway...

In order to populate all the app DB tables (since the "real" app DB has already been initialized at this point to even run the test), it first rebinds the appropriate dynamic vars for the db type, spec, connection, etc. and then loads a test fixture H2 file into the app DB. This h2 test fixture file, evidently, contains all the current app DB tables, etc.

When initializing the app DB from H2 (in `metabase.cmd.copy/copy!`), the code needs to internationalize certain loading message strings. And in order to internationalize messages, it has to look up the current site locale, which is itself an application setting, which of course comes from the setting entity table.

For the "regular" blank app DB scenario (i.e. when Metabase first starts with a blank app DB), there is some code in the `metabase.util.i18n.impl` namespace to handle this chicken and egg problem (basically, defaulting to "en" if it can't be loaded from the app DB because there is no app DB yet). But after this tricky process finishes, that temporary, hacked lookup is replaced by the real one (since now, the app DB exists and the locale can just be loaded normally).

For the purpose of our test, that is the problem. That state (which swaps out the `site-locale-from-setting` fn) has already run (remember, we have already initialized the app DB to even run this test in the first place, and now we are swapping in a temporary one). So the call to load the site locate from the settings table (which is needed to print the loading message to initialize this temp app DB) fails, and hence the test fails.

Now, the reason this test was flaky, instead of just always failing, is because of the settings cache. If the site locale had been loaded anywhere within some short time frame before this test needs to load it, to print init messages, then it succeeds! But that doesn't always end up happening, of course, since it's effectively a race condition (setting cache expiration versus test execution timing).

Unverified

82b3147c

Oct 14, 2021
- Refactor field ordering logic in exports to avoid dropped fields (#18441) · f2fffe3b
  Noah Moss authored 3 years ago
  
  Unverified
  
  f2fffe3b
Oct 13, 2021

Relativize links when emitting html from md in pulse/subscriptions (#18428) · 2a6d8133

dpsutton authored 3 years ago

* Relativize links when emitting html from md in pulse/subscriptions

Helpful links:
- https://awesomeopensource.com/project/vsch/flexmark-java
- https://github.com/vsch/flexmark-java/blob/master/flexmark-java-samples/src/com/vladsch/flexmark/java/samples/PegdownCustomLinkResolverOptions.java
- https://github.com/vsch/flexmark-java/blob/8a881b73109a287b5f202e2e1fc9f9c497d5eecf/flexmark/src/main/java/com/vladsch/flexmark/html/HtmlRenderer.java#L433
- https://github.com/vsch/flexmark-java/blob/8a881b73109a287b5f202e2e1fc9f9c497d5eecf/flexmark/src/main/java/com/vladsch/flexmark/html/renderer/ResolvedLink.java#L10

This was a pain in the ass. I had been looking for a way to just easily
traverse the ast, but this might cause problems since there are spans
and text positions everywhere. So i looked at how to change the
parser. Everything is so difficult to change. Luckily there was a
built-in way to do this with link resolvers, found in a github issue
https://github.com/vsch/flexmark-java/issues/308 . And ideally this
would be done in the parser but it seems to be done in the html
emitter.

https://github.com/vsch/flexmark-java/blob/master/flexmark-java-samples/src/com/vladsch/flexmark/java/samples/CustomLinkResolverSample.java

And then I got turned around on what is relative or not. What happens if
you do something seemingly sane like:

```markdown
[a link to google](www.google.com)
```

This is apparently a relative link since it lacks a protocol. And our
`resolve-uri` code treats it as such:

```clojure
markdown=> (resolve-uri "www.google.com")
"http://localhost:3000www.google.com"
markdown=>
```

This seems strange to me but is also the behavior on gist.github.com:
https://gist.github.com/dpsutton/412502ffa89186487e41885855dfa781 has a
link to https://gist.github.com/dpsutton/www.google.com

In all, trying to figure out this object oriented factory mess i had 24
tabs open to the source of NodeVisitor, NodeVisitorBase,
AstActionHandler, VisitHandler, ParserExtension,
NodePostProcessorFactory, etc. It was truly unpleasant.

* Remove errant require of mb.util.urls

* Ensure the site setting always has a slash when resolving relative

The URI class will do some wonky things. Sometimes it feels structural,
but it can also just feel like it is combining strings willy nilly.

```clojure
(.. (URI. "http://example.com") (resolve "www.google.com") toString)
"http://example.comwww.google.com"
```

So ensure that there is a final trailing slash

Unverified

2a6d8133

Oct 11, 2021

Two column table (#18392) · 57506cc0

dpsutton authored 3 years ago

* Don't render bar charts are sparklines

https://github.com/metabase/metabase/issues/18352

A simple select `SELECT 'a', 1 UNION ALL SELECT 'b', 2` would trigger as
a sparkline and blow up when comparing. There are two issues:
- don't render tables as a sparkline even if it only has two columns
- don't compare things that aren't comparable.

this solves the first issue, ensuring that tables aren't matched as
sparklines

* Sparkline should only compare comparable values

Check that the columns are temporal or numeric before comparing them. It
is only to optionally reverse the order of the results which seems
questionable in itself, but we can at least check that they are
comparable before attempting to

* Ensure tests render as sparklines

default was a :table and that adds csv attachments which throw off our
tests

* unskip repro

* remove unnecessary extra line

Unverified

57506cc0

Correctly format 0 decimal places (#18397) · 6dd76f4a

dpsutton authored 3 years ago

Input:    12345.65432
Expected: 12346
Actual:   12346.

We were unconditionally adding "." (repeat decimals "0") to the base
formatting string. Obviously when the decimals are 0, we are appending
"." to the base format string

Unverified

6dd76f4a

Use dashcard title instead of card name in dashboard subscriptions, if it is set (#18360) · 77a02e6e
Noah Moss authored 3 years ago

Unverified

77a02e6e

Oct 08, 2021
- fix xlsx exports for data with big integer values (#18323) · 965469d5
  Noah Moss authored 3 years ago
  
  Unverified
  
  965469d5
- expressions->subselect: exclude fields used only inside expressions (#18065) · aab206cb
  Pawit Pornkitprasan authored 3 years ago
  
  The expressions are executed inside the sub-query, so the fields do not need to be exposed outside.
  Unverified
  
  aab206cb
Oct 06, 2021
- Fix CSP issue for Google Analytics (#18295) · 193547c4
  Alexander Polyankin authored 3 years ago
  
  Unverified
  
  193547c4
- Static viz: allow rendering non-timeseries line charts (#18279) · a98eb1d8
  Cam Saul authored 3 years ago
  
  * Static viz: respect custom X and Y axis labels * Static viz: allow rendering non-timeseries line charts * Test fixes
  Unverified
  
  a98eb1d8
- Static viz: respect custom X and Y axis labels (#18276) · e85c1122
  Cam Saul authored 3 years ago
  
  Unverified
  
  e85c1122
- Static viz: respect whitelabel colors (#18277) · 7bf63107
  Cam Saul authored 3 years ago
  
  Unverified
  
  7bf63107
- Add `is-hosted?` setting based on metastore token (#17256) · 6f2baf6f
  Noah Moss authored 3 years ago
  
  Unverified
  
  6f2baf6f
- Fix datetime formatting for pulse/dashboard subscription attachments (#18225) · 3d7a1c53
  Noah Moss authored 3 years ago
  
  Unverified
  
  3d7a1c53
Oct 05, 2021

Fix Dashboard Subscription email validation (second pass) (#18237) · 8f4d867b
Cam Saul authored 3 years ago
```
* Fix Dashboard Subscription email validation (second pass)

* Always validate Pulse channels in test endpoint
```
Unverified

8f4d867b
Use UTC as default timezone in alert template when report timezone isn't set (#18257) · a7e1d254
Noah Moss authored 3 years ago

Unverified

a7e1d254

Fix cache native queries not working (underlying problem is unit conversion... · 194d1dc0

Howon Lee authored 3 years ago

Fix cache native queries not working (underlying problem is unit conversion actually) (18160) (#18230)

Cache TTL is defined in the nice new column things as per hour. The actual cache wanted it in seconds. There wasn't a conversion hours->secs before, which was the underlying illusion that gave rise to #18160 - with a properly set cache ttl the cache works fine. But the cache ttl's weren't getting set to proper values because of that lack of unit conversion. So here's the conversion, hours->secs.

Unverified

194d1dc0

Oct 04, 2021
- GET /api/database/:id/schemas should not return 'empty' schemas (#18123) · 38faad60
  Cam Saul authored 3 years ago
  
  Unverified
  
  38faad60
- Fix exports for queries with aggregations (#18205) · e6023765
  Noah Moss authored 3 years ago
  
  Unverified
  
  e6023765
Oct 01, 2021
- Convert deprecated k:mm time style to HH:mm on viz settings normalization (#18167) · 9b2fbc71
  Noah Moss authored 3 years ago
  
  Unverified
  
  9b2fbc71
Sep 30, 2021

Fix errors from disallowed characters in BigQuery custom expression names (#18055) · cb05523b

Jeff Evans authored 3 years ago

Add `escape-alias` multimethod to `sql.qp` to handle translating an alias (whether for a field or expression) into something supported by the driver in quesion

Add default impl that is essentially identity

Marking `sql.qp/field->alias` as deprecated, and changing its default implementation to invoke the new `sql.qp/escape-alias` on the field's `:name` (with the intention of removing it in a future release, since individual driver won't need to override this, so much as they'll need to override `escape-alias` itself).

Override `sql.qp/escape-alias` method for both BigQuery drivers to run through the same `->valid-field-identifier` function already defined for this purpose

Add test for a custom expression with weird characters

Unverified

cb05523b

fix filters alignment (#18139) · c56df9ce
Alexander Lesnenko authored 3 years ago

Unverified

c56df9ce

Do not emit view logs when there's no card id for query viewlogs (fixes #18136) (#18138) · 9eced495

Howon Lee authored 3 years ago

There is a problem when there are no card ids for a userland query (so not all userland queries are non-adhoc queries: they can be adhoc), where viewlog needs a card id so it chokes. Just adds a check so it doesn't even try to emit a viewlog, since the auditing is for non-adhoc queries only.

Unverified

9eced495

Sep 29, 2021
- Check feature flag on backend /graph PUT API for block permission (#18109) · 6ff4eda6
  Jeff Evans authored 3 years ago
  
  Check feature flag on backend /graph PUT API for block permission Check for the `:advanced-permissions` flag in `update-db-permissions!`, when a `:schemas` `:block` entry is present in the graph Set feature flag in tests when using API Add new test to ensure failure in the case that feature flag is missing
  View commit 6ff4eda6 2 tags Unverified
  
  6ff4eda6
- Alert schedule in email template (#18084) · d4bc553c
  Noah Moss authored 3 years ago
  
  Unverified
  
  d4bc553c
- Include dashboard filters in email & slack subscriptions (#17824) · ba1f8dcb
  Noah Moss authored 3 years ago
  
  Unverified
  
  ba1f8dcb
- Remove placeholder code for the new 41 premium features (#18126) · 27e32d7f
  Cam Saul authored 3 years ago
  
  Unverified
  
  27e32d7f
- Enforce allowed domains when sending test Pulses (#18122) · ee134cf9
  Cam Saul authored 3 years ago
  
  Unverified
  
  ee134cf9
- Minor async require cleanup (#18059) · 44ccf236
  Cam Saul authored 3 years ago
  
  * Minor async require cleanup * Sort namespaces
  Unverified
  
  44ccf236
Sep 28, 2021

Unverify cards when they are edited (#17997) · a750e4a5

dpsutton authored 3 years ago

from maz in slack:
Should not remove verification:
 - Moving to another collection
 - Pinning/unpinning
 - Adding/removing an alert on the question
 - Turning sharing on the question on/off
Should remove verification:
 - Editing the query
 - Editing the visualization or viz settings
 - Editing the title or description

This really just identifies when not to remove verification, otherwise
does remove verification. Alerts are not related to questions through
the api so no worries there. Everything else is just excluding
attributes from the diff calculation.

In tests, went perhaps a bit haywire with the macrolet, allowing to
define macros inline rather than farther away from the call site. Used
twice:
- once to optionally add in moderation review into the with-temp. This
macro is handy to ensure we always have a fresh state of card and don't
have leftover changes or verifications. Has to be a macro since i want
to dynamically change whether there are verifications in the with-temp*
vector binding
- a way to more easily reduce repetition when asserting something is
verified, do an action, and assert it is still verified

Unverified

a750e4a5

Audit cache controls 2 (#18046) · d7898af8

Howon Lee authored 3 years ago

Cache controls all landed but is lacking in the audit capacity wanted for in the notion doc. This PR adds that audit capability and by the by changes the ViewLog model in order to be able to deliver on the question of whether ViewLogs were cache hits or not.

Unverified

d7898af8

Sep 27, 2021

Fix {0} being shown when locale is "pt" (#17875) · 35651360

Pawit Pornkitprasan authored 3 years ago

This is a combination of 2 issues:
 1) pt got renamed to pt_BR in x.39.x but the old
    "pt" value may still be stored in the database
 2) when making a release, an unclean build
    directory may be used containing old "pt"
    resource which gets leaked into the release build

1) is fixed by adding a function to treat "pt" as "pt_BR"
to support users who were on "pt" since pre-x.39.
(This is done by finding the closest fallback locale)

2) is fixed by emptying the folder before generating locales
so any old locales are deleted.

Fixes #16690

Unverified

35651360

Fix saving chart with custom expressions only retains first 2 data series (#17855) · 594c4b07
Pawit Pornkitprasan authored 3 years ago
```
See https://github.com/metabase/metabase/issues/15882#issuecomment-917984049 for detailed explanation.

Fixes #15882
```
Unverified

594c4b07

Fix slow query when using nested native question with many rows (#18023) · 38269668

Pawit Pornkitprasan authored 3 years ago

`query->native` works by having `nativef` throws an exception
which causes `mbql->native` to set native query to `nil` which
then cause the rest of the execution to be skipped.

However, because the existing code skips overriding the native
query for queries already native, the execution is not skipped.

This causes bad performance for nested native question because
`query->native` is called for the sub-query which causes the
sub-query to be executed on its own, potentially without any
limit or filtering.

Unverified

38269668

Sep 24, 2021
- Escape HTML in markdown cards when rendered for emails (#18000) · 97906486
  Noah Moss authored 3 years ago
  
  Unverified
  
  97906486
Sep 23, 2021
- docs - update api doc gen code (#17994) · 70141151
  Jeff Bruemmer authored 3 years ago
  
  * update api doc gen code * update test
  Unverified
  
  70141151
Sep 22, 2021

Static Viz Formatting (#17957) · de39de8a

dpsutton authored 3 years ago

* Formatting

return a format string so it can be used over multiple rows. The rules
for formatting are quite annoying because there are defaults that are
assumed rather than present. IE, if you set a column as currency it
doesn't set the number style sometimes and defaults to USD without that
being in the settings. In the future would be far preferable to always
create a fully fleshed map of defaults rather than inferring them in
random places.

* appease our doc string supervisor

* Include column information for formatting numbers

really important for requiring two decimal places by default on doubles
and none on integers. Its close to correct ignoring it and using
"#,###.##" and this leaves non decimal places on integers and allows
only up to two on doubles, but things like 31.10 will format as 31.1 and
that is improper

* Remove unneeded metabase.test

* Update tests with nil column

* Get global viz settings in middleware, pass entire to formatting

Passing the entire viz settings to the formatting stuff makes the
calling code much simpler and the places where we merge global settings,
possibly get currency settings, etc, just in one place. We already had
the col information there so just use it.

* Remove public settings now that we get it from the metadata

Unverified

de39de8a