Skip to content
Snippets Groups Projects
This project is mirrored from https://github.com/metabase/metabase. Pull mirroring updated .
  1. Nov 18, 2024
  2. Nov 15, 2024
    • github-automation-metabase's avatar
      XLSX Pivot Export - Initialize Pivot Table with Small Area Ref (#50060) (#50111) · 5c7d398b
      github-automation-metabase authored
      
      * XLSX Pivot Export - Initialize Pivot Table with Small Area Ref
      
      This PR changes the native pivot export so that we first initialize the pivot table with an area ref that's only the
      first 2 rows. Prior, I used `AreaReference/GetWholecolumn` which had the side effect of using lots of memory.
      
      I think this is related to how the Pivot Table's Cache is built up as you add rows/cols to the pivot definition. So,
      if you keep the area ref only as wide as the number of columns in the raw data, and just 2 rows, then the cache can
      stay small.
      
      After adding row/column data to the pivot table, you can then modify the area ref to be all rows in the relevant
      columns (`A:E` for example).
      
      This keeps the pivot-table object much smaller in size.
      
      * need the extra xml schemas to get this to work
      
      I'd like to try find a way around including this extra dep, it's a 13mb jar (ish), and I think that's a little bigger
      than I'd like
      
      * oops, didnt mean to have this in
      
      * Add test for pivot table initialization being fast and lean on memory
      
      * remove time based assertion, add comments in impl
      
      * cljfmt
      
      * fix test
      
      * Pesky little whitespace
      
      ---------
      
      Co-authored-by: default avataradam-james <21064735+adam-james-v@users.noreply.github.com>
      Co-authored-by: default avatarOleksandr Yakushev <alex@bytopia.org>
    • github-automation-metabase's avatar
      fix: MYSQL filter database with dbname in sync (#50091) (#50097) · cb44d0af
      github-automation-metabase authored
      
      Fixes: #50072
      
      When we `describe-fields` we are passing db details, but sometimes we
      have the database name in `dbname` rather than in `db` depending on the
      connection specs and possibly environment. So we check for both now.
      
      Co-authored-by: default avatarCase Nelson <case@metabase.com>
      Unverified
      cb44d0af
    • github-automation-metabase's avatar
      Omit database details from H2 dump when is_attached_dwh (#50036) (#50089) · 8dedf86d
      github-automation-metabase authored
      == Goal ==
      
      Hide attached DWH database details from anyone incl. admins:
      * Do not show them in the UI
      * Do not permit to change them
      * Do not serialize them
        - This is what we're adding for H2 snapshots (dumps) with this PR.
      
      The aim is that customers cannot gain access to (parts of) credentials,
      and they cannot break a feature they are paying for by changing
      connection details.
      
      == Implementation ==
      
      In 592360c9 I wrongly understood that
      database details would be omitted when dumping to H2, but this is only
      true for dumping H2 databases.
      
      Fix this by omitting database details also when dumping databases with
      `is_attached_dwh` set.
      
      == How to test ==
      
      To prepare, download a H2 JAR matching the H2 version used by Metabase:
      ```
      wget https://repo1.maven.org/maven2/com/h2database/h2/2.1.214/h2-2.1.214.jar
      ```
      
      If the H2 version does not match, you will get an error when trying to
      open the H2 shell:
      ```
      Exception in thread "main" org.h2.jdbc.JdbcSQLNonTransientConnectionException: Unsupported database file version or invalid file header in file "[REDACTED]" [90048-232]
      ```
      
      === New behaviour ===
      
      Setting the `is_attached_dwh` field hides the database details from H2 dumps:
      
      1. Configure a database as described in https://www.metabase.com/docs/latest/configuring-metabase/config-file#databases.
         - In addition to the fields you would normally set, also set
           `is_attached_dwh: true`.
         - This also works when adding this flag to a database that previously
           did not have this flag set.
      2. Start your Metabase instance.
      3. Run `${metabase} dump-to-h2 ./dump-file-h2` to create the H2 snapshot file.
      4. Run `java -cp h2-2.1.214.jar org.h2.tools.Shell -url "jdbc:h2:file:./dump-file-h2;ifexists=true"` to open a H2 shell.
      5. Verify that `SELECT name,details FROM metabase_database;` shows `{}` for the database you added in step 1
      
      === Original behaviour ===
      
      Behaviour without setting the `is_attached_dwh` field is unchanged:
      
      1. Configure a database as described in https://www.metabase.com/docs/latest/configuring-metabase/config-file#databases.
         - Only set the fields you would normally set.  Do not set
           `is_attached_dwh` (or set it to `false`).
      2. Start your Metabase instance.
      3. Run `${metabase} dump-to-h2 ./dump-file-h2` to create the H2 snapshot file.
      4. Run `java -cp h2-2.1.214.jar org.h2.tools.Shell -url "jdbc:h2:file:./dump-file-h2;ifexists=true"` to open a H2 shell.
      5. Verify that `SELECT name,details FROM metabase_database;` shows a non-empty object (i.e. not `{}`) for the database you added in step 1
      
      Fixes: 592360c9
      Closes: https://github.com/metabase/harbormaster/issues/5526
      
      
      
      Co-authored-by: default avatarDennis Schridde <63082+devurandom@users.noreply.github.com>
      Unverified
      8dedf86d
  3. Nov 14, 2024
    • github-automation-metabase's avatar
      sync the attached datawarehouse (#49970) (#50035) · ce33d649
      github-automation-metabase authored
      * sync the attached datawarehouse
      
      requires either "sync_db" true or a table name
      
      it will error
      ```shell
      ❯ http post 'localhost:3000/api/notify/db/attached_datawarehouse' x-metabase-apikey:apikey -pb
      Must provide `sync_db` or a `table_name` to sync
      ```
      
      It will sync the db when sync_db is true [NOTE: this is outdated. now just omitting the table name will sync the db]
      ```shell
      ❯ echo '{"synchronous?": true, "sync_db": true}' | http post 'localhost:3000/api/notify/db/attached_datawarehouse' x-metabase-apikey:apikey -pb
      {
          "success": true
      }
      ```
      
      It will sync existing tables
      ```shell
      ❯ http post 'localhost:3000/api/notify/db/attached_datawarehouse' table_name=existing schema_name=public x-metabase-apikey:apikey -pb
      {
          "success": true
      }
      ```
      
      it will error if it cannot find a table
      ```shell
      ❯ http post 'localhost:3000/api/notify/db/attached_datawarehouse' table_name=new schema_name=public x-metabase-apikey:apikey -pb
      {
          "cause": "Unable to identify table 'public.new'",
          "data": {
              "schema_name": "public",
              "status-code": 404,
              "table_name": "new"
          },
          "message": "Unable to identify table 'public.new'",
          "schema_name": "public",
          "table_name": "new",
          "trace": [],
          "via": [
              {
                  "data": {
                      "schema_name": "public",
                      "status-code": 404,
                      "table_name": "new"
                  },
                  "message": "Unable to identify table 'public.new'",
                  "type": "clojure.lang.ExceptionInfo"
              }
          ]
      }
      ```
      
      if i create that table
      ```sql
      attached=# create table new (id int);
      CREATE TABLE
      ```
      
      it will then find and sync it
      
      ```shell
      ❯ http post 'localhost:3000/api/notify/db/attached_datawarehouse' table_name=new schema_name=public x-metabase-apikey:apikey -pb
      {
          "success": true
      }
      ```
      
      * formatting
      
      * Tests for attached datawarehouse
      
      * PR nits
      
      typo in `thorw` -> `throw`
      comment explaining why `find-and-sync-new-table` is always sync
      gensym the db name in test
      remove a binding for a fn `sync!` that's only called once
      test `{:sync_db true}` syncs the whole db
      
      * stupid formatting :mechanical_arm:
      
      
      
      * Use absence of table_name as indicator to sync database
      
      Co-authored-by: default avatardpsutton <dan@dpsutton.com>
      Unverified
      ce33d649
  4. Nov 13, 2024
  5. Nov 12, 2024
  6. Nov 11, 2024
  7. Nov 08, 2024
  8. Nov 07, 2024
  9. Nov 06, 2024
    • github-automation-metabase's avatar
      Filter out Empty Rows from Pivot Exports (#49512) (#49638) · 1d5dd2c3
      github-automation-metabase authored
      
      * Pivot Measures Order Used in Pivot Exports
      
      Fixes #48442
      
      A pivot table can have any number of measures, and the user can order these by dragging in the UI. Before this PR,
      that order was ignored and measures would alway be in index order, which is confusing for any user who needs the
      measures to be displayed in a particular order, especially if they've re-ordered them in the pivot viz settings UI.
      
      A test has been added to check that measure order is used.
      
      A few minor changes to the pivot qp and post-processor
       - measure indices are looked up in the pivot qp and added in viz-settings order
       - the pivot measures are only added if the qp has not already added them.
       - pivot-opts Malli spec has been made in the namespace, adjusted to allow `nil` as a valid pivot-opts output, and
       used in relevant functions
      
      * address review points.
      
      * add a rows order test
      
      * Filter out Empty Rows from Pivot Exports
      
      Fixes #49353
      
      The linked issue is not actually related to pivot export size but is instead related to the 'Min of Created At: Month'
      aggregation; the default aggregation function was `+`, so it broke when the date string was encountered.
      
      That was fixed.
      
      As I was trynig to keep the export small if possible, I noticed that in some cases empty rows are appended, so I added
      the filter so that if a pivot row's values are completely empty, it doesn't add it.
      
      Finally, this PR also adds the 'sub section totals' which I noticed were missing from the exports. This comes up when
      you have 3+ pivot-rows configured, so you can see the subtotals for the first pivot row and the subtotals nested
      within those sections for the second pivot row, and so on.
      
      * add test for non-numeric values
      
      * Make sure the new name refs match on aggregations not just breakouts
      
      * cljfmt
      
      Co-authored-by: default avataradam-james <21064735+adam-james-v@users.noreply.github.com>
    • github-automation-metabase's avatar
      Fix :pivot-measures with column names (#49575) (#49614) · d658d05f
      github-automation-metabase authored
      
      * Fix :pivot-measures with column names
      
      * Fix the test
      
      * Update src/metabase/query_processor/pivot.clj
      
      
      
      * Add a test
      
      * Add a test
      
      ---------
      
      Co-authored-by: default avatarAlexander Polyankin <alexander.polyankin@metabase.com>
      Co-authored-by: default avatarBraden Shepherdson <braden@metabase.com>
      Unverified
      d658d05f
  10. Nov 05, 2024
  11. Nov 04, 2024
  12. Nov 01, 2024
  13. Oct 31, 2024
  14. Oct 30, 2024
  15. Oct 29, 2024
  16. Oct 28, 2024
  17. Oct 25, 2024
  18. Oct 23, 2024
Loading