Skip to content
Snippets Groups Projects
Unverified Commit 9ddc2f85 authored by metabase-bot[bot]'s avatar metabase-bot[bot] Committed by GitHub
Browse files

docs - csv append (#38974) (#40083)

parent 0a1e15d2
Branches
Tags
No related merge requests found
......@@ -10,21 +10,21 @@ Once you've [enabled uploads](#enabling-uploads), you can [upload files](../expl
Uploading CSV data is best suited for ad hoc analysis of spreadsheet data. If you have a lot of data, or will need to update or add to that data regularly, we recommend setting up a way to load that data into a database directly, then connecting Metabase to that database.
## Databases that support uploads
- [PostgreSQL](../databases/connections/postgresql.md)
- [MySQL](../databases/connections/mysql.md)
## Enabling uploads
There are a few things admins need to do to support CSV uploads:
- [Connect to a database using a database user account with write access](#connect-to-a-database-using-a-database-user-account-with-write-access). This way Metabase will be able to store the uploaded data somewhere.
- [Select the database and schema you want to store the uploaded data in](#select-the-database-and-schema-that-you-want-to-store-the-data-in).
- [(Optional) Specify a prefix for Metabase to prepend to the uploaded tables](#specify-a-prefix-for-metabase-to-prepend-to-the-uploaded-tables).
- [Add people to a group with unrestricted data access to the upload schema database](#add-people-to-a-group-with-unrestricted-data-access-to-the-upload-schema).
- (Optional) [specify a prefix for Metabase to prepend to the uploaded tables](#specify-a-prefix-for-metabase-to-prepend-to-the-uploaded-tables).
### Databases that support uploads
- [PostgreSQL](../databases/connections/postgresql.md)
- [MySQL](../databases/connections/mysql.md)
### Connect to a database using a database user account with write access
## Connect to a database using a database user account with write access
To upload data to Metabase, an admin will need to connect your Metabase to a database that supports uploads using a database user account that has write access to that database.
......@@ -35,7 +35,7 @@ For more, check out:
- [Adding and managing databases](./connecting.md)
- [Database users, roles, and privileges](./users-roles-privileges.md#privileges-to-enable-uploads)
### Select the database and schema that you want to store the data in
## Select the database and schema that you want to store the data in
If Metabase is connected to a database using a database user account with write access, Admins can enable uploads by:
......@@ -47,19 +47,23 @@ When people upload a CSV to a collection, Metabase will:
- Create a table to store that data in the database and schema that the Admin selected to store uploads.
- Create a [model](../data-modeling/models.md) that wraps the uploaded table, and save that model to the collection the person uploaded the CSV data to.
## Specify a prefix for Metabase to prepend to the uploaded tables
Admins can optionally specify a string of text to add in front of the table that Metabase creates to store the uploaded data.
## Add people to a group with unrestricted data access to the upload schema
In order to upload CSVs, a person must be in a group with Unrestricted access to the schema you've selected to store your uploaded data. Native query editing isn't required for uploading. See [groups](../people-and-groups/managing.md) and [data permissions](../permissions/data.md).
## Primary key auto-generation
When you upload a CSV, Metabase will create an a unique primary key column, called `_mb_row_id`, as the first (left-most) column of the uploaded CSV table. This `_mb_row_id` column will contain automatically generated integers. Metabase will also ignore any columns in the upload that have a name that will be in the database with the same name as the auto-generated primary key column (e.g., `_MB row-ID` in the CSV will be `_mb_row_id` or `_MB_ROW_ID` in the database).
If you don't want this autogenerated ID column, you can always remove the column from the model Metabase created. Visit the model, click on the info **i** icon, then **Model details**. From the model details page, click the **Edit definition** button. In the Data section of the query builder, click on the down arrow next to the table, deselect the added ID column, and save your changes.
### Add people to a group with unrestricted data access to the upload schema
## Data type errors
In order to upload CSVs, a person must be in a group with Unrestricted access to the schema you've selected to store your uploaded data. Native query editing isn't required for uploading. See [groups](../people-and-groups/managing.md) and [data permissions](../permissions/data.md).
### Specify a prefix for Metabase to prepend to the uploaded tables
Admins can optionally specify a string of text to add in front of the table that Metabase creates to store the uploaded data.
Metabase will try to guess what the data type is for each column, but if some entries are not like the others, Metabase may not guess the type correctly. For example, if you have a column that starts with integers like 100, 130, 140, then later on a float 105.5, Metabase may reject the upload. To fix this, you'll need to use spreadsheet software to adjust the formatting so that all the integers are formatted as floats (e.g., 100.00, 130.00, 140.00 and so on) before uploading.
## File size limit
......@@ -67,11 +71,7 @@ CSV files cannot exceed 50 MB in size.
> While Metabase limits uploads to 50 MB, the server you use to run your Metabase may impose a lower limit. For example, the default client upload limit for [NGINX is 1 MB](https://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size). So you may need to change your server settings to allow uploads up to 50 MB. People on Metabase Cloud don't have to worry about this.
If you have a file larger than 50 MB, the workaround here is to:
1. Split the data into multiple files.
2. Upload those files one by one. Metabase will create a new model for each sheet.
3. Consolidate that data by creating a new question or model that joins the data from those constituent models created by each upload.
If you have a file larger than 50 MB, the workaround here is to split the data into multiple and [append those files to an existing model](../exploration-and-organization/collections.md#appending-to-a-model-created-by-an-upload).
## Date formats
......
......@@ -82,6 +82,18 @@ Metabase will create a [model](../data-modeling/models.md) that contains that CS
Uploads will only be available if your admin has enabled uploads for your Metabase, and you're in a group with Unrestricted access to the schema used to store those uploads. See [Uploading data](../databases/uploads.md).
## Appending to a model created by an upload
You can upload additional CSV data to an existing model created by a previous CSV upload.
![Append data to existing upload model](./images/append-data.png)
The uploaded CSV must have the same column name, order, and type as the columns in the model. Metabase will look for a header row to check that the column names are the same. So if you split a large CSV into multiple CSVs, make sure to include header rows for all of the files.
When appending, Metabase will simply insert the rows into the underlying table, which will update the model that sits on top of that table. If you have duplicate rows from one upload to the next, Metabase will preserve those duplicate rows.
The upload icon will only be visible on models created by uploads.
## Further reading
- [Keeping your analytics organized](https://www.metabase.com/learn/administration/same-page)
......
docs/exploration-and-organization/images/append-data.png

34.7 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment