Skip to content
Snippets Groups Projects
Commit 4aab3ecb authored by Sameer Al-Sakran's avatar Sameer Al-Sakran
Browse files

move to a docs directory

parent f06b79e3
No related branches found
No related tags found
No related merge requests found
......@@ -14,7 +14,7 @@ pom.xml.asc
.hgignore
.hg/
.idea/
/docs
/docs/uberdoc.html
profiles.clj
/*.h2.db
/*.mv.db
......
......@@ -16,7 +16,7 @@ The Report Server does not deal with getting data into a database or data wareho
The report server does not collect web page views or mobile events, though it can help you understand conversion funnels, cohort retention and use behavior in general once you have collected these events into a database.
See [Data Preparation Guide](www.metabase.com/etl) for more information and advice.
See [Data Preparation Guide](docs/DATAPREPARATION.md) for more information and advice.
# Security Disclosure
......@@ -50,64 +50,15 @@ On logging in, you will be asked a set of questions that will set up a user acco
Once you have added this connection, you will be taken into the app and you'll be ready to ask your first question.
For more information or troubleshooting, check out the [Installation Guide](docs/INSTALLATION.md)
# Getting Started
Follow our [Getting Started](www.metabase.com/start) guide to learn how to use the Report Server.
Follow our [Getting Started](docs/GETTINGSTARTED.md) guide to learn how to use the Report Server.
# Contributing
To get started with a development installation of the Query Server, please follow the instructions at our [Developers Guide](DEVELOPERS.md).
In general, we like to have an open issue for every pull request as a place to discuss the nature of any bug or proposed improvement. Each pull request should address a single issue, and contain both the fix as well as a description of how the pull request and tests that validate that the PR fixes the issue in question.
For significant feature additions, it is expected that discussion will have taken place in the attached issue. Any feature that requires a major decision to be reached will need to have an explicit design document written. The goals of this document are to make explicit the assumptions, constraints and tradeoffs any given feature implementation will contain. The point is not to generate documentation but to allow discussion to reference a specific proposed design and to allow others to consider the implications of a given design.
We don't like getting sued, so for every commit we require a Linux Kernel style developer certificate. If you agree to the below terms (from http://developercertificate.org/)
```
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
```
Then you just add a line to every git commit message:
Signed-off-by: Helpful Contributor <helpful.contributor@email.com>
All contributions need to be signed with your real name.
To get started with a development installation of the Query Server and learn more about contributing, please follow the instructions at our [Developers Guide](docs/DEVELOPERS.md).
# License
......
# Overview
Metabase allows you to optionally annotation the data in your database or datawarehouse. These annotations provide metabase with an understanding of what the data actually means and allows it to more intelligently process and display it for you. When discussing metadata, we generally refer to several main types
# Types of Metadata
## Tables
### Table type
A table can be marked as
## Fields
A field is a representation of either a Column (when using a SQL based database, like PostgreSQL) or a field in a document (when using a document or json based database like MongoDB). Where possible,
### Database Representation
This refers to how the basic representation of the field in the database.
### Basic Types
### Semantic Types
......@@ -132,6 +132,57 @@ Once's this repo is made public, this Clojars badge will work and show the statu
[![Dependencies Status](http://jarkeeper.com/metabase/metabase-init/status.png)](http://jarkeeper.com/metabase/metabase-init)
# Contributing
In general, we like to have an open issue for every pull request as a place to discuss the nature of any bug or proposed improvement. Each pull request should address a single issue, and contain both the fix as well as a description of how the pull request and tests that validate that the PR fixes the issue in question.
For significant feature additions, it is expected that discussion will have taken place in the attached issue. Any feature that requires a major decision to be reached will need to have an explicit design document written. The goals of this document are to make explicit the assumptions, constraints and tradeoffs any given feature implementation will contain. The point is not to generate documentation but to allow discussion to reference a specific proposed design and to allow others to consider the implications of a given design.
We don't like getting sued, so for every commit we require a Linux Kernel style developer certificate. If you agree to the below terms (from http://developercertificate.org/)
```
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
```
Then you just add a line to every git commit message:
Signed-off-by: Helpful Contributor <helpful.contributor@email.com>
All contributions need to be signed with your real name.
## License
......
# Understanding what data you have
* Click “explore”
* Note that all of your tables are there
* Click on one
* Note the pagination
* Try getting the next page
* Note that you can filter these pages
* Try to filter by a column
* if it’s a date
* if it’s a category
* Note that any IDs or Foreign keys are clickable
* Click on one
* Note that all fields are present
* We can click on any FKs
* Any urls are clickable
* Note the “Linked Entites” on the bottom
* Click on one of these and note that below are a bunch of that entities linked objects
# Asking a Question
* Click “Cards"
* Click “Create New"
* Select a database
* if you only have a single database, this step happens automatically
* Select a table
* See the bare rows
* click run
* note that this allows you to see all of the rows in a table
* Select “total count"
# Saving a Question to a Dashboard
* Save + Add it to a dashboard
* Give it a name
* Go to your newly created dashboard
* click “Dashboards"
* click your new dashboard
* Note that your card is there
\ No newline at end of file
# Application database
By default, Metabase uses an embedded database ([H2](http://www.h2database.com/)). If you want to use another database (for ease of administration, backup, or any other reason) you can inject the alternative database vis environment variables. For example
export MB_DB_TYPE=postgres
export MB_DB_DBNAME=metabase
export MB_DB_PORT=5432
export MB_DB_USER=username
export MB_DB_PASS=password
export MB_DB_HOST=localhost
java -jar metabase.jar
would run the application using a local postgres server instead of the default embedded database.
# Backing up
The application will create file named "metabase.db.h2.db" in the directory it is being run in. This can be backed up by either stopping the application server and backing up this file. Alternatively to backup the application data while it is running, you can follow the methods described at the relevant [H2 documentation](http://www.h2database.com/html/tutorial.html#upgrade_backup_restore)
# Database connection strings
If you need to access connections over SSL, you should set an environment variable MB_POSTGRES_SSL to true in the environment that you use to run the application, eg
MB_POSTGRES_SSL=true java -jar ./metabase.jar
# Scaling
Typically, you'll want to evaluate the application on any database you have access to. If you want to expose the application to other users, you should carefully consider how you access your database. In addition as the data sizes grow, there will be a number of options in how you should setup your overall analytics infrastructure.
## Starting out
It is typical to point this to a production database of a small application (or a large application with a small number of users). This typically works for periods before launch or when the database is either static, or has a small number of users (like internal applications or low volume but high value paid applications). Eventually, as usage of the Query Server grows, and the load on the production database increases a couple of things happen
* Expensive queries can slow down the database for production users
* The occasional scans (like on first installation) the Query Server runs to keep its internal representations of your database sync'd might add significant load
* Any recurring queries you run might start to add significant load
* You might need to import third party data for analysis, which typically should not live on your main database
At some point, you should separate out your main application database and your analytics database. There are a number of ways to do this.
## Read Replica
Assuming you do not need to do a lot of transformation or ingest lots of third party data sources, this can be a good stopgap to setting up a complete data/analytics infrastructure. For MySQL or Postgres, just set up a read replica and make sure to not let production application servers hit it for normal queries.
## Dedicated analytics database
Typically once enough data is in the system and/or the tranformation needs are complex enough, a dedicated analytics database is used. There are many options ranging from a normal general purpose database (MySQL, Postgres, SQL Server, etc), to a dedicated Analytics database (Vertica, Redshift, GreenPlum, Terredata, etc), the new generation of SQL on Hadoop databases (Spark, Presto) or NoSQL databases (Druid, Cassandra, etc).
Typically, once there is a dedicated analytics database or a datawarehouse, ETL processes become important. Learn more at [Data Preparation Guide](docs/DATAPREPARATION.md)
# Database Drivers
Metabase currently has drivers for
* H2
* MySQL
* PostgreSQL
On our roadmap are
* [Druid](www.github.com/metabase/metabase-init/issues/X)
* [MongoDB](www.github.com/metabase/metabase-init/issues/X)
* [Presto](www.github.com/metabase/metabase-init/issues/X)
If you are interested in the status of any of these drivers, click through to the issues to see what work is being done. If you are interested in a driver to another database, please open an issue!
# Annotating Data
[Data Annotations](docs/ANNOTATIONS.md)
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment