Merge pull request #313 from metabase/retarget_readme

retarget the readme from being targetted to contributors to being target...

Merge pull request #313 from metabase/retarget_readme
38658568 · Sameer Al-Sakran · aefb040d · dd521cc5 · 38658568 · 38658568
Commit 38658568 authored 9 years ago by Sameer Al-Sakran
--- a/.gitignore
+++ b/.gitignore
@@ -12,7 +12,7 @@ pom.xml.asc
 /.lein-repl-history
 /.nrepl-port
 .idea/
-/docs
+/docs/uberdoc.html
 profiles.clj
 /*.h2.db
 /*.mv.db

--- a/README.md
+++ b/README.md
 [![Circle CI](https://circleci.com/gh/metabase/metabase-init.svg?style=svg&circle-token=3ccf0aa841028af027f2ac9e8df17ce603e90ef9)](https://circleci.com/gh/metabase/metabase-init)

-## Install Prerequisites
+# Overview

-1. Oracle JDK 8 (http://www.oracle.com/technetwork/java/javase/downloads/index.html)
-2. Node.js for npm (http://nodejs.org/)
-3. Leiningen (http://leiningen.org/)
+Metabase Report server is an easy way to generate charts and dashboards, ask simple ad hoc queries without using SQL, and see detailed information about rows in your Database. You can set it up in under 5 minutes, and then give yourself and others a place to ask simple questions and understand the data your application is generating. It is not tied to any specific framework and can be used out of the box with minimal configuration. 

+With a bit of tagging and annotation of what the tables and fields in your database mean, it can be used to provide a rich, humanized version analytics server and administration interface. 

-## Build
+# What it isn't

-Install clojure + npm/bower requirements with
+The Report Server does not deal with getting data into a database or data warehouse or with transforming your data into a representation that lets you answer specific questions. Most sophisticated installations will have separate Ingestion processes that get data from third parties, event collectors or database snapshots into a Data Warehouse as well as Transformation Processes that join, denormalize, enrich or otherwise get your data into a shape that more convenient for use in analytics. 

-    lein deps
-    lein npm
+The report server does not collect web page views or mobile events, though it can help you understand conversion funnels, cohort retention and use behavior in general once you have collected these events into a database. 

-Build the application JS and CSS with
+See the [Data Warehouse Guide](docs/DATAWAREHOUSING.md) for more information and advice.

-    lein gulp
+# Security Disclosure

-When developing the frontend client, you'll want to watch for changes,
-so run the default gulp task.
+Security is very important to us. If discover any issue regarding security, please disclose the information responsibly by sending an email to security@metabase.com and not by creating a github issue.

-    ./node_modules/gulp/bin/gulp.js
+# Installation

+To run the Report server you will need to have a Java Runtime installed. As a quick check to see if you system already has one, try 

-## Usage
+    java -version

-Then run the HTTP server with
+If you see something like 

-    lein ring server
+    java version "1.8.0_31"
+    Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
+    Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

+you are good to go. Otherwise, download the Java Runtime Environment at http://java.com/

-## Unit Tests / Linting
+To install the Query Server, go to the [Metabase Download Page](http://www.metabase.com/download) and download the current build. Place the downloaded jar into a newly created directory (as it will create some files when it is run), and run it on the command line:

-Check that the project can compile successfully with
+    java -jar metabase.jar    

-    lein uberjar
+On the first run of the Report Server, the command line invocation will output a line like

-Run the linters with
+    http://localhost:3000/setup/init/XXXXX

-    lein eastwood                        # Clojure linters
-    lein bikeshed --max-line-length 240
-    ./lint_js.sh                         # JavaScript linter
+where XXXXX is a randomly generated token that can only be used to set up your first account for that particular installation. Once you have created that account, the token (and that URL) will no longer work. 

-Run unit tests with
+On logging in, you will be asked a set of questions that will set up a user account, and then you can add a database connection. For this to work you will need to get some information about which database you want to connect to, such as the Host Name and Port that it is running on, the Database Name and the User and Password that you will be using. 

-    lein test
+Once you have added this connection, you will be taken into the app and you'll be ready to ask your first question. 

-By default, the tests only run against the `generic-sql` dataset (an H2 test database).
-You can run specify which datasets/drivers to run tests against with the env var `MB_TEST_DATASETS`:
+For more information or troubleshooting, check out the [Installation Guide](docs/INSTALLATION.md)

-    MB_TEST_DATASETS=generic-sql,mongo lein test
+# Getting Started

-At the time of this writing, the valid datasets are `generic-sql` and `mongo`.
+Follow our [Getting Started](docs/GETTINGSTARTED.md) guide to learn how to use the Report Server.

+# Contributing

-## Documentation
+To get started with a development installation of the Query Server and learn more about contributing, please follow the instructions at our [Developers Guide](docs/DEVELOPERS.md). 

-#### Instant Cheatsheet
+# Extending and Deep Integrations

-Start up an instant cheatsheet for the project + dependencies by running
+Metabase also allows you to hit our Query API directly from Javascript to integrate the simple analytics we provide with your own application or third party services to do things like:

-    lein instant-cheatsheet
+* Build moderation interfaces
+* Export subsets of your users to third party marketing automation software
+* Provide a specialized customer lookup application for the people in your company

-#### Marginalia

-Available at http://metabase.github.io/metabase-init/.
+# License

-You can generate and view documentation with
+Unless otherwise noted, all Metabase Report Server source files are made available under the terms of the GNU Affero General Public License (AGPL). 

-    lein marg
-    open ./docs/uberdoc.html
+See individual files for details.

-You can update the GitHub pages documentation using
-
-    make dox
-
-You should be on the `master` branch without any uncommited local changes before doing so. Also, make sure you've fetched the branch `gh-pages` and can push it back to `origin`.
-
-## Migration Summary
-
-    lein migration-summary
-
-Will give you a list of all tables + fields in the Metabase DB.
-
-## Bootstrapping (for Development)
-
-To quickly get your dev environment set up, use the `bootstrap` function to create a new User and Organization.
-Open a REPL in Emacs or with `lein repl` and enter the following:
-
-```clojure
-(use 'metabase.db)
-(setup-db)
-(use 'metabase.bootstrap)
-(bootstrap)
-```
-
-You'll be walked through the steps to get started.
-
-## API Client (for Development)
-
-You can make API calls from the REPL using `metabase.http-client`:
-
-```clojure
-(use 'metabase.http-client)
-(defn cl [& args]
-  (-> (apply client {:email "crowberto@metabase.com", :password "blackjet"} args)
-      clojure.pprint/pprint))
-(cl :get "user/current")
-;; -> {:email "crowbetro@metabase.com",
-;;     :first_name "Crowbero",
-;;     :last_login #inst "2015-03-13T22:55:05.390000000-00:00",
-;;     ...}
-```
-
-## Developing with Emacs
-
-`.dir-locals.el` contains some Emacs Lisp that tells `clojure-mode` how to indent Metabase macros and which arguments are docstrings. Whenever this file is updated,
-Emacs will ask you if the code is safe to load. You can answer `!` to save it as safe.
-
-By default, Emacs will insert this code as a customization at the bottom of your `init.el`.
-You'll probably want to tell Emacs to store customizations in a different file. Add the following to your `init.el`:
-
-```emacs-lisp
-(setq custom-file (concat user-emacs-directory ".custom.el")) ; tell Customize to save customizations to ~/.emacs.d/.custom.el
-(ignore-errors                                                ; load customizations from ~/.emacs.d/.custom.el
-  (load-file custom-file))
-```
-
-## Checking for Out-of-Date Dependencies
-
-    lein ancient                   # list all out-of-date dependencies
-    lein ancient latest lein-ring  # list latest version of artifact lein-ring
-
-Will give you a list of out-of-date dependencies.
-
-Once's this repo is made public, this Clojars badge will work and show the status as well:
-
-[![Dependencies Status](http://jarkeeper.com/metabase/metabase-init/status.png)](http://jarkeeper.com/metabase/metabase-init)
-
-
-## License
-
-Copyright © 2015 Metabase, Inc.
+Copyright © 2015 Metabase, Inc
--- a/docs/ANNOTATIONS.md
+++ b/docs/ANNOTATIONS.md
+# Overview
+
+Metabase allows you to optionally annotation the data in your database or datawarehouse. These annotations provide metabase with an understanding of what the data actually means and allows it to more intelligently process and display it for you. We currently allow you to annotate tables and columns. 
+
+All of these settings are editable via the metadata editing page.
+
+# Types of Metadata
+
+## Tables
+
+### Table type
+
+A table can be marked as one of the below types. 
+
+* Business Entity Table
+* Rollup or Metrics Table
+* System Table - this is something that is only used 
+* Intermediate Table
+
+Typically, only Business Entities and Metrics tables are displayed in list, and they will be colored differently to allow you to quickly find the table of interest.
+
+## Fields
+
+A field is a representation of either a Column (when using a SQL based database, like PostgreSQL) or a field in a document (when using a document or json based database like MongoDB). 
+
+### Name
+
+Clicking on the name of the field allows you to change how the field name is displayed. For example, if your ORM produces table names like “auth.user", you can replace this with “User” to make it more readable.
+
+### Description
+
+This is a human readable description of what the field is and how it is meant to be used. Any caveats about interpretation can go here as well.
+
+### Visibility
+
+Fields are always displayed in “long form” spots like the detail pages for a specific row. By default, any column with an average length of longer than 50 characters is clipped. If you wish to toggle this, click on the checkbox next to a field name.
+
+### Position
+
+A field has a default position, which is used whenever a row is displayed. Some views allow you to rearrange the order of column. Cases where you might want to use this are if you have a clear primary identifier for a table that for whatever reason is not the first column, or to move variable length columns to the end to make it easier to scan a table. 
+
+### Database Representation
+
+This refers to how the basic representation of the field in the database. It is not editable as it represents how things are stored. It is useful to see if say “1” refers to a number or a string in the underlying database.
+
+### Basic Types
+
+* Metric - A metric is a number that you expect to plot, sum, take averages of, etc. Basically anything that would end up being plotted on the Y-Axis of a graph.
+* Dimension - This is any field that you expect to use as an X-Axis of a graph or as part of a pivot table. 
+* Information - This is any other information that is not expected to be used in any kind of aggregate metrics but contains other information. Examples include descriptions, names, emails
+
+### Semantic Types
+
+A field’s semantic type is used to determine how to display it as well as providing information to users of the data about the underlying meaning. For example, by marking a fields in a table as Latitude and Longitude, you allow the table to be used to power pin and heat maps. Similarly, marking a field as a URL allows users to click on it and go to that url.
+
+Semantic types include
+
+* Avatar Image URL
+* Category
+* City
+* Country
+* Description
+* Foreign Key
+* Entity Key
+* Image URL
+* Field containing JSON
+* Latitude
+* Longitude
+* Entity Name
+* Number
+* State
+* URL
+* Zip Code
\ No newline at end of file
--- a/docs/DATAWAREHOUSING.md
+++ b/docs/DATAWAREHOUSING.md
+It is rare that your applications database will have all the data you need and be structured in a way that lets you ask all of the questions you are interested in. Typically an application database will have a schema optimized for small reads and updates, while most analytics queries typically touch a large fraction of a table. 
+
+# Ingestion
+## From other databases
+
+If you database is small enough, then it is generally easy enough to dump the whole database and then ingest it into your datawarehouse. 
+
+### Postgres
+### MySQL
+### Heroku
+
+## Events
+## Third party data
+
+# Transformation
+## Uniques
+## Event Enrichment
+## Denormalization
+## Working backwards from Metrics Example
+
+
--- a/docs/DEVELOPERS.md
+++ b/docs/DEVELOPERS.md
+[![Circle CI](https://circleci.com/gh/metabase/metabase-init.svg?style=svg&circle-token=3ccf0aa841028af027f2ac9e8df17ce603e90ef9)](https://circleci.com/gh/metabase/metabase-init)
+
+## Install Prerequisites
+
+1. Oracle JDK 8 (http://www.oracle.com/technetwork/java/javase/downloads/index.html)
+2. Node.js for npm (http://nodejs.org/)
+3. Leiningen (http://leiningen.org/)
+
+
+## Build
+
+Install clojure + npm/bower requirements with
+
+    lein deps
+    lein npm
+
+Build the application JS and CSS with
+
+    lein gulp
+
+When developing the frontend client, you'll want to watch for changes,
+so run the default gulp task.
+
+    ./node_modules/gulp/bin/gulp.js
+
+
+## Usage
+
+Then run the HTTP server with
+
+    lein ring server
+
+
+## Unit Tests / Linting
+
+Check that the project can compile successfully with
+
+    lein uberjar
+
+Run the linters with
+
+    lein eastwood                        # Clojure linters
+    lein bikeshed --max-line-length 240
+    ./lint_js.sh                         # JavaScript linter
+
+Run unit tests with
+
+    lein test
+
+By default, the tests only run against the `generic-sql` dataset (an H2 test database).
+You can run specify which datasets/drivers to run tests against with the env var `MB_TEST_DATASETS`:
+
+    MB_TEST_DATASETS=generic-sql,mongo lein test
+
+At the time of this writing, the valid datasets are `generic-sql` and `mongo`.
+
+## Documentation
+
+#### Instant Cheatsheet
+
+Start up an instant cheatsheet for the project + dependencies by running
+
+    lein instant-cheatsheet
+
+#### Marginalia
+
+Available at http://metabase.github.io/metabase-init/.
+
+You can generate and view documentation with
+
+    lein marg
+    open ./docs/uberdoc.html
+
+You can update the GitHub pages documentation using
+
+    make dox
+
+You should be on the `master` branch without any uncommited local changes before doing so. Also, make sure you've fetched the branch `gh-pages` and can push it back to `origin`.
+
+## Migration Summary
+
+    lein migration-summary
+
+Will give you a list of all tables + fields in the Metabase DB.
+
+## Bootstrapping (for Development)
+
+To quickly get your dev environment set up, use the `bootstrap` function to create a new User and Organization.
+Open a REPL in Emacs or with `lein repl` and enter the following:
+
+```clojure
+(use 'metabase.db)
+(setup-db)
+(use 'metabase.bootstrap)
+(bootstrap)
+```
+
+You'll be walked through the steps to get started.
+
+## API Client (for Development)
+
+You can make API calls from the REPL using `metabase.http-client`:
+
+```clojure
+(use 'metabase.http-client)
+(defn cl [& args]
+  (-> (apply client {:email "crowberto@metabase.com", :password "squawk"} args)
+      clojure.pprint/pprint))
+(cl :get "user/current")
+;; -> {:email "crowbetro@metabase.com",
+;;     :first_name "Crowbero",
+;;     :last_login #inst "2015-03-13T22:55:05.390000000-00:00",
+;;     ...}
+```
+
+## Developing with Emacs
+
+`.dir-locals.el` contains some Emacs Lisp that tells `clojure-mode` how to indent Metabase macros and which arguments are docstrings. Whenever this file is updated,
+Emacs will ask you if the code is safe to load. You can answer `!` to save it as safe.
+
+By default, Emacs will insert this code as a customization at the bottom of your `init.el`.
+You'll probably want to tell Emacs to store customizations in a different file. Add the following to your `init.el`:
+
+```emacs-lisp
+(setq custom-file (concat user-emacs-directory ".custom.el")) ; tell Customize to save customizations to ~/.emacs.d/.custom.el
+(ignore-errors                                                ; load customizations from ~/.emacs.d/.custom.el
+  (load-file custom-file))
+```
+
+## Checking for Out-of-Date Dependencies
+
+    lein ancient                   # list all out-of-date dependencies
+    lein ancient latest lein-ring  # list latest version of artifact lein-ring
+
+Will give you a list of out-of-date dependencies.
+
+Once's this repo is made public, this Clojars badge will work and show the status as well:
+
+[![Dependencies Status](http://jarkeeper.com/metabase/metabase-init/status.png)](http://jarkeeper.com/metabase/metabase-init)
+
+# Contributing
+
+In general, we like to have an open issue for every pull request as a place to discuss the nature of any bug or proposed improvement. Each pull request should address a single issue, and contain both the fix as well as a description of how the pull request and tests that validate that the PR fixes the issue in question.
+
+For significant feature additions, it is expected that discussion will have taken place in the attached issue. Any feature that requires a major decision to be reached will need to have an explicit design document written. The goals of this document are to make explicit the assumptions, constraints and tradeoffs any given feature implementation will contain. The point is not to generate documentation but to allow discussion to reference a specific proposed design and to allow others to consider the implications of a given design. 
+
+We don't like getting sued, so for every commit we require a Linux Kernel style developer certificate. If you agree to the below terms (from http://developercertificate.org/)
+
+```
+Developer Certificate of Origin
+Version 1.1
+
+Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
+660 York Street, Suite 102,
+San Francisco, CA 94110 USA
+
+Everyone is permitted to copy and distribute verbatim copies of this
+license document, but changing it is not allowed.
+
+Developer's Certificate of Origin 1.1
+
+By making a contribution to this project, I certify that:
+
+(a) The contribution was created in whole or in part by me and I
+    have the right to submit it under the open source license
+    indicated in the file; or
+
+(b) The contribution is based upon previous work that, to the best
+    of my knowledge, is covered under an appropriate open source
+    license and I have the right under that license to submit that
+    work with modifications, whether created in whole or in part
+    by me, under the same open source license (unless I am
+    permitted to submit under a different license), as indicated
+    in the file; or
+
+(c) The contribution was provided directly to me by some other
+    person who certified (a), (b) or (c) and I have not modified
+    it.
+
+(d) I understand and agree that this project and the contribution
+    are public and that a record of the contribution (including all
+    personal information I submit with it, including my sign-off) is
+    maintained indefinitely and may be redistributed consistent with
+    this project or the open source license(s) involved.
+```
+
+Then you just add a line to every git commit message:
+
+    Signed-off-by: Helpful Contributor <helpful.contributor@email.com>
+
+All contributions need to be signed with your real name.
+
+## License
+
+Copyright © 2015 Metabase, Inc
+
+Distributed under the terms of the GNU Affero General Public License (AGPL) except as otherwise noted.  See individual files for details.
--- a/docs/GETTINGSTARTED.md
+++ b/docs/GETTINGSTARTED.md
+# Before you start
+
+This guide assumes you have a database you have access to and it is set up correctly. If not, please follow the instructions in the [Installation Guide](docs/INSTALLATION.md)
+
+# Understanding what data you have
+
+Initially, let's see what data you have available. The Explore section of the app allows you to see which tables you have available, look at all the rows in a given table, and drill down to individual rows. 
+
+* Click `explore`
+* Note that all of your tables are there
+* Click on one
+* Note the pagination
+* Try getting the next page
+* Note that you can filter these pages
+* Try to filter by a column
+* if it’s a date
+* if it’s a category
+* Note that any IDs or Foreign keys are clickable
+* Click on one
+* Note that all fields are present
+* We can click on any FKs
+* Any urls are clickable
+* Note the `Linked Entites` on the bottom
+* Click on one of these and note that below are a bunch of that entities linked objects
+
+# Asking a Question
+
+When you have a specific question you are trying to answer, you can use the Card section of the application. Here you can ask a specific question of a given table of data you have. We'll start with the simplest possible question you can ask, "How many X are there?".
+
+* Click `Cards`
+* Click `Create New`
+* Select a database
+* if you only have a single database, this step happens automatically
+* Select a table
+* See the bare rows
+* click run
+* note that this allows you to see all of the rows in a table
+* Select `total count`
+
+# Saving a Question to a Dashboard
+
+Assuming this is something you'll want to keep tabs on regularly, or share regularly, you can add it to a dashboard. Dashboards are collections of questions you have saved that you expect to look at as a group or that everyone in your organization can look at. 
+
+* Save
+* Add it to a dashboard
+* Give it a name
+
+* Go to your newly created dashboard
+* click `Dashboards`
+* click your new dashboard
+* Note that your card is there
\ No newline at end of file
--- a/docs/INSTALLATION.md
+++ b/docs/INSTALLATION.md
+# Application database
+
+By default, Metabase uses an embedded database ([H2](http://www.h2database.com/)). If you want to use another database (for ease of administration, backup, or any other reason) you can inject the alternative database vis environment variables. For example
+
+    export MB_DB_TYPE=postgres 
+    export MB_DB_DBNAME=metabase 
+    export MB_DB_PORT=5432 
+    export MB_DB_USER=username 
+    export MB_DB_PASS=password
+    export MB_DB_HOST=localhost
+    java -jar metabase.jar
+
+would run the application using a local postgres server instead of the default embedded database.
+
+# Backing up 
+
+The application will create file named "metabase.db.h2.db" in the directory it is being run in. This can be backed up by either stopping the application server and backing up this file. Alternatively to backup the application data while it is running, you can follow the methods described at the relevant [H2 documentation](http://www.h2database.com/html/tutorial.html#upgrade_backup_restore)
+
+
+# Database connection strings
+
+If you need to access connections over SSL, you should set an environment variable MB_POSTGRES_SSL to true in the environment that you use to run the application, eg
+ 
+    MB_POSTGRES_SSL=true java -jar ./metabase.jar
+
+# Scaling
+
+Typically, you'll want to evaluate the application on any database you have access to. If you want to expose the application to other users, you should carefully consider how you access your database. In addition as the data sizes grow, there will be a number of options in how you should setup your overall analytics infrastructure.
+
+## Starting out
+
+It is typical to point this to a production database of a small application (or a large application with a small number of users). This typically works for periods before launch or when the database is either static, or has a small number of users (like internal applications or low volume but high value paid applications). Eventually, as usage of the Query Server grows, and the load on the production database increases a couple of things happen
+
+* Expensive queries can slow down the database for production users
+* The occasional scans (like on first installation) the Query Server runs to keep its internal representations of your database sync'd might add significant load
+* Any recurring queries you run might start to add significant load
+* You might need to import third party data for analysis, which typically should not live on your main database
+
+At some point, you should separate out your main application database and your analytics database. There are a number of ways to do this.
+
+## Read Replica
+
+Assuming you do not need to do a lot of transformation or ingest lots of third party data sources, this can be a good stopgap to setting up a complete data/analytics infrastructure. For MySQL or Postgres, just set up a read replica and make sure to not let production application servers hit it for normal queries.
+
+
+## Dedicated analytics database
+
+Typically once enough data is in the system and/or the tranformation needs are complex enough, a dedicated analytics database is used. There are many options ranging from a normal general purpose database (MySQL, Postgres, SQL Server, etc), to a dedicated Analytics database (Vertica, Redshift, GreenPlum, Terredata, etc), the new generation of SQL on Hadoop databases (Spark, Presto) or NoSQL databases (Druid, Cassandra, etc). 
+
+Typically, once there is a dedicated analytics database or a datawarehouse, ETL processes become important. Learn more at See the [Data Warehouse Guide](docs/DATAWAREHOUSING.md).
+
+# Database Drivers
+Metabase currently has drivers for
+
+* H2
+* MySQL
+* PostgreSQL
+
+On our roadmap are
+
+* [Druid](www.github.com/metabase/metabase-init/issues/X)
+* [MongoDB](www.github.com/metabase/metabase-init/issues/X) 
+* [Presto](www.github.com/metabase/metabase-init/issues/X)
+
+If you are interested in the status of any of these drivers, click through to the issues to see what work is being done. If you are interested in a driver to another database, please open an issue!
+
+# Annotating Data
+[Data Annotations](docs/ANNOTATIONS.md)
\ No newline at end of file