update docs about information collection.

264195ef · Allen Gilliland · c5291d21 · 264195ef
Commit 264195ef authored 9 years ago by Allen Gilliland
--- a/docs/information-collection.md
+++ b/docs/information-collection.md
 # About the Information we collect:

-Metabase uses Google Analytics to collect anonymous usage information from the installed servers that enable this feature. Below are the events we have instrumented, as well as the information we collect about the user performing the action and the instance being used. 
+Metabase uses Google Analytics to collect anonymous usage information from the installed servers that enable this feature. Below are the events we have instrumented, as well as the information we collect about the user performing the action and the instance being used.

 While this list of anonymous information we collect might seem long, it’s useful to compare this to other alternatives. With a typical SaaS platform, not only will this information be collected, but it will also be accompanied by information about your data, how often it is accessed, the specific queries that you use, specific numbers of records all tied to your company and current plan.

-We collect this information to improve your experience and the quality of Metabase, and in the list below, we spell out exactly why we collect each bit of information. 
+We collect this information to improve your experience and the quality of Metabase, and in the list below, we spell out exactly why we collect each bit of information.

-If you prefer not to provide us with this anonymous usage data, please go to your instance’s admin section and set the “collect-anonymouse-usage-metrics” value to False. 
+If you prefer not to provide us with this anonymous usage data, please go to your instance’s admin section and toggle off the option for `Anonymous Tracking`.


 ### Example questions we want to answer:
 * Is our query interface working?
-    * Are users stopping halfway through a question? 
+    * Are users stopping halfway through a question?
    * Are users using filters?
    * Are users using groupings?
    * How often are users using bare rows vs other aggregation options?
@@ -31,7 +31,7 @@ If you prefer not to provide us with this anonymous usage data, please go to you
 * Stay on top of browser incompatibilities
 * Optimize our dashboards for either passive consumption or as a starting point for further exploration depending on how they are being used

-While we will closely follow reported issues and feature requests, we aim to make as many of our users happy and provide them with improvements in features that matter to them. Allowing us to collect information about your instance gives your users a vote in future improvements in a direct way. 
+While we will closely follow reported issues and feature requests, we aim to make as many of our users happy and provide them with improvements in features that matter to them. Allowing us to collect information about your instance gives your users a vote in future improvements in a direct way.


 # The data we collect:
@@ -39,24 +39,13 @@ While we will closely follow reported issues and feature requests, we aim to mak

 ### Events

-| Category | Action | Why we collect this|
+| Category | Action | Why we collect this |
 |---------|--------|--------------------|
-| Card Query Builder | Card added to dashboard | To understand how often users add cards to dashboards. If we find that people mainly add cards vs keep them free standing, we will prioritize dashboards features vs ad hoc questioning. |
-| Card Query Builder | filter added |  Are users actively filtering in queries or using materialized views? |
-| Card Query Builder | aggregation added | Are users mainly looking at rows or segments of tables or running aggregate metrics. If the former, we intend to improve the power of our segmentation features. |
-| Card Query Builder | group by added | How often do users slice and dice information by dimensions? Is this intuitive? Are users trying and failing to do this? |
-| Card Query Builder | sort added | How often do users manually sort vs use the sort icon on the columns? | 
-| Card Query Builder | sort icon clicked | How often do users manually sort vs use the sort icon on the columns? | 
-| Card Query Builder | limit applied | How often do users manually limit the results that come back? | 
-| Card Query Builder | query ran | Looking for mismatches between people adding sorts, limits, etc and actually running the query. Are there discrepencies between these numbers and the rest of the query clause events? Are there browsers or languages where these numbers are out of wack? | 
-| Card Query Builder | saved | How often are users saving a question for later vs running quick Ad Hoc questions? | 
-| SQL Query | started | How often do users need to revert to SQL? If this is very high, it’s an indication our query language is not expressive enough or the query builder easy enough. We watch this number to understand how to improve our query language. | 
-| SQL Query | run | How often are sql queries started but not run? This is used as an alerting condition on bugs or issues with the SQL interface. | 
-| SQL Query | saved | How often are people saving sql backed questions? |  
-| SQL Query | Card added to dashboard | This helps us understand whether our query language is expressive enough for ad hoc queries, whether it is also expressive enough for canonical dashboards, or if it doesn’t go far enough in one or both of those cases. | 
-| Dashboard | Rearrange Started | How often do users wish to rearrange their dashboards? | 
-| Dashboard | Rearrange Finished | How often do users commit their changes to dashboard lay out. If this number is much less than rearrange starts, there might be a bug or UX issue. |
-| Dashboard | Card Clicked | How often are dashboard cards used as a starting point for further exploration?  |
-
-
-
+| Links and Page Views | General website tracking of what pages are most used | This provides better understanding of what parts of the application are liked and used by customers so we know what's popular and potentially what needs more improvement. |
+| Dashboards | When the dashboard dropdown is used, when dashboards are created and updated, what types of edits occur such as adding/removing cards and repositioning. | We use this information to understand how dashboards are being used and what types of activities users most commonly do on their dashboards. |
+| Pulses | When pulses are created and updated, what types of pulses are created, and how many cards typically go in a pulse. | This is used to have a sense for how teams are structuring their push based communication.  When and where is information most often sent and how much information allows Metabase to continue improving features around push based data interactions. |
+| Query Builder | When questions are saved and viewed along with what types of choices are made such as chart types and query clauses used. | Helps the Metabase team understand the basic patterns around how users are accessing their data.  NOTE: we never capture any specific details here such as table names, field names, specific input values, etc.  we only care about what action was taken. |
+| SQL Query | When a SQL query is saved or run. | This mostly just gives us a sense for when users are bypassing the GUI query interface.  We never capture the actual SQL written. |
+| Admin Settings | We capture some very basic stats about when settings are updated and if there are ever errors.  We also capture non-intrusive settings such as the chosen timezone. | We use this information to make sure that users aren't having problems managing their Metabase instance and it provides us some sense for the most common configuration choices so we can optimize for those cases. |
+| Databases | We simply capture when databases are created or removed and what types of databases are being used | This helps Metabase ensure that we spend the most time and attention on the types of databases that are most popular to users. |
+| Data Model | The saving and updates on tables, fields, segments, and metrics are all counted, along with a few other details such as what types of special metadata choices are made. | We use this data to help ensure that Metabase provides an appropriate set of options for users to describe their data and also gives us a sense for how much time users spend marking up their schemas. NOTE: as stated above, we never capture direct values such as table or field names, all of this data is complete focused on what action was taken. |