more updates to docs

2014-10-21 16:26:17 -07:00 · 2014-10-21 16:26:17 -07:00 · ee392b6064
parent 2d96bc5f1f
commit ee392b6064
4 changed files with 21 additions and 18 deletions
--- a/docs/content/Ingestion-FAQ.md
+++ b/docs/content/Ingestion-FAQ.md
@ -1,6 +1,11 @@
 ---
 layout: doc_page
 ---
 ## What types of data does Druid support?
 Druid can ingest JSON, CSV, TSV and other delimited data out of the box. Druid supports single dimension values, or multiple dimension values (an array of strings). Druid supports long and float numeric columns.
 ## Where do my Druid segments end up after ingestion?
 Depending on what `druid.storage.type` is set to, Druid will upload segments to some [Deep Storage](Deep-Storage.html). Local disk is used as the default deep storage.
@ -24,7 +29,9 @@ druid.storage.baseKey=sample
 Other common reasons that hand-off fails are as follows:
 1) Historical nodes are out of capacity and cannot download any more segments. You'll see exceptions in the coordinator logs if this occurs.
 2) Segments are corrupt and cannot download. You'll see exceptions in your historical nodes if this occurs.
 3) Deep storage is improperly configured. Make sure that your segment actually exists in deep storage and that the coordinator logs have no errors.
 ## How do I get HDFS to work?
@ -41,7 +48,7 @@ You can check the coordinator console located at `<COORDINATOR_IP>:<PORT>/cluste
 ## My queries are returning empty results
-You can check `<BROKER_IP>:<PORT>/druid/v2/datasources/<YOUR_DATASOURCE>?interval=0/3000` for the dimensions and metrics that have been created for your datasource. Make sure that the name of the aggregators you use in your query match one of these metrics. Also make sure that the query interval you specify match a valid time range where data exists. Note: the broker endpoint will only return valid results on historical segments.
+You can check `<BROKER_IP>:<PORT>/druid/v2/datasources/<YOUR_DATASOURCE>?interval=0/3000` for the dimensions and metrics that have been created for your datasource. Make sure that the name of the aggregators you use in your query match one of these metrics. Also make sure that the query interval you specify match a valid time range where data exists. Note: the broker endpoint will only return valid results on historical segments and not segments served by real-time nodes.
 ## How can I Reindex existing data in Druid with schema changes?
--- a/docs/content/Recommendations.md
+++ b/docs/content/Recommendations.md
@ -2,8 +2,8 @@
 layout: doc_page
 ---
-Best Practices
+Recommendations
-==============
+===============
 # Use UTC Timezone
@ -17,12 +17,19 @@ Druid is not perfect in how it handles mix-cased dimension and metric names. Thi
 SSDs are highly recommended for historical and real-time nodes if you are not running a cluster that is entirely in memory. SSDs can greatly mitigate the time required to page data in and out of memory.
-# Provide Columns Names in Lexicographic Order for Best Results
+# Provide Columns Names in Lexicographic Order
-Although Druid supports schemaless ingestion of dimensions, because of https://github.com/metamx/druid/issues/658, you may sometimes get bigger segments than necessary. To ensure segments are as compact as possible, providing dimension names in lexicographic order is recommended. This may require some ETL processing on your data however. 
+Although Druid supports schema-less ingestion of dimensions, because of [https://github.com/metamx/druid/issues/658](https://github.com/metamx/druid/issues/658), you may sometimes get bigger segments than necessary. To ensure segments are as compact as possible, providing dimension names in lexicographic order is recommended. 
 # Use Timeseries and TopN Queries Instead of GroupBy Where Possible
 Timeseries and TopN queries are much more optimized and significantly faster than groupBy queries for their designed use cases. Issuing multiple topN or timeseries queries from your application can potentially be more efficient than a single groupBy query.  
 # Read FAQs
 You should read common problems people have here:
 1) [Ingestion-FAQ](Ingestion-FAQ.html)
 2) [Performance-FAQ](Performance-FAQ.html)
--- a/docs/content/index.md
+++ b/docs/content/index.md
@ -37,17 +37,6 @@ When Druid?
 * You want to do your analysis on data as it’s happening (in real-time)
 * You need a data store that is always available, 24x7x365, and years into the future.
 Not Druid?
 ----------
 * The amount of data you have can easily be handled by MySQL
 * You're querying for individual entries or doing lookups (not analytics)
 * Batch ingestion is good enough
 * Canned queries are good enough
 * Downtime is no big deal
 Druid vs…
 ----------
@ -60,7 +49,7 @@ Druid vs…
 About This Page
 ----------
-The data store world is vast, confusing and constantly in flux. This page is meant to help potential evaluators decide whether Druid is a good fit for the problem one needs to solve. If anything about it is incorrect please provide that feedback on the mailing list or via some other means so we can fix it.
+The data infrastructure world is vast, confusing and constantly in flux. This page is meant to help potential evaluators decide whether Druid is a good fit for the problem one needs to solve. If anything about it is incorrect please provide that feedback on the mailing list or via some other means so we can fix it.
--- a/docs/content/toc.textile
+++ b/docs/content/toc.textile
@ -19,7 +19,7 @@ h2. Booting a Druid Cluster
 * "Production Cluster Configuration":Production-Cluster-Configuration.html
 * "Production Hadoop Configuration":Hadoop-Configuration.html
 * "Rolling Cluster Updates":Rolling-Updates.html
-* "Best Practices":Best-Practices.html
+* "Recommendations":Recommendations.html
 h2. Configuration
 * "Common Configuration":Configuration.html