From 745e496c59cfece2fcd6120ecc366dcab07b293a Mon Sep 17 00:00:00 2001
From: Andrew Or <andrewor14@gmail.com>
Date: Tue, 22 Apr 2014 14:27:49 -0700
Subject: [PATCH] [Fix #204] Eliminate delay between binding and log checking

**Bug**: In the existing history server, there is a `spark.history.updateInterval` seconds delay before application logs show up on the UI.

**Cause**: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs.

**Fix**: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing.

Author: Andrew Or <andrewor14@gmail.com>

Closes #441 from andrewor14/history-server-fix and squashes the following commits:

b2eb46e [Andrew Or] Document SPARK_PUBLIC_DNS and SPARK_HISTORY_OPTS for the history server
e8d1fbc [Andrew Or] Eliminate delay between binding and checking for logs
---
 .../spark/deploy/history/HistoryServer.scala  |  5 +++++
 docs/monitoring.md                            | 19 +++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
index cf64700f90..b8f56234d3 100644
--- a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
@@ -98,6 +98,11 @@ class HistoryServer(
   def initialize() {
     attachPage(new HistoryPage(this))
     attachHandler(createStaticHandler(STATIC_RESOURCE_DIR, "/static"))
+  }
+
+  /** Bind to the HTTP server behind this web interface. */
+  override def bind() {
+    super.bind()
     logCheckingThread.start()
   }
 
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 4c91c3a592..144be3daf1 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -39,22 +39,33 @@ You can start a the history server by executing:
 
 The base logging directory must be supplied, and should contain sub-directories that each
 represents an application's event logs. This creates a web interface at
-`http://<server-url>:18080` by default. The history server depends on the following variables:
+`http://<server-url>:18080` by default. The history server can be configured as follows:
 
 <table class="table">
   <tr><th style="width:21%">Environment Variable</th><th>Meaning</th></tr>
   <tr>
     <td><code>SPARK_DAEMON_MEMORY</code></td>
-    <td>Memory to allocate to the history server. (default: 512m).</td>
+    <td>Memory to allocate to the history server (default: 512m).</td>
   </tr>
   <tr>
     <td><code>SPARK_DAEMON_JAVA_OPTS</code></td>
     <td>JVM options for the history server (default: none).</td>
   </tr>
+  <tr>
+    <td><code>SPARK_PUBLIC_DNS</code></td>
+    <td>
+      The public address for the history server. If this is not set, links to application history
+      may use the internal address of the server, resulting in broken links (default: none).
+    </td>
+  </tr>
+  <tr>
+    <td><code>SPARK_HISTORY_OPTS</code></td>
+    <td>
+      <code>spark.history.*</code> configuration options for the history server (default: none).
+    </td>
+  </tr>
 </table>
 
-Further, the history server can be configured as follows:
-
 <table class="table">
   <tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
   <tr>
-- 
GitLab