diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index f47098554e145e06ff166d0503b7ff12f8a0d48a..2c1b2cc294931176a8cd5f4ba803316e2cb3e9a6 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -116,8 +116,6 @@ For example: # Building Spark for Hadoop/YARN 2.2.x -Hadoop 2.2.x users must build Spark and publish it locally. The SBT build process handles Hadoop 2.2.x as a special case. This version of Hadoop has new YARN API changes and depends on a Protobuf version (2.5). - See [Building Spark with Maven](building-with-maven.html) for instructions on how to build Spark using the Maven process. # Important Notes @@ -126,4 +124,3 @@ See [Building Spark with Maven](building-with-maven.html) for instructions on ho - The local directories used for spark will be the local directories configured for YARN (Hadoop Yarn config yarn.nodemanager.local-dirs). If the user specifies spark.local.dir, it will be ignored. - The --files and --archives options support specifying file names with the # similar to Hadoop. For example you can specify: --files localtest.txt#appSees.txt and this will upload the file you have locally named localtest.txt into HDFS but this will be linked to by the name appSees.txt and your application should use the name as appSees.txt to reference it when running on YARN. - The --addJars option allows the SparkContext.addJar function to work if you are using it with local files. It does not need to be used if you are using it with HDFS, HTTP, HTTPS, or FTP files. -- YARN 2.2.x users cannot simply depend on the Spark packages without building Spark, as the published Spark artifacts are compiled to work with the pre 2.2 API. Those users must build Spark and publish it locally. diff --git a/yarn/README.md b/yarn/README.md index 9a7a1dd838dea01bb2b4b278bb81eda9bcf8e6d3..65ee85447e04a7eca0843e18a3a4896476ddcb8c 100644 --- a/yarn/README.md +++ b/yarn/README.md @@ -1,12 +1,12 @@ # YARN DIRECTORY LAYOUT -Hadoop Yarn related codes are organized in separate directories for easy management. +Hadoop Yarn related codes are organized in separate directories to minimize duplicated code. * common : Common codes that do not depending on specific version of Hadoop. * alpha / stable : Codes that involve specific version of Hadoop YARN API. alpha represents 0.23 and 2.0.x - stable represents 2.2 and later, until the API is break again. + stable represents 2.2 and later, until the API changes again. alpha / stable will build together with common dir into a single jar