- Jul 26, 2014
-
-
Reynold Xin authored
handleMessageExecutor, handleReadWriteExecutor, and handleConnectExecutor are not marked as daemon and not named. I think there exists some condition in which Spark programs won't terminate because of this. Stack dump attached in https://issues.apache.org/jira/browse/SPARK-2704 Author: Reynold Xin <rxin@apache.org> Closes #1604 from rxin/daemon and squashes the following commits: 98d6a6c [Reynold Xin] [SPARK-2704] Name threads in ConnectionManager and mark them as daemon.
-
bpaulin authored
Added emptyRDD method to Java API with tests. Author: bpaulin <bob@bobpaulin.com> Closes #1597 from bobpaulin/SPARK-2279 and squashes the following commits: 5ad57c2 [bpaulin] [SPARK-2279] Added emptyRDD method to Java API
-
Davies Liu authored
Add several default configs for PySpark, related to serialization in JVM. spark.serializer = org.apache.spark.serializer.KryoSerializer spark.serializer.objectStreamReset = 100 spark.rdd.compress = True This will help to reduce the memory usage during RDD.partitionBy() Author: Davies Liu <davies.liu@gmail.com> Closes #1568 from davies/conf and squashes the following commits: cd316f1 [Davies Liu] remove duplicated line f71a355 [Davies Liu] rebase to master, add spark.rdd.compress = True 8f63f45 [Davies Liu] Merge branch 'master' into conf 8bc9f08 [Davies Liu] fix unittest c04a83d [Davies Liu] some default configs for PySpark
-
Hossein authored
The current default value of spark.serializer.objectStreamReset is 10,000. When trying to re-partition (e.g., to 64 partitions) a large file (e.g., 500MB), containing 1MB records, the serializer will cache 10000 x 1MB x 64 ~= 640 GB which will cause out of memory errors. This patch sets the default value to a more reasonable default value (100). Author: Hossein <hossein@databricks.com> Closes #1595 from falaki/objectStreamReset and squashes the following commits: 650a935 [Hossein] Updated documentation 1aa0df8 [Hossein] Reduce default value of spark.serializer.objectStreamReset
-
Josh Rosen authored
Author: Josh Rosen <joshrosen@apache.org> Closes #1596 from JoshRosen/spark-1458 and squashes the following commits: fdbb0bf [Josh Rosen] Add SparkContext.version to Python & Java [SPARK-1458]
-
- Jul 25, 2014
-
-
Michael Armbrust authored
Author: Michael Armbrust <michael@databricks.com> Closes #1557 from marmbrus/fixDivision and squashes the following commits: b85077f [Michael Armbrust] Fix unit tests. af98f29 [Michael Armbrust] Change DIV to long type 0c29ae8 [Michael Armbrust] Fix division semantics for hive
-
Reynold Xin authored
This is part of the scheduler cleanup/refactoring effort to make the scheduler code easier to maintain. @kayousterhout @markhamstra please take a look ... Author: Reynold Xin <rxin@apache.org> Closes #1561 from rxin/dagSchedulerHashMaps and squashes the following commits: 1c44e15 [Reynold Xin] Clear pending tasks in submitMissingTasks. 620a0d1 [Reynold Xin] Use filterKeys. 5b54404 [Reynold Xin] Code review feedback. c1e9a1c [Reynold Xin] Removed some HashMaps from DAGScheduler by storing information in Stage.
-
Michael Armbrust authored
This reverts commit 06dc0d2c. #1399 is making Jenkins fail. We should investigate and put this back after its passing tests. Author: Michael Armbrust <michael@databricks.com> Closes #1594 from marmbrus/revertJDBC and squashes the following commits: 59748da [Michael Armbrust] Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
-
Kay Ousterhout authored
Due to problems with when we update runningStages (in DAGScheduler.scala) and how we decide to send a SparkListenerStageCompleted message to SparkListeners, sometimes stages can be shown as "running" in the UI forever (even after they have failed). This issue can manifest when stages are resubmitted with 0 tasks, or when the DAGScheduler catches non-serializable tasks. The problem also resulted in a (small) memory leak in the DAGScheduler, where stages can stay in runningStages forever. This commit fixes that problem and adds a unit test. Thanks tsudukim for helping to look into this issue! cc markhamstra rxin Author: Kay Ousterhout <kayousterhout@gmail.com> Closes #1566 from kayousterhout/dag_fix and squashes the following commits: 217d74b [Kay Ousterhout] [SPARK-1726] [SPARK-2567] Eliminate zombie stages in UI.
-
jerryshao authored
This patch adds a sort flag into ShuffleDependecy and moves sort into hash shuffle implementation. Moving sort into shuffle implementation can give space for other shuffle implementations (like sort-based shuffle) to better optimize sort through shuffle. Author: jerryshao <saisai.shao@intel.com> Closes #1210 from jerryshao/SPARK-2125 and squashes the following commits: 2feaf7b [jerryshao] revert MimaExcludes ceddf75 [jerryshao] add MimaExeclude f674ff4 [jerryshao] Add missing Scope restriction b9fe0dd [jerryshao] Fix some style issues according to comments ef6b729 [jerryshao] Change sort flag into Option 3f6eeed [jerryshao] Fix issues related to unit test 2f552a5 [jerryshao] Minor changes about naming and order c92a281 [jerryshao] Move sort into shuffle implementations
-
baishuo(白硕) authored
I think it's better to defined hiveQlTable as a val Author: baishuo(白硕) <vc_java@hotmail.com> Closes #1569 from baishuo/patch-1 and squashes the following commits: dc2f895 [baishuo(白硕)] Update HiveMetastoreCatalog.scala a7b32a2 [baishuo(白硕)] Update HiveMetastoreCatalog.scala
-
Yin Huai authored
Add genjavadocSettings back to SparkBuild. It requires #1585 . https://issues.apache.org/jira/browse/SPARK-2682 Author: Yin Huai <huai@cse.ohio-state.edu> Closes #1584 from yhuai/SPARK-2682 and squashes the following commits: 2e89461 [Yin Huai] Merge remote-tracking branch 'upstream/master' into SPARK-2682 54e3b66 [Yin Huai] Add genjavadocSettings back.
-
Cheng Lian authored
JIRA issue: - Main: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410) - Related: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678) Cherry picked the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc). (Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.) TODO - [x] Use `spark-submit` to launch the server, the CLI and beeline - [x] Migration guideline draft for Shark users ---- Hit by a bug in `SparkSubmitArguments` while working on this PR: all application options that are recognized by `SparkSubmitArguments` are stolen as `SparkSubmit` options. For example: ```bash $ spark-submit --class org.apache.hive.beeline.BeeLine spark-internal --help ``` This actually shows usage information of `SparkSubmit` rather than `BeeLine`. ~~Fixed this bug here since the `spark-internal` related stuff also touches `SparkSubmitArguments` and I'd like to avoid conflict.~~ **UPDATE** The bug mentioned above is now tracked by [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678). Decided to revert changes to this bug since it involves more subtle considerations and worth a separate PR. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1399 from liancheng/thriftserver and squashes the following commits: 090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR 21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd] 199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver 1083e9d [Cheng Lian] Fixed failed test suites 7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic 9cc0f06 [Cheng Lian] Starts beeline with spark-submit cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile 061880f [Cheng Lian] Addressed all comments by @pwendell 7755062 [Cheng Lian] Adapts test suites to spark-submit settings 40bafef [Cheng Lian] Fixed more license header issues e214aab [Cheng Lian] Added missing license headers b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft 3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit 61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit 2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
-
Yin Huai authored
Renaming `short` to `shortForm` and `long` to `longForm`. JIRA: https://issues.apache.org/jira/browse/SPARK-2683 Author: Yin Huai <huai@cse.ohio-state.edu> Closes #1585 from yhuai/SPARK-2683 and squashes the following commits: 5ddb843 [Yin Huai] "short" and "long" are Java keyworks. In order to generate javadoc, renaming "short" to "shortForm" and "long" to "longForm".
-
fireflyc authored
Our program needs to receive a large amount of data and run for a long time. We set the log level to WARN but "Storing iterator" "received single" as such message written to the log file. (over yarn) Author: fireflyc <fireflyc@126.com> Closes #1372 from fireflyc/fix-replace-stdout-log and squashes the following commits: e684140 [fireflyc] 'info' modified into the 'debug' fa22a38 [fireflyc] replace println to log4j
-
Cheng Hao authored
Hive Supports the operator "<=>", which returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL. Author: Cheng Hao <hao.cheng@intel.com> Closes #1570 from chenghao-intel/equalns and squashes the following commits: 8d6c789 [Cheng Hao] Remove the test case orc_predicate_pushdown 5b2ca88 [Cheng Hao] Add cases into whitelist 8e66cdd [Cheng Hao] Rename the EqualNSTo ==> EqualNullSafe 7af4b0b [Cheng Hao] Add EqualNS & Unit Tests
-
Reynold Xin authored
Author: Reynold Xin <rxin@apache.org> Closes #1583 from rxin/closureClean and squashes the following commits: 8982fe6 [Reynold Xin] [SPARK-2529] Clean closures in foreach and foreachPartition.
-
Matei Zaharia authored
JIRA: https://issues.apache.org/jira/browse/SPARK-2657 Our current code uses ArrayBuffers for each group of values in groupBy, as well as for the key's elements in CoGroupedRDD. ArrayBuffers have a lot of overhead if there are few values in them, which is likely to happen in cases such as join. In particular, they have a pointer to an Object[] of size 16 by default, which is 24 bytes for the array header + 128 for the pointers in there, plus at least 32 for the ArrayBuffer data structure. This patch replaces the per-group buffers with a CompactBuffer class that can store up to 2 elements more efficiently (in fields of itself) and acts like an ArrayBuffer beyond that. For a key's elements in CoGroupedRDD, we use an Array of CompactBuffers instead of an ArrayBuffer of ArrayBuffers. There are some changes throughout the code to deal with CoGroupedRDD returning Array instead. We can also decide not to do that but CoGroupedRDD is a `DeveloperAPI` so I think it's okay to change it here. Author: Matei Zaharia <matei@databricks.com> Closes #1555 from mateiz/compact-groupby and squashes the following commits: 845a356 [Matei Zaharia] Lower initial size of CompactBuffer's vector to 8 07621a7 [Matei Zaharia] Review comments 0c1cd12 [Matei Zaharia] Don't use varargs in CompactBuffer.apply bdc8a39 [Matei Zaharia] Small tweak to +=, and typos f61f040 [Matei Zaharia] Fix line lengths 59da88b0 [Matei Zaharia] Fix line lengths 197cde8 [Matei Zaharia] Make CompactBuffer extend Seq to make its toSeq more efficient 775110f [Matei Zaharia] Change CoGroupedRDD to give (K, Array[Iterable[_]]) to avoid wrappers 9b4c6e8 [Matei Zaharia] Use CompactBuffer in CoGroupedRDD ed577ab [Matei Zaharia] Use CompactBuffer in groupByKey 10f0de1 [Matei Zaharia] A CompactBuffer that's more memory-efficient than ArrayBuffer for small buffers
-
Doris Xin authored
exact sample size not supported for now. Author: Doris Xin <doris.s.xin@gmail.com> Closes #1554 from dorx/pystratified and squashes the following commits: 4ba927a [Doris Xin] use rel diff (+- 50%) instead of abs diff (+- 50) bdc3f8b [Doris Xin] updated unit to check sample holistically 7713c7b [Doris Xin] Python version of stratified sampling
-
Davies Liu authored
During aggregation in Python worker, if the memory usage is above spark.executor.memory, it will do disk spilling aggregation. It will split the aggregation into multiple stage, in each stage, it will partition the aggregated data by hash and dump them into disks. After all the data are aggregated, it will merge all the stages together (partition by partition). Author: Davies Liu <davies.liu@gmail.com> Closes #1460 from davies/spill and squashes the following commits: cad91bf [Davies Liu] call gc.collect() after data.clear() to release memory as much as possible. 37d71f7 [Davies Liu] balance the partitions 902f036 [Davies Liu] add shuffle.py into run-tests dcf03a9 [Davies Liu] fix memory_info() of psutil 67e6eba [Davies Liu] comment for MAX_TOTAL_PARTITIONS f6bd5d6 [Davies Liu] rollback next_limit() again, the performance difference is huge: e74b785 [Davies Liu] fix code style and change next_limit to memory_limit 400be01 [Davies Liu] address all the comments 6178844 [Davies Liu] refactor and improve docs fdd0a49 [Davies Liu] add long doc string for ExternalMerger 1a97ce4 [Davies Liu] limit used memory and size of objects in partitionBy() e6cc7f9 [Davies Liu] Merge branch 'master' into spill 3652583 [Davies Liu] address comments e78a0a0 [Davies Liu] fix style 24cec6a [Davies Liu] get local directory by SPARK_LOCAL_DIR 57ee7ef [Davies Liu] update docs 286aaff [Davies Liu] let spilled aggregation in Python configurable e9a40f6 [Davies Liu] recursive merger 6edbd1f [Davies Liu] Hash based disk spilling aggregation
-
- Jul 24, 2014
-
-
Prashant Sharma authored
Author: Prashant Sharma <prashant.s@imaginea.com> Closes #1051 from ScrapCodes/SPARK-2014/pyspark-cache and squashes the following commits: f192df7 [Prashant Sharma] Code Review 2a2f43f [Prashant Sharma] [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default
-
Tathagata Das authored
Stopping the Twitter Receiver would call twitter4j's TwitterStream.shutdown, which in turn causes an Exception to be thrown to the listener. This exception caused the Receiver to be restarted. This patch check whether the receiver was stopped or not, and accordingly restarts on exception. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #1577 from tdas/twitter-stop and squashes the following commits: 011b525 [Tathagata Das] Fixed Twitter stream stopping bug.
-
Neville Li authored
Author: Neville Li <neville@spotify.com> Closes #1188 from nevillelyh/neville/ui and squashes the following commits: d3ac425 [Neville Li] SPARK-2250: show persisted RDD in stage UI f075db9 [Neville Li] SPARK-2035: show call stack even when description is available
-
GuoQiang Li authored
Author: GuoQiang Li <witgo@qq.com> Closes #1180 from witgo/SPARK-2037 and squashes the following commits: 3d52411 [GuoQiang Li] review commit 7058f4d [GuoQiang Li] Correctly stop SparkContext 6d0561f [GuoQiang Li] Fix: yarn client mode doesn't support spark.yarn.max.executor.failures
-
Xiangrui Meng authored
Allow small errors in comparison. @dbtsai , this unit test blocks https://github.com/apache/spark/pull/1562 . I may need to merge this one first. We can change it to use the tools in https://github.com/apache/spark/pull/1425 after that PR gets merged. Author: Xiangrui Meng <meng@databricks.com> Closes #1576 from mengxr/fix-binary-metrics-unit-tests and squashes the following commits: 5076a7f [Xiangrui Meng] fix binary metrics unit tests
-
Yin Huai authored
[SPARK-2603][SQL] Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala In JsonRDD.scalafy, we are using toMap/toList to convert a Java Map/List to a Scala one. These two operations are pretty expensive because they read elements from a Java Map/List and then load to a Scala Map/List. We can use Scala wrappers to wrap those Java collections instead of using toMap/toList. I did a quick test to see the performance. I had a 2.9GB cached RDD[String] storing one JSON object per record (twitter dataset). My simple test program is attached below. ```scala val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext._ val jsonData = sc.textFile("...") jsonData.cache.count val jsonSchemaRDD = sqlContext.jsonRDD(jsonData) jsonSchemaRDD.registerAsTable("jt") sqlContext.sql("select count(*) from jt").collect ``` Stages for the schema inference and the table scan both had 48 tasks. These tasks were executed sequentially. For the current implementation, scanning the JSON dataset will materialize values of all fields of a record. The inferred schema of the dataset can be accessed at https://gist.github.com/yhuai/05fe8a57c638c6666f8d. From the result, there was no significant difference on running `jsonRDD`. For the simple aggregation query, results are attached below. ``` Original: Run 1: 26.1s Run 2: 27.03s Run 3: 27.035s With this change: Run 1: 21.086s Run 2: 21.035s Run 3: 21.029s ``` JIRA: https://issues.apache.org/jira/browse/SPARK-2603 Author: Yin Huai <huai@cse.ohio-state.edu> Closes #1504 from yhuai/removeToMapToList and squashes the following commits: 6831b77 [Yin Huai] Fix failed tests. 09b9bca [Yin Huai] Merge remote-tracking branch 'upstream/master' into removeToMapToList d1abdb8 [Yin Huai] Remove unnecessary toMap and toList.
-
tzolov authored
Add a `<deb.bin.filemode>744</deb.bin.filemode>` property to the `assembly/pom.xml` that defaults to `744`. Use this property for ../bin folder <filemode>. This patch doesn't change the current default modes but allows one override the modes at build time: `-Ddeb.bin.filemode=<new mode>` Author: tzolov <christian.tzolov@gmail.com> Closes #1531 from tzolov/SPARK-2619 and squashes the following commits: 6d95343 [tzolov] [Build] SPARK-2619: Configurable filemode for the spark/bin folder in the .deb package
-
Rahul Singhal authored
...rce manager UI Use the event logger directory to provide a direct link to finished application UI in yarn resourcemanager UI. Author: Rahul Singhal <rahul.singhal@guavus.com> Closes #1094 from rahulsinghaliitd/SPARK-2150 and squashes the following commits: 95f230c [Rahul Singhal] SPARK-2150: Provide direct link to finished application UI in yarn resource manager UI
-
Daoyuan authored
Unpersist useless rdd during bagel iteration to make full use of memory. Author: Daoyuan <daoyuan.wang@intel.com> Closes #1519 from adrian-wang/bagelunpersist and squashes the following commits: 182c9dd [Daoyuan] rename var nextUseless to lastRDD 87fd3a4 [Daoyuan] bagel unpersist old processed rdd
-
Sandy Ryza authored
...spark-submit The PR allows invocations like spark-submit --class org.MyClass --spark.shuffle.spill false myjar.jar Author: Sandy Ryza <sandy@cloudera.com> Closes #1253 from sryza/sandy-spark-2310 and squashes the following commits: 1dc9855 [Sandy Ryza] More doc and cleanup 00edfb9 [Sandy Ryza] Review comments 91b244a [Sandy Ryza] Change format to --conf PROP=VALUE 8fabe77 [Sandy Ryza] SPARK-2310. Support arbitrary Spark properties on the command line with spark-submit
-
Michael Armbrust authored
Author: Michael Armbrust <michael@databricks.com> Closes #1556 from marmbrus/fixBooleanEqualsOne and squashes the following commits: ad8edd4 [Michael Armbrust] Add rule for true = 1 and false = 0.
-
GuoQiang Li authored
Author: GuoQiang Li <witgo@qq.com> Closes #1511 from witgo/JsonProtocol and squashes the following commits: 2b6227f [GuoQiang Li] Fix NPE for JsonProtocol
-
- Jul 23, 2014
-
-
Ankur Dave authored
RoutingTableMessage was used to construct routing tables to enable joining VertexRDDs with partitioned edges. It stored three elements: the destination vertex ID, the source edge partition, and a byte specifying the position in which the edge partition referenced the vertex to enable join elimination. However, this was incompatible with sort-based shuffle (SPARK-2045). It was also slightly wasteful, because partition IDs are usually much smaller than 2^32, though this was mitigated by a custom serializer that used variable-length encoding. This commit replaces RoutingTableMessage with a pair of (VertexId, Int) where the Int encodes both the source partition ID (in the lower 30 bits) and the position (in the top 2 bits). Author: Ankur Dave <ankurdave@gmail.com> Closes #1553 from ankurdave/remove-RoutingTableMessage and squashes the following commits: 697e17b [Ankur Dave] Replace RoutingTableMessage with pair
-
witgo authored
Author: witgo <witgo@qq.com> Closes #1403 from witgo/hive_compatibility and squashes the following commits: 4e5ecdb [witgo] The default does not run hive compatibility tests
-
Prashant Sharma authored
Author: Prashant Sharma <prashant.s@imaginea.com> Closes #1510 from ScrapCodes/SPARK-2549/fun-in-fun and squashes the following commits: 9458bc5 [Prashant Sharma] Tested by removing an inner function from excludes. bc03b1c [Prashant Sharma] SPARK-2549 Functions defined inside of other functions trigger failures
-
Ian O Connell authored
[SPARK-2102][SQL][CORE] Add option for kryo registration required and use a resource pool in Spark SQL for Kryo instances. Author: Ian O Connell <ioconnell@twitter.com> Closes #1377 from ianoc/feature/SPARK-2102 and squashes the following commits: 5498566 [Ian O Connell] Docs update suggested by Patrick 20e8555 [Ian O Connell] Slight style change f92c294 [Ian O Connell] Add docs for new KryoSerializer option f3735c8 [Ian O Connell] Add using a kryo resource pool for the SqlSerializer 4e5c342 [Ian O Connell] Register the SparkConf for kryo, it gets swept into serialization 665805a [Ian O Connell] Add a spark.kryo.registrationRequired option for configuring the Kryo Serializer
-
Michael Armbrust authored
Instead of shipping just the name and then looking up the info on the workers, we now ship the whole classname. Also, I refactored the file as it was getting pretty large to move out the type conversion code to its own file. Author: Michael Armbrust <michael@databricks.com> Closes #1552 from marmbrus/fixTempUdfs and squashes the following commits: b695904 [Michael Armbrust] Make add jar execute with Hive. Ship the whole function class name since sometimes we cannot lookup temporary functions on the workers.
-
William Benton authored
SPARK-2226: [SQL] transform HAVING clauses with aggregate expressions that aren't in the aggregation list This change adds an analyzer rule to 1. find expressions in `HAVING` clause filters that depend on unresolved attributes, 2. push these expressions down to the underlying aggregates, and then 3. project them away above the filter. It also enables the `HAVING` queries in the Hive compatibility suite. Author: William Benton <willb@redhat.com> Closes #1497 from willb/spark-2226 and squashes the following commits: 92c9a93 [William Benton] Removed unnecessary import f1d4f34 [William Benton] Cleanups missed in prior commit 0e1624f [William Benton] Incorporated suggestions from @marmbrus; thanks! 541d4ee [William Benton] Cleanups from review 5a12647 [William Benton] Explanatory comments and stylistic cleanups. c7f2b2c [William Benton] Whitelist HAVING queries. 29a26e3 [William Benton] Added rule to handle unresolved attributes in HAVING clauses (SPARK-2226)
-
Rui Li authored
Hi mridulm, I just think of this issue of [#1212](https://github.com/apache/spark/pull/1212): I added FakeRackUtil to hold the host -> rack mapping. It should be cleaned up after use so that it won't mess up with test cases others may add later. Really sorry about this. Author: Rui Li <rui.li@intel.com> Closes #1454 from lirui-intel/SPARK-2277-fix-UT and squashes the following commits: f8ea25c [Rui Li] SPARK-2277: clear host->rack info properly
-
Takuya UESHIN authored
Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #1491 from ueshin/issues/SPARK-2588 and squashes the following commits: 43d0a46 [Takuya UESHIN] Merge branch 'master' into issues/SPARK-2588 1023ea0 [Takuya UESHIN] Modify tests to use DSLs. 2310bf1 [Takuya UESHIN] Add some more DSLs.
-