- Jan 31, 2013
-
-
Mikhail Bautin authored
See the discussion at https://github.com/mesos/spark/pull/355 for why default profile activation is a problem.
-
- Jan 30, 2013
-
-
Matei Zaharia authored
Minor improvements to PySpark docs
-
Patrick Wendell authored
Also, adds a line in doc explaining how to use.
-
Patrick Wendell authored
It's nicer if all the commands you need are made explicit.
-
Matei Zaharia authored
Remember ConnectionManagerId used to initiate SendingConnections
-
Matei Zaharia authored
Make ExecutorIDs include SlaveIDs when running Mesos
-
Matei Zaharia authored
Include message and exitStatus if availalbe.
-
Stephen Haberman authored
-
Charles Reiss authored
-
Charles Reiss authored
the Mesos ExecutorID as a Spark ExecutorID.
-
- Jan 29, 2013
-
-
Charles Reiss authored
This prevents ConnectionManager from getting confused if a machine has multiple host names and the one getHostName() finds happens not to be the one that was passed from, e.g., the BlockManagerMaster.
-
Matei Zaharia authored
Conflicts: core/src/main/scala/spark/deploy/master/Master.scala
-
Matei Zaharia authored
Add RDD.toDebugString.
-
Matei Zaharia authored
Replace old 'master' term with 'driver'.
-
Matei Zaharia authored
- RDD's getDependencies and getSplits methods are now guaranteed to be called only once, so subclasses can safely do computation in there without worrying about caching the results. - The management of a "splits_" variable that is cleared out when we checkpoint an RDD is now done in the RDD class. - A few of the RDD subclasses are simpler. - CheckpointRDD's compute() method no longer assumes that it is given a CheckpointRDDSplit -- it can work just as well on a split from the original RDD, because it only looks at its index. This is important because things like UnionRDD and ZippedRDD remember the parent's splits as part of their own and wouldn't work on checkpointed parents. - RDD.iterator can now reuse cached data if an RDD is computed before it is checkpointed. It seems like it wouldn't do this before (it always called iterator() on the CheckpointRDD, which read from HDFS).
-
Matei Zaharia authored
-
Stephen Haberman authored
-
Stephen Haberman authored
-
Matei Zaharia authored
-
Stephen Haberman authored
-
Matei Zaharia authored
SPARK-658: Adding logging of stage duration
-
- Jan 28, 2013
-
-
Stephen Haberman authored
Original idea by Nathan Kronenfeld.
-
Patrick Wendell authored
-
Stephen Haberman authored
Conflicts: core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/SparkEnv.scala core/src/main/scala/spark/deploy/LocalSparkCluster.scala core/src/main/scala/spark/executor/StandaloneExecutorBackend.scala core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/main/scala/spark/storage/BlockManagerMaster.scala core/src/main/scala/spark/storage/ThreadingTest.scala core/src/test/scala/spark/MapOutputTrackerSuite.scala
-
Matei Zaharia authored
Some DEBUG-level log cleanup.
-
Matei Zaharia authored
add long and float accumulatorparams
-
Patrick Wendell authored
A few changes to make the DEBUG-level logs less noisy and more readable. - Moved a few very frequent messages to Trace - Changed some BlockManger log messages to make them more understandable SPARK-666 #resolve
-
Imran Rashid authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Matei Zaharia authored
-
Matei Zaharia authored
Directives, and bind to a random port)
-
Matei Zaharia authored
-
- Jan 27, 2013
-
-
Matei Zaharia authored
executors per machine and remove the need for multiple IP addresses in unit tests.
-
Matei Zaharia authored
Detect whether we run on EC2 using ec2-metadata as well
-
Matei Zaharia authored
Blockmanager ui
-
Shivaram Venkataraman authored
-
- Jan 26, 2013
-
-
Matei Zaharia authored
Fix BlockManager reregistration deadlock; do BlockManager reregistration more asynchronously
-
Charles Reiss authored
-