Skip to content
Snippets Groups Projects
  1. Aug 30, 2016
    • Dmitriy Sokolov's avatar
      [MINOR][DOCS] Fix minor typos in python example code · d4eee993
      Dmitriy Sokolov authored
      ## What changes were proposed in this pull request?
      
      Fix minor typos python example code in streaming programming guide
      
      ## How was this patch tested?
      
      N/A
      
      Author: Dmitriy Sokolov <silentsokolov@gmail.com>
      
      Closes #14805 from silentsokolov/fix-typos.
      d4eee993
  2. Jul 15, 2016
    • Joseph K. Bradley's avatar
      [SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide · 5ffd5d38
      Joseph K. Bradley authored
      ## What changes were proposed in this pull request?
      
      Made DataFrame-based API primary
      * Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html
      * mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html
      * ml-guide.html includes a "maintenance mode" announcement about the RDD-based API
        * **Reviewers: please check this carefully**
      * (minor) Titles for DF API no longer include "- spark.ml" suffix.  Titles for RDD API have "- RDD-based API" suffix
      * Moved migration guide to ml-guide from mllib-guide
        * Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides
        * **Reviewers**: I did not change any of the content of the migration guides.
      
      Reorganized DataFrame-based guide:
      * ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc.
      * Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html
        * **Reviewers**: I did not change the content of these guides, except some intro text.
      * Sidebar remains the same, but with pipeline and tuning sections added
      
      Other:
      * ml-classification-regression.html: Moved text about linear methods to new section in page
      
      ## How was this patch tested?
      
      Generated docs locally
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #14213 from jkbradley/ml-guide-2.0.
      5ffd5d38
  3. Jun 11, 2016
    • Dongjoon Hyun's avatar
      [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib documents · ad102af1
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This issue fixes all broken links on Spark 2.0 preview MLLib documents. Also, this contains some editorial change.
      
      **Fix broken links**
        * mllib-data-types.md
        * mllib-decision-tree.md
        * mllib-ensembles.md
        * mllib-feature-extraction.md
        * mllib-pmml-model-export.md
        * mllib-statistics.md
      
      **Fix malformed section header and scala coding style**
        * mllib-linear-methods.md
      
      **Replace indirect forward links with direct one**
        * ml-classification-regression.md
      
      ## How was this patch tested?
      
      Manual tests (with `cd docs; jekyll build`.)
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #13608 from dongjoon-hyun/SPARK-15883.
      ad102af1
  4. May 03, 2016
    • Shuai Lin's avatar
      [MINOR][DOC] Fixed some python snippets in mllib data types documentation. · c4e0fde8
      Shuai Lin authored
      ## What changes were proposed in this pull request?
      
      Some python snippets is using scala imports and comments.
      
      ## How was this patch tested?
      
      Generated the docs locally with `SKIP_API=1 jekyll build` and viewed the changes in the browser.
      
      Author: Shuai Lin <linshuai2012@gmail.com>
      
      Closes #12869 from lins05/fix-mllib-python-snippets.
      c4e0fde8
  5. Mar 08, 2016
    • Sean Owen's avatar
      [SPARK-13715][MLLIB] Remove last usages of jblas in tests · 54040f8d
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Remove last usage of jblas, in tests
      
      ## How was this patch tested?
      
      Jenkins tests -- the same ones that are being modified.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #11560 from srowen/SPARK-13715.
      54040f8d
  6. Dec 10, 2015
    • Timothy Hunter's avatar
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib... · 2ecbe02d
      Timothy Hunter authored
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
      
      Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
      
      It also removes some files that I forgot to delete with #10207
      
      Author: Timothy Hunter <timhunter@databricks.com>
      
      Closes #10234 from thunterdb/12212.
      2ecbe02d
  7. Oct 07, 2015
  8. Sep 10, 2015
  9. Aug 27, 2015
  10. Aug 05, 2015
    • Mike Dusenberry's avatar
      [SPARK-6486] [MLLIB] [PYTHON] Add BlockMatrix to PySpark. · 34dcf101
      Mike Dusenberry authored
      mengxr This adds the `BlockMatrix` to PySpark.  I have the conversions to `IndexedRowMatrix` and `CoordinateMatrix` ready as well, so once PR #7554 is completed (which relies on PR #7746), this PR can be finished.
      
      Author: Mike Dusenberry <mwdusenb@us.ibm.com>
      
      Closes #7761 from dusenberrymw/SPARK-6486_Add_BlockMatrix_to_PySpark and squashes the following commits:
      
      27195c2 [Mike Dusenberry] Adding one more check to _convert_to_matrix_block_tuple, and a few minor documentation changes.
      ae50883 [Mike Dusenberry] Minor update: BlockMatrix should inherit from DistributedMatrix.
      b8acc1c [Mike Dusenberry] Moving BlockMatrix to pyspark.mllib.linalg.distributed, updating the logic to match that of the other distributed matrices, adding conversions, and adding documentation.
      c014002 [Mike Dusenberry] Using properties for better documentation.
      3bda6ab [Mike Dusenberry] Adding documentation.
      8fb3095 [Mike Dusenberry] Small cleanup.
      e17af2e [Mike Dusenberry] Adding BlockMatrix to PySpark.
      34dcf101
  11. Aug 04, 2015
    • Mike Dusenberry's avatar
      [SPARK-6485] [MLLIB] [PYTHON] Add CoordinateMatrix/RowMatrix/IndexedRowMatrix to PySpark. · 571d5b53
      Mike Dusenberry authored
      This PR adds the RowMatrix, IndexedRowMatrix, and CoordinateMatrix distributed matrices to PySpark.  Each distributed matrix class acts as a wrapper around the Scala/Java counterpart by maintaining a reference to the Java object.  New distributed matrices can be created using factory methods added to DistributedMatrices, which creates the Java distributed matrix and then wraps it with the corresponding PySpark class.  This design allows for simple conversion between the various distributed matrices, and lets us re-use the Scala code.  Serialization between Python and Java is implemented using DataFrames as needed for IndexedRowMatrix and CoordinateMatrix for simplicity.  Associated documentation and unit-tests have also been added.  To facilitate code review, this PR implements access to the rows/entries as RDDs, the number of rows & columns, and conversions between the various distributed matrices (not including BlockMatrix), and does not implement the other linear algebra functions of the matrices, although this will be very simple to add now.
      
      Author: Mike Dusenberry <mwdusenb@us.ibm.com>
      
      Closes #7554 from dusenberrymw/SPARK-6485_Add_CoordinateMatrix_RowMatrix_IndexedMatrix_to_PySpark and squashes the following commits:
      
      bb039cb [Mike Dusenberry] Minor documentation update.
      b887c18 [Mike Dusenberry] Updating the matrix conversion logic again to make it even cleaner.  Now, we allow the 'rows' parameter in the constructors to be either an RDD or the Java matrix object. If 'rows' is an RDD, we create a Java matrix object, wrap it, and then store that.  If 'rows' is a Java matrix object of the correct type, we just wrap and store that directly.  This is only for internal usage, and publicly, we still require 'rows' to be an RDD.  We no longer store the 'rows' RDD, and instead just compute it from the Java object when needed.  The point of this is that when we do matrix conversions, we do the conversion on the Scala/Java side, which returns a Java object, so we should use that directly, but exposing 'java_matrix' parameter in the public API is not ideal. This non-public feature of allowing 'rows' to be a Java matrix object is documented in the '__init__' constructor docstrings, which are not part of the generated public API, and doctests are also included.
      7f0dcb6 [Mike Dusenberry] Updating module docstring.
      cfc1be5 [Mike Dusenberry] Use 'new SQLContext(matrix.rows.sparkContext)' rather than 'SQLContext.getOrCreate', as the later doesn't guarantee that the SparkContext will be the same as for the matrix.rows data.
      687e345 [Mike Dusenberry] Improving conversion performance.  This adds an optional 'java_matrix' parameter to the constructors, and pulls the conversion logic out into a '_create_from_java' function. Now, if the constructors are given a valid Java distributed matrix object as 'java_matrix', they will store those internally, rather than create a new one on the Scala/Java side.
      3e50b6e [Mike Dusenberry] Moving the distributed matrices to pyspark.mllib.linalg.distributed.
      308f197 [Mike Dusenberry] Using properties for better documentation.
      1633f86 [Mike Dusenberry] Minor documentation cleanup.
      f0c13a7 [Mike Dusenberry] CoordinateMatrix should inherit from DistributedMatrix.
      ffdd724 [Mike Dusenberry] Updating doctests to make documentation cleaner.
      3fd4016 [Mike Dusenberry] Updating docstrings.
      27cd5f6 [Mike Dusenberry] Simplifying input conversions in the constructors for each distributed matrix.
      a409cf5 [Mike Dusenberry] Updating doctests to be less verbose by using lists instead of DenseVectors explicitly.
      d19b0ba [Mike Dusenberry] Updating code and documentation to note that a vector-like object (numpy array, list, etc.) can be used in place of explicit Vector object, and adding conversions when necessary to RowMatrix construction.
      4bd756d [Mike Dusenberry] Adding param documentation to IndexedRow and MatrixEntry.
      c6bded5 [Mike Dusenberry] Move conversion logic from tuples to IndexedRow or MatrixEntry types from within the IndexedRowMatrix and CoordinateMatrix constructors to separate _convert_to_indexed_row and _convert_to_matrix_entry functions.
      329638b [Mike Dusenberry] Moving the Experimental tag to the top of each docstring.
      0be6826 [Mike Dusenberry] Simplifying doctests by removing duplicated rows/entries RDDs within the various tests.
      c0900df [Mike Dusenberry] Adding the colons that were accidentally not inserted.
      4ad6819 [Mike Dusenberry] Documenting the  and  parameters.
      3b854b9 [Mike Dusenberry] Minor updates to documentation.
      10046e8 [Mike Dusenberry] Updating documentation to use class constructors instead of the removed DistributedMatrices factory methods.
      119018d [Mike Dusenberry] Adding static  methods to each of the distributed matrix classes to consolidate conversion logic.
      4d7af86 [Mike Dusenberry] Adding type checks to the constructors.  Although it is slightly verbose, it is better for the user to have a good error message than a cryptic stacktrace.
      93b6a3d [Mike Dusenberry] Pulling the DistributedMatrices Python class out of this pull request.
      f6f3c68 [Mike Dusenberry] Pulling the DistributedMatrices Scala class out of this pull request.
      6a3ecb7 [Mike Dusenberry] Updating pattern matching.
      08f287b [Mike Dusenberry] Slight reformatting of the documentation.
      a245dc0 [Mike Dusenberry] Updating Python doctests for compatability between Python 2 & 3. Since Python 3 removed the idea of a separate 'long' type, all values that would have been outputted as a 'long' (ex: '4L') will now be treated as an 'int' and outputed as one (ex: '4').  The doctests now explicitly convert to ints so that both Python 2 and 3 will have the same output.  This is fine since the values are all small, and thus can be easily represented as ints.
      4d3a37e [Mike Dusenberry] Reformatting a few long Python doctest lines.
      7e3ca16 [Mike Dusenberry] Fixing long lines.
      f721ead [Mike Dusenberry] Updating documentation for each of the distributed matrices.
      ab0e8b6 [Mike Dusenberry] Updating unit test to be more useful.
      dda2f89 [Mike Dusenberry] Added wrappers for the conversions between the various distributed matrices.  Added logic to be able to access the rows/entries of the distributed matrices, which requires serialization through DataFrames for IndexedRowMatrix and CoordinateMatrix types. Added unit tests.
      0cd7166 [Mike Dusenberry] Implemented the CoordinateMatrix API in PySpark, following the idea of the IndexedRowMatrix API, including using DataFrames for serialization.
      3c369cb [Mike Dusenberry] Updating the architecture a bit to make conversions between the various distributed matrix types easier.  The different distributed matrix classes are now only wrappers around the Java objects, and take the Java object as an argument during construction.  This way, we can call  for example on an , which returns a reference to a Java RowMatrix object, and then construct a PySpark RowMatrix object wrapped around the Java object.  This is analogous to the behavior of PySpark RDDs and DataFrames.  We now delegate creation of the various distributed matrices from scratch in PySpark to the factory methods on .
      4bdd09b [Mike Dusenberry] Implemented the IndexedRowMatrix API in PySpark, following the idea of the RowMatrix API.  Note that for the IndexedRowMatrix, we use DataFrames to serialize the data between Python and Scala/Java, so we accept PySpark RDDs, then convert to a DataFrame, then convert back to RDDs on the Scala/Java side before constructing the IndexedRowMatrix.
      23bf1ec [Mike Dusenberry] Updating documentation to add PySpark RowMatrix. Inserting newline above doctest so that it renders properly in API docs.
      b194623 [Mike Dusenberry] Updating design to have a PySpark RowMatrix simply create and keep a reference to a wrapper over a Java RowMatrix.  Updating DistributedMatrices factory methods to accept numRows and numCols with default values.  Updating PySpark DistributedMatrices factory method to simply create a PySpark RowMatrix. Adding additional doctests for numRows and numCols parameters.
      bc2d220 [Mike Dusenberry] Adding unit tests for RowMatrix methods.
      d7e316f [Mike Dusenberry] Implemented the RowMatrix API in PySpark by doing the following: Added a DistributedMatrices class to contain factory methods for creating the various distributed matrices.  Added a factory method for creating a RowMatrix from an RDD of Vectors.  Added a createRowMatrix function to the PythonMLlibAPI to interface with the factory method.  Added DistributedMatrix, DistributedMatrices, and RowMatrix classes to the pyspark.mllib.linalg api.
      571d5b53
  12. Jul 07, 2015
    • Mike Dusenberry's avatar
      [SPARK-8570] [MLLIB] [DOCS] Improve MLlib Local Matrix Documentation. · 0a63d7ab
      Mike Dusenberry authored
      Updated MLlib Data Types Local Matrix section to include information on sparse matrices, added sparse matrix examples to the Scala and Java examples, and added Python examples for both dense and sparse matrices.
      
      Author: Mike Dusenberry <mwdusenb@us.ibm.com>
      
      Closes #6958 from dusenberrymw/Improve_MLlib_Local_Matrix_Documentation and squashes the following commits:
      
      ceae407 [Mike Dusenberry] Updated MLlib Data Types Local Matrix section to include information on sparse matrices, added sparse matrix examples to the Scala and Java examples, and added Python examples for both dense and sparse matrices.
      0a63d7ab
  13. May 19, 2015
    • Mike Dusenberry's avatar
      [SPARK-7744] [DOCS] [MLLIB] Distributed matrix" section in MLlib "Data Types"... · 38605206
      Mike Dusenberry authored
      [SPARK-7744] [DOCS] [MLLIB] Distributed matrix" section in MLlib "Data Types" documentation should be reordered.
      
      The documentation for BlockMatrix should come after RowMatrix, IndexedRowMatrix, and CoordinateMatrix, as BlockMatrix references the later three types, and RowMatrix is considered the "basic" distributed matrix.  This will improve comprehensibility of the "Distributed matrix" section, especially for the new reader.
      
      Author: Mike Dusenberry <dusenberrymw@gmail.com>
      
      Closes #6270 from dusenberrymw/Reorder_MLlib_Data_Types_Distributed_matrix_docs and squashes the following commits:
      
      6313bab [Mike Dusenberry] The documentation for BlockMatrix should come after RowMatrix, IndexedRowMatrix, and CoordinateMatrix, as BlockMatrix references the later three types, and RowMatrix is considered the "basic" distributed matrix.  This will improve comprehensibility of the "Distributed matrix" section, especially for the new reader.
      38605206
  14. May 16, 2015
  15. Mar 22, 2015
    • Kamil Smuga's avatar
      SPARK-6454 [DOCS] Fix links to pyspark api · 6ef48632
      Kamil Smuga authored
      Author: Kamil Smuga <smugakamil@gmail.com>
      Author: stderr <smugakamil@gmail.com>
      
      Closes #5120 from kamilsmuga/master and squashes the following commits:
      
      fee3281 [Kamil Smuga] more python api links fixed for docs
      13240cb [Kamil Smuga] resolved merge conflicts with upstream/master
      6649b3b [Kamil Smuga] fix broken docs links to Python API
      92f03d7 [stderr] Fix links to pyspark api
      6ef48632
  16. Feb 24, 2015
    • Xiangrui Meng's avatar
      [SPARK-5958][MLLIB][DOC] update block matrix user guide · cf2e4165
      Xiangrui Meng authored
      * Removed SVD code from examples.
      * Corrected Java API doc link.
      * Updated variable names: `AtransposeA` -> `ata`.
      * Minor changes.
      
      brkyvz
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4737 from mengxr/update-block-matrix-user-guide and squashes the following commits:
      
      70f53ac [Xiangrui Meng] update block matrix user guide
      cf2e4165
  17. Feb 18, 2015
    • Burak Yavuz's avatar
      [SPARK-5507] Added documentation for BlockMatrix · a8eb92dc
      Burak Yavuz authored
      Docs for BlockMatrix. mengxr
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #4664 from brkyvz/SPARK-5507PR and squashes the following commits:
      
      4db30b0 [Burak Yavuz] [SPARK-5507] Added documentation for BlockMatrix
      a8eb92dc
  18. Aug 27, 2014
    • Xiangrui Meng's avatar
      [SPARK-2830][MLLIB] doc update for 1.1 · 43dfc84f
      Xiangrui Meng authored
      1. renamed mllib-basics to mllib-data-types
      1. renamed mllib-stats to mllib-statistics
      1. moved random data generation to the bottom of mllib-stats
      1. updated toc accordingly
      
      atalwalkar
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #2151 from mengxr/mllib-doc-1.1 and squashes the following commits:
      
      0bd79f3 [Xiangrui Meng] add mllib-data-types
      b64a5d7 [Xiangrui Meng] update the content list of basis statistics in mllib-guide
      f625cc2 [Xiangrui Meng] move mllib-basics to mllib-data-types
      4d69250 [Xiangrui Meng] move random data generation to the bottom of statistics
      e64f3ce [Xiangrui Meng] move mllib-stats.md to mllib-statistics.md
      43dfc84f
  19. Aug 12, 2014
    • Ameet Talwalkar's avatar
      SPARK-2830 [MLlib]: re-organize mllib documentation · c235b83e
      Ameet Talwalkar authored
      As per discussions with Xiangrui, I've reorganized and edited the mllib documentation.
      
      Author: Ameet Talwalkar <atalwalkar@gmail.com>
      
      Closes #1908 from atalwalkar/master and squashes the following commits:
      
      fe6938a [Ameet Talwalkar] made xiangruis suggested changes
      840028b [Ameet Talwalkar] made xiangruis suggested changes
      7ec366a [Ameet Talwalkar] reorganize and edit mllib documentation
      c235b83e
  20. Jul 13, 2014
    • Sean Owen's avatar
      SPARK-2363. Clean MLlib's sample data files · 635888cb
      Sean Owen authored
      (Just made a PR for this, mengxr was the reporter of:)
      
      MLlib has sample data under serveral folders:
      1) data/mllib
      2) data/
      3) mllib/data/*
      Per previous discussion with Matei Zaharia, we want to put them under `data/mllib` and clean outdated files.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #1394 from srowen/SPARK-2363 and squashes the following commits:
      
      54313dd [Sean Owen] Move ML example data from /mllib/data/ and /data/ into /data/mllib/
      635888cb
  21. May 18, 2014
    • Xiangrui Meng's avatar
      [WIP][SPARK-1871][MLLIB] Improve MLlib guide for v1.0 · df0aa835
      Xiangrui Meng authored
      Some improvements to MLlib guide:
      
      1. [SPARK-1872] Update API links for unidoc.
      2. [SPARK-1783] Added `page.displayTitle` to the global layout. If it is defined, use it instead of `page.title` for title display.
      3. Add more Java/Python examples.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #816 from mengxr/mllib-doc and squashes the following commits:
      
      ec2e407 [Xiangrui Meng] format scala example for ALS
      cd9f40b [Xiangrui Meng] add a paragraph to summarize distributed matrix types
      4617f04 [Xiangrui Meng] add python example to loadLibSVMFile and fix Java example
      d6509c2 [Xiangrui Meng] [SPARK-1783] update mllib titles
      561fdc0 [Xiangrui Meng] add a displayTitle option to global layout
      195d06f [Xiangrui Meng] add Java example for summary stats and minor fix
      9f1ff89 [Xiangrui Meng] update java api links in mllib-basics
      7dad18e [Xiangrui Meng] update java api links in NB
      3a0f4a6 [Xiangrui Meng] api/pyspark -> api/python
      35bdeb9 [Xiangrui Meng] api/mllib -> api/scala
      e4afaa8 [Xiangrui Meng] explicity state what might change
      df0aa835
  22. May 08, 2014
    • DB Tsai's avatar
      MLlib documentation fix · d38febee
      DB Tsai authored
      Fixed the documentation for that `loadLibSVMData` is changed to `loadLibSVMFile`.
      
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #703 from dbtsai/dbtsai-docfix and squashes the following commits:
      
      71dd508 [DB Tsai] loadLibSVMData is changed to loadLibSVMFile
      d38febee
  23. May 06, 2014
    • Sean Owen's avatar
      SPARK-1727. Correct small compile errors, typos, and markdown issues in (primarly) MLlib docs · 25ad8f93
      Sean Owen authored
      While play-testing the Scala and Java code examples in the MLlib docs, I noticed a number of small compile errors, and some typos. This led to finding and fixing a few similar items in other docs.
      
      Then in the course of building the site docs to check the result, I found a few small suggestions for the build instructions. I also found a few more formatting and markdown issues uncovered when I accidentally used maruku instead of kramdown.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #653 from srowen/SPARK-1727 and squashes the following commits:
      
      6e7c38a [Sean Owen] Final doc updates - one more compile error, and use of mean instead of sum and count
      8f5e847 [Sean Owen] Fix markdown syntax issues that maruku flags, even though we use kramdown (but only those that do not affect kramdown's output)
      99966a9 [Sean Owen] Update issue tracker URL in docs
      23c9ac3 [Sean Owen] Add Scala Naive Bayes example, to use existing example data file (whose format needed a tweak)
      8c81982 [Sean Owen] Fix small compile errors and typos across MLlib docs
      25ad8f93
  24. Apr 22, 2014
    • Xiangrui Meng's avatar
      [SPARK-1506][MLLIB] Documentation improvements for MLlib 1.0 · 26d35f3f
      Xiangrui Meng authored
      Preview: http://54.82.240.23:4000/mllib-guide.html
      
      Table of contents:
      
      * Basics
        * Data types
        * Summary statistics
      * Classification and regression
        * linear support vector machine (SVM)
        * logistic regression
        * linear linear squares, Lasso, and ridge regression
        * decision tree
        * naive Bayes
      * Collaborative Filtering
        * alternating least squares (ALS)
      * Clustering
        * k-means
      * Dimensionality reduction
        * singular value decomposition (SVD)
        * principal component analysis (PCA)
      * Optimization
        * stochastic gradient descent
        * limited-memory BFGS (L-BFGS)
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #422 from mengxr/mllib-doc and squashes the following commits:
      
      944e3a9 [Xiangrui Meng] merge master
      f9fda28 [Xiangrui Meng] minor
      9474065 [Xiangrui Meng] add alpha to ALS examples
      928e630 [Xiangrui Meng] initialization_mode -> initializationMode
      5bbff49 [Xiangrui Meng] add imports to labeled point examples
      c17440d [Xiangrui Meng] fix python nb example
      28f40dc [Xiangrui Meng] remove localhost:4000
      369a4d3 [Xiangrui Meng] Merge branch 'master' into mllib-doc
      7dc95cc [Xiangrui Meng] update linear methods
      053ad8a [Xiangrui Meng] add links to go back to the main page
      abbbf7e [Xiangrui Meng] update ALS argument names
      648283e [Xiangrui Meng] level down statistics
      14e2287 [Xiangrui Meng] add sample libsvm data and use it in guide
      8cd2441 [Xiangrui Meng] minor updates
      186ab07 [Xiangrui Meng] update section names
      6568d65 [Xiangrui Meng] update toc, level up lr and svm
      162ee12 [Xiangrui Meng] rename section names
      5c1e1b1 [Xiangrui Meng] minor
      8aeaba1 [Xiangrui Meng] wrap long lines
      6ce6a6f [Xiangrui Meng] add summary statistics to toc
      5760045 [Xiangrui Meng] claim beta
      cc604bf [Xiangrui Meng] remove classification and regression
      92747b3 [Xiangrui Meng] make section titles consistent
      e605dd6 [Xiangrui Meng] add LIBSVM loader
      f639674 [Xiangrui Meng] add python section to migration guide
      c82ffb4 [Xiangrui Meng] clean optimization
      31660eb [Xiangrui Meng] update linear algebra and stat
      0a40837 [Xiangrui Meng] first pass over linear methods
      1fc8271 [Xiangrui Meng] update toc
      906ed0a [Xiangrui Meng] add a python example to naive bayes
      5f0a700 [Xiangrui Meng] update collaborative filtering
      656d416 [Xiangrui Meng] update mllib-clustering
      86e143a [Xiangrui Meng] remove data types section from main page
      8d1a128 [Xiangrui Meng] move part of linear algebra to data types and add Java/Python examples
      d1b5cbf [Xiangrui Meng] merge master
      72e4804 [Xiangrui Meng] one pass over tree guide
      64f8995 [Xiangrui Meng] move decision tree guide to a separate file
      9fca001 [Xiangrui Meng] add first version of linear algebra guide
      53c9552 [Xiangrui Meng] update dependencies
      f316ec2 [Xiangrui Meng] add migration guide
      f399f6c [Xiangrui Meng] move linear-algebra to dimensionality-reduction
      182460f [Xiangrui Meng] add guide for naive Bayes
      137fd1d [Xiangrui Meng] re-organize toc
      a61e434 [Xiangrui Meng] update mllib's toc
      26d35f3f
Loading