Skip to content
Snippets Groups Projects
  • Manish Amde's avatar
    07d72fe6
    Decision Tree documentation for MLlib programming guide · 07d72fe6
    Manish Amde authored
    Added documentation for user to use the decision tree algorithms for classification and regression in Spark 1.0 release.
    
    Apart from a general review, I need specific input on the following:
    * I had to move a lot of the existing documentation under the *linear methods* umbrella to accommodate decision trees. I wonder if there is a better way to organize the programming guide given we are so close to the release.
    * I have not looked closely at pyspark but I am wondering new mllib algorithms are automatically plugged in or do we need to some extra work to call mllib functions from pyspark. I will add to the pyspark examples based upon the advice I get.
    
    cc: @mengxr, @hirakendu, @etrain, @atalwalkar
    
    Author: Manish Amde <manish9ue@gmail.com>
    
    Closes #402 from manishamde/tree_doc and squashes the following commits:
    
    022485a [Manish Amde] more documentation
    865826e [Manish Amde] minor: grammar
    dbb0e5e [Manish Amde] minor improvements to text
    b9ef6c4 [Manish Amde] basic decision tree code examples
    6e297d7 [Manish Amde] added subsections
    f427e84 [Manish Amde] renaming sections
    9c0c4be [Manish Amde] split candidate
    6925275 [Manish Amde] impurity and information gain
    94fd2f9 [Manish Amde] more reorg
    b93125c [Manish Amde] more subsection reorg
    3ecb2ad [Manish Amde] minor text addition
    1537dd3 [Manish Amde] added placeholders and some doc
    d06511d [Manish Amde] basic skeleton
    07d72fe6
    History
    Decision Tree documentation for MLlib programming guide
    Manish Amde authored
    Added documentation for user to use the decision tree algorithms for classification and regression in Spark 1.0 release.
    
    Apart from a general review, I need specific input on the following:
    * I had to move a lot of the existing documentation under the *linear methods* umbrella to accommodate decision trees. I wonder if there is a better way to organize the programming guide given we are so close to the release.
    * I have not looked closely at pyspark but I am wondering new mllib algorithms are automatically plugged in or do we need to some extra work to call mllib functions from pyspark. I will add to the pyspark examples based upon the advice I get.
    
    cc: @mengxr, @hirakendu, @etrain, @atalwalkar
    
    Author: Manish Amde <manish9ue@gmail.com>
    
    Closes #402 from manishamde/tree_doc and squashes the following commits:
    
    022485a [Manish Amde] more documentation
    865826e [Manish Amde] minor: grammar
    dbb0e5e [Manish Amde] minor improvements to text
    b9ef6c4 [Manish Amde] basic decision tree code examples
    6e297d7 [Manish Amde] added subsections
    f427e84 [Manish Amde] renaming sections
    9c0c4be [Manish Amde] split candidate
    6925275 [Manish Amde] impurity and information gain
    94fd2f9 [Manish Amde] more reorg
    b93125c [Manish Amde] more subsection reorg
    3ecb2ad [Manish Amde] minor text addition
    1537dd3 [Manish Amde] added placeholders and some doc
    d06511d [Manish Amde] basic skeleton