ASA Connect

 View Only
  • 1.  Classification Tree

    Posted 06-10-2018 03:36
    Hi,
    I apologize in advance if the following question does not make sense. 

    How can one ensure that a regression tree (method ="anova" if used rpart package in r) isn't a garbage? Decision tree (method = "class" if used rpart package in r), on the other hand, can be checked using confusion matrix and by checking for sensitivity and specificity. How to check the authenticity of a regression tree? Can we rely on approximate R-square by number of splits ("rsq.rpart" from rpart package)?

    Thanks in advance,
    Mamun



    ------------------------------
    Md Abdullah Mamun
    PhD Student
    UNTHSC
    ------------------------------


  • 2.  RE: Classification Tree

    Posted 06-11-2018 15:25
    Your question doesn't make sense without any details -- what you tried,
    what the output was, and what you don't understand.

    In any case, you are posting this to the wrong list. Questions such as
    you posted, for interpretations or how-to on statistical questions are
    best posted to https://stats.stackexchange.com/.

    There are other forums / discussion lists for R programming questions,
    but that's not what you are asking.

    -Michael

    --
    Michael Friendly Email: friendly AT yorku DOT ca
    Professor, Psychology Dept. & Chair, Quantitative Methods
    York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
    4700 Keele Street Web:http://www.datavis.ca
    Toronto, ONT M3J 1P3 CANADA




  • 3.  RE: Classification Tree

    Posted 06-13-2018 02:58
    Edited by Christian Graf 06-13-2018 03:02
    Quote from the home page of ASA community:
    ------------------------------------------------------------------
    "Welcome to ASA Connect: ASA's All member forum.
    Here you can:
    • Collaborate with other ASA members
    • Exchange resources and best practices
    • Discuss critical industry issues and receive input from members outside of your
    current communities
    • Network with other industry experts"
    ------------------------------------------------------------------
    Exchanging resources and best practices includes asking questions on any statistical topic - including R packages.
    I personally enjoy the discussions and answers around such questions in this community.

    Kind regards,
    Christian

    ------------------------------
    Christian Graf
    Dipl.-Math.
    Qualitaetssicherung & Statistik

    "To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of."

    Ronald Fisher in 'Presidential Address by Professor R. A. Fisher, Sc.D., F.R.S. Sankhyā: The Indian Journal of Statistics (1933-1960), Vol. 4, No. 1 (1938), pp. 14-17'
    ------------------------------



  • 4.  RE: Classification Tree

    Posted 06-11-2018 15:33
    Do you use any library to build pipelines, like caret, or are you looking to do everything by hand? RMSE, R^2, MAE etc, are the usual suspects when it comes to having metrics for regression models and comparing them.

    I've not used rpart directly in a while, but rpart has printcp, in that output, the rel error is 1 - R^2. See https://cran.r-project.org/web/packages/rpart/rpart.pdf for more detail on it.

    Don't discard also visual diagnostic of models. For rpart, there is a separate package I've used: rpart-plot

    It gives you rpart.plot(model). With each node you get the predicted value and the percentage of observations. If you are modeling something you have expertise in, the rules and breaks should make sense to you.

    ------------------------------
    Francois Dion
    Chief Data Scientist
    Dion Research LLC
    ------------------------------



  • 5.  RE: Classification Tree

    Posted 06-12-2018 01:42
    Typically the mean squared error (MSE) is used to evaluate the regression tree. Unfortunately the MSE is not as clear on a single model as the confusion matrix. I have used it when comparing several models with each other to choose the best model. 

    Jill

    --
    Jill Lundell

    "A ship is safe in harbor, but that's not what ships are for." 
    William G.T. Shedd