On April 24th Deep Learning for Sciences, Engineering, and Arts Meetup, the following problem was discussed: “Why for binary classification don’t we just pick up some values to represent the two possible outcomes (e.g. 0 and 1) and use regression with a linear output and a MSE loss?”. I had the impression that the provided answers were not totally clear for everybody. I am therefore writing this short note, hoping that the arguments presented below will help for a better understanding.
We saw in Part 4 how to build a decision tree predictor. We are now going to create a predictor from a very classic machine learning data set, the Iris data set.
We saw in Part 1 the basic structure of a decision tree. In Part 2 we created a class to handle the samples and labels of a data set. And in Part 3 we saw how to compute the leaves’ values to fit a data set. In this part, we are going to combine the previous results to build a decision tree predictor.
We saw in Part 1 the basic structure of a decision tree and we created in Part 2 a class to handle the samples and labels of a data set. We are going to see now how to compute the prediction values of the leaves to fit a data set.
We saw in Part 1 the basic structure of a decision tree. We are now going to create a class to handle the samples and labels of a data set. This class will be used in the remaining parts of this serie.
Decision trees are simple to understand. Yet they are the basic element of many powerful Machine Learning algorithms such as Random Forest. This serie of blogs will introduce the concept of decision tree and also provide basic scala code for those who want to better understand as well as do some experiments.
Pearson correlation, the most common type of correlation, is widely used in Data Science. However incorrect conclusions are often drawn from a low or high correlation. We will see below some counterexamples, hoping that they will help to better remember some limitations of the Pearson correlation.