Compare NN, Decision Tree and Forest
In addition to the Neural network solution I explain in the previous article I also tried other algorithms like Decision Tree and Random Forest.
Algorithm | Accuracy |
Neural Network | 96.63% |
Random Forest | 96.96% |
Decision Tree | 96.76% |
Refined Decision Tree | 97.64% |
Refine decision tree?
One of the conclusions of all my tests with ML from the previous article is the complexity to choose the parameters needed by each model. Hopefully a friend suggests me a solution “GridSearchCV” which allows to test various parameter for an algorithm and find the best ones.
The algorithm I called “refined decision tree” is a decision tree based on the best parameters “GridSearchCv” found.
#Now let s try to refine the Decision tree by trying several parameters
aGridSearchParams = {'max_features': [None, 'sqrt'],'max_depth' : [3, 5, 10, None],'min_samples_leaf': [1, 2, 3, 5, 10],'min_samples_split': [2, 4, 8, 16],'max_leaf_nodes': [10, 20, 50, 100, 500, 1000]} # instantiate the grid aGridSearchResult = GridSearchCV(DecisionTreeClassifier(), aGridSearchParams, cv=5, ) # fit the grid with data aGridSearchResult.fit(atrainDataX, atrainDataY) #let s see how good it is aDecisionTreeRefinedPrediction = aGridSearchResult.best_estimator_.predict(atestDataX)
I used the Neural Network and the “refined decision tree” in the application to compare them and notice that the neural network was slightly better. For example
root - INFO - Checking if group should be refresh by calling ML with: [1, 47, 10, 0]
root - INFO - We found 1 new message and the ML probability were NN: [[0.11245716]], DT[[1. 0.]]
When trying to predict if a group with characteristics:
- Latest refresh done 1 days ago
- 47 users in the chat room
- Latest message in the group was posted 10 days ago
- 0 messages posted in the last week in the chatroom
The Neural network predict a probability of 11% of new messages while the “refined decision tree” predicted 0% chance. We found new messages in the room leading for a false positive for the “refined decision tree” which we want to avoid at all cast. I just stick to the neural network for now.
Optimizer change on NN
When training the Neural Network, I sometime end up with very poor results. The model seems stuck and always predict the same output:
NN atestDataYPredictedKeras: [0.05339569 0.05339569 0.05339569 ... 0.05339573 0.05339573 0.05339573]
Even if the test case is composed of around 1500 lines.
NN len(atestDataYPredictedKeras): 1484
This happen from time to time and it usually get away if I retrain the model nevertheless it makes the final results very bad if I did not check the training results every time.
Luckily, I’m not the only one in this case 😉 according to this github ticket.
I tried some of the suggestions proposed on the page and one that seems to be the best was to modified the optimizer from SGD to Adam. After some reading, I decided to go with it since Adam seems to be a good choice according to the ML community. This youtube video explain some of the possible optimizer algorithm and also suggest adam as default choice. Nevertheless, like all topic/parameters in ML you can always find arguments about the opposite like this article:
“We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half.”
I will still stick to Adam for now since it fixes my original issue with the same accuracy and smaller loss:
aKerasNnModel.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
These are the 2 topics I wanted to follow up 😉