Predictive Analytics in Commerce

Chapters 6 › Unit 6: More Personal to the Customer View instructions Hide instructions

More Personal to the Customer

Step 1. Build the best possible churn model

It is time to create your own churn model. Start by creating a random forest on a subset of the data, and then (just like Nienke recommends) build a random forest on the whole dataset (be aware that this might be more time-consuming).

Do you get similar results when you use your whole dataset? And do the results seem logical to you?

Step 2. Elevator pitch

Set up a short presentation (an elevator pitch) to convince your boss to set up a retention campaign based on the churn model you just built. Mention the work you have done, what problem you are trying to solve and how you think you can solve it.

Need a little guidance on how to setup a good elevator pitch? Watch this movie and you are ready to go:

Best possible churn model / Elevator pitch


The task of creating the best possible churn model was indeed more time consuming than the model based on 10% of the customer base; but more confidence is the results is in most cases always worth it! :)

The biggest difference I noticed when it comes to variable importance is the raise of "Age" as an important variable in the model; but the most important variables were mostly the same in slightly different order.

The R2 value was higher when testing on the whole dataset, which is good. But I reckon I need more extensive studying in determining where the thresholds are for what one can/would dim a significant increase/decrease, not only when it comes to R2 but in general with most results from the model; at these early stages of my learning curve for Predictive Analytics, I find myself simply going by what was said in the videos and generalizing it, it will be quite interesting to gather more knowledge on how model-results can be interpreted based on different factors.

The predictive power of the model, when testing the full dataset was still good, and for this final model I end up using the top 10% with the highest probabilities of churning as test-subjects.

enter image description here

Your Comment

Please login to leave a comment.