One of the business problems we encountered a while back involved identifying customers who would subscribe to a company’s next best product.
There was a tremendous amount of customer data and a plethora of information hidden in it – both useful and not useful.
As always, we followed the BADIR steps – We were already past the first stage of forming a Business Question, and the next steps were the Analysis Plan and Data Collection.
Once we finished steps 2 and 3, we had a good set of data, both extracted and engineered, that, according to our hypothesis, would be relevant in identifying customers.
Since we had over a million rows of data, and Neural Networks love massive amounts of data, we decided to dirty our hands with training a deep neural network.
Neural Networks? What are they?
So what are Neural Networks in the first place?
The biological inspiration to build neural networks comes from the brain, specifically the neural connections.
This structure inspires the building blocks of neural networks. A simple neural network unit – called perceptron can make this clearer –
Three things are happening in a perceptron:
1. All the inputs are being multiplied by the corresponding weights (weights are also referred to as parameters):
2. Then all the products of input and weight pairs are being summed up (transfer function ) :
3. The sum is transformed or is acted upon by a function called activation function and the result is passed on :
Here θ is called the threshold, or more commonly the bias term.
The neural networks that we use will involve multiple such perceptron units. Each perceptron unit converges to a node, and various nodes make up a layer in our system.
If we keep increasing layers of nodes, the network grows in depth. Hence the term deep neural network.
What is the right architecture?
Well, there is no one correct answer. It all depends on the data patterns and finding the architecture, which gives us the best results. It would help if you started somewhere – and so we did some brainstorming and came up with some possible suggestions that can provide guidance.
A few pointers to keep in mind before you start building your neural network –
– It has been found that increasing the depth (adding more layers) works better than adding more nodes in a layer.
– Start with simple architecture, then increase the complexity if you think the data patterns are not appropriately captured.
– Try to ensure that the number of rows to the number of parameter ratios is more than 50. Therefore, if you don’t have a large dataset, don’t build a complex architecture.
– In general, in most architecture, nodes in subsequent layers keep decreasing.
For example, if you have a dataset with about 100,000 rows and about 40 columns, you could build a neural network with two layers – the first layer with 30 nodes and the second layer with 20 nodes. This will mean the neural network will have about 1870 parameters and the ratio of the number of rows to the number of parameters is above 50.
Apart from this, there are a lot of other things to consider while designing a neural network. The kind of activation functions to use for each layer, the kind of algorithm to use to train a neural network, the loss function to use for a particular neural network training, etc. We will cover these topics in the next few blog posts.
A much more detailed explanation and discussion are available in our Predictive Analytics course, where we discuss the BADIR framework, neural networks, machine learning algorithms, and how to solve business problems using Data Science.