In gradient descent based logistic regression models, all training samples are used to update the weights in each single iteration. Hence, if the number of training samples is large, the whole training process will become very time-consuming and computationally expensive, as we just witnessed in our last example.
Fortunately, a small tweak will make logistic regression suitable for large-size data. For each weight update, only one training sample is consumed, instead of the complete training set. The model moves a step based on the error calculated by a single training sample. Once all samples are used, one iteration finishes. This advanced version of gradient descent ...