Thursday, May 27, 2021

In SGD one sample is one batch

I had a confusion about SGD and many resources on the net added to that confusion. It was about SGD. From the view point of statistic, the term stochastic is used to indicate a random sample out of multiple samples. So one can easily confuse that SGD is faster because it randomly picks one sample out of a batch. While this is still correct for SGD, in practice SGD applies gradients immediately after each sample is processed. The reason for this is that SGD treats each sample as a batch. This was cleared to me by Jason Brownlee when I asked a question to him. Many thanks to Jason!

https://machinelearningmastery.com/gentle-introduction-mini-batch-gradient-descent-configure-batch-size/#comment-609705

No comments:

Post a Comment

How to check local and global angular versions

 Use the command ng version (or ng v ) to find the version of Angular CLI in the current folder. Run it outside of the Angular project, to f...