Momentum | Uses exponential moving average of current and previous gradient |
Adagrad | Uses squared current and previous gradients and uses its sqrt in the divisor of lr |
RMSProp | Uses exponential moving average of squares of current and previous gradients and uses its sqrt in the divisor of lr |
Adam | 1. uses exponential moving avergages as RMSProp in divisor of lr 2. Uses exponential average of current and previous gradients in multiplier of lr |
No comments:
Post a Comment