Thursday, May 7, 2026

Normalization Algorithms in Machine Learning





1. Feature Scaling (Traditional ML)

Technique Description Formula / Key Point Best Used When
Min-Max Normalization Scales data to [0, 1] or [a, b] X' = (X - min) / (max - min) Bounded data, Neural Networks
Standardization (Z-score) Mean = 0, Std = 1 X' = (X - μ) / σ Gaussian-like data, Linear models
Robust Scaling Uses median & IQR (robust to outliers) X' = (X - median) / IQR Data with outliers
MaxAbs Scaling Scales by maximum absolute value X' = X / max(|X|) Sparse data
Mean Normalization Centers around zero X' = (X - mean) / (max - min) Less common

2. Normalization for Vectors / Features

Technique Description Formula Use Case
L2 Normalization (Euclidean) Most common vector normalization X' = X / ||X||₂ Distance-based algorithms, Neural Networks
L1 Normalization (Manhattan) Sum of absolute values = 1 X' = X / ||X||₁ Sparse data, Feature importance
Max Normalization Divide by maximum value in vector X' = X / max(|X|) Simple scaling of feature vectors

3. Deep Learning Normalization Layers

Layer Year Key Idea Main Advantage Common Use Cases
Batch Normalization (BatchNorm) 2015 Normalize across batch dimension Accelerates training CNNs (ResNet, etc.)
Layer Normalization (LayerNorm) 2016 Normalize across features (per sample) Works with variable batch sizes Transformers
Instance Normalization 2017 Normalize per sample per channel Style transfer StyleGAN, artistic tasks
Group Normalization 2018 Normalize within groups of channels Good for small batch sizes Object detection
RMS Normalization (RMSNorm) - Normalize by Root Mean Square Simpler & faster Modern LLMs (Llama, etc.)

4. Other Specialized Normalization Techniques

  • Quantile Normalization — Makes distributions identical across samples (popular in bioinformatics)
  • Local Response Normalization (LRN) — Used in early CNNs like AlexNet
  • Weight Normalization — Reparameterizes weights instead of activations
  • Spectral Normalization — Constrains weight matrices for stable GAN training
  • Batch Renormalization — Improved and more stable version of BatchNorm
  • Filter Response Normalization (FRN) — Batch-independent normalization
  • Power Transform (Yeo-Johnson / Box-Cox) — Makes data more Gaussian-like
  • Contrast Normalization — Used in computer vision preprocessing

5. Quick Recommendation Guide

Scenario Recommended Technique
Classical ML (SVM, KNN, etc.) Standardization or Robust Scaling
Neural Networks (small batch) LayerNorm / GroupNorm
Large batch CNNs BatchNorm
Transformers / Large Language Models RMSNorm or LayerNorm
Data with outliers Robust Scaling
Images (style-related) Instance Normalization

No comments:

Post a Comment

Normalization Algorithms in Machine Learning

1. Feature Scaling (Traditional ML) Technique Description Formula / Key Point ...