Explore how Layer Normalization and residual paths stabilize LLM training. Compare Pre-LN, Post-LN, RMSNorm, and Peri-LN strategies for better model convergence and efficiency.