mirror of
https://github.com/pese-git/llm-arch-research.git
synced 2026-01-23 21:10:54 +00:00
- docs: rewrite and expand docstrings for GELU class and method (motivation, math formula, smoother ReLU for Transformers, usage, references) - test: add dedicated tests for GELU (output shape, dtype, comparison with torch GELU, monotonicity, gradients, large/small value behavior) - fix: align numerical test to allow for minor approximation difference vs PyTorch gelu This update makes the GELU module more transparent and robust for deep learning practitioners and researchers.