- docs: rewrite and expand docstrings for GELU class and method (motivation, math formula, smoother ReLU for Transformers, usage, references)
- test: add dedicated tests for GELU (output shape, dtype, comparison with torch GELU, monotonicity, gradients, large/small value behavior)
- fix: align numerical test to allow for minor approximation difference vs PyTorch gelu
This update makes the GELU module more transparent and robust for deep learning practitioners and researchers.