mirror of
https://github.com/pese-git/llm-arch-research.git
synced 2026-01-23 21:10:54 +00:00
- Covers forward pass with and without KV-cache - Verifies correct sequence generation for greedy, top-k, and top-p sampling - Adds ValueError test for exceeding max sequence length - Uses small random toy config and minimal setup for fast test feedback Motivation: Prevent regressions in decoding, sampling, and KV-cache logic in GPT2 implementation.