llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-01-23 21:10:54 +00:00

Files

Sergey Penkovsky 3bc2848cf0 refactor: unify CachedDecoder implementation across models

- Completely removed duplicate CachedDecoder from llama.py
- Modified core CachedDecoder to support dependency injection:
  - Added feed_forward_layer parameter (required)
  - Added norm_layer parameter with LayerNorm default
  - Added rope parameter for RoPE support
  - Removed unused activation parameter
- Updated GPT2 to use new CachedDecoder with FeedForward
- Updated LLaMA to use new CachedDecoder with SwiGLU and RMSNorm
- Fixed parameter order in constructor to follow Python syntax rules

This eliminates all code duplication while maintaining architectural specificities through dependency injection.

2025-10-06 14:57:29 +03:00