llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-05-16 10:09:42 +00:00

Files

Sergey Penkovsky f30cd530a9 feat: add LLaMA model implementation with RoPE positional encoding

- Added LLaMA model architecture with RMSNorm and SwiGLU activation
- Implemented Rotary Positional Embeddings (RoPE) for better positional encoding
- Created training script for LLaMA with BPE tokenizer
- Fixed matplotlib dependency version in uv.lock
- Added LLaMA module initialization

The implementation includes:
- TokenEmbeddings, HeadAttention, MultiHeadAttention with RoPE support
- RMSNorm normalization layer
- SwiGLU feed-forward activation
- Cached decoder implementation for efficient generation

2025-10-06 13:26:20 +03:00