Files
llm-arch-research/experiments/llm_only
Sergey Penkovsky f30cd530a9 feat: add LLaMA model implementation with RoPE positional encoding
- Added LLaMA model architecture with RMSNorm and SwiGLU activation
- Implemented Rotary Positional Embeddings (RoPE) for better positional encoding
- Created training script for LLaMA with BPE tokenizer
- Fixed matplotlib dependency version in uv.lock
- Added LLaMA module initialization

The implementation includes:
- TokenEmbeddings, HeadAttention, MultiHeadAttention with RoPE support
- RMSNorm normalization layer
- SwiGLU feed-forward activation
- Cached decoder implementation for efficient generation
2025-10-06 13:26:20 +03:00
..