llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-01-23 21:10:54 +00:00

Author	SHA1	Message	Date
Sergey Penkovsky	f30cd530a9	feat: add LLaMA model implementation with RoPE positional encoding - Added LLaMA model architecture with RMSNorm and SwiGLU activation - Implemented Rotary Positional Embeddings (RoPE) for better positional encoding - Created training script for LLaMA with BPE tokenizer - Fixed matplotlib dependency version in uv.lock - Added LLaMA module initialization The implementation includes: - TokenEmbeddings, HeadAttention, MultiHeadAttention with RoPE support - RMSNorm normalization layer - SwiGLU feed-forward activation - Cached decoder implementation for efficient generation	2025-10-06 13:26:20 +03:00
Sergey Penkovsky	9898e8ee83	feat: add RoPE positional embeddings implementation in llama.ipynb - Implement Rotary Positional Embeddings (RoPE) with separate cosine/sine components - Add vectorized computation of inverse frequencies for RoPE - Include tensor slicing utilities for even/odd column separation - Update dependencies in pyproject.toml and uv.lock	2025-10-06 12:52:59 +03:00
Sergey Penkovsky	1f9a4d2fa9	chore: add ipykernel dependency and update notebooks - Add ipykernel to project dependencies for Jupyter notebook support - Update BPE and GPT analysis notebooks with latest experiments	2025-10-05 11:59:24 +03:00
Sergey Penkovsky	f4bdc81829	fix: update PyTorch mask types and BPE tokenizer serialization - Replace deprecated torch.uint8 and .byte() with torch.bool in GPT.generate - Add save/load methods to BPETokenizer for proper merges and vocab_list serialization - Update dependencies in pyproject.toml	2025-10-05 08:09:30 +03:00
Sergey Penkovsky	ec07546ea8	feat: initial project setup with LLM architecture and HF integration - Add LLM library with GPT model implementation - Add hf-proxy for HuggingFace integration - Add experiments for training and generation - Add comprehensive documentation and examples - Configure uv workspace with proper dependencies	2025-10-04 22:40:21 +03:00

5 Commits