llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-01-24 13:32:08 +00:00

Author	SHA1	Message	Date
Sergey Penkovsky	ec0d2bd8d0	feat(mistral): add Mistral model implementation and configs - implement Mistral model in llm/models/mistral/mistral.py with GroupedQueryAttention, SwiGLU, RoPE, sliding window attention - add __init__.py for module export - add config files for mistral training and generation - update universal experiment runner to support Mistral model - add notebook for Mistral experiments	2025-10-14 14:53:45 +03:00
Sergey Penkovsky	e5706a690d	fix(rope, attention): корректное позиционирование RoPE при генерации с кэшем - Исправлена ошибка расчёта позиции для RoPE (Rotary Positional Embeddings) при автодополнении с использованием кэша. - В HeadAttention теперь передаётся start_pos в RoPE, вычисляемый из длины кэша. - Обновлена сигнатура и логика метода RoPE.forward. - Обновлен ноутбук llama.ipynb под новые интерфейсы и выводы. BREAKING CHANGE: переопределён метод forward у RoPE, требуется обновить код, если RoPE использовался вручную.	2025-10-14 12:03:20 +03:00
Sergey Penkovsky	9898e8ee83	feat: add RoPE positional embeddings implementation in llama.ipynb - Implement Rotary Positional Embeddings (RoPE) with separate cosine/sine components - Add vectorized computation of inverse frequencies for RoPE - Include tensor slicing utilities for even/odd column separation - Update dependencies in pyproject.toml and uv.lock	2025-10-06 12:52:59 +03:00
Sergey Penkovsky	b6f56a2640	fix: typo in activation attribute for SwiGLU (rename _actvation to _activation) and minor index update	2025-10-05 23:01:58 +03:00
Sergey Penkovsky	aa408e941a	docs: add GPT-2 analysis notebook - Add gpt2.ipynb with GPT-2 model experiments and comparisons	2025-10-05 12:48:32 +03:00
Sergey Penkovsky	da1cf3fb55	fix: rename notebook	2025-10-05 12:46:17 +03:00
Sergey Penkovsky	1f9a4d2fa9	chore: add ipykernel dependency and update notebooks - Add ipykernel to project dependencies for Jupyter notebook support - Update BPE and GPT analysis notebooks with latest experiments	2025-10-05 11:59:24 +03:00
Sergey Penkovsky	f060497eb1	docs: add analysis notebooks for BPE and GPT - Add bpe.ipynb with Byte Pair Encoding implementation analysis - Update gpt_analysis.ipynb with GPT model experiments and visualizations	2025-10-05 08:23:09 +03:00
Sergey Penkovsky	ec07546ea8	feat: initial project setup with LLM architecture and HF integration - Add LLM library with GPT model implementation - Add hf-proxy for HuggingFace integration - Add experiments for training and generation - Add comprehensive documentation and examples - Configure uv workspace with proper dependencies	2025-10-04 22:40:21 +03:00

9 Commits