llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-05-16 10:09:42 +00:00

Author	SHA1	Message	Date
Sergey Penkovsky	e6ca8dee6f	docs(core): add comprehensive docstrings and unit tests for GroupedQueryAttention (GQA) - docs: Rewrite and expand docstrings for the GroupedQueryAttention class and all main methods (__init__, forward, _repeat_kv_heads, _create_sliding_window_mask): - explained GQA architecture and motivation - included mathematical formulas, step-by-step algorithms, usage examples - added references to relevant scientific papers (Mistral, Llama 2, etc.) - test: Add dedicated unit tests for GQA (output shape correctness, mask/window logic, KV head replication, RoPE processing, error and edge-cases) - docs/test: Documentation and tests now fully reflect modern GQA usage and best practices for LLM architectures This commit makes the implementation, usage, and theoretical underpinnings of GQA transparent and reproducible for researchers and engineers.	2025-10-15 17:27:55 +03:00
Sergey Penkovsky	2e72dbaf07	test(llama): add unit tests for generation, cache, and edge cases - Covers inference with and without cache and with sampling (top-k, top-p) - Includes test for max sequence length (should raise ValueError) - Verifies output shape and absence of dtype errors for the mask logic - Minimal config and random data ensure tests are fast and robust Motivation: Regression and integration protection for Llama decoding and sampling logic.	2025-10-15 14:37:35 +03:00
Sergey Penkovsky	dc440a3938	test(gpt2): add unit tests for generation, cache behavior, and error conditions - Covers forward pass with and without KV-cache - Verifies correct sequence generation for greedy, top-k, and top-p sampling - Adds ValueError test for exceeding max sequence length - Uses small random toy config and minimal setup for fast test feedback Motivation: Prevent regressions in decoding, sampling, and KV-cache logic in GPT2 implementation.	2025-10-15 14:36:32 +03:00
Sergey Penkovsky	38682e8c9d	test(mistral): add unit tests for model generation and cache	2025-10-15 13:20:50 +03:00
Sergey Penkovsky	d10044e4a7	refactor(core): refactor RoPE and MultiHeadAttention, add math-rich docs, expand tests, remove unused head_attention - refactor: улучшена и унифицирована реализация RoPE, теперь поддерживаются строгие проверки размерности входа; внесены улучшения и структурные изменения в MultiHeadAttention (более понятная логика, строгая спецификация входов/выходов) - docs: полностью переписаны docstrings для RoPE и MultiHeadAttention — включены математические формулы, ссылки на научные статьи, подробные пояснения по алгоритму, формату входных данных, ограничениям, примеры использования - test: добавлены отдельные unit-тесты для RoPE (корректность формы, ошибки на неверную размерность, сохранение нормы, backward/градиенты, работу с параметрами start_pos и батчами) - chore: удалён неиспользуемый модуль core/head_attention.py - fix: теперь выбрасывается AssertionError при неправильной размерности входа RoPE; это позволило полностью покрыть тест-кейсы на ошибки Этот коммит синхронизирует логику реализации базового внимания с современной практикой LLM, укрепляет документацию для инженеров и исследователей, а также расширяет надежность автотестирования библиотеки.	2025-10-15 11:04:07 +03:00
Sergey Penkovsky	712278e33c	Рефакторинг: единообразие оформления кода (пробелы, кавычки, пустые строки), без изменения логики по всему проекту.	2025-10-06 22:57:19 +03:00
Sergey Penkovsky	3843e64098	test(core): fix FeedForward and MultiHeadAttention tests for unified interface and tuple outputs	2025-10-05 19:26:18 +03:00
Sergey Penkovsky	fb74dc7c17	test: add comprehensive test suite for LLM components - Add pytest configuration and fixtures - Add tests for core modules: decoder, feed_forward, multi_head_attention - Add tests for positional and token embeddings - Add tests for GPT model - Add tests for tokenizers (base and BPE) - Add basic integration tests	2025-10-05 08:11:18 +03:00

8 Commits