llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-05-16 10:09:42 +00:00

Author	SHA1	Message	Date
Sergey Penkovsky	25caf69ced	refactor(gpt1): migrate Decoder to GptDecoder, unify API, and update tests - Renamed Decoder (and decoder.py) to GptDecoder (gpt_decoder.py) for clarity in GPT1 - Implemented support for cache and use_cache parameters in GptDecoder.forward (API unification) - Adapted all usages in GPT model to use new decoder structure and handle tuple output - Refactored core tests (test_gpt.py, test_gpt_decoder.py, test_basic.py) to correctly expect tuple or logits and ensure shape/device checks work as before - Improved clarity and future extensibility for autoregressive generation and benchmarking - No changes to architectural details or training loop; pure API and test modernization	2025-10-22 16:27:08 +03:00
Sergey Penkovsky	cfb4b6dfb1	feat(gemma): initial implementation of Gemma model and configs - Add core Gemma model (architecture, attention, GeGLU, RoPE, RMSNorm, etc) - Add configs for training and generation: gemma_train.json, gemma_generate.json - Add Gemma notebook for exploratory analysis and demonstration - Add __init__.py for Gemma submodule - Update run_llm_experiment.py to support Gemma experiment configs test(gemma): add comprehensive unit tests for Gemma - Test forward pass (with/without cache) - Test autoregressive generation (greedy, top-k, top-p) - Test shape correctness and max sequence length errors - Test multi-layer stack and token embeddings docs: add documentation notebook for Gemma usage and analysis Closes: #issue (if applicable)	2025-10-21 01:02:15 +03:00
Sergey Penkovsky	b1737bbce2	feat(mixtral): initial implementation of Mixtral MoE model, configs, and tests - Add Mixtral architecture implementation with MoE support (llm/src/llm/models/mixtral/mixtral.py) - Introduce generic Mixture-of-Experts (MoE) block (llm/src/llm/core/moe.py) - Create dedicated configuration files for Mixtral training and generation experiments - Register and test Mixtral support in experiment runner (run_llm_experiment.py) - Add unit tests for Mixtral API including forward, caching, and generation modes - Include Jupyter notebook mixstral.ipynb for architectural exploration and research - Ensure correct handling of torch bool masks in sampling (top-k, top-p) during generation BREAKING CHANGE: Adds new model code and test coverage, modifying experiment runner logic to register Mixtral.	2025-10-20 08:12:11 +03:00
Sergey Penkovsky	2e72dbaf07	test(llama): add unit tests for generation, cache, and edge cases - Covers inference with and without cache and with sampling (top-k, top-p) - Includes test for max sequence length (should raise ValueError) - Verifies output shape and absence of dtype errors for the mask logic - Minimal config and random data ensure tests are fast and robust Motivation: Regression and integration protection for Llama decoding and sampling logic.	2025-10-15 14:37:35 +03:00
Sergey Penkovsky	dc440a3938	test(gpt2): add unit tests for generation, cache behavior, and error conditions - Covers forward pass with and without KV-cache - Verifies correct sequence generation for greedy, top-k, and top-p sampling - Adds ValueError test for exceeding max sequence length - Uses small random toy config and minimal setup for fast test feedback Motivation: Prevent regressions in decoding, sampling, and KV-cache logic in GPT2 implementation.	2025-10-15 14:36:32 +03:00
Sergey Penkovsky	38682e8c9d	test(mistral): add unit tests for model generation and cache	2025-10-15 13:20:50 +03:00
Sergey Penkovsky	d10044e4a7	refactor(core): refactor RoPE and MultiHeadAttention, add math-rich docs, expand tests, remove unused head_attention - refactor: улучшена и унифицирована реализация RoPE, теперь поддерживаются строгие проверки размерности входа; внесены улучшения и структурные изменения в MultiHeadAttention (более понятная логика, строгая спецификация входов/выходов) - docs: полностью переписаны docstrings для RoPE и MultiHeadAttention — включены математические формулы, ссылки на научные статьи, подробные пояснения по алгоритму, формату входных данных, ограничениям, примеры использования - test: добавлены отдельные unit-тесты для RoPE (корректность формы, ошибки на неверную размерность, сохранение нормы, backward/градиенты, работу с параметрами start_pos и батчами) - chore: удалён неиспользуемый модуль core/head_attention.py - fix: теперь выбрасывается AssertionError при неправильной размерности входа RoPE; это позволило полностью покрыть тест-кейсы на ошибки Этот коммит синхронизирует логику реализации базового внимания с современной практикой LLM, укрепляет документацию для инженеров и исследователей, а также расширяет надежность автотестирования библиотеки.	2025-10-15 11:04:07 +03:00
Sergey Penkovsky	712278e33c	Рефакторинг: единообразие оформления кода (пробелы, кавычки, пустые строки), без изменения логики по всему проекту.	2025-10-06 22:57:19 +03:00
Sergey Penkovsky	fb74dc7c17	test: add comprehensive test suite for LLM components - Add pytest configuration and fixtures - Add tests for core modules: decoder, feed_forward, multi_head_attention - Add tests for positional and token embeddings - Add tests for GPT model - Add tests for tokenizers (base and BPE) - Add basic integration tests	2025-10-05 08:11:18 +03:00

9 Commits