llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-01-23 21:10:54 +00:00

Files

Sergey Penkovsky e6ca8dee6f docs(core): add comprehensive docstrings and unit tests for GroupedQueryAttention (GQA)

- docs: Rewrite and expand docstrings for the GroupedQueryAttention class and all main methods (__init__, forward, _repeat_kv_heads, _create_sliding_window_mask):
    - explained GQA architecture and motivation
    - included mathematical formulas, step-by-step algorithms, usage examples
    - added references to relevant scientific papers (Mistral, Llama 2, etc.)
- test: Add dedicated unit tests for GQA (output shape correctness, mask/window logic, KV head replication, RoPE processing, error and edge-cases)
- docs/test: Documentation and tests now fully reflect modern GQA usage and best practices for LLM architectures

This commit makes the implementation, usage, and theoretical underpinnings of GQA transparent and reproducible for researchers and engineers.

2025-10-15 17:27:55 +03:00

core

docs(core): add comprehensive docstrings and unit tests for GroupedQueryAttention (GQA)

2025-10-15 17:27:55 +03:00

models

test(llama): add unit tests for generation, cache, and edge cases

2025-10-15 14:37:35 +03:00

tokenizers

Рефакторинг: единообразие оформления кода (пробелы, кавычки, пустые строки), без изменения логики по всему проекту.

2025-10-06 22:57:19 +03:00

__init__.py

test: add comprehensive test suite for LLM components

2025-10-05 08:11:18 +03:00

conftest.py

2025-10-06 22:57:19 +03:00

test_basic.py

2025-10-06 22:57:19 +03:00