llm-arch-research

mirror of https://github.com/pese-git/llm-arch-research.git synced 2026-01-23 21:10:54 +00:00

Files

Sergey Penkovsky c9da4c841b feat(mixtral): add MixtralDecoder, enhance MoE and Mixtral model docs, add unit tests

- Implement new core module: MixtralDecoder (llm/core/mixtral_decoder.py) with full Russian scientific docstrings, formal math, and usage examples
- Improve MoE: add Russian docstrings for class, __init__, forward; validate top_k_experts; explain theory and components
- Refactor Mixtral model: switch stack to MixtralDecoder, add comprehensive documentation for class, constructor and forward, clarify config usage and architecture
- Add thorough unit tests:
   * tests/core/test_mixtral_decoder.py: checks shapes, errors, mask, dropout, grads etc.
   * tests/core/test_moe.py: covers normal and edge-case logic, gradients, shape, params check
- All code and tests in compliance with recent scientific and engineering standards.

2025-10-20 16:07:51 +03:00

core

feat(mixtral): add MixtralDecoder, enhance MoE and Mixtral model docs, add unit tests

2025-10-20 16:07:51 +03:00

datasets

doc(datasets): update docstrings and tests

2025-10-17 10:49:45 +03:00

models

feat(mixtral): initial implementation of Mixtral MoE model, configs, and tests

2025-10-20 08:12:11 +03:00

tokenizers

Рефакторинг: единообразие оформления кода (пробелы, кавычки, пустые строки), без изменения логики по всему проекту.