mirror of
https://github.com/pese-git/llm-arch-research.git
synced 2026-01-23 21:10:54 +00:00
feat(mistral): add Mistral model implementation and configs
- implement Mistral model in llm/models/mistral/mistral.py with GroupedQueryAttention, SwiGLU, RoPE, sliding window attention - add __init__.py for module export - add config files for mistral training and generation - update universal experiment runner to support Mistral model - add notebook for Mistral experiments
This commit is contained in:
3267
notebooks/mistral.ipynb
Normal file
3267
notebooks/mistral.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user