Sergey Penkovsky
b9d9bdcc71
docs(readme): add explicit support notice for GPT-2 architecture and usage examples
2025-10-05 21:29:38 +03:00
Sergey Penkovsky
c31eed8551
fix(hf-integration): handle logits as tuple in hf_adapter, convert torch.Tensor to list in hf_tokenizer.decode for decoding compatibility
2025-10-05 20:47:36 +03:00
Sergey Penkovsky
3843e64098
test(core): fix FeedForward and MultiHeadAttention tests for unified interface and tuple outputs
2025-10-05 19:26:18 +03:00
Sergey Penkovsky
c39e68d71a
feat(gpt2): add GPT2 architecture with universal FeedForward, CachedDecoder, and refactored components. Core modules now shared; add train and generate scripts for GPT2-BPE.
2025-10-05 19:11:20 +03:00
Sergey Penkovsky
f866ed7ac7
fix: universal logits extraction for tuple/model output in Trainer (GPT/GPT2 compatibility)
2025-10-05 15:52:21 +03:00
Sergey Penkovsky
aa408e941a
docs: add GPT-2 analysis notebook
...
- Add gpt2.ipynb with GPT-2 model experiments and comparisons
2025-10-05 12:48:32 +03:00
Sergey Penkovsky
da1cf3fb55
fix: rename notebook
2025-10-05 12:46:17 +03:00
Sergey Penkovsky
1f9a4d2fa9
chore: add ipykernel dependency and update notebooks
...
- Add ipykernel to project dependencies for Jupyter notebook support
- Update BPE and GPT analysis notebooks with latest experiments
2025-10-05 11:59:24 +03:00
Sergey Penkovsky
f060497eb1
docs: add analysis notebooks for BPE and GPT
...
- Add bpe.ipynb with Byte Pair Encoding implementation analysis
- Update gpt_analysis.ipynb with GPT model experiments and visualizations
2025-10-05 08:23:09 +03:00
Sergey Penkovsky
fb74dc7c17
test: add comprehensive test suite for LLM components
...
- Add pytest configuration and fixtures
- Add tests for core modules: decoder, feed_forward, multi_head_attention
- Add tests for positional and token embeddings
- Add tests for GPT model
- Add tests for tokenizers (base and BPE)
- Add basic integration tests
2025-10-05 08:11:18 +03:00
Sergey Penkovsky
f4bdc81829
fix: update PyTorch mask types and BPE tokenizer serialization
...
- Replace deprecated torch.uint8 and .byte() with torch.bool in GPT.generate
- Add save/load methods to BPETokenizer for proper merges and vocab_list serialization
- Update dependencies in pyproject.toml
2025-10-05 08:09:30 +03:00
Sergey Penkovsky
ec07546ea8
feat: initial project setup with LLM architecture and HF integration
...
- Add LLM library with GPT model implementation
- Add hf-proxy for HuggingFace integration
- Add experiments for training and generation
- Add comprehensive documentation and examples
- Configure uv workspace with proper dependencies
2025-10-04 22:40:21 +03:00