It has been quite a while since a release of SHAInet happened.
What is SHAInet?
SHAInet is a full neural network and deep learning framework made in pure crystal, it can train actual LLMs and can use CUDA for fast operations.
The latest changes bring forth:
- GPU support for CUDA and CUDBLAS
- LLM network support with an example training on the BabyLM Challagne.
- Support for GPT2\GPT3 features.
- Transformer layers with sinusoidal positional encoding
- BPE tokenizer and vocabulary training
- Cross-entropy loss for language modeling
- CUDA acceleration (with cuDNN)
- Streaming datasets in JSONL
- Autograd, AdamW optimizer
- GPU-friendly mini-batch training
Why use SHAInet?
If you want to understand neural network more deeply but don’t want to use python or getting into huge eco-systems like PyTorch or TensorFlow.
Maybe you want to train a small model at home? or maybe you want to run some model from hugging face and run it inside crystal code?
Why are you wasting your time on this project for the last 8 years?
I hold a dream about Crystal being a mainstream language or at least a more popular one, SHAInet was aimed at bringing easy to use real world deep learning into the world of Crystal, and now that the new tech moved from computer vision and simple networks into the realm of LLMs I thought that it wasn’t fair to keep us without a library that can work in that regard.
Also, I just love Crystal what can I say ;)