The Crystal Programming Language Forum v1.0.0 released v1.0.0 released v1.0.0 has been released, completing a massive overhaul of the entire interface of the library, to enable device agnostic numerical computing to be used in Crystal, leveraging both CPU and GPU devices.

Some major highlights:

  • ClTensor(T) and Tensor(T) have been merged to Tensor(T, S), with OpenCL backed storage becoming a first class citizen. All creation methods support both storage backends, and the implementation paves the way for zero-copy interop with numerous other libraries (Apache Arrow is the next prime target).
  • Num::NN and Num::Grad feature full GPU support, with almost all layers and gates supporting OpenCL backed Tensors
  • Num::Einsum allows for optimized contractions of Tensors, providing functionality identical to Numpy's einsum.

Some less flashy highlights:

  • Vastly improved test coverage and stability, as well as a revamped API documentation, which can be found here.
  • OpenCL memory management has been implemented, with JIT compiled kernels backed by memory-safe caching.
  • Numpy inter-op is supported via reading and writing to .npy files

As always I am constantly looking for additional contributors to improve documentation, examples, performance, and continue to expand the API. I am especially interested in anyone with CUDA experience (and a CUDA enabled graphics card, which is currently blocking my ability to write the storage backend).

If you have a chance to experiment with the library, bug tickets + feedback in the Gitter channel are always appreciated.