Num.cr v1.0.0 released

christopherzimmerman · November 7, 2021, 9:32pm

Num.cr v1.0.0 released

Num.cr v1.0.0 has been released, completing a massive overhaul of the entire interface of the library, to enable device agnostic numerical computing to be used in Crystal, leveraging both CPU and GPU devices.

Some major highlights:

ClTensor(T) and Tensor(T) have been merged to Tensor(T, S), with OpenCL backed storage becoming a first class citizen. All creation methods support both storage backends, and the implementation paves the way for zero-copy interop with numerous other libraries (Apache Arrow is the next prime target).
Num::NN and Num::Grad feature full GPU support, with almost all layers and gates supporting OpenCL backed Tensors
Num::Einsum allows for optimized contractions of Tensors, providing functionality identical to Numpy's einsum.

Some less flashy highlights:

Vastly improved test coverage and stability, as well as a revamped API documentation, which can be found here.
OpenCL memory management has been implemented, with JIT compiled kernels backed by memory-safe caching.
Numpy inter-op is supported via reading and writing to .npy files

As always I am constantly looking for additional contributors to improve documentation, examples, performance, and continue to expand the API. I am especially interested in anyone with CUDA experience (and a CUDA enabled graphics card, which is currently blocking my ability to write the storage backend).

If you have a chance to experiment with the library, bug tickets + feedback in the Gitter channel are always appreciated.

stakach · September 12, 2023, 1:15am

@christopherzimmerman working on porting https://github.com/patlevin/face-detection-tflite/blob/main/fdlite/face_detection.py to crystal.

Just wondering if I could leverage your library?
There are operations like _get_sigmoid_scores(raw_scores) in python.
In crystal I have raw_scores as a Slice(Float32) and this is my sigmoid_scores function

def sigmoid_scores(data : Slice(Float32)) : Slice(Float32)
    data.map! { |x| 1.0_f32 / (1.0_f32 + Math.exp(-x)) }
end

How would I implement something like that using num.cr?
Would a case like that benefit from the library?
The slices representing Tensors already so I assume it’s a good fit

Also I do a lot of Tensor normalisation so I can use different NN models with the same code. Is there a way to use num.cr to accelerate something like

# Tensor(UInt8) => Tensor(Float32)
output_layer.as_u8.map { |result| (result.to_f32 / 255.0_f32) }

Thanks in advance!

Topic		Replies	Views
Num.cr v0.4.3 released - Autograd and Neural Networks News	0	823	September 29, 2020
Saline: Saturating Arithmetic in Crystal Community	9	1074	July 27, 2021
Kudos to Crystal! News	12	2153	December 29, 2021
SHAInet v2.4.0 release! News announces	2	242	January 11, 2023
Arbitrary Arithmetic Library Written in Crystal Community	7	437	June 5, 2023

Num.cr v1.0.0 released

Num.cr v1.0.0 released

Related topics