Looking for data scientists (or anyone) interested in using Crystal

I’ve been having a blast learning crystal, and I would love to have actively maintained data science tools written in crystal. I am looking for people to help me with that task.

There are a couple libraries that do basic calculations, but none seem to be active.

Primary short term goals:

  • Maintain a data visualization library written in crystal (aquaplot)
  • Create a library to handle data analysis on one dimensional data (similar to a pandas Series, or an R vector), backed with LAPACK / BLAS
  • Extend the library to handle analysis on two dimensional data

My primary limitation right now is that I am still learning the internals of the language, so even if you aren’t interested in the more dense scientific computing, but would help with code review / refactoring, I would love all the help I can get!

6 Likes

Hi, very nice goals! I’m personally not really interested in the scientific side, but I always love seeing nice graphics :stuck_out_tongue:

I (and many other people here in the forum and in the gitter/IRC chat) can also help with reviewing!


I’d suggest making a separate post asking for review, to avoid many unrelated messages about your code and “hiding” relevant reply to your “call-to-data-scientists”…


If you make a separate post, I’ll move the following :point_down: to it to cleanup this thread!

A quick look at a file led me to https://github.com/crystal-data/aquaplot/blob/1fd38eb23cf513dd7554ab3f6207a8c7ce13cd41/src/plot/base.cr where you do:

class Foo
  property some_field : TheType
  # ...

  # Then a manual getter and setter
  def get_some_field
    # get the public value for @some_field
    @some_field.some_internal_value
  end

  def set_some_field(@some_field)
  end
end

This can be simplified to:

class Foo
  @some_field : TheType
  # ...

  # Then a manual getter and setter
  def some_field
    # get the value for @some_field with some operation...
    @some_field.some_internal_value
  end

  def some_field=(@some_field)
  end
  # or simply
  setter some_field
  # (the `setter` macro will roughly generate the method `some_field=` like above)
end

# Then to use in a subclass:
class Bar < Foo
  def some_method
    value = self.some_field # call the getter
    self.some_field = value + 1 # call the setter (the method `some_field=(arg)`)
  end
end

You don’t need the property, because it is a macro that generates a getter and a setter that you don’t use, see the documentation for more info on that: https://crystal-lang.org/api/latest/Object.html#property(*names,&block)-macro

Edit:
You could even simplify more (thanks @Blacksmoke16 for pointing it out in gitter chat), using another form of the setter macro:

class Foo
  setter some_field : TheType

  def some_field
    @some_field.some_internal_value
  end
end

The line setter some_field : TheType will basically generate the following code:

@some_field : TheType

def some_field=(@some_field : TheType)
end

More infos in the docs: https://crystal-lang.org/api/latest/Object.html#setter(*names)-macro

1 Like

A couple of years ago (time flies!) I toyed with a Crystal lib that bound to LAPACK and BLAS and provided some more idiomatic Crystal APIs on top of them. I didn’t have the time to maintain it though, but the preliminary results were promising (https://manas.tech/blog/2015/10/30/linear-algebra-in-crystal-from-lapack.html):

At least it might give you a head start or some ideas :). Feel free to fork or whatever!

1 Like

Phenomenal library. Definitely gives a great head start binding to BLAS and LAPACK. Thanks for linking!

1 Like

I would love to help you ! :)

I don’t know where to start, maybe create some issues on shards could helps.

1 Like

@alex-lairan thanks for the offer! I will start creating some issues that need attention. Just for my benefit, are you more interested in working on the charting library or the scientific computing library?

1 Like

Mostly on the scientific one.

Separate the post for review with data scientist

Amazing shard. You should submit it to Awesom Crystal Repo.

I’m pretty sure many people would find it useful too.

@christopherzimmerman I noticed the repository has now been archived. I’m really interested in a plotting library right now and I would prefer not to start my own from scratch.
Just wondering why you archived it and if you still plan on working on it.
I’m contemplating forking the repo and extending it. Not sure yet.

@appcypher. I archived it because it was mostly poorly written code. I’ve actually been working on adding bindings for PlPlot, which is more targeted as scientific plotting, to my numerical computing library num.cr. I would love it if you were interested in contributing to that!

1 Like

I’m more interested in using Julia for data science, but I am interested in Chrystal for front end things, including visualisation.

Julia uses a variety of visualisation packages, like GR, written in C, Plotly, written in JavaScript, and PyPlots, written in Python, in Plots.jl. I would love to see aquaplot as a backend fol Plots.jl, especially since speed is a continued issue, fol Plots.jl, and GR gives speed at the expense of usefulness (sliders for example).