Hello. I have been fascinated with ChatGPT for the past month and working on a command line tool to use ChatGPT in Crystal. This is very useful, in my opinion, but I have not completed it and would like to present it another time.
While building this tool, I realized that there was no way to pre-count the number of Tokens. So I used Sunday to create a binding for Blingfire. My goal was to finish implementing it in 2 hours, but it took me 6 hours. I just rewrote ankane’s blingfire-ruby code in Crystal.
It now passes the minimum test.
With the worldwide ChatGPT boom, there must be a need for a library that can convert strings and tokens in the Crystal language world. I have taken the first steps, but I believe this library still has many bugs.
I usually don’t report that we have created such a trivial library, but tokenizers are important and requested everywhere. I hope someone will develop a more useful library. For example, huggingface/tokenizers bindings would be nice, but it was difficult for me to implement bindings because they are written in Rust.
Thank you.