Base64-encoding a large value

Is there a good way to base64-encode a large file without loading the whole thing into memory? Thinking of something along the lines of def Base64.encode(source : IO, destination : IO). It looks like the current stdlib implementation only supports objects that respond to to_slice as the source, which seems to imply that all of the data be loaded into RAM.

2 Likes

You can base64-encode any stream as long a you can read three bytes at a time. Three input bytes become four base64 characters.

1 Like

In other words, grab subvectors of length divisible by 3 and then run them through the stdlib base64 encoder.

This is what I’m currently doing as a workaround. :-) I don’t want to keep it as a workaround, though. I’m looking for a first-class solution and I was hoping someone knew of something.

Since I implemented it by monkeypatching Base64.encode (and a couple methods downstream from it) in my app and it actually fits pretty well into the Base64 module, I’m considering pushing it up as a PR to Crystal. This isn’t the first time I’ve needed this and I assume others that also need it have simply been accepting that they have to load an entire file into memory to base64-encode it over the wire.

4 Likes

Sounds like a good idea to upstream this :+1:

1 Like

Hi, is there a link for the code? thanks

I posted my implementation as a PR to the Crystal stdlib.

I closed it because someone had some strong opinions on it and opened their own PR. They haven’t touched it since 3 days after that, though, so I’ll either reopen mine or release it as a shard. The important thing is that the functionality is supported.

1 Like