Read chunks from file

Just a hobbyist that would like to re-write some ruby stuff as compiled so I can more easily share in the office. Several scripts include reading certain number of bytes from non-text files, doing something with that chunk, and writing the result to a separate file, real simple stuff. I’m struggling to find documentation on how to read a string of bytes of a certain length.

I’d checkout IO - Crystal 1.11.2. There is also a #read_fully method that would error if it didn’t read the exact amount of bytes it should have.

Thanks for the pointer! (pun intended) I found that #read_string while keeping track of file position was a really easy way to do what I needed.

Even if the files you’re reading from are not textual? I’m pretty sure #read_string is only going to work when you’re reading UTF-8 encoded text.

Yes, the test file I used at first was a TCL script that I had wrapped as a windows executable, basically a zip file, so to be sure I run the same test program on windows’ calc.exe, and it did hash correctly. Here is the code I used to test:

require "io"

def main()
	outfile = File.open("#{ARGV[0]}.out", "wb")
	infile = File.open("#{ARGV[0]}", "rb")
	infile_size = infile.size

	read_length = 1024
	while infile_size > 0
		buf = infile.read_string(read_length)
		outfile << buf
		infile_size -= read_length
		if infile_size <= 1024
			read_length = infile_size
		end
	end
	infile.close
	outfile.close

end

if ARGV.size < 1
	exit
end
main

edit: I also ran this in Manjaro in a vm, and it seems to work there, too

Binary data in strings works here, to an extent, because read_string receives a byte size rather than characters. My Redis client uses it to read strings in the parser and I store quite a bit of binary data in Redis — cached objects encoded via MessagePack.

I don’t think I’d do anything with binary data in a string beyond using the string as a sort of wrapper for a Slice, but it does work.

1 Like