Read chunks from file

BloodFeastMan · March 28, 2024, 2:33pm

Just a hobbyist that would like to re-write some ruby stuff as compiled so I can more easily share in the office. Several scripts include reading certain number of bytes from non-text files, doing something with that chunk, and writing the result to a separate file, real simple stuff. I’m struggling to find documentation on how to read a string of bytes of a certain length.

Blacksmoke16 · March 28, 2024, 2:40pm

I’d checkout IO - Crystal 1.11.2. There is also a #read_fully method that would error if it didn’t read the exact amount of bytes it should have.

BloodFeastMan · March 28, 2024, 5:25pm

Thanks for the pointer! (pun intended) I found that #read_string while keeping track of file position was a really easy way to do what I needed.

Blacksmoke16 · March 28, 2024, 5:40pm

Even if the files you’re reading from are not textual? I’m pretty sure #read_string is only going to work when you’re reading UTF-8 encoded text.

BloodFeastMan · March 28, 2024, 5:54pm

Yes, the test file I used at first was a TCL script that I had wrapped as a windows executable, basically a zip file, so to be sure I run the same test program on windows’ calc.exe, and it did hash correctly. Here is the code I used to test:

require "io"

def main()
	outfile = File.open("#{ARGV[0]}.out", "wb")
	infile = File.open("#{ARGV[0]}", "rb")
	infile_size = infile.size

	read_length = 1024
	while infile_size > 0
		buf = infile.read_string(read_length)
		outfile << buf
		infile_size -= read_length
		if infile_size <= 1024
			read_length = infile_size
		end
	end
	infile.close
	outfile.close

end

if ARGV.size < 1
	exit
end
main

edit: I also ran this in Manjaro in a vm, and it seems to work there, too

jgaskins · March 29, 2024, 12:15am

Binary data in strings works here, to an extent, because read_string receives a byte size rather than characters. My Redis client uses it to read strings in the parser and I store quite a bit of binary data in Redis — cached objects encoded via MessagePack.

I don’t think I’d do anything with binary data in a string beyond using the string as a sort of wrapper for a Slice, but it does work.

Topic		Replies	Views
Is there a way to Digest large files? Help & Support	9	512	September 2, 2020
Compressing / decompressing strings : how do I decompress this? Help & Support	1	154	February 25, 2023
How to get raw post request without any modification? Help & Support	7	398	August 20, 2019
Read nrrd-file with attached header and gzipped data Help & Support	7	206	February 11, 2023
Writing to the stdin of an external program?	2	585	March 17, 2020

Read chunks from file

Related Topics