Unhandles exception in spawn: Could not create pipe: Too many open files


It appears I am using spawn incorrectly (please see the code below).
Why am I getting the “Error opening file…Too many open files error”
Can someone please help? Thank you. The folder that is giving me trouble has 83 files in it. That does not seem enough to run into file descriptors limit problem, or am I wrong. ulimit -n shows 256 on my Mac.

Relevant detail
I am trying to run a crystal script that uploads some local files to
digital ocean spaces (which is like Amazon S3). The command is:

./msbcli uploadfiles --dest images --src /Users/username/images/testfolder 

msbcli is my crystal lang script that calls aws command line tool to do the job.
The command worked great for some folders. However for one folder that contain 83 files, I get the following errors:

Unhandled exception in spawn: Could not create pipe: Too many open files (Errno)
Unhandled exception in spawn: Error opening file ‘/dev/null’ with mode ‘r’: Too many open files (Errno)
Unhandled exception in spawn: Error opening file ‘/dev/null’ with mode ‘r’: Too many open files (Errno)
Unhandled exception in spawn: Error opening file ‘/dev/null’ with mode ‘r’: Too many open files (Errno)
Unhandled exception in spawn: Error opening file ‘/dev/null’ with mode ‘r’: Too many open files (Errno)
many lines like the above followed by some successful upload messages.
hangs up here (probably because channel.send or channel.receive is blocking after the above errors.)

Relevant Code
The code that the above command calls is:

def put_multiple_local_files(path_to_dir : String, 
                                key_prefix : String, 
                                ext_incl : String = "",
                                ext_excl : String = "",
                                permission : String = "public-read")

    channel = Channel(String).new
    num_files = 0
    Dir.each_child(path_to_dir) do |file|
        num_files += 1
        spawn do 
            res = put_local_file(File.join(path_to_dir, file), key_prefix)
    num_files.times do |_|
        val = channel.receive
        puts val

The put_local_file function is:

def put_local_file(local_file : String, key_prefix : String, filename : String = "", permission : String = "public-read")
    if filename == ""
        content_type = MIME.from_filename?(local_file) || "application/octet-stream"
        fname = File.basename(local_file)
        content_type = MIME.from_filename?(filename) || "application/octet-stream"
        fname = filename

    cmd = %(aws s3api put-object \
                --bucket #{BUCKET_NAME} \
                --profile #{PROFILE} \
                --endpoint-url #{ENDPOINT_URL} \
                --key #{key_prefix}/#{fname} \
                --body #{local_file} \
                --acl #{permission} \
                --content-type #{content_type})

    stdout = IO::Memory.new
    stderr = IO::Memory.new
    status = Process.run(cmd, shell: true, output: stdout, error: stderr)
    if status.success?
        ## on success, the command returns a json object with ETag (the MD5-hash)
        ## of the file. The ETag has extra double quotes in it, which need to 
        ## be stripped to get the MD5-hash.
        output = JSON.parse(stdout.to_s)["ETag"].as_s
        result = output.gsub(/\"/, "")
        return {type: "success", msg: "#{fname} uploaded to #{key_prefix}.", result: result}
        return {type: "error", msg: stderr.to_s, result: ""}

--src /Users/username/images/testfolder directory have too many files? Maybe you can try to set a number of fibers,example: upload 5 files at the same time instead of all the files, like this.

Ulimit means user limit. So if you have multiple processes running under that same user, they all add up to the number of open files. ulimit -n = 256 seems like a very low setting. Especially if you intend to do mass file operations. Not sure what you’re working on (maybe it’s a shared system?), but if you can, you should consider increasing that.

So increasing the ulimit would probably help to remove the immediate problem, but it’s actually a sign there may be issues with you code.

It seems you’re essentially starting a process for every file in the folder concurrently. That works for smaller amounts, but is not really efficient at scale. As @Blacksmoke16 already suggested, you should limit the amount of concurrently running jobs to keep the number of simultaneously opened files low. If the number of concurrent uploads exeeds the upload capacitry, they’re all fighting for the available bandwidth anyways. So it’s better to limit the number of concurrent uploads in the first place.

Thank you for the replies @Blacksmoke16 and @straight-shoota.

Increasing the ulimit worked, so it was a ulimit problem. However, for (perhaps) a better solution, I now use aws s3 tool instead of aws s3api, which provides concurrent uploads,without me needing to spawn multiple processes. It perhaps does it like how you both suggested.

My follow up question is: was this limitation a result of concurrently running multiple copies of an external cli command; and these external cli processes were running against the ulimit. A spawned process in Crystal (if it was not running an external command on the system) won’t hit the ulimit so quickly, right? For example, if I was doing

spawn do
     ## make an http request that takes long to respond

would that also hit the ulimit so quickly, or I can spawn many such processes?

Thank you.

spawn does not spawn processes, but fibers. I think the distinction is important when talking about ulimits. ( Concurrency - Crystal )

If you call an external command, mostly the same things happen as if you would call that command in a script or shell. Which means that each of those is a separate process with its own file handles on stdin/out/err and whatever other files it touches.

If you use the crystal “spawn” command, crystal handles the concurrency itself. No process is created, not even a thread (at least currently as of v0.34).

This is called a fiber and described in a bit more detail in the link above.
So, no a spawn would not hit that limit so quickly, as long as you don’ t open 256 files in total at the same time.

It doesn’t really matter if you do the job in a Crystal fiber (that’s the equivalent to processes in Crystal’s runtime) or an external process. The limiting metric is the number of open files. Whether you open those file in the crystal (main) program or a different process is not relevant, they count towards the same user limit. Note howerver, that spawning processes may need add additional open files for inter process communication.

Thank you for clarifying that @straight-shoota.

Crystal community is blessed to have members such as you, @asterite and @Blacksmoke16, who are so patient with the beginners.

1 Like

FWIW I think you been @ mentioning the wrong person considering I haven’t replied to this thread :stuck_out_tongue:. But thanks, I try :wink:.

I think the description is still valid when also considering other threads ;)

1 Like

I included your name because you have replied to me numerous times :) I mentioned @asterite for the same reason. While Crystal is awesome, I am not sure I could have stuck with it if some of you had not helped with my earlier questions.

@orangeSi and @mavu thanks for your replies too in this thread. Sorry for not acknowledging your help in this case.