Log output blocked until program completes

Hi everyone, I’m having some issues with Log.info calls on a script I’m running.

The problem is the logs won’t output to console until after the program finishes executing. Perhaps there’s something I’m missing in how IO is handled in my code. Here’s a quick overview of what I have in place:

  1. I’m using SHAInet shard to process a CSV file with data. SHAInet has Log.info calls to inform about the progress. They use Log = ::Log.for(self) in the shard for logging. None of these will print anything until after the program exits.
  2. Log.info calls in my program won’t log until after the program exits either, but puts calls do print to console immediately though.
  3. I’m using Crystal version 1.9.2
  4. The structure of the code is a simple script.cr file and it looks like this (below):
require "shainet"
require "log"
Log.setup_from_env

Log.info { "This should print first" }
puts "This should print second... Right?"
data = SHAInet::Data.new_with_csv_input_target("winners_data.csv", 0..120, 121)
training_set, test_set = data.split(0.67)
cc_nn = SHAInet::Network.new

...
# other SHAInet specific code to train & test the network
...

Log.info { "Done!" }

I’m running the code with LOG_LEVEL=INFO crystal run src/script.cr. Separating the call into build and then run the executable makes no difference.

The program seems like it hangs without any log output, but checking the process it’s indeed doing some work (high CPU usage). Once the program exists I see all logs being printed to console, which looks like this:

This should print second... Right?
2023-10-10T18:22:10.775944Z   INFO - This should print first
2023-10-10T18:22:10.823991Z   INFO - sha_inet.data: Labels are up, left, down, right
2023-10-10T18:22:10.838033Z   INFO - sha_inet.data: Selected 612 / 914 rows for training
2023-10-10T18:22:10.846411Z   INFO - sha_inet.data: Selected 302 / 914 rows for testing
2023-10-10T18:22:10.851544Z   INFO - sha_inet.network: Training started
2023-10-10T18:22:10.851551Z   INFO - sha_inet.network: Epoch: 0, Total error: 1.0, MSE: 1.0
2023-10-10T18:22:19.836287Z   INFO - sha_inet.network: Epoch: 100, Total error: 0.2321057247592544, MSE: 0.005866983734104936
2023-10-10T18:22:29.095057Z   INFO - sha_inet.network: Epoch: 200, Total error: 0.11222741193807531, MSE: 0.004784340070470271
2023-10-10T18:22:38.520245Z   INFO - sha_inet.network: Epoch: 300, Total error: 0.02977150206250756, MSE: 0.003966882967193343
2023-10-10T18:22:47.966916Z   INFO - sha_inet.network: Epoch: 400, Total error: 0.002911991526582036, MSE: 0.0033767864299002904
2023-10-10T18:22:57.490483Z   INFO - sha_inet.network: Epoch: 500, Total error: 0.00014734574689323254, MSE: 0.003306130354475695
2023-10-10T18:23:07.042107Z   INFO - sha_inet.network: Epoch: 600, Total error: 2.235918069390086e-5, MSE: 0.003247342412562554
2023-10-10T18:23:16.598932Z   INFO - sha_inet.network: Epoch: 700, Total error: 2.2635012710536867e-6, MSE: 0.0031576429285392026
2023-10-10T18:23:26.149360Z   INFO - sha_inet.network: Epoch: 800, Total error: 3.0468996719399124e-7, MSE: 0.00305942063498404
2023-10-10T18:23:35.748695Z   INFO - sha_inet.network: Epoch: 900, Total error: 2.8781460029084063e-8, MSE: 0.0029825842902794163
2023-10-10T18:23:45.406833Z   INFO - sha_inet.network: Epoch: 1000, Total error: 2.2839988214081857e-9, MSE: 0.0027512012120866954
2023-10-10T18:23:45.406837Z   INFO - sha_inet.network: Training finished. (Elapsed: 00:01:34.557602792)
2023-10-10T18:23:45.411890Z   INFO - sha_inet.network: Predicted 127 out of 302 (42.05298013245033% accuracy)
2023-10-10T18:23:45.411890Z   INFO - Done!

I’ve tried configuring the Log like this as well:

Log.setup do |c|
  backend = Log::IOBackend.new

  c.bind "*", :debug, backend
end

But still no luck. Logs only appear after the program exits.

I’m thinking there’s something blocking the IO to STDOUT in some way but I haven’t been able to find the solution on my own. Any ideas/suggestions on how to solve this?

If I had to guess, I think the catch is the default log dispatcher is async, meaning there’s a fiber that needs to run at some point for it to actually deliver its messages. If your logic is CPU heavy and not giving a chance to that fiber it makes sense it would have a chance when that logic finishes and before the program exits.

Try doing like Log::IOBackend.new dispatcher: :sync (or possibly :direct) and see if that helps. It may come at a hit to perf tho, so keep that in mind.

Alternately, your logic may wish to include some sleep 0 or Fiber.yield in strategic places to give a chance for fibers to run in the middle of your logic. Another option would be to run in MT mode such that the fiber can run on its own thread vs being blocked by the main fiber.

1 Like

Absolutely nailed it! Both Log::IOBackend.new dispatcher: :sync and using -Dpreview_mt flag worked like charm. Most of the CPU intensive work lives within the SHAInet code so I can’t place Fiber.yield calls on there AFAICT.

I’ll probably stick to -Dpreview_mt flag for the better performance yield. Possibly a very small difference either way, but it’s something. Overall I’m tweaking the parameters I’m using to train the network, so a few small edits in between each run (all manual yet) and visual feedback on them helps a lot.

Thanks for the very quick reply here :pray:t3:

1 Like