I’m having an issue in my application which causes Time.parse_rfc2822 to raise due to bad input data. I haven’t yet sorted out where the bad data is coming from because it’s intermittent, but in the mean time I’ve had the intent to make the failure more graceful.
Time.parse_rfc2822 raises an exception, but I was hoping to find a method like Time.parse_rfc2822? which would return Nil or some other Invalid.
Unhandled exception: Unexpected char: '>' at 4: "Fri >>>>13 Sep 2024 12:55:00 GMT" (Time::Format::Error)
from /usr/lib/crystal/time/format/parser.cr:598:9 in 'raise'
from /usr/lib/crystal/time/format/parser.cr:594:5 in 'raise'
from /usr/lib/crystal/time/format/parser.cr:514:9 in 'char'
If possible I’d like to avoid this, but I don’t yet see a way around it.
I would appreciate that as well. Formatted time strings are often from unknown origins and of unknown quality. Parsing errors should be expected.
Unfortunately, this likely requires a major rewrite of Time::Format::Parser because its structure is focused very much on error handling via exceptions.
Maybe we could test what effect it has if we just avoid the most costly operation of raising an exception, unrolling the call stack. For consumers of the parse API, the exact error location is pretty much irrelevant. So there’s no need to provide that information.
We could end up with assigning an empty call stack to every Exception raised by the parser. And then there could be two entry point methods, one which reraises the exception with local call stack. And one which returns nil.
@@dummy_callstack = Exception.CallStack.new
def self.parse_rfc2822_internal(string)
# on error:
raise Exception.new("message").tap(&.callstack=@@dummy_callstack)
end
def self.parse_rfc2822(string)
parse_rfc2822_internal(string)
rescue exc
exc.callstack ||= caller
end
def self.parse_rfc2822?(string)
parse_rfc2822_internal(string)
rescue exc
nil
end
I appreciate the idea to save the callstack build time, but that doesn’t really change the pain point I’m having. The error state notification is still an exception, it’s just a cheaper exception to raise.
It’s unfortunate that the API of Time::Format::Parser is based around this idea, but only because Crystal style has evolved in the other direction, and replacing the API with something safer is going to have to wait to ship until a breaking API changeset. I have no idea how long that will be, but it’s not worth waiting for my place.
For my code, I went with this:
def parse_time(date : String) : Time?
# Format: Mon, 15 Feb 2016 04:35:50 UTC
format_regex = /\A\w{3}, \d{2} \w{3} \d{4} \d{2}:\d{2}:\d{2} \w{3}\z/
unless format_regex.matches? date
return
end
Time.parse_rfc2822 date
end
I’m confident it’ll break at some point, but I’m also confident it’ll catch most of the errors caused by bad data I’m dealing with.
I have been playing with the idea of an exception-free parser that still conforms to Time::Format::Pattern’s interface, which looks roughly like this:
struct MyParser
include Time::Format::Pattern
private alias Error = String
@error : Error?
def year
unless @error
@year = assert consume_number(4)
end
end
def month
unless @error
@month = assert consume_number(2)
end
end
def day
unless @error
@day = assert consume_number(2)
end
end
def char(char, *alternatives) : Error?
unless @error
unless @reader.has_next?
if alternatives.empty?
return @error = "Expected #{char.inspect} but the end of the input was reached"
else
return @error = "Expected one of #{char.inspect}, #{alternatives.join(", ", &.inspect)} but reached the input end"
end
end
unless char?(char, *alternatives)
return @error = "Unexpected char: #{current_char.inspect}"
end
end
end
def consume_number_i64(max_digits)
consume_number_i64?(max_digits) || "Invalid number"
end
private macro assert(exp)
%ret = {{ exp }}
return @error = %ret if %ret.is_a?(Error)
%ret
end
end