Early on, I formed a rule of thumb. Always store and send date times as epoch timestamps. Trivial serialisation, universal support, what’s there not to love about it?
Readability. I dislike binary formats. I feel the success of the internet has a lot to do with the textual formats that it inherited from the Unix school of pipes. While having an epoch timestamp in the JSON is not a binary encoding in the technical sense, it has similar flaws.
But this post is not about readability. This post is about how my rule of thumb ossified into a blanket rule, and combined with the fact that I had incomplete knowledge, it became a rule I kept parroting without understanding all the tradeoffs involved.
You see, after latching on to epoch timestamps, I never gave the other formats a deeper look. I thought the only difference between epoch timestamps and RFC whatever datetime strings was readability. But no, that’s not just it, there is one more difference.
The RFC strings encode just more information! They encode both a point in time (like as the epoch timestamps), but they also encode the offset to UTC.
This second bit of information is lost when we serialise or store date times as epoch timestamps. Usually, we don't care about this second bit of information, which is why my rule of thumb worked fine for long. But this second bit of information, the local time offset, is critical for certain cases.
For example, say you spend a great New Year’s Eve in Melbourne, and years later you recall those moments and try to search for it by typing in "night" into your photos app (IDK, say, Ente Photos 😉). This won’t work if your app uses UTC time for searching - last year's sun was still shining in Greenwich whilst you were welcoming a new dawn in Australia.
Luckily for me, there are smarter people working at Ente who understood this nuance and explained it to me. So this post is just me writing my personal experience of how rules of thumb sometimes end up as wallpapers over ignorance, and how for certain cases, using an RFC 3339 string is a better representation of datetimes.
One might say, why make the big jump, just keep using epoch timestamps, and additionally store a time zone offset. The thing is, that is already what RFC 3339 dates are! and we’d just be reinventing the wheel. Maybe such a gradual approach is necessary for migrating existing codebases, but if starting from scratch it’s better to use an existing standard.
Talking of standards, it is easy getting swamped by ISO this RFC that when talking about datetime representations, and end up thinking that there are multiple of them flying around.
No, they’re all the same thing. RFC 3339 introduces itself this way:
This document defines a date and time format for use in Internet protocols that is a profile of the ISO 8601 standard for representation of dates and times using the Gregorian calendar.
So effectively, both of them are the same thing:
- ISO 8601 is an international standard for date times,
- RFC 3339 describes how to use ISO 8601 over the internet.
That's it. There are small differences yes, but those don't matter unless you're writing your own parsers and stuff.
Now coming back to our motivating example. The RFC itself mentions a very similar example of why it chooses to store the UTC offset in addition to the datetime itself:
The offset between local time and UTC is often useful information. For example, in electronic mail the local offset provides a useful heuristic to determine the probability of a prompt response. Attempts to label local offsets with alphabetic strings have resulted in poor interoperability in the past. As a result, IMAP has made numeric offsets mandatory.
Note that we're not storing the time zone, that's a complicated can of worms. We're just storing a numeric offset.
The RFC also mentions how to deal with situations where the UTC date is known but the time offset is not known. The whole thing is quite short, give it a scan if you’re interested in more details.
It'd be perhaps befitting to give an example too. Here is the epoch timestamp of when Chandrayaan 3 landed near moon's south pole recently:
1692793980
And here it is, as an RFC 3339 string:
2023-08-23T18:03:00+05:30
Both represent the same moment, but the RFC 3339 string is not just more readable, it also tells us what time it was for me, when I was watching it live.
UTC is not enough, 24 hours is a long time.