- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
- ISO 8601 is paywalled
- RFC allows a space instead of a T (e.g. 2020-12-09 16:09:…) which is nicer to read.
Top post of the hour is about an RFC from >20 years ago.
This is worse than the Linux stuff.
Y’all a bunch of nerds
You’re not wrong
Room for one more
One of us, one of us, one of us
Being a nerd is fun.
Thanks /u/OsrsNeedsF2P!
I’m a Linux nerd and even I don’t get this 😭
A space is more problematic than a T tho
Skill issue
For a skilled pro like you I suggest using epoch time for everything
Cassandra uses epoch milliseconds for timestamping snapshots. This means that each node will have a different name for the same snapshot. Trivially solved with truncating the timestamp with * wildcard, but just… why?
Any other day I’d see this get laughs, but I guess people are bitchier this time of day.
I’d write down the ISO timecode I’m talking about, but I can’t afford it.
You’ve just become the nemesis of the entire unix-like userbase for praising the space.
What’s the issue with the space?
On the command line, space is what separates each argument. If a path contains a space, you either have to quote the entire path, or use an escape character (e.g. the
\
character in most shells, the backtick in Powershell because Microsoft is weird, or the character’s hexadecimal value), otherwise the path will be passed to the command as separate arguments. For example,cat hello world.txt
would try to print the fileshello
andworld.txt
.It is a good practice to minimize the character set used by filenames, and best to only use English alphanumeric characters and certain symbols like
-
,_
, and.
. Non-printable characters (like the lower half of ASCII), weird diacritics (like ő or ű), ligatures, or any characters that could be misinterpreted by a program should be avoided.This is why byte-safe encodings, like base64 or percent-encoding, are important. Transmitting data directly as text runs the risk of mangling the characters because some program misinterpreted them.
but what does the command line matter for dates? sure every once in a while you’ll have to pass a date as an argument on the command line but I think usually that kind of data is handled by APIs without human intervention, so once these are set up properly, I don’t see the problem
rsync -a "somedir" "somedir_backup_$(date)"
If the
date
command returns an RFC-3339-formatted string, the filename will contain a space. If, for example, you want to iterate over the files usingfor d in $(find...)
and forget to set$IFS
properly, it can cause issues.But
(date)
does return a string with spaces, at least on every system I’ve ever used. And what’s so bad about the possibility of spaces in filenames? They’re slightly inconvenient in a command line, but I haven’t used a commuter this century that didn’t support spaces in filenames.Bro, literally re-read the comment you replied to. It has an example of what might happen.
Ok, I just reread it. I don’t see what you think I’m missing. You mean an improperly written find command misbehaving? The fact that a different date format could prevent a bug from manifesting doesn’t seem like much of an argument.
Both arguments are surrounded by
"
, which should be space-safe.At least in the shells I use, putting
"
makes spaces inside paths a non-issue.For the
rsync
command, yes. But this:for d in $(find . -type d); do echo "$d" done
will process the space-separated parts of each path as separate items. I had to work around this issue just two days ago, it’s an obscure thing that not everyone will keep in mind.
Hm, I guess I just don’t agree that CLI usablity comes before readability.
Again, it’s not just CLI, it’s an insurance against misinterpreted characters breaking programs.
honestly, if a space breaks your program, it’s kind of a shit program.
I’m not exactly fond of the space either, but man, the T is noisy. They could’ve gone with an underscore or something, so it actually looks like two different sections.
deleted by creator
allows, not requires. It basically means you can use space instead of T when showing it to end users and any technical person can just use T
deleted by creator
The amount of things allowed by ISO 8601 is even more than what’s allowed by RFC 3339, if you take the time to look at https://ijmacd.github.io/rfc3339-iso8601/
It’s really a skill issue if replacing
T
by[
in your regexp is hard ]This is the most junior developer comment I’ve seen in a while.
Nobody that’s competent thinks that’s shit is hard. That’s not the point.
The point is, it makes it easy to make mistakes. Somebody might see all of one type of strings, assume that’s the format, and forget to enclose the thing in quotes, causing mysterious bugs years later when a differently created date filters into the system. You might have a regex error, you might split incorrectly, you might make a query that works the wrong way and gives an incorrect aggregate, and none of that is due to lack of skill. It’s due to not knowing it’s the rfc standard, not the iso. It could be due to not even realizing the rfc allows for that or is different.
Software engineering in practice is not about making sure there is at least some way for people to use your library/standard/pattern. It’s about making sure the way to do it that’s most intuitive/obvious is also foolproof, easy, and efficient. Adding the space makes debugging harder and adds footguns which is exactly what good software engineers want to stay away from. Otherwise we’d all be writing in assembly. But since you aren’t, maybe you are the one with a skill issue. Either that or you really misunderstand this field.
The difference:
2023-12-12T21:18Z is ISO 8601 format
2023-12-12 21:18 is RFC 3339 Format
A small change
ISO 8601 also allows for some weird shit. Like
2023-W01-1
which actually means2022-12-31
. There’s a lot of cruft in that standard.Doesn’t the ISO also includes time periods? Because if it does, those are amazing.
Without any explanation, you should be able to decypher these periods just by looking at them:
- P1Y
- P6M2D
- P1DT4H
- PT42M
Hmm I don’t get the T there tbh
It makes the difference between M meaning month or M meaning minute. Small differences.
So it’s redundant in P1DT4H? Or is it a mandatory separator between ymd and hms?
It’s mandatory, which also makes it nice and predictable.
This is the killer for me. Most people promote ISO 8601 as a “definitive” date structure, when it actually supports a lot of different formats. What they actually want is usually RFC 3339.
Week numbers are convenient for projects in which key delivery dates are often expressed in his many weeks out they are.
wtf what is that gross
That Z is doing a lot of work.
Z indicates UTC. Alternatively,
2023-12.12T21:18-05 for time zone as central. The UTC time zone code at the end just tells you where the time is taken from. Usually Z is used since, well, it’s “universal,” but having a +13 or -06 or whatever else brings context, and allows computers to synchronize the string of text into a comparable time for event logs and such.
Yes. The RFC is missing something that explicitly indicates the time zone. The Z is a great unambiguous way of saying “yes, this is UTC.”
IMO, ISO 8601 is better for computers, people working with multiple time zones, or critical logging.
RFC 3339 is better used colloquially, while still remaining unambiguous for the use cases that most people use dates and times in.
I’d rather have an explicit time zone any time a datetime is being passed around code as a string. Communicating it to a human is relatively safe since even if there’s a mistake, it’s directly visible. Before that last step, incorrect time zone parsing or implicit time zone assumptions in code that was written by “who knows” in the year “who knows” can be really annoying.
I couldn’t agree more!
There’s a new RFC in the pipeline that will address this.
It’s already been approved, just needs to slooowly crawl its way theough the final publication queue.
Thanks for the link. Reading it gave me a headache. Not because of the proposal, but because of the very clear explanation it includes of just how annoying time zones are. I never even thought about the fact that a time relative to a UTC timestamp isn’t uniquely associated with another UTC timestamp because the local UTC offset can change. It’s obvious when you say it, but now I’m wondering if I have more time zone bugs somewhere.
It only just hit me a month or two ago just what a timezone, as described by IANA, actually is.
I’m from the eastern half of the US state of North Dakota. We run on what we’d collloquially call “central time”, often abbreviated CST. That’s UTC-6:00 in winter and UTC-5:00 in summer (technically CDT, but whatever).
Long ago I had it passed down to me from on high that the IANA timezone indicator I should use for my local time is
America/Chicago
. Ok. Easy enough. Why Chicago, though? I long guessed because it happens to be one of the largest localities in the CST block? That is in fact the answer if you read the rationale of the tz database, but I did not know this at the time.What threw me off, though, is that there are other localities that seemingly map to the same time zone block. Like
America/Mexico_City
, orAmerica/Indianapolis
. What’s up with those? When I set my computer system clock to them, they behave just likeAmerica/Chicago
does. Why are these here? And why these cities, specifically?Then, imagine the loop I was thrown for when I discovered three timezone definitions exclusive to North Dakota. Those being
America/North_Dakota/Beulah
,../../Center
, and../../New_Salem
. What the fuck…?? These are literal nowhere towns. Midwest America is the middle of nowhere. North Dakota is the middle of nowhere within the Midwest. And these three towns are the middle of nowhere to the rest of us in North Dakota. What is going on? Why are there three tiny timezones in the middle of nowhere in the middle of nowhere in the middle of nowhere? And they’re all right next to each other!Then, it clicked. What do these three places have in common? These towns all used to be in the next timezone over (“Mountain Time”, MST), but later decided to jump over to CST.
There’s a humorous story for why this happened. Supposedly, drinkers in the capital city, Bismarck, would stay to bar close. Then, they’d all hop in their cars and drunk drive to the sister city across the river, Mandan, for an extra hour of fun, causing untold chaos in the process. The jump was allegedly to curb this. Sadly, that story apocryphal. In reality, it was just because it was economically favorable to be time-aligned with the state capital city. But I digress…
If you were, say, looking over historic records of events recorded in both Bismarck and Beulah, where records are always taken simultaneously, and your data happened to span back before this switchover, there would be an inexplicable point in time where after it the timestamps would match, but before it, they’d be offset. So, to encode that, Beulah gets its own unique timezone all to itself that indicates this historical switchover exists.
It also explains why there are three tiny timezones all right next to one another. Three counties participated in this switchover, and to make it happen, each one had to individually pass laws to enact it. These laws all took effect on slightly different dates. Thus, if we wish to capture the nuanced time shifts in all three counties, each county needs its own bespoke timezone.
IANA timezones aren’t just representations of all the time zones that currently exist. They are representations of every unique permutation of historic clock changes for every place on Earth. That’s fucking nuts! Knowing that, I went from being shocked that there are so many timezones to being shocked that the list of timezones is as short as it is!
I definitely don’t agree that the RFC is easier to read, the two numbers can appear to be one at a quick glance without a separator.
But there is a separator between the numbers: the same one that also very reliably separates the words in this comment
- A single separator is better than a choice of separators to mean the same thing.
- A space is not as apparent in a large log of data as a capital T
- Human language is not as strict as a programming language. There is a reason you see people still using “alot” and “a lot”. That just proves it’s easy to overlook and commonly happens.
Both are valid (if you’d add seconds) in both RFC 3339 and ISO 8601, but timezone support is the same here and there…
Its funny because everything about ISO 8601 is covered on its Wikipedia article. Very few people need to spend the francs to need the spec.
You HAVE to read the spec if you want to be compliant, you can’t just hope every detail is on wikipedia
Also, even if you fully respect the specs, I assume you can’t get certified as “compliant” by ISO if you didn’t pay for the specs ?
“HAVE” to like Germans HAVE to have their driving license to drive?
If you want to be compliant for a standard you need to have a copy of it. Luckily it’s only companies that really need to buy them
Which means the companies using the specs pay the company making the specs for everyone (companies and people) to use.
That sounds fair, but I wouldn’t be surprised if capitalism fucked it up anyhow.
Yeah I like a girl who is firm on her choice of date time format…😂😂😂😂
I personally have a list of 14 RFCs I won’t compromise on when it’s a first date
Please be serious and give me that list! Please be real!
Edit: guys, if they don’t answer, they might just missed the question. They might be real. BELIEVE!!!
Do you care to share them?
Linux sex tips approved
I don’t even know what ISO 8601 is, but I agree with the sentiment
https://en.m.wikipedia.org/wiki/ISO_8601
Date format that is both human readable and for the most part sortable as strings (assuming you are using the same time zone).
Same
Well, they cover very different formats: https://pbs.twimg.com/media/FdzPYu-UAAADHEq.jpg
TIL, didn’t know that
wtf does this even mean?
This is about the old argument around how date strings are formatted.
MMDDYYYY vs YYYYMMDD, spaces or hyphens may differ. It’s an old and passionate argument (mostly due to the American approach of starting with the month being insane)
Both ISO8601 and RFC3339 are YYYY-MM-DD. The difference is in how the date and time are separated.
Than you! I was shooting from the hip half asleep (the classic ‘gosh I’m so clever’ moment for me…)
Also, ISO 8601 has some handy rules for expressing time lengths and periodicities.
I’ve worked with this one project for so long I can now read +%s timestamps.
That’s a certain kind of skill I wouldn’t want the need to have. I just copy paste those timestamps into a terminal with
date -d @
(and always forget the right syntax for that :D)
ISO standards need to be purchased to be viewed, RFCs are freely available requests for comment. The RFC 3339 format is effectively the same is the ISO format, except RFC 3339 allows for a space between the date and time components whereas the ISO format uses a “T” character to separate date and time components.
If you want to get real weird, RFCs are not standards but rather a request for other participants to comment on the proposal. RFCs tend to be pointed towards as de facto standards though, even before they become a BCP or STD.
Yeah… I have no idea what any of that means either. I’m sorry I caused you to write all that out.
Relevant XKCD: https://xkcd.com/1179/
Counterargument: https://xkcd.com/927/
How could it be paywalled? I’ve never heard of anyone paying ISO to be able to write the date and time in a handy way.
What he means is, if you want to download the document from ISO that describes the standard, you have to pay a fee. Here’s their store page: click.
It’s about 190 USD for a 38 page document describing the rules of the standard. There’s another document with extensions for a similar price. Quite pricey for a PDF file obviously, and the RFC is free to download.
On the other hand, no one in the history of time has gone “hmm, I don’t know how ISO-8601 works, let me go buy this document from the ISO store to figure it out.” Most people just call
datetime.isoformat()
or whatever their library function is called.Ah thanks for the clarification! Very informative
We need a better one…
Ymd-ymd-yhms-yhms
Much clearer and easier for programmers.
Right now, it’s 210-024-200-379
Am programmer. Idk wtf that is. But if it converts easily to a datetime object, or if I can easily parse the parts out of it, I’m all for it. Idgaf if it’s easy to read as-is. Just make it efficient and make it sort predictably, and I’m all for it lol.
Too confusing. How about ymh-yMy_myM-h
Maybe we could use different letters. Something only ISO knows and jeeps in their spec.
iSO
savage
RFC2795, because the IETF guys work hard, and then play hard on April fools.