Date Parsing. Nose wrinkle.
This article is part of a series.
- Part 1: What if I just copy-paste from the web?
- Part 2: How do you get messages to Swift directly?
- Part 3: Okay, but how about all the way up to the View?
- Part 4: How to do some basic file handling?
- Part 5: How do custom Encoder's work?
- Part 6: And what can I make a custom Encoder do?
- Part 7: Wait, how do I scan text again?
- Part 8: This Article
- Part 9: What would be a very simple working Decoder?
- remember how to inspect strings in general
- decide about Dates specifically <=
- write a really really simple decoder to walk through the process
- inspect other peoples decoders
- write the string -> tree step for SimpleCoder
- finish the full decoder (with round trip tests)
The test HousePlant
from the last post just totally punted on parsing dates, but I know I want to use them. So now what?
First step is to be clear on what a Date
type is and what it isn’t. As we saw when writing the encoder, the default Codable
representation for Date
is timeIntervalSinceReferenceDate
because that is what a Date
’s core storage is (core | FoundationEssentials ), a TimeInterval. And what’s a TimeInterval? A type alias for a Double representing seconds.
Any other representation of a Date
provided by a formatter is a convenience. Calendar and other libraries better serve working with days, hours, months (little d dates) as first-order concepts. One cannot have a Date
with nil for hours. That makes no sense. Date
has no Optional “hours” parameter. Date
is a unidimensional time coordinate; it must be specific. One can represent spotty calendar style information with DateComponents
, not Date
.
The various date formatters one can use in Swift do an excellent job of creating these time coordinates we call a Date
from Strings
and vice versa. However, it could be easy to end up very frustrated with them if in fact one is trying to do withDate
things that would be much better served by a different API. (For even more on this topic see conversations around SE-0329) (also this NSSpain X Dave DeLong talk)
As of 2021 there are new versions of theses formatters in Foundation
. The WWDC21 video “What’s new in Foundation” (14:29) shows both the old style and the new and explains the motivation.
The WWDC22 RegEx videos both show working with the new formatter in the context of a RegEx.
- https://developer.apple.com/videos/play/wwdc2022/110357/?time=693
- https://developer.apple.com/videos/play/wwdc2022/110358?time=314
swift-foundation
has public implementations.
- ParseStrategy protocol definition (FoundationEssentials)
- ParseStrategy Extension on Date (FoundationInternationalization)
- ICUDateFormatter Internal class doing the work. (FoundationInternationalization)
- Date.FormatString definition (FoundationInternationalization)
- extension Date.FormatString (FoundationInternationalization)
- DateFieldSymbol (FoundationInternationalization)
- DateFormatStyle
- VerbatimFormatStyle (FoundationInternationalization)
- Date.ISO8601FormatStyle (FoundationInternationalization)
- Date parsing as mentioned in RegEx Pitch (5.7) (swift-evolution)
- Regex String Processing Algorithms (swift-evolution)
FormatStyle
FormatStyle
seems to be designed for UI representation, making sure that the user’s settings stay in charge of how a Date
gets displayed. As a result a FormatStyle
works differently depending on the device’s settings and localization. Output will behave a bit unpredictably to the developer. For example, 24 hour time display gets decided by the user’s system settings, not the developer. On the other hand, it also provides some pretty generous parsers as demonstrated below.
My Playground and Package Manager for some reason had differing opinions about my locale settings? (TODO?) I could have gotten more control by adding an explicit Locale (e.g. .locale(Locale(identifier: "en_US"))
), but I decided to roll with it to see the different behaviors.
//Computer settings PDT.
//In Playground. Fails in package until date string changed to 03/28/2024
let basicFormatStyle = Date.FormatStyle()
let basicDate = basicFormatStyle.parse("28/03/2024 03:12 PM")
let basicStringFromDate = basicFormatStyle.format(basicDate)
print("date:", basicDate, "string:", basicStringFromDate)
//date: 2024-03-28 22:12:00 +0000 string: 28/03/2024, 3:12 PM
If time doesn’t matter, one can ignore it by telling the format style which fields you do care about. Order doesn’t matter. Order will be determined by the Locale. Notice in the print out there still is a time, but its midnight in the device’s time zone relative to UTC, not the time from the String.
//Computer settings PDT.
//In Playground. Fails in package until date string changed to month leading
let dateOnlyStyle:Date.FormatStyle = .dateTime.year().day().month()
let dateOnlyDate = dateOnlyStyle.parse("28/03/2024 03:12 PM")
let dateOnlyStringFromDate = dateOnlyStyle.format(dateOnlyDate)
print("date:", dateOnlyDate, "string:", dateOnlyStringFromDate)
//date: 2024-03-28 07:00:00 +0000 string: 28 Mar 2024
A FormatStyle ends up being super chill about the text it will take in. It took all the following text inputs (Playground examples. Switched to month leading in Package Manger. There does not seem to be a way to make a super chill year-leading formatter? Where does that?)
28/03/2024
28 Mar 2024
28-03-2024
28-Mar-2024
28 .-/ 03 .-/ 2024
But not
28 fhujsflh 03 jifoalgf 2024
So it does seem to be limited to “the usual suspects” in terms of date delimiters.
I’ve been showing the showing the shortcuts. The full initializer looks like
Date.FormatStyle(date: Date.FormatStyle.DateStyle?,
time: Date.FormatStyle.TimeStyle?,
locale: Locale,
calendar: Calendar,
timeZone: TimeZone,
capitalizationContext: FormatStyleCapitalizationContext)
For example (why this)
static let customFormat = Date.FormatStyle(date: .complete,
time: .complete,
locale: Locale(identifier: "zh_Hans_CN"),
calendar: Calendar(identifier: .chinese),
timeZone: TimeZone(identifier: "UTC")!,
capitalizationContext: .beginningOfSentence)
//Text(Date.now, format:Self.customFormat)
//=> Second Month 19, 2024(jia-chen) at 21:37:17
// when it was Thursday Mar 28 14:37:17 and system time was set to 24hr time.
ISO8601FormatStyle
If less chill is desired, one can go for the built in ISO8601 format.
let dateString = "20240312 03:12:10"
let myDateFormat:Date.ISO8601FormatStyle = .iso8601.year().month().day().dateSeparator(.omitted)
let myDate = myDateFormat.parse(dateString2)
let stringFromDate = myDateFormat.format(myDate)
print("date:", myDate, "string:", stringFromDate)
//date: 2024-03-12 00:00:00 +0000 string: 20240312
With time omitted from the format, again the parser will just ignore it. Overall the ISO8601FormatStyle
assumes the date data will be generated by machines and be read by machines. It has very limited configuration. It’s not interested in being a good sport. That’s not its job.
VerbatimFormatStyle and DateFormatString
If one wants less chill like with ISO8601FormatStyle
, but in a custom style, VerbatimFormatStyle
combined with a DateFormatString
provides a similar functionality.
//for some reason Playground balks when not dynamic var
var isoFormatString:Date.FormatString {
"\(year: .defaultDigits)-\(month: .twoDigits)-\(day: .twoDigits)"
}
let verbatimFormat:Date.VerbatimFormatStyle = .init(format: isoFormatString, timeZone: TimeZone.gmt, calendar: Calendar.current)
//need to drop down into the parseStrategy directly
let verbatimDate = verbatimFormat.parseStrategy.parse("2024-03-12")
let stringFromVDate = verbatimFormat.format(myDate)
//date: 2024-03-12 00:00:00 +0000 string: 2024-03-12
This is exactly how to get 24 hour time, independent of the system settings.
var withTime:Date.FormatString { "\(year: .defaultDigits)-\(month: .twoDigits)-\(day: .twoDigits) \(hour: .twoDigits(clock: .twentyFourHour, hourCycle: .zeroBased)):\(minute: .twoDigits):\(second: .defaultDigits) \(timeZone: .identifier(.short))" }
let verbatimFormatWithTime:Date.VerbatimFormatStyle = .init(format: withTime, timeZone: TimeZone.gmt, calendar: Calendar.current)
print(verbatimFormatWithTime.format(Date.now))
The VerbatimFormatStyle
does not appear to work with Date.FormatString
literals (FB13700896, semi-embarrassing forum post (plenty of those)), but pure ParseStrategy
approaches do.
If one wants flexibility and a custom style… well that’s where I’d abandon Date
for DateComponent
and pull out RegExBuilder
. Maybe even make some custom components. OOOooo or maybe a custom scanner… no this is 2024 its gotta be ML… But as I officially have too many projects going and Decoders do not require the ability to take human generated input, not today.
ParseStrategy
Date
initializers can now take a ParseStrategy
. You can roll your own with a Date.FormatString
, or extract one from an existing Date.FormatStyle
(myFormatStyle.parseStrategy.format
). Like VerbatimFormatStyle
they don’t provide flexibility.
//THIS CODE DOES NOT WORK IN PLAYGROUNDS
func testDateParsing() throws {
let dashString:Date.FormatString = "'y-M-d'"
let strategy = Date.ParseStrategy(format: dashString, timeZone: .gmt)
let date = try Date("2024-03-12", strategy: strategy)
print("newDate:", date)
//newDate 2024-03-12 00:00:00 +0000
let formatString = strategy.format //returns the format string.
print(formatString)
//''y-M-d''
let date2 = try strategy.parse("2024-03-12")
XCTAssertEqual(date, date2)
}
In this example I used a Date.FormatString
literal, but THEY ARE NOT IDENTICAL to the old DateFormatter
literals.
They’re going to take some getting used to.
-
Definition y (current year)
-
Definition Y (week of year?)
-
works:
"'y'-'MM'-'dd' 'HH':'mm':'ss'"
, fails:"y'-'MM'-'dd' 'HH':'mm':'ss"
-
When the date string is
let dateString = "20240312031210"
the string has to be"'yyyyMMddHHmmss'"
and the full initializer has to have the year as\(year: .extended(minimumLength: 4))
as.defaultdigits
does not work.
If one wanted to extract a format string to pass it around, it would need a little massaging to get back into proper Date.FormatString
condition.
var formatString:Date.FormatString {
Date.FormatString(stringLiteral: "'\(someFormat.parseStrategy.format)'")
}
let strategy = Date.ParseStrategy(format: formatString, timeZone: .gmt)
let date = try Date(dateString, strategy: strategy)
Example for the Decoder
My first crack at decoding a date actually went straight for initializing a ParseStrategy
with a Date.FormatString.
Which strategy depended on whether the time was included (based on presence of colon).
func _decodeDate(from value:String) throws -> Date {
//gmt to make sure time fields are 0
let strategy = if value.contains(":") {
//strings can work too, not identical to DateFormatter but similar
Date.ParseStrategy(format: "'yyyy-MM-dd' 'HH:mm:ss'",
timeZone: .gmt)
} else {
Date.ParseStrategy(format: "\(year: .defaultDigits)-\(month: .twoDigits)-\(day: .twoDigits)",
timeZone: .gmt)
}
guard let date = try? Date(value, strategy: strategy) else {
throw DecodingError.dataCorrupted(
.init(codingPath: [], debugDescription: "String not in expected \(strategy.format) format.")
)
}
return date
}
On second pass I changed the code to use the ISO8601FormatStyle
because that’s the industry standard. Additionally, I decided to require a time zone (distance from UTC as ±HH:mm or Z if UTC) if a time is also included. An Encoder should be able to provide that no problem.
func _decodeDate(from value:String) throws -> Date {
//gmt ("Z")is the default.
//
let format = if value.contains(":") {
Date.ISO8601FormatStyle.iso8601
//.dateTimeSeparator(.space)
} else {
Date.ISO8601FormatStyle.iso8601
.year().month().day()
}
guard let date = try? format.parse(value) else {
throw DecodingError.dataCorrupted(
.init(codingPath: [], debugDescription: "String not in expected \(format.parseStrategy) format.")
)
}
return date
}
I could try to be a lot more gracious with what strings I’ll accept, but I’m choosing not to be. When writing a Decoder, it’s important to know if the data will be coming from:
- user input,
- only your own Encoder,
- or someone else’s serialization process
The question becomes: is a badly formatted date something to be gracious about, or an alarm that the data can’t be trusted. I wanted to think about how to accept more than one format style, but not go full general purpose calendar event storage.
Summary
Good. Enough. Next post, a super basic Decoder.
This article is part of a series.
- Part 1: What if I just copy-paste from the web?
- Part 2: How do you get messages to Swift directly?
- Part 3: Okay, but how about all the way up to the View?
- Part 4: How to do some basic file handling?
- Part 5: How do custom Encoder's work?
- Part 6: And what can I make a custom Encoder do?
- Part 7: Wait, how do I scan text again?
- Part 8: This Article
- Part 9: What would be a very simple working Decoder?