How to do some basic file handling?
This article is part of a series.
- Part 1: What if I just copy-paste from the web?
- Part 2: How do you get messages to Swift directly?
- Part 3: Okay, but how about all the way up to the View?
- Part 4: This Article
- Part 5: How do custom Encoder's work?
- Part 6: And what can I make a custom Encoder do?
- Part 7: Wait, how do I scan text again?
- Part 8: Date Parsing. Nose wrinkle.
- Part 9: What would be a very simple working Decoder?
Related Repo: https://github.com/carlynorama/SimplePersist
Great [Performance Comparison by Tera on the Swift Forum] (https://forums.swift.org/t/what-is-the-best-way-to-work-with-the-file-system/71020/18)
Continuing with the Lines App, what’s the best way to store the saved text so both Lines and LineGrabber can use it? At the end of the post I’ve dumped a checklist of some of what I consider when picking a data storage solution. Some of it’s technical. Some of it is about the working environment of people who will interact with the data. It’s a bit overkill for this stage of the Lines app but it’s in the back of my head. Right now the Lines data gets loaded fresh from a file, and we’re going to stick with loading it back to a file.
That brings up some interesting questions about where to save that file. Right now it’s in the Lines bundle, but what if it was saved to the App group? That way both processes (Lines and LineGrabber) could interact with it?
That kind of multi-process file access can be very difficult to get right. What happens if one process is ready to write while the other is reading? Actors help mitigate that issue within a single app’s boundaries, but at least historically file systems have used Locks. One help, this data storage file won’t be willny nilly out in the file system subject to the whims of any process that wanders past. It will live in a sandbox which offers some protections.
References
Handy links I don’t want to loose track of
- Article on how to have two processes share a file in C for contrast.
- DocC’s Document Monitor. Pretty clever.
- C & Swift tmp file article mentioned below:
- https://www.unix.com/man-page/mojave/2/open/
What’s in the Tin
Given I’d prefer not to roll my own file system interactions, I spent some time looking at what Swift can already do.
I highly recommend this article by Wade Tregaskis who goes over creating tmp files with both Swift and C API’s. Very handy.
Some criteria: Whatever I write will likely be in the form of a library and I prefer Library code to run everywhere Swift does. So my preference will be to use the highest level library that exists since I don’t know anything about developing for Windows.
That said, if I end up wanting to go lower it will be good to know how deep into the *nix weeds I can go without dipping into the C like I did for the serial port.
So without further ado, here’s a catalog of the main players:
FileManager
https://developer.apple.com/documentation/foundation/filemanager
“A convenient interface to the contents of the file system, and the primary means of interacting with it.
A file manager object lets you examine the contents of the file system and make changes to it. The FileManager class provides convenient access to a shared file manager object that is suitable for most types of file-related manipulations. A file manager object is typically your primary mode of interaction with the file system. You use it to locate, create, copy, and move files and directories. You also use it to get information about a file or directory or change some of its attributes.”
Largely used to get the URLs that will be used by other function, although URL seems to be moving into that space, too.
FileManager.default.fileExists(atPath: url.path)
let url = URL.documentsDirectory.appending(path: "fileName.txt")
Data, String and URL
These types have all been given read and sometimes write options. No separate FileManager or FileHandle required. For working with existing files.
- https://developer.apple.com/documentation/foundation/url
- https://developer.apple.com/documentation/foundation/url/3767315-lines
- https://developer.apple.com/documentation/foundation/url/3767316-resourcebytes
- https://developer.apple.com/documentation/foundation/url/asyncbytes
- https://developer.apple.com/documentation/foundation/data
- https://developer.apple.com/documentation/foundation/data/1779858-write
- https://developer.apple.com/documentation/foundation/data/readingoptions
- https://developer.apple.com/documentation/foundation/data/writingoptions
- https://developer.apple.com/documentation/swift/string/init(contentsof:)
- https://developer.apple.com/documentation/swift/string/write(to:)
try someStringICareAbout.write(to: someURL, atomically: true, encoding: .utf8)
let myString = try String(contentsOf: someURL, encoding: .utf8)
let myData = try Data(contentsOf: someURL)
try myData.write(to: someFileURL, options: [.atomic, .completeFileProtection])
//NOTE: This function striped out empty lines the last I checked.
for try await line in url.lines {
doSomething(with:line)
}
*Stream
For when you don’t actually know to what or where the bytes will be going. File, socket, whatever.
- [Stream], InputStream and OutputStream as bridges to of CFStream, CFReadStream and CFWriteStream?
Honestly I haven’t used these much. I’m not clear on where they fit in the future of Swift and async file access. If I was going to these links would be a place to start.
- Current GitHub: https://github.com/apple/swift-corelibs-foundation/blob/main/Sources/Foundation/Stream.swift
- 2020: https://forums.swift.org/t/is-it-possible-to-read-a-file-saved-in-sandbox-with-inputstream/37587
- https://forums.swift.org/t/make-inputstream-and-outputstream-methods-safe-and-swifty/23726/3
- https://forums.swift.org/t/inputstream-outputstream-differences-on-linux/34846/2
TextOutputStream, a protocol other processes, including print
can dump to, may play a role later down the line.
- Github:https://github.com/apple/swift/blob/main/stdlib/public/core/OutputStream.swift
- Great howto article: https://nshipster.com/textoutputstream/
I found other things called FileStreams floating around the Swift code, but it appears to be for specialized AV use?
FileHandle
FileHandle
doesn’t give raw C FileHandle access. Thanks to Wade’s examinations we know it doesn’t let one create files, for example, ("_NSOpenFileDescriptor to do the actual file system calls. That calls open with only the flag O_WRONLY; it does not pass O_CREAT nor O_EXCL.")
https://developer.apple.com/documentation/foundation/filehandle
Although the read, accept, and wait operations themselves are performed asynchronously on background threads, the file handle uses a run loop source to monitor the operations and notify your code appropriately. Therefore, you must call those methods from your application’s main thread or from any thread where you’ve configured a run loop and are using it to process events.
private func appendData(data:Data) throws {
let fileHandle = try FileHandle(forWritingTo: storageUrl)
fileHandle.seekToEndOfFile()
fileHandle.write(data)
fileHandle.closeFile()
}
//(from swift-argument-parser examples)
mutating func lineCount() async throws {
let fileHandle = try FileHandle(forReadingFrom: inputFile)
let lineCount = try await fileHandle.bytes.lines.reduce(into: 0)
{ count, _ in count += 1 }
print(lineCount)
}
FileWrapper
This is when a App’s Document has multiple items in a single “File”, which is perhaps better called a Document in this context. Not relevant… yet?
https://developer.apple.com/documentation/foundation/filewrapper/
“The FileWrapper class provides access to the attributes and contents of file system nodes. A file system node is a file, directory, or symbolic link. Instances of this class are known as file wrappers.”
NSFilePresenter and friends
More Document sharing between Apps and people. Could be relevant, but is not cross platform?
https://developer.apple.com/documentation/foundation/nsfilepresenter
NSFilePresenter: The interface a file coordinator uses to inform an object presenting a file about changes to that file made elsewhere in the system….Objects that allow the user to view or edit the content of files or directories should adopt the NSFilePresenter protocol.
https://developer.apple.com/documentation/foundation/nsfilecoordinator
NSFileCoordinator: An object that coordinates the reading and writing of files and directories among file presenters.
DispatchSource and Friends
Apple platforms low-level system interaction handling.
https://developer.apple.com/documentation/dispatch/dispatchsource https://developer.apple.com/documentation/dispatch/dispatchsourcefilesystemobject https://developer.apple.com/documentation/dispatch/dispatchsource/filesystemevent
swift-system
Low-level system interactions for all the platforms Swift supports. Incredibly handy. Not nearly as easy as “Hey FileManager, fix it for me”, but good group of tools to look through before rolling ones own…
https://github.com/apple/swift-system/
Backing library for Swift-NIO
System is a multi-platform library, not a cross-platform one. It provides a separate set of APIs and behaviors on every supported platform, closely reflecting the underlying OS interfaces. A single import will pull in the native platform interfaces specific for the targeted OS.
2024-04-03: NIOFileSystem is the place to look, specifically. Good forum discussion
What to do?
I am not convinced that I can do multi process managing of files without something more magic than the existing multi-platform options but won’t know without a little bit of try.
Here’s the hello world. Available at https://github.com/carlynorama/SimplePersist
Tested on MacOS and Linux so far.
import Foundation
public protocol StringPersistable: LosslessStringConvertible, Sendable {
//the losses strings can't have \n
init?(_ description: some StringProtocol)
}
enum PersistorError: Error {
case unknownError(message: String)
case fileAttributeUnavailable(_ attributeName: String)
case stringNotDataEncodable
}
public actor BasicTextPersistor<Element: StringPersistable> {
private let fm = FileManager.default
private(set) var separator: String
private(set) var encoding: String.Encoding = .utf8
let storageUrl: URL
public static func fileExists(_ path: String) -> Bool {
FileManager.default.fileExists(atPath: path)
}
public static func fileExists(_ url: URL) -> Bool {
FileManager.default.fileExists(atPath: url.path)
}
@discardableResult
public static func touch(_ url: URL) -> Bool {
FileManager.default.createFile(atPath: url.path, contents: nil)
}
public init(storageUrl: URL, separator: String = "\n") {
self.storageUrl = storageUrl
self.separator = separator
if !Self.fileExists(storageUrl) {
Self.touch(storageUrl)
}
}
private func makeBlob(from array: [StringPersistable]) -> String {
array.map { $0.description }.joined(separator: separator)
}
public func write(contentsOf: [StringPersistable]) async throws {
try makeBlob(from: contentsOf).write(to: storageUrl, atomically: true, encoding: .utf8)
}
//Do you need appends to be atomic? That is, as supported by the O_APPEND flag for open.
public func append(_ item: Element) async throws {
if let data = "\(separator)\(item.description)".data(using: encoding) {
try appendData(data: data)
} else {
throw PersistorError.stringNotDataEncodable
}
}
public func append(contentsOf: [Element]) async throws {
if let data = "\(separator)\(makeBlob(from: contentsOf))".data(using: encoding) {
try appendData(data: data)
} else {
throw PersistorError.stringNotDataEncodable
}
}
private func appendData(data: Data) throws {
let fileHandle = try FileHandle(forWritingTo: storageUrl)
fileHandle.seekToEndOfFile()
fileHandle.write(data)
fileHandle.closeFile()
}
//this is async for the actor, not the file i/o
@available(macOS 13.0, iOS 16.0, *)
public func retrieve() async throws -> [Element] {
let string = try String(contentsOf: storageUrl)
return string.split(separator: separator).compactMap({
Element.init($0)
})
}
@available(macOS 13.0, iOS 16.0, *)
public func retrieveAvailable() async -> [Element] {
do {
return try await retrieve()
} catch {
return []
}
}
public func lastModified() throws -> Date {
//works in linux?
//if let date = try storageUrl.resourceValues(forKeys: [.contentModificationDateKey]).contentModificationDate
let attribute = try fm.attributesOfItem(atPath: storageUrl.path)
if let date = attribute[FileAttributeKey.modificationDate] as? Date {
return date
} else {
throw PersistorError.fileAttributeUnavailable("modificationDate")
}
}
//The corresponding value is an NSNumber object containing an unsigned long long.
//Important
//If the file has a resource fork, the returned value does not include the size of the resource fork.
public func size() throws -> Int {
let attribute = try FileManager.default.attributesOfItem(atPath: storageUrl.path)
if let size = attribute[FileAttributeKey.size] as? Int {
return size
} else {
throw PersistorError.fileAttributeUnavailable("size")
}
}
}
#if os(Linux)
import Foundation
//https://github.com/apple/swift-corelibs-foundation/blob/36a411b304063de2cbd3fe06adc662e7648d5a9d/Sources/Foundation/URL.swift#L749
//TODO: isDirectory one, too.
extension URL {
public func appending(path: String) -> Self {
self.appendingPathComponent(path)
}
}
#endif
Things to think about when saving Data
The journalism classic article checklist “Who, what, when, where, why?” provides just as much utility when trying to decide how and if to save certain data.
This is just a quick mental dump of what’s in the back of my mind when picking a mechanism to store data.
- Quantity and shape of the data?
- how much?
- how often does it change?
- modified
- appended
- removed
- does it arrive in the shape it will be stored?
- will subsections need to be selectively accessed?
- how many sources does it come from?
- [] number of process types
- [] number of processes at a time
- what’s the time frame for its value (nanoseconds vs archive quality)?
- Data usage
- does it leave in the shape it’s stored?
- how many processes need the information to do their work?
- number of process types
- number of processes at a time
- honestly, how “fresh” does the data need to be for the work to be correct.
- Semantic value
- is it identifying?
- is it private?
- is it legally protected?
- is it hard to reproduce?
- Machine Trust
- are all the processes that need it inside a verifiable trust?
- who needs to write? (can it be mediated?)
- who needs to read? (can it be mediated?)
- do all the machines read and write data the same way?
- Low-level Human Operator Trust
- expertise with the data
- expertise with computers
- current methods
- conditions of labor
- will they be overworked and tired?
- will they be bribeable?
This article is part of a series.
- Part 1: What if I just copy-paste from the web?
- Part 2: How do you get messages to Swift directly?
- Part 3: Okay, but how about all the way up to the View?
- Part 4: This Article
- Part 5: How do custom Encoder's work?
- Part 6: And what can I make a custom Encoder do?
- Part 7: Wait, how do I scan text again?
- Part 8: Date Parsing. Nose wrinkle.
- Part 9: What would be a very simple working Decoder?