How to do some basic file handling?

This article is part of a series.

Related Repo: https://github.com/carlynorama/SimplePersist

Great [Performance Comparison by Tera on the Swift Forum] (https://forums.swift.org/t/what-is-the-best-way-to-work-with-the-file-system/71020/18)

Continuing with the Lines App, what’s the best way to store the saved text so both Lines and LineGrabber can use it? At the end of the post I’ve dumped a checklist of some of what I consider when picking a data storage solution. Some of it’s technical. Some of it is about the working environment of people who will interact with the data. It’s a bit overkill for this stage of the Lines app but it’s in the back of my head. Right now the Lines data gets loaded fresh from a file, and we’re going to stick with loading it back to a file.

That brings up some interesting questions about where to save that file. Right now it’s in the Lines bundle, but what if it was saved to the App group? That way both processes (Lines and LineGrabber) could interact with it?

That kind of multi-process file access can be very difficult to get right. What happens if one process is ready to write while the other is reading? Actors help mitigate that issue within a single app’s boundaries, but at least historically file systems have used Locks. One help, this data storage file won’t be willny nilly out in the file system subject to the whims of any process that wanders past. It will live in a sandbox which offers some protections.

References

Handy links I don’t want to loose track of

What’s in the Tin

Given I’d prefer not to roll my own file system interactions, I spent some time looking at what Swift can already do.

I highly recommend this article by Wade Tregaskis who goes over creating tmp files with both Swift and C API’s. Very handy.

Some criteria: Whatever I write will likely be in the form of a library and I prefer Library code to run everywhere Swift does. So my preference will be to use the highest level library that exists since I don’t know anything about developing for Windows.

That said, if I end up wanting to go lower it will be good to know how deep into the *nix weeds I can go without dipping into the C like I did for the serial port.

So without further ado, here’s a catalog of the main players:

FileManager

https://developer.apple.com/documentation/foundation/filemanager

“A convenient interface to the contents of the file system, and the primary means of interacting with it.

A file manager object lets you examine the contents of the file system and make changes to it. The FileManager class provides convenient access to a shared file manager object that is suitable for most types of file-related manipulations. A file manager object is typically your primary mode of interaction with the file system. You use it to locate, create, copy, and move files and directories. You also use it to get information about a file or directory or change some of its attributes.”

Largely used to get the URLs that will be used by other function, although URL seems to be moving into that space, too.

    FileManager.default.fileExists(atPath: url.path)
    let url = URL.documentsDirectory.appending(path: "fileName.txt")

Data, String and URL

These types have all been given read and sometimes write options. No separate FileManager or FileHandle required. For working with existing files.

try someStringICareAbout.write(to: someURL, atomically: true, encoding: .utf8)
let myString = try String(contentsOf: someURL, encoding: .utf8)
let myData = try Data(contentsOf: someURL) 
try myData.write(to: someFileURL, options: [.atomic, .completeFileProtection])

//NOTE: This function striped out empty lines the last I checked. 
for try await line in url.lines {
    doSomething(with:line)
}

*Stream

For when you don’t actually know to what or where the bytes will be going. File, socket, whatever.

Honestly I haven’t used these much. I’m not clear on where they fit in the future of Swift and async file access. If I was going to these links would be a place to start.

- Current GitHub: https://github.com/apple/swift-corelibs-foundation/blob/main/Sources/Foundation/Stream.swift
- 2020: https://forums.swift.org/t/is-it-possible-to-read-a-file-saved-in-sandbox-with-inputstream/37587
- https://forums.swift.org/t/make-inputstream-and-outputstream-methods-safe-and-swifty/23726/3
- https://forums.swift.org/t/inputstream-outputstream-differences-on-linux/34846/2

TextOutputStream, a protocol other processes, including print can dump to, may play a role later down the line.

- Github:https://github.com/apple/swift/blob/main/stdlib/public/core/OutputStream.swift
- Great howto article: https://nshipster.com/textoutputstream/

I found other things called FileStreams floating around the Swift code, but it appears to be for specialized AV use?

FileHandle

FileHandle doesn’t give raw C FileHandle access. Thanks to Wade’s examinations we know it doesn’t let one create files, for example, ("_NSOpenFileDescriptor to do the actual file system calls. That calls open with only the flag O_WRONLY; it does not pass O_CREAT nor O_EXCL.")

https://developer.apple.com/documentation/foundation/filehandle

Although the read, accept, and wait operations themselves are performed asynchronously on background threads, the file handle uses a run loop source to monitor the operations and notify your code appropriately. Therefore, you must call those methods from your application’s main thread or from any thread where you’ve configured a run loop and are using it to process events.

    private func appendData(data:Data) throws {
        let fileHandle = try FileHandle(forWritingTo: storageUrl)
        fileHandle.seekToEndOfFile()
        fileHandle.write(data)
        fileHandle.closeFile()
    }
    //(from swift-argument-parser examples)
    mutating func lineCount() async throws {
        let fileHandle = try FileHandle(forReadingFrom: inputFile)
        let lineCount = try await fileHandle.bytes.lines.reduce(into: 0) 
            { count, _ in count += 1 }
        print(lineCount)
    }

FileWrapper

This is when a App’s Document has multiple items in a single “File”, which is perhaps better called a Document in this context. Not relevant… yet?

https://developer.apple.com/documentation/foundation/filewrapper/

“The FileWrapper class provides access to the attributes and contents of file system nodes. A file system node is a file, directory, or symbolic link. Instances of this class are known as file wrappers.”

NSFilePresenter and friends

More Document sharing between Apps and people. Could be relevant, but is not cross platform?

https://developer.apple.com/documentation/foundation/nsfilepresenter

NSFilePresenter: The interface a file coordinator uses to inform an object presenting a file about changes to that file made elsewhere in the system….Objects that allow the user to view or edit the content of files or directories should adopt the NSFilePresenter protocol.

https://developer.apple.com/documentation/foundation/nsfilecoordinator

NSFileCoordinator: An object that coordinates the reading and writing of files and directories among file presenters.

DispatchSource and Friends

Apple platforms low-level system interaction handling.

https://developer.apple.com/documentation/dispatch/dispatchsource https://developer.apple.com/documentation/dispatch/dispatchsourcefilesystemobject https://developer.apple.com/documentation/dispatch/dispatchsource/filesystemevent

swift-system

Low-level system interactions for all the platforms Swift supports. Incredibly handy. Not nearly as easy as “Hey FileManager, fix it for me”, but good group of tools to look through before rolling ones own…

https://github.com/apple/swift-system/

Backing library for Swift-NIO

System is a multi-platform library, not a cross-platform one. It provides a separate set of APIs and behaviors on every supported platform, closely reflecting the underlying OS interfaces. A single import will pull in the native platform interfaces specific for the targeted OS.

2024-04-03: NIOFileSystem is the place to look, specifically. Good forum discussion

What to do?

I am not convinced that I can do multi process managing of files without something more magic than the existing multi-platform options but won’t know without a little bit of try.

Here’s the hello world. Available at https://github.com/carlynorama/SimplePersist

Tested on MacOS and Linux so far.

import Foundation

public protocol StringPersistable: LosslessStringConvertible, Sendable {
  //the losses strings can't have \n
  init?(_ description: some StringProtocol)
}

enum PersistorError: Error {
  case unknownError(message: String)
  case fileAttributeUnavailable(_ attributeName: String)
  case stringNotDataEncodable
}

public actor BasicTextPersistor<Element: StringPersistable> {
  private let fm = FileManager.default
  private(set) var separator: String
  private(set) var encoding: String.Encoding = .utf8
  let storageUrl: URL

  public static func fileExists(_ path: String) -> Bool {
    FileManager.default.fileExists(atPath: path)
  }

  public static func fileExists(_ url: URL) -> Bool {
    FileManager.default.fileExists(atPath: url.path)
  }

  @discardableResult
  public static func touch(_ url: URL) -> Bool {
    FileManager.default.createFile(atPath: url.path, contents: nil)
  }

  public init(storageUrl: URL, separator: String = "\n") {
    self.storageUrl = storageUrl
    self.separator = separator
    if !Self.fileExists(storageUrl) {
      Self.touch(storageUrl)
    }
  }

  private func makeBlob(from array: [StringPersistable]) -> String {
    array.map { $0.description }.joined(separator: separator)
  }

  public func write(contentsOf: [StringPersistable]) async throws {
    try makeBlob(from: contentsOf).write(to: storageUrl, atomically: true, encoding: .utf8)
  }

  //Do you need appends to be atomic? That is, as supported by the O_APPEND flag for open.
  public func append(_ item: Element) async throws {
    if let data = "\(separator)\(item.description)".data(using: encoding) {
      try appendData(data: data)
    } else {
      throw PersistorError.stringNotDataEncodable
    }
  }

  public func append(contentsOf: [Element]) async throws {
    if let data = "\(separator)\(makeBlob(from: contentsOf))".data(using: encoding) {
      try appendData(data: data)
    } else {
      throw PersistorError.stringNotDataEncodable
    }
  }

  private func appendData(data: Data) throws {
    let fileHandle = try FileHandle(forWritingTo: storageUrl)
    fileHandle.seekToEndOfFile()
    fileHandle.write(data)
    fileHandle.closeFile()
  }

  //this is async for the actor, not the file i/o
  @available(macOS 13.0, iOS 16.0, *)
  public func retrieve() async throws -> [Element] {
    let string = try String(contentsOf: storageUrl)
    return string.split(separator: separator).compactMap({
      Element.init($0)
    })
  }

  @available(macOS 13.0, iOS 16.0, *)
  public func retrieveAvailable() async -> [Element] {
    do {
      return try await retrieve()
    } catch {
      return []
    }
  }

  public func lastModified() throws -> Date {
    //works in linux?
    //if let date = try storageUrl.resourceValues(forKeys: [.contentModificationDateKey]).contentModificationDate

    let attribute = try fm.attributesOfItem(atPath: storageUrl.path)
    if let date = attribute[FileAttributeKey.modificationDate] as? Date {
      return date
    } else {
      throw PersistorError.fileAttributeUnavailable("modificationDate")
    }
  }

  //The corresponding value is an NSNumber object containing an unsigned long long.
  //Important
  //If the file has a resource fork, the returned value does not include the size of the resource fork.
  public func size() throws -> Int {
    let attribute = try FileManager.default.attributesOfItem(atPath: storageUrl.path)
    if let size = attribute[FileAttributeKey.size] as? Int {
      return size
    } else {
      throw PersistorError.fileAttributeUnavailable("size")
    }
  }
}

#if os(Linux)
  import Foundation

  //https://github.com/apple/swift-corelibs-foundation/blob/36a411b304063de2cbd3fe06adc662e7648d5a9d/Sources/Foundation/URL.swift#L749
  //TODO: isDirectory one, too.
  extension URL {
    public func appending(path: String) -> Self {
      self.appendingPathComponent(path)
    }
  }

#endif

Things to think about when saving Data

The journalism classic article checklist “Who, what, when, where, why?” provides just as much utility when trying to decide how and if to save certain data.

This is just a quick mental dump of what’s in the back of my mind when picking a mechanism to store data.

This article is part of a series.