The Operating System: Should There Be One?

Some friends and I have started a paper reading group on operating systems, the first of which is The Operating System: Should There Be One?.

The paper concerns itself with a statement made previously:

An operating system is a collection of things that don’t fit inside the language; there shouldn’t be one.

The author of the paper evaluates this in the world of Unix, and tries to figure out if this statement still holds.

I’ll cover a quick overview of the paper as I understand it without injecting my thoughts, then cover my initial thoughts and post-discussion thoughts.

The paper

The original statement was made concerning the Smalltalk system. This system, similar to the Lisp Machines, has a single language runtime that runs the entire machine. I guess it’s useful to make a few distinctions here: language vs. runtime, and kernel vs. operating system. I had to work to keep these straight in my head, so I’ll do my best in this writeup to make the distinction clear.

There are similar runtime and language conflations:

  • Genera Lisp and the Lisp Machines
  • The JVM and Java Cards (and others, but I’ve only ever worked with Java Cards)
  • The BEAM and the Grisp, and undoubtedly plenty of telecom equipment

All these examples to show that there are successful, useful integrated systems.

Smalltalk prefers to use higher-level abstractions like message passing, compared to the Unix style of byte streams and raw memory interfaces. However, there are areas in which Unix facilities could be composed to more closely align with the Smalltalk style.

It’s useful to delineate what about the Smalltalk environment would make it desirable, with the goal of helping to manage software complexity through the use of composable abstractions. Specifically, three key benefits are identified:

  1. Programmatic availability of system-level talks, all of which “cease to require mechanism-specific code.”
  2. Descriptive availability — this part reminded me of describe and pervasive metasystem from Common Lisp.
  3. Interposable bindings provided by Smalltalk’s late-bound message-passing interfaces; it allows clients to remain oblivious to the implementation specifics and only allows one general mechanism for binding objects.

If we compare the programmatic availability of Unix and Plan 9, we run into the application-device split. This can be understand if we look at the programming interfaces provided by Unix, of which there are four.

  1. The host instruction set
  2. System calls, which are an extension of the host instruction set with the operating system’s services
  3. The shell
  4. C

C is an abstraction over the host instruction set and system calls, which is considered application programming, while the shell can be thought of as an abstraction of the system call API that specialises in file and process-level operations. This is device programming; these file mechanisms are the operating system’s raison d’être while application mechanisms are opaque to the operating system (except, of course, for the system call traps). Let’s consider Unix in light of the three features from Smalltalk.

Descriptiveness is provided via the “everything is a file” mechanism, wherein the filesystem acts as a sort of metasystem. Listing files in a directory can be thought of as being similar to listing slots in an object. However, the operating system defines what interfaces are available, and these interfaces tend to be optimised for storage devices (think the timestamp triad and file sizes). The facilities for exposing state at this level aren’t extended to application code; it has been extended to cover other abstractions - processes abstracted via the procfs filesystem and system devices via the sys filesystem.

Interposibility is provided for with Unix pipes: applications have unique I/O streams (stdin, stdout, and stderr). This has some limits: there can only be one standard input; even if you multiplex multiple streams on it, you can’t distinguish between them and they are intermixed. Only files can be opened by name, which means application developers have to think to provide the filename as an input option. Two examples of this:

  • Developers expect bash to be at /bin/bash (but it’s not on BSD systems)
  • Having to recompile programs to replace a string like /dev/dsp

In contrast to Smalltalk, Unix doesn’t provide a facility for user-defined classes; files are always one a few specific classes (e.g. regular files, directories, sockets, etc…). Furthermore, these classes are meant for larger objects - as opposed to “units of data the size of program variables.” In fact, it supports diverse language-level abstractions are accommodated only by ignoring them. This leads to three types of fragmentation:

  1. Fragmentation of the various user-level mechanisms,
  2. Fragmentation among system-level mechanisms, and
  3. Fragmentation within opaque user code: each language implementation has its own mechanism for object binding and identity (how to represent and store object addresses).

Smalltalk punts on this problem - the answer is to just use Smalltalk for everything.

Plan 9 took the “everything is a file” approach even further:

Its design, pithily stated, is that “everything is a [file] server” - a system is a (distributed) collection of processes serving and consuming files, or things superficially like them, using a standard protocol (9P) that is transport-agnostic.

System calls like ioctl and setuid were replaced with a general binding mechanism: servers call bind and clients call open. Configuration was done using special control files. Here is where the filesystem semantics start to wear thin: what does a timestamp (e.g. mtime) mean for a process? What is the size of a control file?

There were other research kernels contemporary to Plan 9 that attempted to build message-based abstractions with finer-grained protection mechanisms.

Like Smalltalk, however, these systems offer only a grand narrative on how software could and should be structured. Unlike Smalltalk, their programming abstractions were something of a secondary concern, lacking a true aspiration to influence the fabric and construction of user-level software. Accordingly, they have been the subject of substantially less application programming experience.

Unix does have the Smalltalk facilities - generic object abstractions via files, a (primitive) metasystem, and interposable late bindings, just in a fragmented form. This fragmentation makes them composable only by experts in specific use cases, as opposed to “the naturalness and immediacy of a designed system.” For example, every language has to unnecessarily invent its own configuration “mini-language.” Meta structure tends to only be documented, not provided for via a programmatic interface. For example, the procfs man pages provide scanf format strings for parsing specific files rather than providing a general parsing facility. The MMU provides a powerful late-binding interface, translating between virtual and physical memory addresses. Finally, many of the bindings passed between programs are rewritten inline with sed or awk.

Now to the central question: how can Unix harness this lurking Smalltalk nature? Consider using grep to search some object for text. But grep can’t natively handle mixed gzipped and plaintext files. It can, however, handle a directory using the recursive flag. Grep is an object (an executable on disk) sent a message using some variant of execve with a reference to an object that is the root of an object graph, and it has two main operations: traversing leaves of that object graph and reading lines of text.

A first step in recovering an objectoriented interpretation of files and raw memory is to understand them abstractly, as fields instead of uninterpreted bytes.

In an ideal system, nobody should need to tell Unix how to turn gzipped text into lines suitable for searching over; this information should be available from metadata over the available programs on the system, metadata from the gzipped file, and the program that wants to read the gzipped file (e.g. grep). A Prolog-based system could be used for a first pass, building on a base set of rules, and the DWARF debugging metamodel could provide an inspiration for the metasystem.

The paper concludes with two key points, the first of which is that

A language is a collection of concepts that can be found and recognised within a larger system; there will be many.

The second is that Smalltalk has embraced Unix in order to support more hardware.

Initial thoughts

While reading about the Smalltalk benefits, I envisioned a Common Lisp analogue along the lines of

(describe (network-interface 'wlan0))

This could provide buffer and connection state information, obviating the need for userspace tools like ifconfig and tcpdump.

On the whole, I’m not sure how you’d provide cross-language message passing without Unix memory or file byte streams. The “just use Smalltalk” answer seems like a copout for the general case. While it would be nice to have a single language to do everything in, I don’t see that happening anytime soon. Unless, of course, you eschew all the software out there and build your own universe from scratch. The idea of using a metasystem like DWARF is pretty interesting, though. Unix would have to evolve significantly, I think, to take advantage of this approach. There’s also the question of reliability: complex new systems that have been retrofitted to Linux, for example, have made those systems much harder to comprehend. Also, if we consider a typical production server running a containerised load, e.g. with

  • backend server written in Go
  • frontend written in react with a small HTTP server
  • a datastore written in C
  • networking services written in Rust

I wonder how we could provide a common interface to the necessary facilities for all of these - we’d have to adapt compilers and runtimes. What would that look like?

While I was reading this paper and working on writing it up, two articles came to my attention that seemed the relevant.

The first, “The Programming Language Conundrum,” discusses why Smalltalk doesn’t really scale to larger teams. If you consider all the work that goes into an operating system, especially in regards to hardware support, it makes sense why you wouldn’t use a powerful language as the runtime. Other languages probably wouldn’t cut it, though it’d be interesting to see someone do this with something like Lua (or maybe this does exist and I just haven’t seen it).

In why GNU grep is fast, the author of GNU grep discusses performance optimisations. It makes me wonder how well these would play with the gzip example - you’d have to decode chunks of the input files at a time, e.g. by having a a transparent layer between the input file and the text searcher that supports seeking (or presenting the entire file as a memory map). It would be interesting to see what that would look like, both from a user perspective and thinking about how to add additional processing overlay layers transparently to grep (e.g. implicitly adding additional steps to the pipeline: grep something foo.gz becomes zcat foo.gz | grep something).

Some more thoughts

Wally and I were talking about the paper over Signal, and the question came up "how does the language itself not then become the OS after pulling in all of the features of a kernel?"

It seems to me that most of our interactions with the Linux kernel are through an intermediary, like a shell. But there's a lot of cases where you're just using a thin wrapper over the operating system (think Python's open which I think, but haven't verified, is just a wrapper around a call to open(3)). You definitely don't get the same level of kernel introspection and manipulation that you'd get on, say, a Lisp machine [1].

It's papers like this that make me think I need to just write a toy operating system to learn some more.

[1]Which is admittedly an insecure machine, but that's a different problem for a different day.