Ransomware attacks on government IT systems: One reason we need better software architecture

This story is from May 22, 2019, put out by Tim Pool of Subverse. There has been a spate of ransomware attacks on municipal governments, and in one case, the State of Colorado’s Dept. of Transportation (CDOT).

I’ve done a little research, looking for details about how this happened. The best I’m getting is that the governments that got infected didn’t keep their software up to date with the latest versions and patches. They also didn’t maintain good network security. That seems to be the way these ransomware attacks happen. However, in the case of the City of Baltimore, the attack method was different from those in the past. The infection appears to have been installed intentionally on the governments’ computers by someone, or some group, who got into their systems through some means, perhaps through a remote desktop capability that was left unsecured. It didn’t come in through an infected website, or through a viral e-mail that’s used as bait.

Baltimore’s out-of-date and underfunded IT system was ripe for ransomware attack – Baltimore Brew

Some time back, I got a tip from Alan Kay to look at Mark S. Miller‘s work (no relation) on creating secure software infrastructure, using object capabilities, in an Actor sort of way. It was educational, largely because he said that in the 1960s and ’70s, there were two schools of thought on how to design a secure system architecture. One was using access control lists (ACLs), based around user roles. The other was based around what’s been called “capabilities” (or “cap”). The access control list model is what almost every computer system now in existence uses, and it is the primary cause of the security vulnerabilities we’ve been dealing with for three decades.

He said the problem with ACLs is that whenever you run a piece of software logged in using a certain user credential, that software has all the power and authority that you have. It doesn’t matter if you ran a piece of software by mistake, or someone is impersonating you. The software has as much run of the system as you do. If it manages to elevate its privileges through a hack, making itself appear to the system as the “root” user (or “Admin”), then it has complete run of the system, no holds barred.

The capability model doesn’t allow this. To use an analogy, it sandboxes software, based on the capabilities, or designation that spawned it, not the user’s credentials. Capabilities are defined for each object and component in the system, based on what resources it needs. If one component spawns another, the spawned component adds some capabilities of its own, but otherwise, it gets capabilities that the component that spawned it allows it to have, but only from its own repertoire. A user can give software more capabilities, if they desire it.

The strength of this model is there is no avenue for a piece of software to suddenly gain access to all of the system’s resources, just because it fooled you into running it, or someone got your credentials, and installed it. Even if an impersonator gains access to the system, they don’t have the ability to rampage through the system’s most sensitive resources, because it doesn’t use a user-role access model.

Part of the security infrastructure of object capabilities is that objects are not globally accessed. Using the Actor model of access, object references are only obtained through “introduction,” through an interface.

I’m including a couple videos from Miller to illustrate this principle, since he created a couple programming languages (and runtimes), called E and Caja, that use this capability model in Windows, and JavaScript/EcmaScript, respectively.

The first video demonstrates software written in E, from 2002

This next one demonstrates a web app. written in Caja, from 2011. A major topic of discussion here is using object capabilities in a distributed environment. So, cryptographic keys are an important part of it, to provide secret identifiers between objects, to restrict system access to sensitive resources.

In the Caja demo, he showed that an object that’s unknown to the system can be brought into it, and allowed to run, without breaking the object’s operation, and without threatening the system’s security. Again, this is because of the architecture’s sandboxing, which is deep. It’s also because of the object-oriented nature of this system. Everything is late-bound, and everything executes through interfaces. Objects using the same interfaces as sensitive system resources can emulate secure, sandboxed versions of those resources.

Miller has approached capabilities from a standpoint of pragmatism, “the operating system ship has sailed.” We’re stuck with the access control list model as our OS foundation for running software, because too much depends on it. We have a very loooooong legacy of software that uses it, and it’s not practical to just pitch it and start over. He’s talked about “tunneling” from ACLs to capabilities, as a gradualist path to a better, more secure model for designing software. This still leaves some security vulnerabilities open for attack in his system, because he hasn’t rebuilt the entire operating system from scratch. He created a layer (runtimes), on top of which his own software runs, so that it can operate in this way, without falling prey to attack as much as the other software on the system.

This pragmatic approach has its virtues, but it would not have prevented the attacks I’m talking about above, because they used system resources that are built into the operating system. What these attacks highlighted for me is that the system architecture we’re using does not provide adequate security for operating on a network. Sure, you can keep patching and upgrading your system software to stay ahead of cyber threats, but get behind on that, while not adequately compensating for the system security model’s inherent weaknesses, and you compromise a large chunk of your IT investment.

As I am not terribly familiar with the capability model yet, this is as much as I’ll say about it. I think Miller’s demos of it were very impressive. What this is meant to get across is the need for better software architecture, not in the sense of using something like design patterns, but a different system of semantics that is surfaced through one or more programming interfaces designed to use it.

The architecture Miller demonstrated is object-oriented, and I mean that in the real sense of the word, not in the sense of Java, C++, C#, etc., but in the sense, as I said earlier, of the Actor model, which carries on some fundamental features that were invented in Smalltalk, and adds some more. The implementations he created seem to be synchronous programming models (unlike Actor), using the operating system’s multitasking facility to run things concurrently.

It would behoove government and industry to fund basic research into secure system software architecture for the very reason that our governments now depend so much on computers running securely. What Miller demonstrated is one idea for addressing that, but it’s as old as a lot of other ideas in computer science, and it no doubt has some weaknesses that could be overcome by a better model. Though, reading Miller’s commentary, he said he’s biased in favor of thinking that the capability model is the best. He hasn’t seen anything better.

Goals for software engineering

This is going to strike people as a rant, but it’s not. It’s a recommendation to the fields of software engineering, and by extension, computer science, at large, on how to elevate the discussion in software engineering. This post was inspired by a question I was asked to answer on Quora about whether object-oriented programming or functional programming offers better opportunities for modularizing software design into simpler units.

So, I’ll start off by saying that what software engineering typically calls “OOP” is not what I consider OOP. The discussion that this question was about is likely comparing abstract data types, in concept, with functional programming (FP). What they’re also probably talking about is how well each programming model handles data structures, how the different models implement modular functionality, and which is preferable for dealing with said data structures.

Debating stuff like this is so off from what the discussion could be about. It is a fact that if an OOP model is what is desired, one can create a better one in FP than what these so-called “OOP” proponents are likely talking about. It would take some work, but it could be done. It wouldn’t surprise me if there already are libraries for FP languages that do this.

The point is what’s actually hampering the discussion about what programming model is better is the ideas that are being discussed, and more broadly, the goals. A modern, developed engineering discipline would understand this. Yes, both programming models are capable of decomposing tasks, but the whole discussion about which does it “better” is off the mark. It has a weak goal in mind.

I remember recently answering a question on Quora regarding a dispute between two people over which of two singers, who had passed away in recent years, was “better.” I said that you can’t begin to discuss which one was better on a reasonable basis until you look at what genres they were in. In this case, I said each singer used a different style. They weren’t trying to communicate the same things. They weren’t using the same techniques. They were in different genres, so there’s little point in comparing them, unless you’re talking about what style of music you like better. By analogy, each programming model has strengths and weaknesses relative to what you’re trying to accomplish, but in order to use them to best effect, you have to consider whether each is even a good fit architecturally for the system you’re trying to build. There may be different FP languages that are a better fit than others, or maybe none of them fit well. Likewise, for the procedural languages that these proponents are probably calling “OOP.” Calling one “better” than another is missing the point. It depends on what you’re trying to do.

Comparisons about programming models in software engineering tend to wind down to familiarity, at some point; which languages can function as a platform that a large pool of developers know how to use, because no one wants to be left holding the bag if developers decide to up and leave. I think this argument misses a larger problem: How do you replace the domain knowledge those developers had? The most valuable part of a developer’s knowledge base is their understanding of the intent of the software design; for example, how the target business for the system operates, and how that intersects with technical decisions that were made in the software they worked on. Usually, that knowledge is all in their heads. It’s hardly documented. It doesn’t matter that you can replace those developers with people with the same programming skills, because sure, they can read the code, but they don’t understand why it was designed the way it was, and without that, it’s going to be difficult for them to do much that’s meaningful with it. That knowledge is either with other people who are still working at the same business, and/or the ones who left, and either way, the new people are going to have to be brought up to speed to be productive, which likely means the people who are still there are going to have to take time away from their critical duties to train the new people. Productivity suffers even if the new people are experts in the required programming skills.

The problem with this approach, from a modern engineering perspective, goes back to the saying that if all someone knows how to use is a hammer, every problem looks like a nail to them. The problem is with the discipline itself; that this mentality dominates the thinking in it. And I must say, computer science has not been helping with this, either. Rather than exploring how to make the process of building a better-fit programming model easier, they’ve been teaching a strict set of programming models for the purpose of employing students in industry. I could go on about other nagging problems that are quite evident in the industry that they are ignoring.

I could be oversimplifying this, but my understanding is modern engineering has much more of a form-follows-function orientation, and it focuses on technological properties. It does not take a pre-engineered product as a given. It looks at its parts. It takes all of the requirements of a project into consideration, and then applies analysis technique and this knowledge of technological properties to finding a solution. It focuses a lot of attention on architecture, trying to make efficient use of materials and labor, to control costs. This focus tends to make the scheduling and budgeting for the project more predictable, since they are basing estimates on known constraints. They also operate on a principle that simpler models (but not too simple) fail less often, and are easier to maintain than overcomplicated ones. They use analysis tools that help them model the constraints of the design, again, using technological properties as the basis, taking cognitive load off of the engineers.

Another thing is they don’t think about “what’s easier” for the engineers. They think about “what’s easier” for the people who will be using what’s ultimately created, including engineers who will be maintaining it, at least within cost constraints, and reliability requirements. The engineers tasked with creating the end product are supposed to be doing the hard part of trying to find the architecture and materials that fit the requirements for the people who will be using it, and paying for it.

Clarifying this analogy, what I’m talking about when I say “architecture” is not, “What data structure/container should we use?” It’s more akin to the question, “Should we use OOP or FP,” but it’s more than that. It involves thinking about what relationships between information and semantics best suite the domain for which the system is being developed, so that software engineers can better express the design (hopefully as close to a “spec” as possible), and what computing engine design to use in processing that programming model, so it runs most efficiently. When I talk about “constraints of materials,” I’m analogizing that to hardware, and software runtimes, and what their speed and load capacity is, in terms of things like frequency of requests, and memory. In short, what I’m saying is that some language and VM/runtime design might be necessary for this analogy to hold.

What this could accomplish is ultimately documenting the process that’s needed—still in a formal language—and using that documentation to run the process. So, rather than requiring software engineers to understand the business process, they can instead focus on the description and semantics of the engine that allows the description of the process to be run.

What’s needed is thinking like, “What kind of system model do we need,” and an industry that supports that kind of thinking. It needs to be much more technically competent to do this. I know this is sounding like wishful thinking, since people in the field are always trying to address problems that are right in front of them, and there’s no time to think about this. Secondly, I’m sure it sounds like I’m saying that software engineers should be engaged in something that’s impossibly hard, since I’m sure many are thinking that bugs are bad enough in existing software systems, and now I’m talking about getting software engineers involved in developing the very languages that will be used to describe the processes. That sounds like I’m asking for more trouble.

I’m talking about taking software engineers out of the task of describing customer processes, and putting them to work on the syntactic and semantic engines that enable people familiar with the processes to describe them. Perhaps I’m wrong, but I think this reorientation would make hiring programming staff based on technical skills easier, in principle, since not as much business knowledge would be necessary to make them productive.

Thirdly, “What about development communities?” Useful technology cannot just stand on its own. It needs an industry, a market around it. I agree, but I think, as I already said, it needs a more technically competent industry around it, one that can think in terms of the engineering processes I’ve described.

It seems to me one reason the industry doesn’t focus on this is we’ve gotten so used to the idea that our languages and computing systems need to be complex, that they need to be like Swiss Army knives that can handle every conceivable need, because we seem to need those features in the systems that have already been implemented. They reality is the reason they’re complex is a) they have been built using semantic systems that are not well suited to the problem they’re trying to solve, and b) they’re really designed to be “catch all” systems that anticipate a wide variety of customer needs. So, the problem you’re trying to solve is but a subset of that. We’ve been coping with the “Swiss Army knife” designs of others for decades. What’s actually needed is a different knowledge set that eschews from the features we don’t need for the projects we want to complete, that focuses on just the elements that are needed, with a practice that focuses on design, and its improvement.

Very few software engineers and computer scientists have had the experience of using a language that was tailored to the task they were working on. We’ve come to think that we need feature-rich languages and/or feature-rich libraries to finish projects. I say no. That is a habit, thinking that programming languages are communication protocols not just with computers, but with software engineers. What would be better is a semantic design scheme for semantic engines, having languages on top of them, in which the project can be spec’d out, and executed.

As things stand, what I’m talking about is impractical. It’s likely there are not enough software engineers around with the skills necessary to do what I’m describing to handle the demand for computational services. However, what’s been going on for ages in software engineering has mostly been a record of failure, with relatively few successes (the vast majority of software projects fail). Discussions, like the one described in the question that inspired this post, are not helping the problem. What’s needed is a different kind of discussion, I suggest using the topic scheme I’ve outlined here.

I’m saying that software engineering (SE) needs to take a look at what modern engineering disciplines do, and do their best to model that. CS needs to wonder what scientific discipline it can most emulate, which is what’s going to be needed if SE is going to improve. Both disciplines are stagnating, and are being surpassed by information technology management as a viable solution scheme for solving computational problems. However, that leads into a different problem, which I talked about 9 years ago in IT: You’re doing it completely wrong.

Related posts:

Alan Kay: Rethinking CS education

The future is not now

— Mark Miller, https://tekkie.wordpress.com