06 - The Cloud Platform | Mission O/S

Name: Building Cloud Platforms That Accelerate Delivery | Mission O/S Ep 6
Uploaded: 2026-03-03
Channel: Bryon Kroger
Description: Many government cloud platforms create more friction than value. Learn what a good platform actually provides: paved roads, guardrails, and self-service that accelerates delivery.

Mission O/S

Learn why software delivery fails in government — and what's required to make shipping possible.

Episode Resources

Read

Achieving Continuous Delivery for Government Agencies

Mission O/S Live

The Missing Layer in Government Tech: A Real Operating System with John Cutler

Frequently asked questions

Why do most government cloud migrations fail to deliver speed or savings?

Because they treat the cloud as a destination instead of an operating model. They spend billions to lift and shift legacy systems from their own data centers into commercial ones—and get nothing for it. Costs go up, speed doesn't change, and security is often worse. As Bryon says in Episode 6: "If you're filling out a ticket to get a server, you're not in the cloud, you're just in yet another data center." Cloud-native means on-demand, self-service, automated infrastructure—and to harness its power, you need a platform built on a fundamentally different philosophy.

Should government programs build their own cloud platform or buy a commercial PaaS?

The Mission O/S philosophy is: "Unless it differentiates you, don't build what you can buy, don't buy what you can rent." Your organization's mission—whether it's defending the nation or providing veteran healthcare—is not to become the best in the world at managing Kubernetes clusters. The cost of FTEs to build a platform equivalent to a commercial PaaS typically dwarfs the license costs, once you account for the total cost to develop, operate, and maintain compliance across the entire portfolio.

Why does the platform need to be opinionated?

Because unlimited optionality slows the enterprise. A good platform makes choices—"This is the approved way to handle logging, this is the approved way to manage databases"—and intentionally limits optionality to create a simpler, more secure, more efficient experience for the vast majority of users. "You're not trying to build a platform that can do everything. You're trying to build a platform that makes it easy to do the right things and hard to do the wrong things." The value of providing a secure, reliable, fast engine for 95% of your teams far outweighs the value of allowing the other 5% to have unlimited choices.

How does a cloud platform lower the cost of compliance under NIST RMF?

With a well-designed, opinionated Platform as a Service, 80 to 90% of the security controls required under NIST RMF are handled centrally at the platform layer—implemented once, assessed once, and inherited by every application team that runs on top of it. This is the concept of common controls inheritance, and it's found directly in NIST Special Publication 800-37. Instead of 50 or 100 application teams independently answering hundreds of security controls, you solve the compliance problem once and the cost drops across the entire portfolio.

Transcript

Bryon Kroger (00:05):

In the last episode, we covered the path to production and we found out that a reliable automated expressway for our code is the critical infrastructure for modern delivery. Today, we're talking about what powers the path, the cloud platform. Now, for the last decade, moving to the cloud has been the number one priority for nearly every CIO in the world. And most of them have failed spectacularly. They spend billions of dollars to lift and shift their legacy systems from their own data centers to Amazon or Microsoft data centers, and they get nothing for it. Their cost ends up going up, their speed doesn't change, and their security posture is often worse. And why? Well, it's because they treat the cloud as a destination and not an operating model. So they change the physical location of their servers, but they don't actually change their culture, their architecture, and their processes.

(01:05):

They're still filling out PDF forms and waiting six months for a virtual machine. But if you're filling out a ticket to get a server, you're not in the cloud, you're just in yet another data center. So cloud native is a new way of working. It's defined by on- demand, self-service, automated infrastructure. And to harness its power, you need a foundation. You need a cloud native platform built on a fundamentally different philosophy. Now, that philosophy is simple. Unless it differentiates you, don't build what you can buy, don't buy what you can rent. So you should always default to the highest level of abstraction possible. Your organization's mission, whether it's defending the nation or providing Veteran healthcare, is not to become the best in the world at managing Kubernetes clusters. That's what we like to call undifferentiated heavy lifting. It's work that's absolutely necessary, but it provides no unique value to you or your end users.

(02:08):

And in large enterprises, it's the kind of work that gets duplicated hundreds, if not thousands of times across different teams, creating an unimaginable amount of waste. The solution is to abstract it away, to solve that problem once, centrally, and then provide it to all of your teams as a service. And this is the core idea behind a platform as a service or PaaS. As a service was a phrase originally meant to denote a service call over the network, not simply a service done by somebody else. A PaaS is an agreement. It's a contract between the platform team and the application development teams usually in the form of an API. Now, the platform team should own everything from the runtime down. They manage the cloud infrastructure, the operating systems, networking, monitoring, all of it. And then the application team is only responsible for two things, their application code and their data.

(03:05):

The contract, the promise from the platform to the developer is a simple one. And it's captured in my favorite little haiku from Onsi Fakhouri. He says, "Here is my source code. Run it in the cloud for me. I do not care how." This is the most important abstraction that you can create in a modern technology organization, and it produces three massive streams of value. The first one is for your developers. It's a force multiplier. Instead of spending 50% or more of their time wrestling with infrastructure, and low level compliance and YAML, they can spend nearly 100% of their time focused on what actually matters. Building features that solve problems for your users and advance your mission. You unlock the creative and problem solving potential of your most valuable talent. Second is for your operations team. The cost to operate a commercial platform as a service is far lower than a DIY platform, even after accounting for any licensing costs.

(04:09):

Typically, FTE costs dwarf the license costs of commercial platforms, which are able to take advantage of a massive economy of scale. And last but certainly not least, the more automated the cloud platform, the higher the savings. The third, for your security and compliance teams, a PaaS is a godsend. In the traditional model, every one of your 50, 100, 1000 application teams has to independently answer, and provide evidence for, hundreds of security controls. It's a nightmare of duplicated work and inconsistent implementation. With a good, opinionated platform as a service though, 80 to 90% of those security controls are handled centrally at the platform layer. So they're implemented once, assessed once, and then inherited by every single application team that runs on top of it. So this creates a consistent auditable and defensible security posture across your entire enterprise. And it's the technical foundation that makes a continuous authorization to operate in the federal government possible.

(05:17):

So for leaders and shareholders, a PaaS fundamentally changes the economics of your entire technology portfolio. It dramatically lowers your total cost of ownership. Your cost to develop goes down because your teams are so much more efficient. Your cost to operate goes down because you're managing one platform with a lot of automation, not 50 unique snowflakes. And your cost of compliance goes way down because you've eliminated all of that redundant security work. Now, to make this work, the platform has to be opinionated. A good platform makes choices. It says, "This is the approved way to handle logging. This is the approved way to manage databases." And it intentionally limits optionality to create simpler, more secure, and more efficient experience for the vast majority of its users. And that's also where you're going to get a lot of pushback. You're going to have some teams, usually the most technically advanced ones, who will want full control to customize everything.

(06:20):

There's a lot of reasons why they might want to do this, but your job as a leader is to recognize that while the desire for optionality is totally understandable, it's a luxury that the enterprise can't afford. In a large, complex organization, the value of providing a secure, reliable, and fast engine for 95% of your teams far outweighs the value of allowing the other 5% to have unlimited choices. You're not trying to build a platform that can do everything. You're trying to build a platform that makes it easy to do the right things and hard to do the wrong things. It's an engine that delivers speed, security, and scalable value.

(07:14):

Now, if you're a change agent out there listening, your organization might not have a PaaS. And so the very first thing that you need to do is go have a conversation with your program leadership, or whoever is going to be the signatory on that program budget or acquisition strategy, and you need to build the case, a financial case first, around the total cost of operations. And so this can be modeled out pretty easily, but you have to know what you're getting yourself into. So a lot of times people will go and do some research and find out that you can build your own Kubernetes infrastructure, and ship a Hello World app, and it costs one developer a few hours. It's like, "oh, it's so easy." But it's very different to do that, than to run a worldwide cloud platform on the secret network, shipping very complex applications with complex data, failover, all of these things, right?

(08:12):

Blue-green deployments. That's where things get expensive. That's why the commercial PaaS market exists because that is very expensive and it requires a very large economy of scale to be able to do efficiently. And you don't have that economy of scale. And so when you model that out, you want to model out the total cost to develop. So what's the difference between developing a DIY versus deploying on top of a cloud platform and only having to worry about your data and your application code? And then you want to map out the total cost of operations. So how much does it cost to operate a commercial PaaS, versus where you're paying essentially for licenses, and then day two support, versus building your own, where you've got to develop the thing, then you've got day zero costs and day one and day two costs. So a lot more cost of operations.

(09:07):

And then the last one, of course, being the total cost of security and compliance. So if you have this opinionated commercial PaaS that has like really stable releases and a team behind it, what does that look like versus building your own where you're going to accomplish all of that yourself, and on an ongoing basis, if you can inherit a lot of controls at the application layer, what does that mean for your total cost of compliance when you look across the entire system, including application developers?

(09:45):

Now, when I say that a PaaS is a foundation for a continuous authorization to operate, you're probably going to run into some skeptical cybersecurity leaders, or maybe you are one, who's only ever known or done system by system authorizations with large authorization boundaries. And so a couple things that I want to mention ... first, this concept of common controls inheritance is found in the Risk Management Framework. It's in NIST special publication 800-37, the RMF. And when you're doing common controls authorizations, even though you might have a large authorization boundary for a given system, it's able to inherit controls from components. So in this case, we treat PaaS as a component, and we could include all of the underlying cloud infrastructure, or we could even separate those out if you wanted to be able to mix and match. Maybe you're doing multi-cloud. Or maybe you're doing multi-PaaS.

(10:47):

And so you almost have a build-a-bear compliance as it were, of if this PaaS is deploying on top of this infrastructure, it's enabled to inherit a lot of the security controls from it. And likewise, when the applications deploy on top of the PaaS, it's able to inherit controls. Now, not only does that make compliance easier, but it also makes security better because you're centralizing all of that control. So instead of having an SOP, and if you're building something in an ecosystem, you're expected to know and understand the full stack of security controls and implement them yourself, you move to a model where most of that work is centralized and controlled, and it becomes really tight guardrails for the developers that are providing capability in your system boundary. And so this is highly preferable, not only for the ease of compliance, but also for the quality of security.

(11:48):

Now, early on in Kessel Run, we did have some battles over the platform being opinionated and running this commercial PaaS. One of them was about the cost of the licenses, which I already covered. We ran a very detailed financial model and showed that the cost of FTEs to build something equivalent would dwarf the cost of the licenses. So that was a pretty easy one to knock out. The harder one though was actually our own development teams and third party developers who wanted more flexibility and control, or they thought they did. The reality is, we said, "Hey, you don't have to use this Pivotal Cloud Foundry that we've installed at these locations worldwide, but if you don't, then you have to figure out how to get your own app to production, full stack, including cybersecurity and RMF, and getting your authorization to operate." And so some teams tried to embark on that and were not able to meet our threshold for speed, getting your initial valuable release into production within 120 days, and then weren't able to get subsequent weekly delivery cadences going, which was the standard that we had.

(13:04):

And so teams quickly realized like, "Oh yeah, maybe this flexibility and control is not worth it. I would rather be able to go fast and provide more responsiveness and better capability to the end users that I'm talking to every day." The other thing that we saw was that some application teams found other platforms that were maybe slightly less opinionated, had slightly longer lead times, but also offered them other advantages that we couldn't at the time because we were early on in our journey. And so we just allowed this. And I would say, don't mandate your platform. It immediately takes the incentive away from the platform team to build a good one, and it also builds resentment in the developer teams. Let them explore a little bit, at least to the degree that you can with the runway that you have. Some teams might go in a different direction.

(13:52):

That's okay. The goal isn't to build a platform that everybody must build on. The goal is to get capability to the Warfighters, the veterans, the clinicians faster and better.

‍

Episode 06

Episode Resources

Frequently asked questions

Transcript