Something as a Service

Several years ago, NIST produced documents (SP 800 145 & 146) that made admirable and compelling efforts to categorize cloud service offerings with the tripartite taxonomy already in use: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). NIST sought to tighten loose definitions. This was about as good a categorization as was possible at the time. While comprehensive, the perspective set forth in the documents zeroed in on "consumer-provider interaction dynamics." It represented an important perspective, albeit not a preclusive one. The terms have proved useful for understanding the array of cloud services available in the market, at least in a general, commerce­-oriented sense.

AWS is an IaaS, Heroku is a PaaS, and Salesforce is a SaaS.

Like any taxonomy, the NIST classifications are a form of distillation and the careful officials there (taking parameters of measurement quite seriously as NIST is wont to do) recognized that. They prefaced their works with short sections about intended audience and scope. They used the three IaaS-PaaS-SaaS buckets so that we can generalize in the face of messy and irresolvable realities. This is a necessary act for decision-making, but it carries with it the possibility of misreading scope. Continued adoption of the defined buckets rolls on in a manner that doesn't account for other perspectives, ones the taxonomy was never intended to preclude. A limited perception takes hold in the marketplace and in the media, filtering to the general public.

Faceted Taxonomies

I spent years of my career working closely with controlled vocabularies and taxonomies. I always stopped short of embracing ontologies, because I'm skeptical of them and just not smart enough to create the One Ontology To Rule Them All. (If literature is any guide, that approach tends to backfire anyway.)

After the dotcom bust, I spent a few years building the team, information design, search, and software for cancer.gov. NCI had been using library scientists (some of the coolest and smartest people I've worked with) to describe a hierarchical navigation taxonomy that would Contain All The Content. It turns out this is pretty easy, but making it useful is practically impossible. Through this experience, I came to the conclusion that there is no single correct way to organize any piece of information. "Is a" is never an absolute truth. Skipping many misadventures, I ultimately ended up working with Peter Morville to create a faceted taxonomy for cancer.gov, so that we could deliver relevant content to people in real need. (It even won a Webby). Twelve years later, the visual design and technology behind the site have changed for better or worse, but the navigation and information design remain the same.

A faceted taxonomy is an approach to categorization of information that yields compellingly effective results. It rejects the fundamental notion that any particular thing actually lives in any particular hierarchy, but embraces that it lives in many simultaneously and equally valid hierarchies, all of which are reductions in resolution to allow humans to comprehend them in one context or another. Is the chair a piece of wood, or is it a thing to sit on? Both, equally, with no primary parent.

At NCI, "how do people use it?" was our most important facet, which is pretty much how Google came to dominate search and crush the dogmatic taxonomists.

How you decide to categorize services - how you think about them - ultimately will manifest as your service architecture. Once in place, choices are hard to change, so a more pragmatic approach upfront, which begins with specific decisions and validates them, means more flexibility in the long run.

We need to think about cloud in this way.

Units of Cloud are Services

The units we are trying to categorize into useful facets are individual services themselves, not broad products and certainly not companies. Think: EC2, S3, RDS, BigQuery, Translate API, Azure Cache, etc. Even within a particular named service, there are often component services, like interfaces for run­instance and stop­instance or asynchronous messaging, that may have different characteristics that suggest you use or avoid them. It's a big, messy, organic population. There are thousands of cloud services out there you can use (or not). They weren't created with regard to categorization. Any given facet that might be used to categorize them will fail at some point. We give our best estimations, then move on.

The IaaS-PaaS-SaaS Model

Categorizing service offerings as infrastructure, platform, or software is focused on describing traditional computing services, mapped onto cloud architectures. This is helpful as a mental bridge for people who are used to buying infrastructure, platforms, and applications, but it doesn't reflect the nature of cloud and utility computing. Nor will it get you very far as a builder of things on top of clouds. A great many actual services blur the lines of IaaS-PaaS-SaaS or even span all the categories.

Is DynamoDB infrastructure or software? It isn't very platformy in that it has a fairly discrete function, but it has a particular API, so in a way it is. It's certainly software, but then so is EC2 which is usually thought of as infrastructure. The three great buckets are inadequate, as is the tack of proliferating more *aaS buckets. Imagine colander pots submerged in water and you get the picture.

To be abundantly fair, in the present market, the *aaS model sometimes still may be convenient in capturing the "consumer-provider dynamic." But even through that lens, it has become anachronistic. For builders, a model's inaccuracies and anachronisms are rougher. Developing and inventing don't happen among locked paradigms, though those can provide a reference point from which to climb, zoom away, hack beyond.

The actionable wisdom is this: try to move away from making policy and technical decisions based on IaaS-PaaS-SaaS. The model has poor resolution to current reality and gets worse as things evolve.

Sharper Facets

Below is my personal list of facets that I use when architecting an application on the cloud. If you asked people I work with what this list includes, they likely wouldn't enumerate items on it, because they're implicit. Their use is just part of the decision-making process on the actual unit, which is the service in question. As with the *aaS taxonomy, these facets are finite. None are proposed as definitive alternatives to *aaS.

Service Gravity

An important way to categorize services is to measure or estimate how many other services they drag along with them or how hard they will be to remove from your application. For example, if you choose to use EC2 on AWS, you likely are going to use EBS, VPC, security groups, IAM, S3, and other AWS services as well. That makes EC2 heavy. Therefore, it has a larger impact on your project than a light service like SQS. The facet is service gravity.

Another way to measure service gravity is to account for the size and complexity of the individual service API. SWF turns out to be pretty light on service commandeering, but is itself complex and full of nuance. When we began using it, we thought it was lighter than it turned out to be, and that cost us some time in development.

One way you might use a service gravity faceted taxonomy is to create policy based on mass: the heavier the service, the more thoughtful you should be before adopting it; the work to implement it will be significant. Gravity also implies that you are going to be pulled into doing things in the service's way, i.e., using its modus operandi. SalesForce is perhaps the heaviest service we use at Luminal, not because we program against it, but because we do business process with it.

Here's a service gravity taxonomy with some example services classified:

  • Light: SQS, S3, Twitter, Glacier
  • Medium: Docker, Github, DynamoDB, Google Apps for Business
  • Heavy: EC2, SalesForce, CloudFormation, Facebook

Service Generality

How many ways can I use a service? Something like S3 is extremely generic and can be used nearly everywhere. Something like GitHub's API is pretty narrowly focused. If you are thinking about integrating a service into your application, having many uses for the service may provide greater efficiency. On the other hand, a very narrow but rich service might be worth the effort. Investing the time to learn and use it is a good choice, if its adoption reduces the amount of time you spend on internal engineering. Related to a service's generality is its applicability. Because you can do something with a service doesn't mean you necessarily should.

Architecture and Interfaces

When you are building an application or service, you always have an architecture. It may be an implicit architecture or an explicit one, but it's there. When you integrate others' services into yours, you are inheriting architecture, at least in the interface layer that you're writing against. Are you using OAuth? Are they? Do you prefer JSON or XML? Are websockets going to cause you to add a new infrastructure tier because you are stateless/ephemeral? The architecture and interfaces are too numerous to categorize easily. Often, alignment has a "feel" to it as you build your application or service, but consider underlying architecture nonetheless. Too diverse a set of architectural patterns causes confusion and extra work over time.

When you use any service, you are leaving behind data. When those data are on someone else's computer, she can be subpoenaed and required to turn over the goods. When you decide to use a service, it's a great idea to look at the provider's usage agreement and legal track record.

Here are a few others, in brief. Some are obvious:

  • Cost: Guess at your usage at low, medium, and high scale. Calculate costs.
  • Vendor History: Do they lock in? Do they have a highly profitable professional services business? (Both typically are bad signs.)
  • Breaking Changes: How often has the service deprecated interfaces or changed implementation in a way that breaks things?
  • Performance: Does the service have the performance you need for the operation? From everywhere you'll be? Can it scale to your wildest success scenario?
  • Business Focus: Does the service provider align with the use and future direction of your service?

Facets in Action

Let's say we are building a website that collects user sign up information for a beta program. In a traditional development mindset, we'd stack some compute somewhere and run a web server and a database to collect the information. In the cloud world, we have more and better choices. We may choose to host our content in an object store and/or a content distribution network and directly interface from the browser to a service that allows us to collect user data and export it. For example, SalesForce has a service called WebToLead. But SalesForce is a SaaS (or is it a PaaS?) and we need some infrastructure. The *aaS consideration is irrelevant to the application because we are using an actual service, which is what clouds are made of. Is WebToLead part of SalesForce's PaaS or SaaS? Who cares?

Other facets become useful:

The first concern in this user sign up example has to be service gravity. Are many other services being dragged along or is the API especially complex with WebT­oLead or another option? No, on both counts. Because we are only collecting data to be exported, the service gravity is pretty light.

What about service generality? I doubt we'd use the service under consideration for other kinds of data, so it's not very general. But, its narrow focus is attractive and worth our investment because the time we'd spend on internal engineering is minimal.

The architecture and interface is all REST goodness in our example. We're not building the service into our application in a deep way, so the service provides a good answer to a not-very-important question.

I cannot see a scenario where I'd build a complex service unrelated to CRM atop the SalesForce APIs, but that's not because they are a particular type of *aaS, it's because I'm concerned about the service gravity and generality issues for non-CRM focused services.

Then again, it might be informative to do a thought experiment on building a non-CRM service on those APIs just to see if my intuition is correct or if I have an unfair bias towards the business focus facet.

The important point is that faceted taxonomies aren't limited in count or scope. They are whatever "view" of a large collection of things is helpful to organizing thoughts and decision-making. I caution against creating matrices of facets, overusing tables, or exploiting other meta­categorization. I don't use these as rigid selection criteria, but as ways to understand what is really offered in order to come to my own conclusions. Don't do math on taxonomies. Let each facet speak for itself. While you won't have a simple, sure answer, you will be dealing with reality.

Take Control of Your Cloud Today

You and your team can be productive with Fugue on AWS in less than an hour, without the need for professional services.