Why We Built Ludwig — a DSL for the Cloud of Today and the Future

The approach taken by Fugue is to allow cloud infrastructure to be treated as code. This concept is required if developers are to generate applications that can exploit the cloud's capabilities and deliver on the promise of immutable infrastructure.

Fugue provides simplification of your life on the cloud through abstractions. Abstractions can be expressed in one of two ways: as black boxes, or as language. Fugue puts as much into language as we can, so that you can do things with it that we didn't predict.

Black boxes are easier for a platform builder to make, because they do things in one particular way. They are also less flexible for the user, because they do things in one particular way, which may not be the way the user needs or prefers.

As users, we prefer flexibility and access, so we prefer languages to black boxes. We made Fugue to be something we would enjoy using, so we decided to express a lot of the system as a language. Since we knew we wanted to go down the language path, we first looked to see if there was something out there that would be a good choice, based on our criteria. These are:

  1. Doing typical things on the cloud should be easy, and not feel like programming.
  2. Users should get great error messages, fast.
  3. If the program compiles, it should almost always work when operating against the cloud.
  4. Doing sophisticated things should be possible, in a safe and predictable way.
  5. Doing sophisticated things once should turn them into shareable, easy things.

We didn't find anything that would meet these criteria, so we created a minimalist domain-specific language called Ludwig. Ludwig is a declarative language that also has functions for abstraction.

Doing typical things on the cloud should be easy, and not feel like programming.

Most uses for Fugue can be expressed in Ludwig by declaring simple records. If you're familiar with YAML or JSON, Ludwig records should look really familiar:

type Config:
  region: AWS.Region
  ami: String

Config jsonish: {
  region: AWS.Us-east-1,
  ami: "ami-6869aa05",
}

For most users, this is all you really need to know about Ludwig, because we provide a rich standard library that covers a lot of ground for building and operating cloud services. Need a VPC network, a compute instance, or a Lambda function? They look like this:

VPC

example-network: Network.public {
  name: "Fugue Ink Network",
  region: fugue-ink-region,
  cidr: "10.0.0.0/16",
  subnets: [ (AWS.A, "10.0.1.0/24"),
             (AWS.B, "10.0.2.0/24"), ] 
}

EC2

example-instance: EC2.Instance.new {
  keyName: "example-key",
  instanceType: EC2.M4_large,
  subnet: public-10-0-1-0,
  tags: [example-tag],
  securityGroups: [example-sg],
  iamInstanceProfile: example-profile,
  image: "ami-d0f506b0"
}

Lambda Function

image-processor: Lambda.Function {
  functionName: "image-processor",
  runtime: Lambda.Nodejs4_3,
  role: lambda-role,
  handler: "exports.handler",
  description: "processes user-uploaded images in s3",
  timeout: 60,
  memorySize: 128,
  publish: True,
  vpcConfig: None,
  region: example-region,
  code: lambda-code
}

Users should get great error messages, fast.

So far, you might be thinking that Ludwig is just another YAML or JSON template format, but it's actually much more, even when declaring simple records based on the standard library. Behind these records are well-defined types and functions that contain a lot of knowledge about cloud services, and return useful errors as you work. Often, when someone learns Ludwig, they start by reading a few documents, but quickly realize that the error messages are so rich that they form a sort of interactive documentation, and instead of poring over specs, they experiment.

For example, let's take a look at the VPC example that we just provided and how this might work in practice. We will intentionally make some errors to show how intuitive and easy it is to get things right with Ludwig.

demo-app-network: Network.new {
  name: "demo-app-network",
  region: AWS.Us-west-3,
  cidr: "10.0.0.0/8",
  publicSubnets: [ (AWS.A, "10.0.1.0/24"),
                   (AWS.B, "10.0.2.0/24"), ],
  privateSubnets: [],
}

If you know AWS, you probably know that there is no region called Us-west-3, but someone new to the cloud might not, or might fat finger the region name. If we try to run this composition, the Ludwig compiler (/opt/fugue/bin/lwc) will return the following error:

  "FugueDemo.lw" (line 14, column 18):
  Not in scope:

    6|   region: AWS.Us-west-3,
                 ^^^^^^^^^^^^^

  Constructor not in scope: AWS.Us-west-3
  Hint: perhaps you mean one of:
    AWS.Us-west-1 (from Fugue.Core.AWS.Common)
    AWS.Us-west-2 (from Fugue.Core.AWS.Common)

First, you get the actual line and column for where the error occurred - this becomes particularly important when you have modules that are imported with a stack trace. No more searching through thousands of lines of code to find what generated an unhelpful AWS error. Second, you get some hints as to what the correct answer might be. This isn't black box code, but lwc being smart enough to look at the type definition for regions and matching things that look close. Here's the Region type:

type Region:
  | Ap-northeast-1
  | Ap-northeast-2
  | Ap-southeast-1
  | Ap-southeast-2
  | Eu-central-1 
  | Eu-west-1
  | Sa-east-1
  | Us-east-1
  | Us-west-1
  | Us-west-2
  | Ap-south-1

This is a sum type, meaning that a Region can be constructed of any of the following items preceded by the pipe (|). Regions are just one of many domains of information encoded in the types that ship with Fugue. Those type definitions are right on disk in /opt/fugue/lib as well, so you can take a look yourself. Let's keep working on getting this VPC correct, though. Now we've changed the region to Us-west-2, so that will be fine, but we've made an error in the CIDR block:

import Fugue.AWS as AWS
import Fugue.AWS.Pattern.Network as Network

demo-app-network: Network.new {
  name: "demo-app-network",
  region: AWS.Us-west-2,
  cidr: "10.0.0.0/8",
  publicSubnets: [ (AWS.A, "10.0.1.0/24"),
                   (AWS.B, "10.0.2.0/24"), ],
  privateSubnets: [],
}
ludwig (runtime error):
  "/opt/fugue/lib/Fugue/AWS/EC2/Vpc.lw" (line 132, column 29):
  error:

    132| Bool isValid(VpcSpec spec): isValidCidr(spec.cidrBlock)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^

  Vpc cidrBlock size must be between a /28 and /16 netmask.

  Stack trace:
    In call to isValid at "/opt/fugue/lib/Fugue/AWS/EC2/Vpc.lw" (line 40, column 6)
    In call to new at "/opt/fugue/lib/Fugue/AWS/Pattern/Network.lw" (line 67, column 12)
    In call to new at "FugueDemo.lw" (line 19, column 19)

Note that the error information has AWS specific knowledge which is also written into the library. In this case, we need a /28 to a /16. This error has a full stack trace at the bottom because the record we defined above is passed into the Network.new function in the standard library.

If the program compiles, it should almost always work when operating against the cloud.

This feedback loop ends when the user gets a successful compile of the Ludwig program, and because the standard library is pretty good at knowing what works and what doesn't on the cloud, this generally means the composition will run successfully. While it is still possible to have runtime errors when a composition is run against actual cloud APIs — for example, trying to create an object that is a duplicate on a service where this isn't allowed, such as S3 — the vast majority of actual user mistakes can be caught during compilation.

Prior to actually building, modifying, or terminating cloud infrastructure, you'll probably want to know what will actually happen in the environment. So, after compilation, the next recommended step is a dry run. The dry-run feature produces a JSON output of what is going to happen in AWS if you were to execute fugue run, fugue kill, fugue resume, or fugue update. The workflow of using Fugue with Ludwig is therefore fast, interactive, and low-risk.

We cannot fully solve for runtime environment issues and some kinds of API errors in Ludwig, so these are the domain of the Fugue Conductor, which has commands like status and history that can return detailed information on what is happening, what went right, and what went wrong.

Doing sophisticated things should be possible, in a safe and predictable way.

Declaring records is fine and good, but you may eventually want to create some of your own abstractions, and this is where you can use functions in Ludwig. Like any language, learning how to write functions in Ludwig has a learning curve, but it's not too bad. You can read our docs on Ludwig to learn how to write these, so we won't focus on a tutorial in this blog post. We will point out a couple interesting use cases for functions.

Combining a collection of cloud services into a higher order abstraction is the most typical use case for functions. For example, your company might have a standard way to build a Kubernetes laboratory. With Fugue, you can write a Kubernetes cluster constructor that takes in a few parameters in the familiar record form shown above. The constructor can take a few arguments that are key decisions for deploying the cluster:

fun newCluster {
    name: String,
    region: AWS.Region,
    subnetAZ: AWS.AvailabilityZone,
    keyName: String
  } -> KubeCluster:

  [ ... snip ... ]

This code makes a simple laboratory like this:

kubernetes_lab_rt.png

Then, making a new cluster is as easy as a few lines of code.

# A Kubernetes laboratory cluster. Includes a VPC and requisite
# network bits, IAM and security bits, and compute for etcd and Kube.

composition

import KubernetesCluster as KubeCluster
import Fugue.AWS as AWS

# Run the cluster function. Everything happens from there.
my-kube: KubeCluster.new {
             name: "my-kube-net",
             region: AWS.Us-west-2,
             subnetAZ: AWS.A,
             keyName: "kubernetes_the_hard_way"
         }

Another use case is to enforce company policy on how infrastructure is used. For example, you might have an application that needs to conform to HIPAA regulations because it contains patient healthcare data. HIPAA on AWS is possible, but it can be a laborious and manual process to get it right, generally implemented with monitoring and reporting solutions. Fugue allows HIPAA rules to be abstracted, with all the other Ludwig features such as useful error messages. This allows you to move really fast when developing on cloud, while maintaining policy and compliance of the cloud services.

For instance, HIPAA rules require that instances be dedicated. With Ludwig, we can build validations which confirm this is the case. For example, consider this EC2 instance, which is meant to be HIPAA-compliant:

import Fugue.AWS.EC2  as EC2
import Fugue.HIPAA.AWS

instance: EC2.Instance.new {
  instanceType: EC2.T2_nano,
  monitoring: False,
  subnet: subnet,
  securityGroups: [sg],
  image: "ami-6869aa05",
}

By virtue of importing the Fugue.HIPAA.AWS library, Ludwig will check that the instance is HIPAA-compliant. In this case, it actually is not, because the t2.nano instance type doesn't support dedicated tenancy.

ludwig (runtime error):
  "hipaa-instance.lw" (line 5, column 17):
  error:

    25|   instanceType: EC2.T2_nano,
                        ^^^^^^^^^^^

HIPAA: The instance type t2.nano does not support dedicated instances, which are required by AWS HIPAA guidelines

Doing sophisticated things once should turn them into shareable, easy things.

Once you've written your abstractions, they can be shared across your organization just like any other code. Most organizations that use cloud have a DevOps team that is expert on cloud, and lots of development teams that aren't. Functions allow the DevOps team to vend simple code to the dev teams for environments that they can use that span all the AWS services Fugue covers, automatically. Through data externalization and the Fugue process model, you get real infrastructure as code. Statically-typed, safe code with great error messages.

We continue to expand Fugue and Ludwig to make it easier to use and more powerful, and in the near future we'll be making some announcements on even more powerful validation and safety features, as well as lots of additional service coverage.

See why Ovum’s research report recommends Fugue to help developers generate applications that can exploit cloud capabilities and deliver on the promise of the cloud. Read more. Be sure to join the next Fugue webinar to learn how to instantly deploy cloud environments, and register now.

Take Control of Your Cloud Today

You and your team can be productive with Fugue on AWS in less than an hour, without the need for professional services.