pulumi/docs/overview.md

28 KiB

Lumi Overview

Lumi is a toolset and runtime for creating reusable cloud services. The resulting packages can be shared and consumed similar to your favorite programming language's package manager. Lumi is inherently multi-language and multi-cloud, and lets you build abstractions that span different cloud environments and topologies, if you desire.

This document provides an overview of the Lumi system, its goals, primary concepts, and the system architecture.

Problem

Cloud services are difficult to build, deploy, and manage. The current trend to use increasingly fine-grained microservices increases complexity, transforming most modern cloud applications into complex distributed systems, without much of the supporting language, library, and tooling support you'd expect in a distributed programming model.

There are many aspects to building distributed systems that aren't evident to the newcomer: configuration, RPC, dependency management, logging, fault tolerance, and zero-downtime deployments, to name a few. Even the experienced practitioner will quickly find that today's developer and operations tools do not guide us down the golden path. It's common to need to master a dozen tools before even getting an application up and running in production. Worse, these tools are closer to ad-hoc scripts, markup files, and unfamiliar templating systems, than real programming languages.

On top of that complexity, it is difficult to share knowledge. Most modern programming languages have component models that allow you to encapsulate complex functionality underneath simple, easy-to-use abstractions, and package managers that allow you to share these components with others, and consume components shared by others, in the form of libraries. The current way that cloud architectures are built and deployed has no such componentization, sharing, or reuse. Instead, "sharing" happens through copy-and-paste, resulting in all the standard manageability woes you would expect.

The cloud platforms are also divergent in the configuration formats that they accept, infrastructure abstractions that they provide, and the specific knobs used to configure those abstractions. It is as though every web application developer needs to use a different programming language, while understanding the intricate details of the underlying thread scheduler, filesystem, and networking stack, just to target Linux, macOS, or Windows. This is not how things are done: instead, most programmers use a language like Node.js, Python, Go, Java, or Ruby, to hide these details.

Finally, once such an application is up and running, managing and evolving it requires similarly ad-hoc and individualized tools and practices. Applying changes to a running environment is often done manually, in an unauditable way, and patches are applied unevenly and inconsistently, incurring security and reliability hazards.

All of the above is a big productivity drain, negatively impacting the agility of organizations moving to and innovating in the cloud. Containers have delivered a great improvement to productivity and management of single nodes in a cluster, but have yet to simplify entire applications or entire clusters. By adopting concepts and ideas that have been proven in the overall landscape of languages and runtimes, however, Lumi significantly improves this situation.

Solution

Lumi lets developers author cloud components in their language of choice (JavaScript, Python, Ruby, Go, etc). These components include infrastructure, service and application-level components, serverless, and entire fabrics of topologies. Lumi embraces an immutable infrastructure philosophy, improving analysis, automation, auditing, and repeatability. In cattle versus pets terminology, Lumi prefers cattle, but can cater to pets.

Lumi's approach to using a real language and package manager is in contrast to most approaches to cloud configuration today which typically use obscure configuration markups, DSLs, and, occasionally, templating functionality for conditional logic. This unified view is particularly helpful when building serverless applications where you want to focus on code. In such applications, the infrastructure management pieces increasingly fade into the background.

At the same time, Lumi is polyglot, allowing composition of components authored in many different languages. Cloud components can be built by reusing existing components shared by others, and published to the Lumi package manager for others to use. A common intermediate format and runtime is used to stitch these components together. Cloud services are in fact simply instances of these components, with property values configured appropriately. Change management leverages existing source control workflows and is performed by Lumi's understanding of the overall graph of dependencies between those services. Think of each service as an "object" that is running in the cloud.

Lumi may target any public or private cloud, and may be used in an unopinionated way (programmatic IaaS), or fully opinionated (more PaaS, FaaS, or BaaS-like), depending on your preference. Although you are free to program directly to your cloud provider's specific abstractions, using the full power of your native cloud, Lumi also facilitates building higher-level cloud-neutral components that can run anywhere. This includes compute services, storage services, and even more logical domain-specific services like AI, ML, and recognition. This is done using the standard techniques all programmers are familiar with: interface-based abstraction, class-based encapsulation, sharing, and reuse.

Example

Let us look at a handful of brief examples in Lumi's flavor of JavaScript, LumiJS.

Basic Infrastructure

The first example is an unopinionated description of infrastructure that projects onto AWS in a straightforward way. It creates a virtual private cloud (VPC) and a subnet per availability zone, the foundation for most new private clouds:

import * as aws from "@lumi/aws";

export let vpcCidrBlock = "10.0.0.0/16";

let vpc = new aws.ec2.VPC("acmecorp-cloud", {
    cidrBlock: vpcCidrBlock,
});
let subnets = [];
for (let zone of aws.util.availabilityZones()) {
    subnets.push(new aws.ec2.Subnet("acmecorp-subnet-" + zone, {
        vpc: vpc,
        availabilityZone: zone,
        cidrBlock: aws.util.suggestSubnetCidr(vpcCidrBlock, zone),
    });
}

Although this is a simple example, it shows off some of the underlying power of the Lumi model:

  • Full configuration of AWS resources is available to us.
  • Resources are created and configured using standard language constructors and parameters.
  • The full power of a language is available to us: classes, loops, branches, function calls, etc.
  • The AWS resources themselves, in fact, are just classes exported from an imported library.
  • Passing resource arguments -- like vpc to Subnet -- is capability-based instead of weak string IDs.
  • The vpcCidrBlock export is configurable and can be changed if the default 10.0.0.0/16 isn't right.
  • This library can also export helpful functions, like aws.util.availabilityZones, avoiding hard-coding.

This alone is a significant step forward compared to what IT and developers are accustomed to.

A Little Abstraction

The above example demonstrates an executable package that creates a VPC and its subnets. It is a small step to package up this code into a reusable library that others can use.

Packaging up abstractions is useful for many reasons. Perhaps it just saves you from typing the same boilerplate resource configuration logic over and over again, as is common in today's infrastructure markup. Or maybe you just want to share common patterns within your organization, to enforce some degree of consistency. Finally, perhaps you wish to share your creation with the overall community, so that others can leverage your hard work.

No matter the reason, doing this is as simple as exporting your infrastructure logic as a class that others can use:

import * as aws from "@lumi/aws";

export class AWSInfraFoundation {
    public readonly vpc: aws.ec2.VPC;
    public readonly subnets: aws.ec2.Subnet[];

    constructor(name: string, args?: AWSInfraFoundationArgs) {
        this.vpc = new aws.ec2.VPC(name + "-cloud", {
            cidrBlock: args.vpcCidrBlock || "10.0.0.0/16",
        });
        this.subnets = [];
        for (let zone of aws.util.availabilityZones()) {
            this.subnets.push(new aws.ec2.Subnet("acmecorp-subnet-" + zone, {
                vpc: this.vpc,
                availabilityZone: zone,
                cidrBlock: aws.util.suggestSubnetCidr(vpcCidrBlock, zone),
            });
        }
    }
}

export interface AWSInfraFoundationArgs {
    vpcCidrBlock?: string; // an optional CIDR block for the VPC (default: 10.0.0.0/16).
}

Now, given this definition exported from a library acmecorp/infra, another executable package can instantiate it:

import {AWSInfraFoundation} from "...";
let infra = new AWSInfraFoundation("acmecorp");

This small program essentially has the same effect as the original example. And the VPC and Subnet resource objects are trivially accessible, and usable in the same ways, simply by accessing the properties infra.vpc and infra.subnets.

A Little Syntactic Sugar

At the bottom of the overall ecosystem of libraries are the basic resource libraries for cloud resources like AWS, Azure, Google Cloud, Kubernetes, VMWare, and so on. These resources project the core capabilities of these low-level resources in their full glory, typically with very little on top (except for some types and schema for convenience).

As we will soon see in the section immediately following this one, at the top of this ecosystem are cloud-neutral, high-level abstractions.

In the middle, however, there are some abstractions that still deliver the full power of the underlying abstractions, but add a modest amount of convenience. The most notable example is in the serverless domain, where cloud functions -- such as AWS Lambda -- can be expressed using ordinary lambdas.

Let's take an example.

To create an AWS Lambda using the lowest level, raw projection of the resources, we either need to express the code using inline code embedded in a string, or we need to upload that code to S3 and reference it by bucket name. Either way, the expression of this is unnatural to say the least. For example:

import * as aws from "@lumi/aws";

let dst = new aws.s3.Bucket("thumbnails");

let thumbnailer = new aws.lambda.Lambda("thumbnailer", {
    handler: "index.handler",
    runtime: "nodejs",
    code: {
        zipFile:
            "var config = require('config');\n" +
            "var request = require('request');\n" +
            "var AWS = require('aws-sdk');\n" +
            "var S3 = new AWS.S3();\n" +
            "exports.handler = function(event, context) {\n" +
            "   var src = event.Records[0].s3.bucket.name;\n" +
            "   var key = event.Records[0].s3.object.key;\n" +
            "   s3.getObject(\n" +
            "       {\n" +
            "           Bucket: srcBucket,\n" +
            "           Key: srcKey,\n" +
            "       },\n" +
            "       (data, err) => {\n" +
            "           if (err) {\n" +
            "               console.error(`Unable to resize: ${err}`);\n" +
            "               context.done();\n" +
            "           }\n" +
            "           else {\n" +
            "               let thumb = generateThumbnail(data);\n" +
            "               let dstBuck = config.buckets[" + dst.id + "];\n" +
            "               request.post(dstBuck.host + '/' + key,\n" +
            "                   {\n" +
            "                       form: {\n" +
            "                           bucket: dstBuck.bucket,\n" +
            "                           secret: dstBuck.secret,\n" +
            "                       }\n" +
            "                   },\n" +
            "                   (err, response) {\n" +
            "                       if (err) {\n" +
            "                           console.error(`Unable to post thumbnail: ${err}`);\n" +
            "                       }\n" +
            "                       context.done();\n" +
            "                   }\n" +
            "               );\n" +
            "           }\n" +
            "       }\n" +
            "   );\n" +
            "};\n",
    },
    memory: 1024,
});

let src = new aws.s3.Bucket("images", {
    notificationConfiguration: {
        lambdaConfigurations: [{
            event: "s3:ObjectCreated:*",
            function: thumbnailer,
        }],
    },
});

Although a direct translation of the underlying AWS APIs -- and remarkably similar to the AWS CloudFormation expression of such a topology -- this example suffers from many obvious problems. Embedding source as text leads to a poor tooling experience overall. The subscription to the S3 bucket event source is awkward. And wiring up the destination bucket requires that we dynamically inject code as text into the lambda, and fetch configuration through awkward API calls that make the inner code feel completely and utterly disconnected from the outer code, which of course, it actually is.

The Lumi high-level abstractions alleviate this problem, as we will see soon. But even without eschewing the AWS-specific nature of lambdas and S3 subscriptions, we can use the aws.x package for considerable convenience:

import * as aws from "@lumi/aws";

let dst = new aws.s3.Bucket("thumbnails");
let src = new aws.s3.Bucket("images", {
    onNewObject: new aws.x.lambda.Lambda("thumbnailer", {
        code: async (event, context) => {
            // Generate a thumbnail from the object payload.
            let thumb = generateThumbnail(event.getObject().data);
            // Now store the new thumbnail and raise our event.
            await dst.putObject(event.key, thumb);
        },
        memory: 1024,
    })
});

Notice how concise this code is in comparison. This is for numerous reasons. First, we can use a real lambda in the source language to represent the lambda's code. Next, the lambda can simply close over variables like src and dst, as capabilities, rather than awkwardly marshaling IDs as strings statically at compile-time. Next, we have first class APIs on the various context objects, so that we can simply call getObject and putObject on those capabilities, versus marshaling everything in a very dynamically typed way. And yet, notice we can still set the underlying properties that we care about, like the memory size. We have not lost any power with this gain in expressiveness.

Advanced High-Level Abstractions

This next example demonstrates how higher-level opinionated abstractions may be built out of these fundamental building blocks. Note that there is nothing "special" per-se about these higher level abstractions. These capabilities fall out as a natural consequence from our choice of using a real programming language and real component model.

Let us now look at a cloud-neutral serverless component that creates thumbnails from images uploaded to a bucket:

import * as lumi from "@lumi/lumi";

export class Thumbnailer {
    // onNewThumbnail is an event that is raised for each new thumbnail.
    public readonly onNewThumbnail: lumi.x.Event;

    private readonly src: lumi.x.Bucket; // the source to monitor for images.
    private readonly dst: lumi.x.Bucket; // the destination to store thumbnails in.

    constructor(src: lumi.x.Bucket, dst: lumi.x.Bucket) {
        this.src = src;
        this.dst = dst;
        this.onNewThumbnail = new lumi.x.Event();
        this.src.onNewObject(async (event: lumi.x.NewObjectEvent) => {
            // Generate a thumbnail from the object payload.
            let thumb = generateThumbnail(event.getObject().data);
            // Now store the new thumbnail and raise our event.
            await this.dst.putObject(event.key, thumb);
            this.onNewThumbnail.raise({ key: event.key, thumb: thumb });
        });
    }
}

Thumbnailer accepts two lumi.x.Buckets in its constructor, subscribes to the source's onNewObject event, creates new thumbnails in the resulting lambda, and stores them in the other bucket.

It is important to note that the body of this lambda is real JavaScript -- and can use libraries from NPM, perform IO and awaits, etc. -- while the configuration outside of it is the LumiJS subset. Notice how we can mix what would have been classically expressed using a combination of configuration and real programming languages in one consistent and idiomatic programming model. This illustrates why Lumi's multi-language capabilities are important.

Also notice that Thumbnailer exposes its own event, onNewThumbnail, that can be subscribed just as it subscribes to bucket events. This enables an extensible ecosystem of events and handlers. These events and handlers are enlightened when targeting clouds that support first class events and handlers (such as S3 buckets and AWS Lambdas).

The result is a reusable cloud component that can be instantiated any number of times in any number of environments.

In fact, let us now look at code that uses Thumbnailer:

import * as aws from "@lumi/aws";
import {Thumbnailer} from "...";

let images = new aws.s3.Bucket("images");
let thumbnails = new aws.s3.Bucket("thumbnails");
let thumbnailer = new Thumbnailer(images, thumbnails);

This package is an executable because it is meant to be run directly to create a new cloud topology. Many Lumi programs are libraries (like Thumbnailer itself), while blueprints are akin to executables in your favorite language.

The aws.s3.Bucket class is a subclass of lumi.x.Bucket, and so can be passed to Thumbnailer's constructor just fine. We could have passed an azure.blob.Container, google.storage.Bucket, or a custom subclass, instead. Notice how Thumbnailer is itself a cloud-neutral abstraction. Of course, if it had wanted to access specific AWS S3 features, it could have requested a concrete aws.s3.Bucket instead; or it can enlighten itself and use advanced features simply by using a runtime type check; etc. As with ordinary object-oriented languages, the class author decides. In fact, this example is remarkably similar to accepting a concrete "list" versus "enumerable" interface.

Architecture

The primary concepts in Lumi are:

  • Package: A static library or executable containing modules, classes, functions, and variables.
  • Resource: A special kind of class that represents a cloud resource (VM, VPC, subnet, bucket, etc).
  • Environment: A target environment with a name and optional configuration (e.g., production, staging, etc).
  • Stack: An instantiation of a package, paired with an environment, and fully specified with arguments.
  • Resource URN: An ID that is auto-assigned to each resource object, unique within the overall environment.

Analogous to programming languages, a stack is essentially a collection of instantiated resource objects. Many concepts that are "distinct" in other systems, like gateways, controllers, functions, triggers, and so on, are expressed as classes in Lumi. They are essentially "subclasses" -- or specializations -- of the more general concept of a resource object, unifying the creation, configuration, provisioning, discovery, and overall management of them.

There are some different kinds of stacks we will encounter:

  • Plan: A hypothetical stack created for purposes of inspection but that has not been deployed yet.
  • Deployment: A stack that has actually been deployed into an environment.
  • Snapshot: A stack generated by inspecting a live environment, not necessarily deployed using Lumi.

Notice that Lumi has the ability to generate programs and stacks from an existing environment. This can be useful during the initial adoption of Lumi. It can also be useful for ongoing drift analysis, such as ensuring resources in production don't differ from staging, that resources in different regions are equivalent, and/or verifying that changes to an environment haven't been made without corresponding changes being made and checked into the Lumi metadata.

In addition to those core abstractions, there are some supporting ones:

  • Identity: A unit of authentication and authorization, governing access to protected resources.
  • Configuration: A bag of key/value settings used either at build, runtime, or a combination.
  • Secret: A special kind of key/value configuration bag that is encrypted and protected by identity.

There are some other "internal" concepts that most users can safely ignore:

  • LumiLang: A language subset for the Lumi configuration system (e.g., LumiJS, LumiPy, LumiGo, etc).
  • LumiPack: The intermediate package format used as metadata for packages, common to all languages.
  • LumiIL: The intermediate language (IL) used to represent code and data within a LumiPack package.
  • LumiGL: The graph language (GL) describing a stack's resource topology with dependencies as edges.

Because Lumi is a tool for interacting with existing clouds -- including AWS, Azure, Google Cloud, Kubernetes, and Docker Swarm -- one of the toolchain's most important jobs is faithfully mapping abstractions onto "lower level" infrastructure. Much of Lumi's ability to deliver on its promise of better productivity, sharing, and reuse relies on its ability to robustly and intuitively perform these translations. There is an extensible provider model for creating new providers, which amounts to implementing create, read, update, and delete (CRUD) methods per resource type.

Toolchain

The Lumi toolchain analyzes programs and understands them fully. An important thing to realize is that a Lumi program isn't "run" directly in the usual way; instead, the process is as follows:

  • The source program is compiled (by a LumiLang compiler) into a package (a LumiPack containing LumiIL).
  • The package itself may depend on any number of other library packages (themselves just LumiPacks).
  • The executable package (and its LumiIL) is evaluated by the Lumi runtime to produce a graph (in LumiGL).
  • This graph is then used to generate a plan, deployment, or simply output that can be inspected by a tool.

In particular, deployments are performed by the tool diffing the current state of an environment with the proposed new state, to come up with a series of edits to make the new state reality. This results in a series of invocations to the necessary resource providers that perform side-effects that yield the necessary physical changes.

For instance, a standard workflow for creating a new project might look like this.

First, we initialize the project. This isn't special, other than a Lumi.json or .yaml file:

$ mkdir acmecorp-infra && cd acmecorp-infra   # create a project directory
$ ...                                         # create a Lumi.* file, edit code, etc.

Next, we might choose to compile the project without performing a deployment. This is used for inner loop development, and reports any compile-time errors, while producing a package that we can distribute or inspect.

$ lumi compile                                # compile the project

Now we are ready to do a deployment. First, we will initialize a target environment. Let's call it test:

$ lumi env init test                          # initialize a test environment
$ lumi env config test ...                    # configure the target environment

The configuration steps are optional, but may be used to configure the target region, credentials, and so on.

Next, we might choose to do a dry-run of a deployment first (a.k.a., create a "plan"):

$ lumi deploy test -n                         # do a dry-run (plan) of the changes
Planned step #1 [create]
+ aws:ec2/instance:Instance:
      [urn=lumi:test::ec2instance:index::aws:ec2/instance:Instance::instance]
      imageId       : "ami-f6035893"
      instanceType  : "t2.micro"
      name          : "instance"
      resource      : "AWS::EC2::Instance"
      securityGroups: [
          [0]: -> *urn:lumi:test::ec2instance:index::aws:ec2/securityGroup:SecurityGroup::group
      ]
1 planned changes:
	+ 1 resource created

This shows a colorized view of the series of changes that will be carried out. The output is meant to resemble a familiar Git-like diff view. This will include adds (green), deletes (red), and modifications (yellow).

Finally, we would presumably perform the deployment. This is done simply by omitting the -n flag:

$ lumi deploy test                            # actually perform the deployment

The output of an actual deployment will look a lot like the plan output above, except that it contains more incremental information about the status of steps as they are taken. For example:

Applying step #1 [create]
+ aws:ec2/instance:Instance:
		  [urn=test::ec2instance:index::aws:ec2/instance:Instance::instance]
		  imageId       : "ami-f6035893"
		  instanceType  : "t2.micro"
		  name          : "instance"
		  resource      : "AWS::EC2::Instance"
		  securityGroups: [
			  [0]: -> *urn:lumi:test::ec2instance:index::aws:ec2/securityGroup:SecurityGroup::group
		  ]
info: plugin[aws].stdout: Creating new EC2 instance resource
info: plugin[aws].stdout: EC2 instance 'i-0c5192a1d67810e1a' created; now waiting for it to become 'running'
1 total changes:
	+ 1 resource created

Subsequent changes may be made in the expected way:

$ ...                                         # more code edits, etc.
$ lumi deploy test                            # re-deploy (automatically recompiles)

Lumi calculates the minimal set of incremental edits, compared to the previous deployment, so that just the changed parts will be modified in the target environment.

Further Reading

More details are left to the respective design documents. Here are some key ones:

  • Formats: An overview of Lumi's three formats: LumiLangs, LumiPack/LumiIL, and LumiGL.
  • LumiPack/LumiIL: A detailed description of packages and the LumiPack/LumiIL formats.
  • Dependencies: An overview of how package management and dependency management works.
  • Resources: A description of how extensible resource providers are authored and registered.
  • LumiGL: An overview of the LumiGL file format and how Lumi uses graphs for deployments.
  • Stacks: An overview of how stacks are represented using the above fundamentals.
  • Clouds: A description of how Lumi abstractions map to different cloud providers.
  • Runtime: An overview of Lumi's runtime footprint and services common to all clouds.
  • Cross-Cloud: An overview of how Lumi can be used to create cloud-neutral abstractions.
  • Security: An overview of Lumi's security model, including identity and group management.
  • FAQ: Frequently asked questions, including how Lumi differs from its primary competition.