pulumi/docs/architecture/providers.md

18 KiB

(providers)=

Providers

The term "provider" can mean different things in different contexts. When we talk about Pulumi programs, we often talk about provider resources such as that provided by the aws.Provider class in the @pulumi/aws NodeJS/TypeScript package. Or, we might simply mean a cloud provider, such as AWS or GCP. In the context of the wider Pulumi architecture though, a provider (specifically, a resource provider) is a Pulumi plugin that implements a standardized gRPC interface for handling communication with a third-party service (usually a cloud service, such as AWS, GCP, or Azure):

  • Configuration methods are designed to allow a consumer to configure a provider instance in some way. The call is the most common example of this, allowing a caller to e.g. specify the AWS region that a provider should use operate in. is a similar method that operates at a higher level, allowing a caller to influence more deeply how a provider works (see the section on parameterized providers for more).
  • Schema endpoints allow a caller to interrogate the resources and functions that a provider exposes. The method returns a provider's schema, which includes the set of resources and functions that the provider supports, as well as the properties and inputs that each resource and function expects. This is the primary driver for the various code generation processes that Pulumi uses, such as that underpinning SDK generation.
  • Lifecycle methods expose the typical , , , and (CRUD) operations that allow clients to manage provider resources. The , , and methods also fall into this category, as discussed in resource registration.
  • Functions can be invoked on a provider through the call, or on specific resources by using the operation. Functions are typically used to perform operations that don't fit into the CRUD model, such as retrieving a list of availability zones, or available regions, etc.

While any program which implements the interface can be interfaced with by the Pulumi engine, in practice most Pulumi providers are built in a handful of ways:

A provider binary is typically named pulumi-resource-<provider-name>; pulumi-resource-aws is one example.

(default-providers)=

Default providers

A default provider for a package and version is the provider instance that Pulumi will use to manage resources that do not have a provider explicitly specified (either directly as a resource option or indirectly via a parent, for instance). Consider for example the following TypeScript program that creates an S3 bucket in AWS:

import * as aws from "@pulumi/aws"

new aws.s3.Bucket("my-bucket")

The Bucket constructor will yield a such as the following:

RegisterResourceRequest{
  type: "aws:s3/bucket:Bucket",
  name: "my-bucket",
  parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
  custom: true,
  object: {},
  version: "4.16.0",
}

The absence of a provider field in this request will cause the engine to use a default provider for the aws package at version 4.16.0. The engine's implementation ensures that only a single default provider instance exists for each package version, and only creates default provider instances on demand (that is, when a resource that requires one is registered). Default provider instances are created by synthesizing appropriate RegisterResourceEvents with inputs sourced from the stack's configuration values for the relevant provider package. In the example above, the default AWS provider would be configured using any stack configuration values whose keys begin with aws: (e.g. aws:region).

Changing the above example to use an explicit provider will prevent a default provider from being used:

import * as aws from "@pulumi/aws"

const usWest2 = new aws.Provider("us-west-2", { region: "us-west-2" })

new aws.s3.Bucket("my-bucket", {}, { provider: usWest2 })

This will yield a RegisterResourceRequest whose provider field references the explicitly constructed entity:

RegisterResourceRequest{
  type: "aws:s3/bucket:Bucket",
  name: "my-bucket",
  parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
  custom: true,
  object: {},
  provider: "urn:pulumi:dev::project::pulumi:providers:aws::us-west-2::308b79ee-8249-40fb-a203-de190cb8faa8",
  version: "4.16.0",
}

Note that the explicit provider itself is registered as a resource, and its constructor will emit its own RegisterResourceRequest with the appropriate name, type, parent, and so on.

(mlcs)= (component-providers)=

Component providers

Authors of Pulumi programs can use component resources to logically group related resources together. For instance, a TypeScript program might specify a component that combines AWS and PostgreSQL providers to abstract the management of an RDS database and logical databases within it:

import * as aws from "@pulumi/aws"
import * as postgresql from "@pulumi/postgresql"

class Database extends pulumi.ComponentResource {
  constructor(name: string, args: DatabaseArgs, opts?: pulumi.ComponentResourceOptions) {
    super("my:database:Database", name, args, opts)

    const rds = new aws.rds.Instance("my-rds", { ... }, { parent: this })
    const pg = new postgresql.Database("my-db", { ... }, { parent: this })

    ...
  }
}

This component can then be used just like any other Pulumi resource:

const db = new Database("my-db", { ... })

...if the program is written in the same language as the component (in this case, TypeScript). In some cases however it would be great if components could be reused in multiple languages, since components provide a natural means to abstract and reuse infrastructure.

Enter component providers (also known as multi-language components, or MLCs). Component providers allow components to be written in one language and used in another (or rather, any other). Typically we refer to such components as remote, in contrast with local components written directly in and alongside the user's program as above.

Under the hood, component providers expose remote components by implementing the method. The engine automatically calls Construct when it sees a request to create a remote component.1 Indeed, since providers and gRPC calls are the key to making custom resources consumable in any language, exposing components through the same interface is a natural extension of the Pulumi model.

Just as the body of a component resource is largely concerned with instantiating other resources, so is the implementation of Construct for a component provider. Whereas a custom resource's method can be expected to make a "raw" call to some underlying cloud provider API (for instance), is generally only concerned with registering child resources and their desired state. For this reason, includes a monitorEndpoint so that the component provider can itself make calls back to the deployment's resource monitor to register these child resources. Child resources registered by Construct consequently end up in the calling program's state just like any other resource, and proceed through step generation, etc. in exactly the same way. That is to say, once Construct has been called, the engine does not really care whether or not a resource registration came from the program or a remote component.

:::{note} "Ordinary" resource providers and component providers are not mutually exclusive -- it is perfectly sensible for a provider to implement both the and //... methods. :::

(dynamic-providers)=

Dynamic providers

Dynamic providers are a Pulumi feature that allows the core logic of a provider to be defined and managed within the context of a Pulumi program. This is in contrast to a normal ("real", sometimes "side-by-side") provider, whose logic is encapsulated as a separate plugin for use in any program. Dynamic providers are presently only supported in NodeJS/TypeScript and Python. They work as follows:

  • The SDK defines two types:

    • That of dynamic providers -- objects with methods for the lifecycle methods that a gRPC provider would normally offer (CRUD, diff, etc.).
    • That of dynamic resources -- those that are managed by a dynamic provider. This type specialises (e.g. by subclassing in NodeJS and Python) the SDK's core resource type so that all dynamic resources have the same Pulumi package -- pulumi-nodejs for NodeJS and pulumi-python for Python.

    These are located in gh-file:pulumi#sdk/nodejs/dynamic/index.ts in NodeJS/TypeScript and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/dynamic.py in Python.

  • The SDK also defines a "real" provider that implements the gRPC interface and manages the lifecycle of dynamic resources. This provider is named according to the single package name used for all dynamic resources. See gh-file:pulumi#sdk/nodejs/cmd/dynamic-provider/index.ts for NodeJS and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/__main__.py for Python.

  • A user extends the types defined by the SDK in order to implement one or more dynamic providers and resources that belong to those providers. They use these resources in their program like any other.

  • When a dynamic resource class is instantiated, it captures the provider instance that manages it and serializes this provider instance as part of the resource's properties.

  • The serialized provider state is then stored as a property on the dynamic resource. It is consequently sent to the engine as part of lifecycle calls (check, diff, create, etc.) like any other property.

  • When the engine receives requests pertaining to dynamic resources, the fixed package (pulumi-nodejs or pulumi-python) will cause it to make provider calls against the "real" provider defined in the SDK.

  • The provider proxies these calls to the code the user wrote by deserializing and hydrating the provider instance from the resource's properties and invoking the appropriate code.

These implementation choices impose a number of limitations:

  • Serialized/pickled code is brittle and simply doesn't work in all cases. Some features are supported and some aren't, depending on the language and surrounding context. Dependency management (both within the user's program and as it relates to third-party packages such as those from NPM or PyPi) is challenging.
  • Even when code works once, or in one context, it might not work later on. If e.g. absolute paths specific to one machine form part of the provider's code (or the code of its dependencies), the fact that these are serialized into the Pulumi state means that on later hydration, a program that worked before might not work again.
  • Related to the problem of state serialization is the fact that dynamic provider state is only updated when the program runs. It is therefore not possible in general to e.g. change the code of a dynamic provider and expect an operation like destroy (which does not run the program) to pick up the changes.

(parameterized-providers)=

Parameterized providers

Parameterized providers are a feature of Pulumi that allows a caller to change a provider's behaviour at runtime in response to a call. Where a call allows a caller to influence provider behaviour at a high level (e.g. by specifying the region in which an AWS provider should operate), a call may change the set of resources and functions that a provider offers (that is, its schema). A couple of examples of where this is useful are:

  • Dynamically bridging Terraform providers. The pulumi-terraform-bridge can be used to build a Pulumi provider that wraps a Terraform provider. This is an "offline" or "static" process -- provider authors write a Go program that imports the bridge library and uses it to wrap a specific Terraform provider. The resulting provider can then be published as a Pulumi plugin and its method used to generate language-specific SDKs which are also published. Generally, the Go program that authors write is the same (at least in structure) for many if not all providers. pulumi-terraform-provider is a parameterized provider that exploits this to implement a provider that can bridge an arbitrary Terraform provider at runtime. pulumi-terraform-provider accepts the name of the Terraform provider to bridge and uses the existing pulumi-terraform-bridge machinery to perform the bridging and schema loading in response to the Parameterize call. Subsequent calls to GetSchema and other lifecycle methods will then behave as if the provider had been statically bridged.

  • Managing Kubernetes clusters with custom resource definitions (CRDs). Kubernetes allows users to define their own resource types outside the standard set of APIs (Pod, Service, and so on). By default, the Pulumi Kubernetes provider does not know about these resources, and so cannot expose them in its schema and by extension offer SDK/code completion for interacting with them. Parameterization offers the possibility for the provider to accept a parameter describing a set of CRDs, enabling it to then extend its schema to expose them to programs and SDK generation.

As hinted at by the above examples, encodes a provider-specific parameter that is used to influence the provider's behaviour. The parameter passed in the can take two forms, corresponding to the two contexts in which parameterization typically occurs:

  • When generating an SDK (e.g. using a pulumi package add command), we need to boot up a provider and parameterize it using only information from the command-line invocation. In this case, the parameter is a string array representing the command-line arguments (args).
  • When interacting with a provider as part of program execution, the parameter is embedded in the SDK, so as to free the program author from having to know whether a provider is parameterized or not. In this case, the parameter is a provider-specific bytestring (value). This is intended to allow a provider to store arbitrary data that may be more efficient or practical at program execution time, after SDK generation has taken place. This value is base-64-encoded when embedded in the SDK.

:::{warning} In the absence of parameterized providers, it is generally safe to assume that a resource's package name matches exactly the name of the provider plugin that provides that package. For example, an aws:s3:Bucket resource could be expected to be managed by the aws provider plugin, which in turn would live in a binary named pulumi-resource-aws. In the presence of parameterized providers, this is not necessarily the case. Dynamic Terraform providers are a great example of this -- if a user were to dynamically bridge an AWS Terraform provider, the same aws:s3:Bucket resource might be provided by the terraform provider plugin (with a parameter of aws:<version> or similar, for example). :::

(replacement-extension-providers)=

Replacement and extension parameterization


  1. See resource registration for more information. ↩︎