pulumi/docs/architecture/providers.md

14 KiB

(providers)=

Providers

The term "provider" can mean different things in different contexts. When we talk about Pulumi programs, we often talk about provider resources such as that provided by the aws.Provider class in the @pulumi/aws NodeJS/TypeScript package. Or, we might simply mean a cloud provider, such as AWS or GCP. In the context of the wider Pulumi architecture though, a provider (specifically, a resource provider) is a Pulumi plugin that implements a standardized gRPC interface for handling communication with a third-party service (usually a cloud service, such as AWS, GCP, or Azure):

  • Configuration methods are designed to allow a consumer to configure a provider instance in some way. The call is the most common example of this, allowing a caller to e.g. specify the AWS region that a provider should use operate in. is a similar method that operates at a higher level, allowing a caller to influence more deeply how a provider works (see the section on parameterized providers for more).
  • Schema endpoints allow a caller to interrogate the resources and functions that a provider exposes. The method returns a provider's schema, which includes the set of resources and functions that the provider supports, as well as the properties and inputs that each resource and function expects. This is the primary driver for the various code generation processes that Pulumi uses, such as that underpinning SDK generation.
  • Lifecycle methods expose the typical , , , and (CRUD) operations that allow clients to manage provider resources. The , , and methods also fall into this category, as discussed in resource registration.
  • Functions can be invoked on a provider through the call, or on specific resources by using the operation. Functions are typically used to perform operations that don't fit into the CRUD model, such as retrieving a list of availability zones, or available regions, etc.

While any program which implements the interface can be interfaced with by the Pulumi engine, in practice most Pulumi providers are built in a handful of ways:

A provider binary is typically named pulumi-resource-<provider-name>; pulumi-resource-aws is one example.

(default-providers)=

Default providers

A default provider for a package and version is the provider instance that Pulumi will use to manage resources that do not have a provider explicitly specified (either directly as a resource option or indirectly via a parent, for instance). Consider for example the following TypeScript program that creates an S3 bucket in AWS:

import * as aws from "@pulumi/aws"

new aws.s3.Bucket("my-bucket")

The Bucket constructor will yield a such as the following:

RegisterResourceRequest{
  type: "aws:s3/bucket:Bucket",
  name: "my-bucket",
  parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
  custom: true,
  object: {},
  version: "4.16.0",
}

The absence of a provider field in this request will cause the engine to use a default provider for the aws package at version 4.16.0. The engine's implementation ensures that only a single default provider instance exists for each package version, and only creates default provider instances on demand (that is, when a resource that requires one is registered). Default provider instances are created by synthesizing appropriate RegisterResourceEvents with inputs sourced from the stack's configuration values for the relevant provider package. In the example above, the default AWS provider would be configured using any stack configuration values whose keys begin with aws: (e.g. aws:region).

Changing the above example to use an explicit provider will prevent a default provider from being used:

import * as aws from "@pulumi/aws"

const usWest2 = new aws.Provider("us-west-2", { region: "us-west-2" })

new aws.s3.Bucket("my-bucket", {}, { provider: usWest2 })

This will yield a RegisterResourceRequest whose provider field references the explicitly constructed entity:

RegisterResourceRequest{
  type: "aws:s3/bucket:Bucket",
  name: "my-bucket",
  parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
  custom: true,
  object: {},
  provider: "urn:pulumi:dev::project::pulumi:providers:aws::us-west-2::308b79ee-8249-40fb-a203-de190cb8faa8",
  version: "4.16.0",
}

Note that the explicit provider itself is registered as a resource, and its constructor will emit its own RegisterResourceRequest with the appropriate name, type, parent, and so on.

(dynamic-providers)=

Dynamic providers

Dynamic providers are a Pulumi feature that allows the core logic of a provider to be defined and managed within the context of a Pulumi program. This is in contrast to a normal ("real", sometimes "side-by-side") provider, whose logic is encapsulated as a separate plugin for use in any program. Dynamic providers are presently only supported in NodeJS/TypeScript and Python. They work as follows:

  • The SDK defines two types:

    • That of dynamic providers -- objects with methods for the lifecycle methods that a gRPC provider would normally offer (CRUD, diff, etc.).
    • That of dynamic resources -- those that are managed by a dynamic provider. This type specialises (e.g. by subclassing in NodeJS and Python) the SDK's core resource type so that all dynamic resources have the same Pulumi package -- pulumi-nodejs for NodeJS and pulumi-python for Python.

    These are located in gh-file:pulumi#sdk/nodejs/dynamic/index.ts in NodeJS/TypeScript and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/dynamic.py in Python.

  • The SDK also defines a "real" provider that implements the gRPC interface and manages the lifecycle of dynamic resources. This provider is named according to the single package name used for all dynamic resources. See gh-file:pulumi#sdk/nodejs/cmd/dynamic-provider/index.ts for NodeJS and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/__main__.py for Python.

  • A user extends the types defined by the SDK in order to implement one or more dynamic providers and resources that belong to those providers. They use these resources in their program like any other.

  • When a dynamic resource class is instantiated, it captures the provider instance that manages it and serializes this provider instance as part of the resource's properties.

  • The serialized provider state is then stored as a property on the dynamic resource. It is consequently sent to the engine as part of lifecycle calls (check, diff, create, etc.) like any other property.

  • When the engine receives requests pertaining to dynamic resources, the fixed package (pulumi-nodejs or pulumi-python) will cause it to make provider calls against the "real" provider defined in the SDK.

  • The provider proxies these calls to the code the user wrote by deserializing and hydrating the provider instance from the resource's properties and invoking the appropriate code.

These implementation choices impose a number of limitations:

  • Serialized/pickled code is brittle and simply doesn't work in all cases. Some features are supported and some aren't, depending on the language and surrounding context. Dependency management (both within the user's program and as it relates to third-party packages such as those from NPM or PyPi) is challenging.
  • Even when code works once, or in one context, it might not work later on. If e.g. absolute paths specific to one machine form part of the provider's code (or the code of its dependencies), the fact that these are serialized into the Pulumi state means that on later hydration, a program that worked before might not work again.
  • Related to the problem of state serialization is the fact that dynamic provider state is only updated when the program runs. It is therefore not possible in general to e.g. change the code of a dynamic provider and expect an operation like destroy (which does not run the program) to pick up the changes.

(parameterized-providers)=

Parameterized providers

Parameterized providers are a feature of Pulumi that allows a caller to change a provider's behaviour at runtime in response to a call. Where a call allows a caller to influence provider behaviour at a high level (e.g. by specifying the region in which an AWS provider should operate), a call may change the set of resources and functions that a provider offers (that is, its schema). A couple of examples of where this is useful are:

  • Dynamically bridging Terraform providers. The pulumi-terraform-bridge can be used to build a Pulumi provider that wraps a Terraform provider. This is an "offline" or "static" process -- provider authors write a Go program that imports the bridge library and uses it to wrap a specific Terraform provider. The resulting provider can then be published as a Pulumi plugin and its method used to generate language-specific SDKs which are also published. Generally, the Go program that authors write is the same (at least in structure) for many if not all providers. pulumi-terraform-provider is a parameterized provider that exploits this to implement a provider that can bridge an arbitrary Terraform provider at runtime. pulumi-terraform-provider accepts the name of the Terraform provider to bridge and uses the existing pulumi-terraform-bridge machinery to perform the bridging and schema loading in response to the Parameterize call. Subsequent calls to GetSchema and other lifecycle methods will then behave as if the provider had been statically bridged.

  • Managing Kubernetes clusters with custom resource definitions (CRDs). Kubernetes allows users to define their own resource types outside the standard set of APIs (Pod, Service, and so on). By default, the Pulumi Kubernetes provider does not know about these resources, and so cannot expose them in its schema and by extension offer SDK/code completion for interacting with them. Parameterization offers the possibility for the provider to accept a parameter describing a set of CRDs, enabling it to then extend its schema to expose them to programs and SDK generation.

As hinted at by the above examples, encodes a provider-specific parameter that is used to influence the provider's behaviour. The parameter passed in the can take two forms, corresponding to the two contexts in which parameterization typically occurs:

  • When generating an SDK (e.g. using a pulumi package add command), we need to boot up a provider and parameterize it using only information from the command-line invocation. In this case, the parameter is a string array representing the command-line arguments (args).
  • When interacting with a provider as part of program execution, the parameter is embedded in the SDK, so as to free the program author from having to know whether a provider is parameterized or not. In this case, the parameter is a provider-specific bytestring (value). This is intended to allow a provider to store arbitrary data that may be more efficient or practical at program execution time, after SDK generation has taken place. This value is base-64-encoded when embedded in the SDK.

:::{warning} In the absence of parameterized providers, it is generally safe to assume that a resource's package name matches exactly the name of the provider plugin that provides that package. For example, an aws:s3:Bucket resource could be expected to be managed by the aws provider plugin, which in turn would live in a binary named pulumi-resource-aws. In the presence of parameterized providers, this is not necessarily the case. Dynamic Terraform providers are a great example of this -- if a user were to dynamically bridge an AWS Terraform provider, the same aws:s3:Bucket resource might be provided by the terraform provider plugin (with a parameter of aws:<version> or similar, for example). :::

(replacement-extension-providers)=

Replacement and extension parameterization