14 KiB
(providers)=
Providers
The term "provider" can mean different things in different contexts. When we
talk about Pulumi programs, we often talk about provider resources such as
that provided by the aws.Provider
class in the @pulumi/aws
NodeJS/TypeScript
package. Or, we might simply mean a cloud provider, such as AWS or GCP. In the
context of the wider Pulumi architecture though, a provider (specifically, a
resource provider) is a Pulumi plugin that implements a
standardized gRPC interface for handling
communication with a third-party service (usually a cloud service, such as AWS,
GCP, or Azure):
- Configuration methods are designed to allow a consumer to configure a provider instance in some way. The call is the most common example of this, allowing a caller to e.g. specify the AWS region that a provider should use operate in. is a similar method that operates at a higher level, allowing a caller to influence more deeply how a provider works (see the section on parameterized providers for more).
- Schema endpoints allow a caller to interrogate the resources and functions that a provider exposes. The method returns a provider's schema, which includes the set of resources and functions that the provider supports, as well as the properties and inputs that each resource and function expects. This is the primary driver for the various code generation processes that Pulumi uses, such as that underpinning SDK generation.
- Lifecycle methods expose the typical , , , and (CRUD) operations that allow clients to manage provider resources. The , , and methods also fall into this category, as discussed in resource registration.
- Functions can be invoked on a provider through the call, or on specific resources by using the operation. Functions are typically used to perform operations that don't fit into the CRUD model, such as retrieving a list of availability zones, or available regions, etc.
While any program which implements the interface can be interfaced with by the Pulumi engine, in practice most Pulumi providers are built in a handful of ways:
- Bridged providers wrap a Terraform provider using the Pulumi Terraform bridge. The majority of Pulumi providers are built in this way.
- The
pulumi-go-provider
library provides a high-level API for writing a provider in Go, without having to worry about low-level gRPC details. - Dynamic providers can be written as part of a Pulumi program. They are discussed in more detail later in this document.
- Each language SDK provides a
provider
package that offers low-level primitives for writing a provider in that language (e.g. gh-file:pulumi#sdk/nodejs/provider for NodeJS, gh-file:pulumi#sdk/python/lib/pulumi/provider for Python, and gh-file:pulumi#sdk/go/pulumi/provider for Go). These packages are in varying states of completeness and neatness and are generally only used for building multi-language components (MLCs).
A provider binary is typically named pulumi-resource-<provider-name>
;
pulumi-resource-aws
is one example.
(default-providers)=
Default providers
A default provider for a package and version is the provider instance that Pulumi will use to manage resources that do not have a provider explicitly specified (either directly as a resource option or indirectly via a parent, for instance). Consider for example the following TypeScript program that creates an S3 bucket in AWS:
import * as aws from "@pulumi/aws"
new aws.s3.Bucket("my-bucket")
The Bucket
constructor will yield a such
as the following:
RegisterResourceRequest{
type: "aws:s3/bucket:Bucket",
name: "my-bucket",
parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
custom: true,
object: {},
version: "4.16.0",
}
The absence of a provider
field in this request will cause the engine to use a
default provider for the aws
package at version 4.16.0. The engine's
implementation ensures that only a single default
provider instance exists for each package version, and only creates default
provider instances on demand (that is, when a resource that requires one is
registered). Default provider instances are created by synthesizing appropriate
RegisterResourceEvent
s with inputs sourced from the stack's configuration
values for the relevant provider package. In the example above, the default AWS
provider would be configured using any stack configuration values whose keys
begin with aws:
(e.g. aws:region
).
Changing the above example to use an explicit provider will prevent a default provider from being used:
import * as aws from "@pulumi/aws"
const usWest2 = new aws.Provider("us-west-2", { region: "us-west-2" })
new aws.s3.Bucket("my-bucket", {}, { provider: usWest2 })
This will yield a RegisterResourceRequest
whose provider
field references
the explicitly constructed entity:
RegisterResourceRequest{
type: "aws:s3/bucket:Bucket",
name: "my-bucket",
parent: "urn:pulumi:dev::project::pulumi:pulumi:Stack::project",
custom: true,
object: {},
provider: "urn:pulumi:dev::project::pulumi:providers:aws::us-west-2::308b79ee-8249-40fb-a203-de190cb8faa8",
version: "4.16.0",
}
Note that the explicit provider itself is registered as a resource, and its
constructor will emit its own RegisterResourceRequest
with the appropriate
name, type, parent, and so on.
(dynamic-providers)=
Dynamic providers
Dynamic providers are a Pulumi feature that allows the core logic of a provider to be defined and managed within the context of a Pulumi program. This is in contrast to a normal ("real", sometimes "side-by-side") provider, whose logic is encapsulated as a separate plugin for use in any program. Dynamic providers are presently only supported in NodeJS/TypeScript and Python. They work as follows:
-
The SDK defines two types:
- That of dynamic providers -- objects with methods for the lifecycle methods that a gRPC provider would normally offer (CRUD, diff, etc.).
- That of dynamic resources -- those that are managed by a dynamic provider.
This type specialises (e.g. by subclassing in NodeJS and Python) the SDK's
core resource type so that all dynamic resources have the same Pulumi
package --
pulumi-nodejs
for NodeJS andpulumi-python
for Python.
These are located in gh-file:pulumi#sdk/nodejs/dynamic/index.ts in NodeJS/TypeScript and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/dynamic.py in Python.
-
The SDK also defines a "real" provider that implements the gRPC interface and manages the lifecycle of dynamic resources. This provider is named according to the single package name used for all dynamic resources. See gh-file:pulumi#sdk/nodejs/cmd/dynamic-provider/index.ts for NodeJS and gh-file:pulumi#sdk/python/lib/pulumi/dynamic/__main__.py for Python.
-
A user extends the types defined by the SDK in order to implement one or more dynamic providers and resources that belong to those providers. They use these resources in their program like any other.
-
When a dynamic resource class is instantiated, it captures the provider instance that manages it and serializes this provider instance as part of the resource's properties.
- In NodeJS, serialization is performed by capturing and mangling the source code of the provider and any dependencies by (ab)using v8 primitives -- see gh-file:pulumi#sdk/nodejs/runtime/closure for the gory details.
- In Python, serialization is performed by pickling the dynamic provider
instance -- see gh-file:pulumi#sdk/python/lib/pulumi/dynamic/dynamic.py's
use of
dill
for more on this.
-
The serialized provider state is then stored as a property on the dynamic resource. It is consequently sent to the engine as part of lifecycle calls (check, diff, create, etc.) like any other property.
-
When the engine receives requests pertaining to dynamic resources, the fixed package (
pulumi-nodejs
orpulumi-python
) will cause it to make provider calls against the "real" provider defined in the SDK. -
The provider proxies these calls to the code the user wrote by deserializing and hydrating the provider instance from the resource's properties and invoking the appropriate code.
These implementation choices impose a number of limitations:
- Serialized/pickled code is brittle and simply doesn't work in all cases. Some features are supported and some aren't, depending on the language and surrounding context. Dependency management (both within the user's program and as it relates to third-party packages such as those from NPM or PyPi) is challenging.
- Even when code works once, or in one context, it might not work later on. If e.g. absolute paths specific to one machine form part of the provider's code (or the code of its dependencies), the fact that these are serialized into the Pulumi state means that on later hydration, a program that worked before might not work again.
- Related to the problem of state serialization is the fact that dynamic
provider state is only updated when the program runs. It is therefore not
possible in general to e.g. change the code of a dynamic provider and expect
an operation like
destroy
(which does not run the program) to pick up the changes.
(parameterized-providers)=
Parameterized providers
Parameterized providers are a feature of Pulumi that allows a caller to change a provider's behaviour at runtime in response to a call. Where a call allows a caller to influence provider behaviour at a high level (e.g. by specifying the region in which an AWS provider should operate), a call may change the set of resources and functions that a provider offers (that is, its schema). A couple of examples of where this is useful are:
-
Dynamically bridging Terraform providers. The
pulumi-terraform-bridge
can be used to build a Pulumi provider that wraps a Terraform provider. This is an "offline" or "static" process -- provider authors write a Go program that imports the bridge library and uses it to wrap a specific Terraform provider. The resulting provider can then be published as a Pulumi plugin and its method used to generate language-specific SDKs which are also published. Generally, the Go program that authors write is the same (at least in structure) for many if not all providers.pulumi-terraform-provider
is a parameterized provider that exploits this to implement a provider that can bridge an arbitrary Terraform provider at runtime.pulumi-terraform-provider
accepts the name of the Terraform provider to bridge and uses the existingpulumi-terraform-bridge
machinery to perform the bridging and schema loading in response to theParameterize
call. Subsequent calls toGetSchema
and other lifecycle methods will then behave as if the provider had been statically bridged. -
Managing Kubernetes clusters with custom resource definitions (CRDs). Kubernetes allows users to define their own resource types outside the standard set of APIs (
Pod
,Service
, and so on). By default, the Pulumi Kubernetes provider does not know about these resources, and so cannot expose them in its schema and by extension offer SDK/code completion for interacting with them. Parameterization offers the possibility for the provider to accept a parameter describing a set of CRDs, enabling it to then extend its schema to expose them to programs and SDK generation.
As hinted at by the above examples, encodes a provider-specific parameter that is used to influence the provider's behaviour. The parameter passed in the can take two forms, corresponding to the two contexts in which parameterization typically occurs:
- When generating an SDK (e.g. using a
pulumi package add
command), we need to boot up a provider and parameterize it using only information from the command-line invocation. In this case, the parameter is a string array representing the command-line arguments (args
). - When interacting with a provider as part of program execution, the parameter
is embedded in the SDK, so as to free the program author from having to know
whether a provider is parameterized or not. In this case, the parameter is a
provider-specific bytestring (
value
). This is intended to allow a provider to store arbitrary data that may be more efficient or practical at program execution time, after SDK generation has taken place. This value is base-64-encoded when embedded in the SDK.
:::{warning}
In the absence of parameterized providers, it is generally safe to assume that a
resource's package name matches exactly the name of the provider
plugin that provides that package. For example, an aws:s3:Bucket
resource could be expected to be managed by the aws
provider plugin, which in
turn would live in a binary named pulumi-resource-aws
. In the presence of
parameterized providers, this is not necessarily the case. Dynamic Terraform
providers are a great example of this -- if a user were to dynamically bridge an
AWS Terraform provider, the same aws:s3:Bucket
resource might be provided by
the terraform
provider plugin (with a parameter of aws:<version>
or similar,
for example).
:::
(replacement-extension-providers)=