1. 'readID' was never assigned to and was always the default value,
leading the refresh source to believe a resource was deleted
2. The refresh source could hang when a resource is deleted.
Some time ago, we introduced the concept of the initialization error to
Pulumi (i.e., an error where the resource was successfully created but
failed to fully initialize). This was originally implemented in `Create`
and `Update` methods of the resource provider interface; when we
detected an initialization failure, we'd pack the live version of the
object into the error, and return that to the engine.
Omitted from this initial implementation was a similar semantics for
`Read`. There are many implications of this, but one of them is that a
`pulumi refresh` will erase any initialization errors that had
previously been observed, even if the initialization errors still exist
in the resource.
This commit will introduce the initialization error semantics to `Read`,
fixing this issue.
### First-Class Providers
These changes implement support for first-class providers. First-class
providers are provider plugins that are exposed as resources via the
Pulumi programming model so that they may be explicitly and multiply
instantiated. Each instance of a provider resource may be configured
differently, and configuration parameters may be source from the
outputs of other resources.
### Provider Plugin Changes
In order to accommodate the need to verify and diff provider
configuration and configure providers without complete configuration
information, these changes adjust the high-level provider plugin
interface. Two new methods for validating a provider's configuration
and diffing changes to the same have been added (`CheckConfig` and
`DiffConfig`, respectively), and the type of the configuration bag
accepted by `Configure` has been changed to a `PropertyMap`.
These changes have not yet been reflected in the provider plugin gRPC
interface. We will do this in a set of follow-up changes. Until then,
these methods are implemented by adapters:
- `CheckConfig` validates that all configuration parameters are string
or unknown properties. This is necessary because existing plugins
only accept string-typed configuration values.
- `DiffConfig` either returns "never replace" if all configuration
values are known or "must replace" if any configuration value is
unknown. The justification for this behavior is given
[here](https://github.com/pulumi/pulumi/pull/1695/files#diff-a6cd5c7f337665f5bb22e92ca5f07537R106)
- `Configure` converts the config bag to a legacy config map and
configures the provider plugin if all config values are known. If any
config value is unknown, the underlying plugin is not configured and
the provider may only perform `Check`, `Read`, and `Invoke`, all of
which return empty results. We justify this behavior becuase it is
only possible during a preview and provides the best experience we
can manage with the existing gRPC interface.
### Resource Model Changes
Providers are now exposed as resources that participate in a stack's
dependency graph. Like other resources, they are explicitly created,
may have multiple instances, and may have dependencies on other
resources. Providers are referred to using provider references, which
are a combination of the provider's URN and its ID. This design
addresses the need during a preview to refer to providers that have not
yet been physically created and therefore have no ID.
All custom resources that are not themselves providers must specify a
single provider via a provider reference. The named provider will be
used to manage that resource's CRUD operations. If a resource's
provider reference changes, the resource must be replaced. Though its
URN is not present in the resource's dependency list, the provider
should be treated as a dependency of the resource when topologically
sorting the dependency graph.
Finally, `Invoke` operations must now specify a provider to use for the
invocation via a provider reference.
### Engine Changes
First-class providers support requires a few changes to the engine:
- The engine must have some way to map from provider references to
provider plugins. It must be possible to add providers from a stack's
checkpoint to this map and to register new/updated providers during
the execution of a plan in response to CRUD operations on provider
resources.
- In order to support updating existing stacks using existing Pulumi
programs that may not explicitly instantiate providers, the engine
must be able to manage the "default" providers for each package
referenced by a checkpoint or Pulumi program. The configuration for
a "default" provider is taken from the stack's configuration data.
The former need is addressed by adding a provider registry type that is
responsible for managing all of the plugins required by a plan. In
addition to loading plugins froma checkpoint and providing the ability
to map from a provider reference to a provider plugin, this type serves
as the provider plugin for providers themselves (i.e. it is the
"provider provider").
The latter need is solved via two relatively self-contained changes to
plan setup and the eval source.
During plan setup, the old checkpoint is scanned for custom resources
that do not have a provider reference in order to compute the set of
packages that require a default provider. Once this set has been
computed, the required default provider definitions are conjured and
prepended to the checkpoint's resource list. Each resource that
requires a default provider is then updated to refer to the default
provider for its package.
While an eval source is running, each custom resource registration,
resource read, and invoke that does not name a provider is trapped
before being returned by the source iterator. If no default provider
for the appropriate package has been registered, the eval source
synthesizes an appropriate registration, waits for it to complete, and
records the registered provider's reference. This reference is injected
into the original request, which is then processed as usual. If a
default provider was already registered, the recorded reference is
used and no new registration occurs.
### SDK Changes
These changes only expose first-class providers from the Node.JS SDK.
- A new abstract class, `ProviderResource`, can be subclassed and used
to instantiate first-class providers.
- A new field in `ResourceOptions`, `provider`, can be used to supply
a particular provider instance to manage a `CustomResource`'s CRUD
operations.
- A new type, `InvokeOptions`, can be used to specify options that
control the behavior of a call to `pulumi.runtime.invoke`. This type
includes a `provider` field that is analogous to
`ResourceOptions.provider`.
When a resource fails to initialize (i.e., it is successfully created,
but fails to transition to a fully-initialized state), and a user
subsequently runs `pulumi update` without changing that resource, our
CLI will fail to warn the user that this resource is not initialized.
This commit begins the process of allowing our CLI to report this by
storing a list of initialization errors in the checkpoint.
This commit adds CLI support for resource providers to provide partial
state upon failure. For resource providers that model resource
operations across multiple API calls, the Provider RPC interface can now
accomodate saving bags of state for resource operations that failed.
This is a common pattern for Terraform-backed providers that try to do
post-creation steps on resource as part of Create or Update resource
operations.
As of this change, the engine will run all Configure calls in parallel.
This improves startup performance, since otherwise, we would block
waiting for all plugins to be configured before proceeding to run a
program. Emperically, this is about 1.5-2s for AWS programs, and
manifests as a delay between the purple "Previewing update of stack"
being printed, and the corresponding grey "Previewing update" message.
This is done simply by using a Goroutine for Configure, and making sure
to synchronize on all actual CRUD operations. I toyed with using double
checked locking to eliminate lock acquisitions -- something we may want
to consider as we add more fine-grained parallelism -- however, I've
kept it simple to avoid all the otherwise implied memory model woes.
I made the judgment call that GetPluginInfo may proceed before
Configure has settled. (Otherwise, we'd immediately call it and block
after loading the plugin, obviating the parallelism benefits.) I also
made the judgment call to do this in the engine, after flip flopping
several times about possibly making it a provider's own decision.
The RPC provider interface needs a way to convey back to the engine
that a resource being read no longer exists. To do this, we'll return
the ID property that was read back. If it is empty, it means the
resource is gone. If it is non-empty, we expect it to match the input.
This commit changes two things about our resource model:
* Stop performing Pulumi Engine-side diffing of resource state.
Instead, we defer to the resource plugins themselves to determine
whether a change was made and, if so, the extent of it. This
manifests as a simple change to the Diff function; it is done in
a backwards compatible way so that we continue with legacy diffing
for existing resource provider plugins.
* Add a Read RPC method for resource providers. It simply takes a
resource's ID and URN, plus an optional bag of further qualifying
state, and it returns the current property state as read back from
the actual live environment. Note that the optional bag of state
must at least include enough additional properties for resources
wherein the ID is insufficient for the provider to perform a lookup.
It may, however, include the full bag of prior state, for instance
in the case of a refresh operation.
This is part of pulumi/pulumi#1108.
* Improve the error message arising from missing required configs for
resource providers
If the resource provider that we are speaking to is new enough, it will send
across a list of keys and their descriptions alongside an error
indicating that the provider we are configuring is missing required
config. This commit packages up the list of missing keys into an error
that can be presented nicely to the user.
* Code review feedback: renaming simplification and correcting errors in comments
* Send structured errors across RPC boundaries
This brings us closer to gRPC best practices where we send structured
errors with error codes across RPC endpoints. The new "rpcerrors"
package can wrap errors from RPC endpoints, so RPC servers can attach
some additional context as to why a request failed.
* Code review feedback:
1. Rename rpcerrors -> rpcerror, better package name
2. Rename RPCError -> Error, RPCErrorCause -> ErrorCause, names
suggested by gometalinter to improve their package-qualified names
3. Fix import organization in rpcerror.go
* Improve error messages output by the CLI
This fixes a couple known issues with the way that we present errors
from the Pulumi CLI:
1. Any errors from RPC endpoints were bubbling up as they were to
the top-level, which was unfortunate because they contained
RPC-specific noise that we don't want to present to the user. This
commit unwraps errors from resource providers.
2. The "catastrophic error" message often got printed twice
3. Fatal errors are often printed twice, because our CLI top-level
prints out the fatal error that it receives before exiting. A lot of
the time this error has already been printed.
4. Errors were prefixed by PU####.
* Feedback: Omit the 'catastrophic' error message and use a less verbose error message as the final error
* Code review feedback: interpretRPCError -> resourceStateAndError
* Code review feedback: deleting some commented-out code, error capitalization
* Cleanup after rebase
This API was introduced to aid the refactoring, but it isn't something
we want to support long term. Remove it and for a few places, push
passing config.Key around more, instead of converting to the old type
eagerly.
Previously, the checkpoint manifest contained the full path to a plugin
binary, in places of its friendly name. Now that we must move to a model
where we install plugins in the PPC based on the manifest contents, we
actually need to store the name, in addition to the version (which is
already there). We still also capture the path for debugging purposes.
This adds support for two things:
* Installing all plugins that a project requires with a single command:
$ pulumi plugin install
* Listing the plugins that this project requires:
$ pulumi plugin ls --project
$ pulumi plugin ls -p
This brings back the Node.js language plugin's GetRequiredPlugins
function, reimplemented in Go now that the language host has been
rewritten from JavaScript. Fairly rote translation, along with
some random fixes required to get tests passing again.
This change adds a GetRequiredPlugins RPC method to the language
host, enabling us to query it for its list of plugin requirements.
This is language-specific because it requires looking at the set
of dependencies (e.g., package.json files).
It also adds a call up front during any update/preview operation
to compute the set of plugins and require that they are present.
These plugins are populated in the cache and will be used for all
subsequent plugin-related operations during the engine's activity.
We now cache the language plugins, so that we may load them
eagerly too, which we never did previously due to the fact that
we needed to pass the monitor address at load time. This was a
bit bizarre anyhow, since it's really the Run RPC function that
needs this information. So, to enable caching and eager loading
-- which we need in order to invoke GetRequiredPlugins -- the
"phone home" monitor RPC address is passed at Run time.
In a subsequent change, we will switch to faulting in the plugins
that are missing -- rather than erroring -- in addition to
supporting the `pulumi plugin install` CLI command.
This change introduces a workspace.GetPluginPath function that probes
the central workspace cache of plugins for a matching plugin binary that
matches the desired kind, name, and, optionally, version. It also permits
overriding this with $PATH for developer scenarios.
The analyzer, language, and resource plugin logic now uses this function
for deciding which binary path to load at runtime.
My previous change to stop supplying unknown properties to providers
broke `pulumi preview` in the case of unknown inputs. This change
restores the previous behavior for previews only; the new unknown-free
behavior remains for applies.
Fixes#790.
Before these changes, we were inconsistent in our treatment of unknown
property values across the resource provider RPC interface. `Check` and
`Diff` were retaining unknown properties in inputs and outputs;
`Create`, `Update`, and `Delete` were not. This interacted badly with
recent changes to `Check` to return all provider inputs--i.e. not just
defaults--from that method: if an unknown input was provided, it would
be present in the returned inputs, which would eventually confuse the
differ by giving the appearance of changes where none were present.
These changes remove unknowns from the provider interface entirely:
unknown property values are never passed to a provider, and a provider
must never return an unknown property value.
This is the primary piece of the fix for pulumi/pulumi-terraform#93.
This change adds a bit more tracing context to RPC marshaling
logging so that it's easier to attribute certain marshaling calls.
Prior to this, we'd just have a flat list of "marshaled property X"
without any information about what the marshaling pertained to.
This change adds rudimentary delete-before-create support (see
pulumi/pulumi#450). This cannot possibly be complete until we also
implement pulumi/pulumi#624, becuase we may try to delete a resource
while it still has dependent resources (which almost certainly will
fail). But until then, we can use this to manually unwedge ourselves
for leaf-node resources that do not support old and new resources
living side-by-side.
As documented in issue #616, the inputs/defaults/outputs model we have
today has fundamental problems. The crux of the issue is that our
current design requires that defaults present in the old state of a
resource are applied to the new inputs for that resource.
Unfortunately, it is not possible for the engine to decide which
defaults remain applicable and which do not; only the provider has that
knowledge.
These changes take a more tactical approach to resolving this issue than
that originally proposed in #616 that avoids breaking compatibility with
existing checkpoints. Rather than treating the Pulumi inputs as the
provider input properties for a resource, these inputs are first
translated by `Check`. In order to accommodate provider defaults that
were chosen for the old resource but should not change for the new,
`Check` now takes the old provider inputs as well as the new Pulumi
inputs. Rather than the Pulumi inputs and provider defaults, the
provider inputs returned by `Check` are recorded in the checkpoint file.
Put simply, these changes remove defaults as a first-class concept
(except inasmuch as is required to retain the ability to read old
checkpoint files) and move the responsibilty for manging and
merging defaults into the provider that supplies them.
Fixes#616.
This change adds a new manifest section to the checkpoint files.
The existing time moves into it, and we add to it the version of
the Pulumi CLI that created it, along with the names, types, and
versions of all plugins used to generate the file. There is a
magic cookie that we also use during verification.
This is to help keep us sane when debugging problems "in the wild,"
and I'm sure we will add more to it over time (checksum, etc).
For example, after an up, you can now see this in `pulumi stack`:
```
Current stack is demo:
Last updated at 2017-12-01 13:48:49.815740523 -0800 PST
Pulumi version v0.8.3-79-g1ab99ad
Plugin pulumi-provider-aws [resource] version v0.8.3-22-g4363e77
Plugin pulumi-langhost-nodejs [language] version v0.8.3-79-g77bb6b6
Checkpoint file is /Users/joeduffy/dev/code/src/github.com/pulumi/pulumi-aws/.pulumi/stacks/webserver/demo.json
```
This addresses pulumi/pulumi#628.
Note: for the purposes of this discussion, archives will be treated as
assets, as their differences are not particularly meaningful.
Currently, the identity of an asset is derived from the hash and the
location of its contents (i.e. two assets are equal iff their contents
have the same hash and the same path/URI/inline value). This means that
changing the source of an asset will cause the engine to detect a
difference in the asset even if the source's contents are identical. At
best, this leads to inefficiencies such as unnecessary updates. This
commit changes asset identity so that it is derived solely from an
asset's hash. The source of an asset's contents is no longer part of
the asset's identity, and need only be provided if the contents
themselves may need to be available (e.g. if a hash does not yet exist
for the asset or if the asset's contents might be needed for an update).
This commit also changes the way old assets are exposed to providers.
Currently, an old asset is exposed as both its hash and its contents.
This allows providers to take a dependency on the contents of an old
asset being available, even though this is not an invariant of the
system. These changes remove the contents of old assets from their
serialized form when they are passed to providers, eliminating the
ability of a provider to take such a dependency. In combination with the
changes to asset identity, this allows a provider to detect changes to
an asset simply by comparing its old and new hashes.
This is half of the fix for [pulumi/pulumi-cloud#158]. The other half
involves changes in [pulumi/pulumi-terraform].
It's legal and possible for undefined properties to show up in
objects, since that's an idiomatic JavaScript way of initializing
missing properties. Instead of failing for these during deployment,
we should simply skip marshaling them to Terraform and let it do
its thing as usual. This came up during our customer workload.
This change adds the capability for a resource provider to indicate
that, where an action carried out in response to a diff, a certain set
of properties would be "stable"; that is to say, they are guaranteed
not to change. As a result, properties may be resolved to their final
values during previewing, avoiding erroneous cascading impacts.
This avoids the ever-annoying situation I keep running into when demoing:
when adding or removing an ingress rule to a security group, we ripple
the impact through the instance, and claim it must be replaced, because
that instance depends on the security group via its name. Well, the name
is a great example of a stable property, in that it will never change, and
so this is truly unfortunate and always adds uncertainty into the demos.
Particularly since the actual update doesn't need to perform replacements.
This resolvespulumi/pulumi#330.
This change enables us to make progress on exposing data sources
(see pulumi/pulumi-terraform#29). The idea is to have an Invoke
function that simply takes a function token and arguments, performs
the function lookup and invocation, and then returns a return value.
This change improves our output formatting by generally adding
fewer prefixes. As shown in pulumi/pulumi#359, we were being
excessively verbose in many places, including prefixing every
console.out with "langhost[nodejs].stdout: ", displaying full
stack traces for simple errors like missing configuration, etc.
Overall, this change includes the following:
* Don't prefix stdout and stderr output from the program, other
than the standard "info:" prefix. I experimented with various
schemes here, but they all felt gratuitous. Simply emitting
the output seems fine, especially as it's closer to what would
happen if you just ran the program under node.
* Do NOT make writes to stderr fail the plan/deploy. Previously
we assumed that any console.errors, for instance, meant that
the overall program should fail. This simply isn't how stderr
is treated generally and meant you couldn't use certain
logging techniques and libraries, among other things.
* Do make sure that stderr writes in the program end up going to
stderr in the Pulumi CLI output, however, so that redirection
works as it should. This required a new Infoerr log level.
* Make a small fix to the planning logic so we don't attempt to
print the summary if an error occurs.
* Finally, add a new error type, RunError, that when thrown and
uncaught does not result in a full stack trace being printed.
Anyone can use this, however, we currently use it for config
errors so that we can terminate with a pretty error message,
rather than the monstrosity shown in pulumi/pulumi#359.
This includes a few changes:
* The repo name -- and hence the Go modules -- changes from pulumi-fabric to pulumi.
* The Node.js SDK package changes from @pulumi/pulumi-fabric to just pulumi.
* The CLI is renamed from lumi to pulumi.
This change finishes the conversion of LUMIDL over to the new
runtime model, with the appropriate code generation changes.
It turns out the old model actually had a flaw in it anyway that we
simply didn't hit because we hadn't been stressing output properties
nearly as much as the new model does. This resulted in needing to
plumb the rejection (or allowance) of computed properties more
deeply into the resource property marshaling/unmarshaling logic.
As of these changes, I can run the GitHub provider again locally.
This change fixespulumi/pulumi-fabric#332.
As explained in pulumi/pulumi-fabric#293, we were a little ad-hoc in
how configuration was "applied" to resource providers.
In fact, config wasn't ever communicated directly to providers; instead,
the resource providers would simply ask the engine to read random heap
locations (via tokens). Now that we're on a plan where configuration gets
handed to the program at startup, and that's that, and where generally
speaking resource providers never communicate directly with the language
runtime, we need to take a different approach.
As such, the resource provider interface now offers a Configure RPC
method that the resource planning engine will invoke at the right
times with the right subset of configuration variables filtered to
just that provider's package. This fixespulumi/pulumi#293.
This change simplifies the provider RPC interface slightly:
1) Eliminate Get. We really don't need it anymore. There are
several possibly-interesting scenarios down the road that may
demand it, but when we get there, we can consider how best to
bring this back. Furthermore, the old-style Get remains mostly
incompatible with Terraform anyway.
2) Pass URNs, not type tokens, across the RPC boundary. This gives
the provider access to more interesting information: the type,
still, but also the name (which is no longer an object property).
We are renaming Lumi to Pulumi Fabric. This change simply renames the
pulumi/lumi repo to pulumi/pulumi-fabric, without the CLI tools and other
changes that will follow soon afterwards.
This changes the RPC interfaces between Lumi and provider ever so
slightly, so that we can track default properties explicitly. This
is required to perform accurate diffing between inputs provided by
the developer, inputs provided by the system, and outputs. This is
particularly important for default values that may be indeterminite,
such as those we use in the bridge to auto-generate unique IDs.
Otherwise, we fail to reapply defaults correctly, and trick the
provider into thinking that properties changed when they did not.
This is a small step towards pulumi/lumi#306, in which we will defer
even more responsibility for diffing semantics to the providers.
This change serializes unknown properties anywhere in the entire
property structure, including deeply embedded inside object maps, etc.
This is now done in such a way that we can recover both the computed
nature of the serialized property, along with its expected eventual
type, on the other side of the RPC boundary.
This will let us have perfect fidelity with the new bridge's view on
computed properties, rather than special casing them on "one side".
As part of the bridge bringup, I've discoverd that the property state
returned from Creates does *not* always equal the state that is then
read from calls to Get. (I suspect this is a bug and that they should
be equivalent, but I doubt it's fruitfal to try and track down all
occurrences of this; I bet it's widespread). To cope with this, we will
return state from Create and Update, instead of issuing a call to Get.
This was a design we considered to start with and frankly didn't have
a super strong reason to do it the current way, other than that it seemed
elegant to place all of the Get logic in one place.
Note that providers may choose to return nil, in which case we will read
state from the provider in the usual Get style.
This change recognizes assets and archives as 1st class resource
property values. This is necessary to support them in the new bridge
work, and lays the foundation for fixing pulumi/lumi#153.
I also took the opportunity to clean up some old cruft in the
resource properties area.