* Lift snapshot management out of the engine
This PR is a prerequisite for parallelism by addressing a major problem
that the engine has to deal with when performing parallel resource
construction: parallel mutation of the global snapshot. This PR adds
a `SnapshotManager` type that is responsible for maintaining and
persisting the current resource snapshot. It serializes all reads and
writes to the global snapshot and persists the snapshot to persistent
storage upon every write.
As a side-effect of this, the core engine no longer needs to know about
snapshot management at all; all snapshot operations can be handled as
callbacks on deployment events. This will greatly simplify the
parallelization of the core engine.
Worth noting is that the core engine will still need to be able to read
the current snapshot, since it is interested in the dependency graphs
contained within. The full implications of that are out of scope of this
PR.
Remove dead code, Steps no longer need a reference to the plan iterator that created them
Fixing various issues that arise when bringing up pulumi-aws
Line length broke the build
Code review: remove dead field, fix yaml name error
Rebase against master, provide implementation of StackPersister for cloud backend
Code review feedback: comments on MutationStatus, style in snapshot.go
Code review feedback: move SnapshotManager to pkg/backend, change engine to use an interface SnapshotManager
Code review feedback: use a channel for synchronization
Add a comment and a new test
* Maintain two checkpoints, an immutable base and a mutable delta, and
periodically merge the two to produce snapshots
* Add a lot of tests - covers all of the non-error paths of BeginMutation and End
* Fix a test resource provider
* Add a few tests, fix a few issues
* Rebase against master, fixed merge
We've seen failures in CI where DNS lookups fail which cause our
operations against the service to fail, as well as other sorts of
timeouts.
Add a set of helper methods in a new httputil package that helps us do
retries on these operations, and then update our client library to use
them when we are doing GET requests. We also provide a way for non GET
requests to be retried, and use this when updating a lease (since it
is safe to retry multiple requests in this case).
The RPC provider interface needs a way to convey back to the engine
that a resource being read no longer exists. To do this, we'll return
the ID property that was read back. If it is empty, it means the
resource is gone. If it is non-empty, we expect it to match the input.
This change skips unknown IDs during read operations. This can happen
when a read is performed using the output property of another resource
during planning. This is intentionally supported via ID being an
Input<ID> and all we need to do for this to work correctly is skip the
actual provider RPC and the runtime will propagate unknown outputs as
usual.
This change lets plugin versions to float in two ways:
1) If a `pulumi plugin install` detects a newer version is available
already, there's no need to download and install the older version.
2) If the engine attempts to load a plugin at a particular version,
if a newer version is available, it will be accepted without error.
As part of this, we permit $PATH to have the final say when determining
which version to accept. That is, it can always override the choice.
Note that I highly suspect, in the limit, that we'll want to stop doing
this for major version incompatibilities. For now, since we don't
envision any such version changes imminently, this will suffice.
This change wires up the new Read RPC method in such a manner that
Pulumi programs can invoke it. This is technically not required for
refreshing state programmatically (as in pulumi/pulumi#1081), however
it's a feature we had eons ago and have wanted since (see
pulumi/pulumi#83), and will allow us to write code like
let vm = aws.ec2.Instance.get("my-vm", "i-07043cd97bd2c9cfc");
// use any property from here on out ...
The way this works is simply by bridging the Pulumi program via its
existing RPC connection to the engine, much like Invoke and
RegisterResource RPC requests already do, and then invoking the proper
resource provider in order to read the state. Note that some resources
cannot be uniquely identified by their ID alone, and so an extra
resource state bag may be provided with just those properties required.
This came almost for free (okay, not exactly) and will come in handy as
we start gaining experience with reading live state from resources.
This commit changes two things about our resource model:
* Stop performing Pulumi Engine-side diffing of resource state.
Instead, we defer to the resource plugins themselves to determine
whether a change was made and, if so, the extent of it. This
manifests as a simple change to the Diff function; it is done in
a backwards compatible way so that we continue with legacy diffing
for existing resource provider plugins.
* Add a Read RPC method for resource providers. It simply takes a
resource's ID and URN, plus an optional bag of further qualifying
state, and it returns the current property state as read back from
the actual live environment. Note that the optional bag of state
must at least include enough additional properties for resources
wherein the ID is insufficient for the provider to perform a lookup.
It may, however, include the full bag of prior state, for instance
in the case of a refresh operation.
This is part of pulumi/pulumi#1108.
* Improve the error message arising from missing required configs for
resource providers
If the resource provider that we are speaking to is new enough, it will send
across a list of keys and their descriptions alongside an error
indicating that the provider we are configuring is missing required
config. This commit packages up the list of missing keys into an error
that can be presented nicely to the user.
* Code review feedback: renaming simplification and correcting errors in comments
* Send structured errors across RPC boundaries
This brings us closer to gRPC best practices where we send structured
errors with error codes across RPC endpoints. The new "rpcerrors"
package can wrap errors from RPC endpoints, so RPC servers can attach
some additional context as to why a request failed.
* Code review feedback:
1. Rename rpcerrors -> rpcerror, better package name
2. Rename RPCError -> Error, RPCErrorCause -> ErrorCause, names
suggested by gometalinter to improve their package-qualified names
3. Fix import organization in rpcerror.go
This change includes a bunch of refactorings I made in prep for
doing refresh (first, the command, see pulumi/pulumi#1081):
* The primary change is to change the way the engine's core update
functionality works with respect to deploy.Source. This is the
way we can plug in new sources of resource information during
planning (and, soon, diffing). The way I intend to model refresh
is by having a new kind of source, deploy.RefreshSource, which
will let us do virtually everything about an update/diff the same
way with refreshes, which avoid otherwise duplicative effort.
This includes changing the planOptions (nee deployOptions) to
take a new SourceFunc callback, which is responsible for creating
a source specific to the kind of plan being requested.
Preview, Update, and Destroy now are primarily differentiated by
the kind of deploy.Source that they return, rather than sprinkling
things like `if Destroying` throughout. This tidies up some logic
and, more importantly, gives us precisely the refresh hook we need.
* Originally, we used the deploy.NullSource for Destroy operations.
This simply returns nothing, which is how Destroy works. For some
reason, we were no longer doing this, and instead had some
`if Destroying` cases sprinkled throughout the deploy.EvalSource.
I think this is a vestige of some old way we did configuration, at
least judging by a comment, which is apparently no longer relevant.
* Move diff and diff-printing logic within the engine into its own
pkg/engine/diff.go file, to prepare for upcoming work.
* I keep noticing benign diffs anytime I regenerate protobufs. I
suspect this is because we're also on different versions. I changed
generate.sh to also dump the version into grpc_version.txt. At
least we can understand where the diffs are coming from, decide
whether to take them (i.e., a newer version), and ensure that as
a team we are monotonically increasing, and not going backwards.
* I also tidied up some tiny things I noticed while in there, like
comments, incorrect types, lint suppressions, and so on.
This change uses the prior checkpoint's deployment manifest to pre-
populate all plugins required to complete the destroy operation. This
allows for subsequent attempts to load a resource's plugin to match the
already-loaded version. This approach obviously doesn't work in a
hypothetical future world where plugins for the same resource provider
are loaded side-by-side, but we already know that.
This takes the existing `apitype.Checkpoint` type and renames it to
`apitype.CheckpointV1` locking in the shape. In addition, we introduce
a `apitype.VersionedCheckpoint` type, which holds a version number and
a json document representing a checkpoint at that version. Now, when
reading a checkpoint, the CLI can determine if it's in a format it
understands, and fail gracefully if it is not.
While the CLI understands the older checkpoint version, it always
writes the newest version format, meaning that if you manage a
fire-and-forget stack with this version of the CLI, it will be
un-readable by previous versions.
Stacks managed by Pulumi.com are not impacted by this change.
Fixes: #887
* Improve error messages output by the CLI
This fixes a couple known issues with the way that we present errors
from the Pulumi CLI:
1. Any errors from RPC endpoints were bubbling up as they were to
the top-level, which was unfortunate because they contained
RPC-specific noise that we don't want to present to the user. This
commit unwraps errors from resource providers.
2. The "catastrophic error" message often got printed twice
3. Fatal errors are often printed twice, because our CLI top-level
prints out the fatal error that it receives before exiting. A lot of
the time this error has already been printed.
4. Errors were prefixed by PU####.
* Feedback: Omit the 'catastrophic' error message and use a less verbose error message as the final error
* Code review feedback: interpretRPCError -> resourceStateAndError
* Code review feedback: deleting some commented-out code, error capitalization
* Cleanup after rebase
When a stack has secrets, we now take the secret values and construct
a regular expression which is just an alternation of all the secret
values. Then, before pushing any string data into an Event, we run the
regular expression and replace all matches with '[secret]'.
Fixes#747
As it stands, we allow plugin load requests to race. Not only does this
create a situation in which we may load and then immediately throw away
a plugin (potentially leaking its process), it also creates the
possibility of races when reading from/writing to the various plugin
caches. These changes serialize all plugin loads and cache accesses by
running all accesses for a particular host in a single goroutine.
Fixes#1020.
This helper method is only really used for testing, but we should not
allow it to create a Key who's namespace has a colon (as ParseKey
would not build something like this).
This API was introduced to aid the refactoring, but it isn't something
we want to support long term. Remove it and for a few places, push
passing config.Key around more, instead of converting to the old type
eagerly.
When serializing config.Key's we now write them as <package>:<name>
instead of <package>:config:<name>. We continue to support reading the
older format for compatability with older files.
config.Key has become a pair of namespace and name. Because the whole
world has not changed yet, there continues to be a way to convert
between a tokens.ModuleMember and config.Key, however now sometime the
conversion from tokens.ModuleMember can fail (when the module member
is not of the form `<package>:config:<name>`).
I'll be changing the structure of the representation of config.Key, so
let's write some tests first to ensure we can continue to treat
everything as JSON and YAML.
Right now, config.Key is a type alias for tokens.ModuleMember. I did a
pass over the codebase such that we use config.Key everywhere it
looked like the value did not leak to some external process (e.g a
resource provider or a langhost).
Doing this makes it a little clearer (hopefully) where code is
depending on a module member structure (e.g. <package>:config:<value>)
instead of just an opaque type.
As it stands, we only configure those providers for which configuration
is present. This can lead to surprising failure modes if those providers
are then used to create resources. These changes ensure that all
resource providers that are not configured during plan initialization
are configured upon first load.
Fixes#758.
A hold-over from a previous experiment (LumiIDL) which we don't use
anymore. If we decide to bring that back, we can easily restore these
types, but for now, let's just remove this dead code.
Most of the errors in this package are holdovers from our previous
syetem where we had our own custom compiler and evaluator and are no
longer needed. The few we still use during plan applicaton (via the
diagnostics system, which is another component from the old system
that we still use) have been promoted into the diag package. Doing so,
allows us to not have to import "github.com/pkg/errors" as "goerr" in
some parts of the engine, a nice cleaup.
Despite our good progress moving towards having an apitype package,
where our exchange types live and can be shared among the engine and
our services, there were a few major types that were still duplciated.
Resource was the biggest example -- and indeed, the apitype varirant
was missing the new Dependencies property -- but there were others,
like Manfiest, PluginInfo, etc. These too had semi-random omissions.
This change merges all of these types into the apitype package. This
not only cleans up the redundancy and missing properties, but will
"force the issue" with respect to keeping them in sync and properly
versioning the information in a backwards compatible way.
The resource/stack package still exists as a simple marshaling layer
to and from the engine's core data types.
Finally, I've made the controversial change to share the actual
Deployment data structure at the apitype layer also. This will force
us to confront differences in that data structure similarly, and will
allow us to leverage the strong typing throughout to catch issues.
Previously, we would prefer a plugin on the $PATH which is more or
less always the case for people hacking on `pulumi`. Later, when we
went to check the loaded plugin version matched the one we requested,
we fail.
Now, if we have a version, we'll first consult the local plugin
cache. If that fails, we'll fall back to the $PATH as we used to.
When we are loading a plugin without a version, we continue to use the
one on the $PATH (without testing the cache) on the assumption it is
newer.
In addition, we've turned the "plugin versions are mis-matched" from
an error into a warning. We expect that we'll only ever see this
warning when something strange is going on (since in the normal case,
we'll have found the exact version in the cache) but having it not
hard fail does help in development cases.
Fixes#977
This change adds a basic Python langhost RPC server. It's fairly
barebones and merely acts as a jumping off point for the Pulumi engine
to spawn a Python program. The host is written in Go, in contrast to
implementing the host in Python, and more closely resembles how I
expect the Node.js language host to work once pulumi/pulumi#331 is done.
1. Various idiomatic Go and TypeScript fixes
2. Add an integration test that end-to-end roundtrips dependency
information for a simple Pulumi program
3. Add an additional test assert that tests that dependency information
comes from the language host as expected
This change makes the engine backwards compatible with older
language host binaries, by simply ignoring GetRequiredPlugins
calls when the RPC server has not yet implemented it. This
is benign, since we will eventually fault plugins in on demand,
although it does mean that commands like `pulumi plugin install`
will become no-ops (which, thankfully, is what we want).
This commit does two things:
1. All dependencies of a resource, both implicit and explicit, are
communicated directly to the engine when registering a resource. The
engine keeps track of these dependencies and ultimately serializes
them out to the checkpoint file upon successful deployment.
2. Once a successful deployment is done, the new `pulumi stack
graph` command reads the checkpoint file and outputs the dependency
information within in the DOT format.
Keeping track of dependency information within the checkpoint file is
desirable for a number of reasons, most notably delete-before-create,
where we want to delete resources before we have created their
replacement when performing an update.
Previously, the checkpoint manifest contained the full path to a plugin
binary, in places of its friendly name. Now that we must move to a model
where we install plugins in the PPC based on the manifest contents, we
actually need to store the name, in addition to the version (which is
already there). We still also capture the path for debugging purposes.
We have had a long-standing bug in here where we waiting on a
stdout channel that never got populated, when the language plugin
fails to load entirely. This would lead to hung processes. The
fix is simple: only wait for stdout/stderr channels to drain that
have actually been wired up to enjoy the requisite signaling.
This adds support for two things:
* Installing all plugins that a project requires with a single command:
$ pulumi plugin install
* Listing the plugins that this project requires:
$ pulumi plugin ls --project
$ pulumi plugin ls -p
This brings back the Node.js language plugin's GetRequiredPlugins
function, reimplemented in Go now that the language host has been
rewritten from JavaScript. Fairly rote translation, along with
some random fixes required to get tests passing again.
This change adds a GetRequiredPlugins RPC method to the language
host, enabling us to query it for its list of plugin requirements.
This is language-specific because it requires looking at the set
of dependencies (e.g., package.json files).
It also adds a call up front during any update/preview operation
to compute the set of plugins and require that they are present.
These plugins are populated in the cache and will be used for all
subsequent plugin-related operations during the engine's activity.
We now cache the language plugins, so that we may load them
eagerly too, which we never did previously due to the fact that
we needed to pass the monitor address at load time. This was a
bit bizarre anyhow, since it's really the Run RPC function that
needs this information. So, to enable caching and eager loading
-- which we need in order to invoke GetRequiredPlugins -- the
"phone home" monitor RPC address is passed at Run time.
In a subsequent change, we will switch to faulting in the plugins
that are missing -- rather than erroring -- in addition to
supporting the `pulumi plugin install` CLI command.
This change introduces a workspace.GetPluginPath function that probes
the central workspace cache of plugins for a matching plugin binary that
matches the desired kind, name, and, optionally, version. It also permits
overriding this with $PATH for developer scenarios.
The analyzer, language, and resource plugin logic now uses this function
for deciding which binary path to load at runtime.
This addresses pulumi/pulumi#446: what we used to call "package" is
now called "project". This has gotten more confusing over time, now
that we're doing real package management.
Also fixespulumi/pulumi#426, while in here.