Commit Graph

219 Commits

Author SHA1 Message Date
Fraser Waters 7ff5c18216
Add GetRequiredPackages to the language host ()
Next part of https://github.com/pulumi/pulumi/issues/17507

This adds a new `GetRequiredPackages` method to the language hosts. This
will replace `GetRequiredPlugins`. This change set adds the method, and
the required handling in the engine without making use of the new
information returned by the package form, and without updating the
language hosts to return that information.

This should continue to work by hitting all the compatibility paths.
I'll follow this change up with updating the conformance tests to check
against this, and fixing the language hosts to return the new package
information.
2024-12-02 20:24:23 +00:00
Will Jones 6e04a284a5
Generate more accurate reproductions when fuzzing ()
Fuzzed lifecycle tests work slightly differently to their handwritten
counterparts. A handwritten test will typically "start from nothing". An
initial snapshot is built from an empty state and starting program,
before subsequent operations are executed on this state to test
behaviour. In constrast, fuzzed tests create arbitrary starting
snapshots "out of thin air", before running an operation to see if a bug
can be triggered. Ideally, any starting state conjured by a fuzz test is
actually reproducible from an empty state and some combination of
operations, but it may be that this is not the case, or that the number
of operations required to reach the state is very high. In such cases,
it is handy to have the exact code the fuzz test used to hand when
reproducing and isolating behaviour. To this end, this commit extends
the `reprogen` functionality of the suite to generate this code as well
as the existing "handwritten" approximation. This should also aid in
minimising failing test cases quickly when bugs are found.
2024-11-15 09:16:46 +00:00
Will Jones 8314293d6e
Support renaming providers in targeted operations ()
When executing a `--target`ed operation, Pulumi considers all providers
to be "implicitly targeted", whether their URNs are mentioned or not. As
a result, *all* Pulumi operations update provider resources,
`--target`ed or not. This can lead to some tricky scenarios that Pulumi
needs to handle or detect in order not to break the user's program or
state.

One such case is where a user tries to alter the provider of an
untargeted resource. Consider the following starting program/state:

```typescript
const parent = new sdk.Component("parent")
const provider = new sdk.Provider("provider")
const resource = new sdk.Resource("resource", { provider })
```

Let us modify the program so that `provider` becomes a child of
`parent`. In doing so, `provider`'s URN will be changed on the next
update to include `parent`'s type (so e.g. `provider` will become
something akin to `parent$provider`):

```typescript
const parent = new sdk.Component("parent")
const provider = new sdk.Provider("provider", { parent })
const resource = new sdk.Resource("resource", { provider })
```

Now we run `pulumi up --target "**non-existent**"`. That is, we specify
a target glob that will not target *any* of the resources in the
program. Our update will proceed as follows:

* A resource registration for `parent` is sent. `parent` is not targeted
(it is not matched by the `--target` glob, and it is not a provider
resource), so we emit a `SameStep` for it to copy it unchanged from the
old state to the new state.
* A resource registration for `provider` is sent, with a parent of
`parent`. This resource *is* targeted since it is a provider, and thus
is implicitly targeted. Pulumi constructs the URN `parent$provider` for
the provider resource and looks to see if there is any old state for it.
* Pulumi finds no old state (the existing provider state is registered
under the URN `provider` [no `parent$`]) and so decides that `provider`
must be `Create`d. It emits a `CreateStep` and moves on.
* A resource registration for `resource` is sent. `resource` is not
targeted (it is not matched by the `--target` glob, and it is not a
provider resource), so we want to emit a `SameStep` for it to copy it
unchanged from the old state to the new state.
* *However*, as part of checking whether we can emit a `SameStep` for
resource, we find that it references a provider that we have not seen a
registration for (`provider` -- recall that we have only seen
`parent$provider`). We thus abort with an error informing the user that
they have to run an untargeted update to update their provider before
targeted updates can proceed.

If the user genuinely wants to rename the provider, and not actually
delete the old one and create a new one whose parent happens to be
`parent`, they can of course use an alias:

```typescript
const parent = new sdk.Component("parent")
const provider = new sdk.Provider(
  "provider",
  {
    parent,
    aliases: [new Alias({ parent: undefined })],
  }
)

const resource = new sdk.Resource("resource", { provider })
```

At least, that's what they might expect. Presently, attempting to do
this will yield a `panic`, neither tripping the existing error
machinery, nor successfully completing the `--target`ed operation.

This commit fixes this panic by making it work. We'd previously
considered making this case trigger the existing "no provider exists"
error flow, but this is not in fact correct -- in the case an alias is
supplied, the provider *does* exist and we should be able to find it and
complete the rename. Here we make that so.

Fixes 
2024-11-13 16:54:16 +00:00
Will Jones db67d58b74
Don't copy deleted dependencies of untargeted resources ()
When using `--target` to target specific resources during an update, we
use the list of targets to decide which steps to generate given a set of
resource registrations. Specifically:

* If the registration event names a resource that is targeted, we
process it as usual.
* If the registration event names a resource that _is not_ targeted, we
emit a `SameStep` for it.

In the latter case, the emission of a `SameStep` means that the old
state for the resource will be copied across to the new state. This is
the desired behaviour -- the resource was not targeted and so the new
state should contain the resource exactly as it was prior to the update.
In , this behaviour was extended so that any *dependencies* of a
resource being copied were *also* `SameStep`ped, to avoid copying over
states referring to resources no longer registered by the program (and
thus not already implicitly in the new state -- see  for a
concrete example).

Manually traversing dependencies in this manner introduces a new bug
however -- by walking "back up" the state as step generation is
proceeding, we are able to observe certain intermediate states that we
would not be able to ordinarily. One such intermediate state is that of
resources which are marked for deletion as part of a
create-before-replace operation. Consider the following starting
program/state:

```typescript
const p = new Resource("parent")
const a = new Resource("a", { parent: p })
const b = new Resource("b", { dependency: a })
```

We have three resources in our state: `p`, `a` (whose URN will be
something like `p$a` since it has `p` as its parent and will thus embed
`p`'s type in its URN), and `b`. We change the program as follows:

```typescript
const p = new Resource("parent")
const a = new Resource("a", { aliases: [new Alias({ parent: p })] })
const b = new Resource("b", { dependency: a })
```

We have modified `a` to no longer have `p` as a parent and added an
alias to the URN it would have had when `p` *was* its parent. With this
change made, we run a *targeted* update that *only targets `a`*.
Moreover, as part of this update, the following will occur:

* `a` will be diffed and found to be needing *replacement*;
* The replacement will be a "create-before-replace", meaning that a new
A will be created *before* the old A is deleted;
* As part of this operation, the creation of the new A will succeed,
while the deletion of the old A will fail.

As a result, the operation unfolds thus:

* A resource registration is received for `p`. `p` is not targeted, so a
`SameStep` is emitted for it.
* A resource registration is received for `a`, with no parent and an
alias. The alias means that the old `a` with URN of the form `p$a` will
be found and become `a`'s `old` during step generation.
* `a` is targeted, so we diff it and find we must perform a
create-before-replace.
* A new `a`, with the parentless URN, is created successfully.
* The old `a`, with the parented (`p$a`) URN (since the alias enabled us
to find this), is marked for deletion -- its `Delete` field is set to
`true`. This permits Pulumi to retry deletions of replaced resources
should they fail.
* Indeed, this deletion fails, so the old `a` (with `Delete: true`) will
be left in the state to be cleaned up in a subsequent operation.
* A resource registration is received for `b`. `b` is not targeted, so a
`SameStep` is emitted for it. Prior to this however, `b`'s dependencies
are traversed so that `SameStep`s can be emitted for them as part of
.
* `b`'s dependencies contain a reference to *`a`'s old URN* (of the form
`p$a`).
* Ordinarily, this URN would be *normalized* (rewritten to the new URN)
as part of snapshot persistence -- this happens to all resources,
targeted or not, if aliased resources are targeted as part of an
operation.
* In this case however, we are manually traversing dependencies prior to
snapshot persistence and so can observe the old resource (which is now
marked for deletion with `Delete: true`).
* We emit a `SameStep` for the old `a`, which causes a panic since
`SameStep`s are not supposed to be constructed for `Delete`d resources.

Despite the wordy explanation, the fix is (it seems) straightforward --
just as we skip `Delete`d resources when building the list of `old`s
that will take part in step generation, we skip them when we encounter
them as we manually traverse dependencies. The snapshot persistence
layer then takes care of them later on, copying them over. Note that
they are copied over *after* the resources that supposedly depend on
them, but due to URN normalization, those resources will be rewritten to
have their dependencies point to the new, renamed resources, which
*will* appear *before* them as is necessary.

Fixes 
2024-11-13 09:36:15 +00:00
Will Jones 51a62a9265
Add Go tests for fuzzing the engine ()
In  through  and some follow-up PRs, we built out a
framework for fuzzing lifecycle tests in order to help track down
snapshot integrity violations in the Pulumi engine. All that remains now
is to actually provide ways to trigger a fuzzing run in useful ways.
This commit kicks this off by introducing two Go test functions that can
be run with `go test` or our `Makefile`:

* `TestFuzz` -- this runs the fuzzer and generates a brand new set of
scenarios (1,000 by default) and checks whether any of them result in a
snapshot integrity error. This test is skipped unless an environment
variable is set (which the `Makefile` handles if one runs `make
test_lifecycle_fuzz`). The intended purpose of this test is to back one
or more CI workflows that will run periodically in order to slowly
explore the state space.

* `TestFuzzFromStateFile` -- this accepts a path to a JSON state file
(such as that produced by a `pulumi stack export`) and uses that state
to seed the fuzzer, subsequently trying to find provider and operation
configurations that lead to a snapshot integrity error. This test is
skipped unless a state file path is set using the relevant environment
variable. The intended purpose of this test is to make it possible to
find root causes for user issues when all we have is a state and we'd
like to guess the program/provider configurations that led to an issue.

Alongside introducing these two tests, we bulk out the fuzzing
documentation a bit to help engineers run them, and link to the new
sections from the existing docs on snapshot integrity issues.
2024-11-08 13:11:03 +00:00
Will Jones f9de64c2cb
Bump gRPC dependencies and migrate `grpc.Dial` ()
We'd like to bump `pulumi-java` to 0.17.0, but in order to do so we'll
need to first bump our gRPC dependencies to match. Doing so incurs
dealing with a deprecation -- `grpc.Dial` has been deprecated in favour
of `grpc.NewClient`. This commit makes that upgrade and migration
separately so that we can bump Java in a subsequent piece of work.
2024-11-06 18:36:10 +00:00
Will Jones 79c5d97afa
Normalize URNs in `DeletedWith` references ()
It's often the case that we want to move or rename resources in our
Pulumi programs. Doing so can result in a change in the resource's URN,
which means that Pulumi will, by default, see the move as a deletion of
the old resource and a creation of the new resource. We can tell Pulumi
that a resource has been renamed by using *aliases*, whereby a resource
can be annotated with its previous URNs. Pulumi will then use these URNs
in several places:

* When looking for a resource's previous state, Pulumi will try to find
it using the new URN first and any aliases second.
* When writing out a new snapshot, Pulumi will *normalize* all URNs
(e.g. those in provider, parent and dependency references) to remove
aliases and replace them with the new URNs they resolve to.

Alas, a familiar story presents itself in the second case -- Pulumi does
not currently normalize `DeletedWith` references. This can result in
snapshot integrity errors as Pulumi leaves stale references in the
snapshot before writing it. This commit addresses this omission, using
the now-preferred `GetAllDependencies` method introduced in  to
hopefully stop this from happening again in this part of the codebase.

Fixes 
2024-11-05 13:27:11 +00:00
Will Jones acfec7a8a8
Generate reproductions for fuzz failures ()
In  and the PRs that followed it, we added fuzz testing
capabilities to the engine's lifecycle test suite, with a view to
randomly generating test cases in the hopes of proactively finding
snapshot integrity bugs in our code. This commit extends the fuzzing
library to generate a reproducing lifecycle test case in the event that
a snapshot integrity error is found, hopefully aiding in debugging and
pinning down the exact cause of the error. Code is written to a file in
a temporary directory, which may optionally be overridden using an
environment variable. This might be useful in e.g. a GitHub action that
fuzzes periodically so that failing cases' reproductions can be made
available as artifacts for download.

Part of 
2024-11-01 18:46:54 +00:00
Will Jones 85e39e09f6
Add the ability to fuzz parents and aliases ()
In  and the PRs that followed it, we added fuzz testing
capabilities to the engine's lifecycle test suite, with a view to
randomly generating test cases in the hopes of proactively finding
snapshot integrity bugs in our code. This commit extends the fuzzer so
that it randomly parents, re-parents, and aliases reparented resources,
covering the various parent/child relationships that these actions lead
to and which can result in snapshot integrity issues if handled
improperly. In particular, we can now fuzz the following:

* Randomly parenting a resource to another resource in an initial
snapshot.
* Randomly updating a resource in a program to either add, remove, or
change an existing parent.
* Randomly aliasing a resource whose parent (and consequently URN) has
been changed by a program to point back to the URN it had originally.

Part of 
2024-11-01 10:32:56 +00:00
Will Jones 28b5774366
Support generating random fixtures in lifecycle tests ()
Snapshot integrity errors are very problematic when they occur and can
be hard to spot and prevent. To this end,  outlines a plan to
introduce [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) to our suite
of lifecycle tests in order to find cases and executions which might
violate snapshot integrity. This commit extends the `fuzzing` package of
the suite to support generating random fixtures and adds documentation
for the tactics employed when doing so.

A fixture comprises an initial snapshot, a program to run against that
initial snapshot, a set of providers to use during that program's
execution, and an operation to run. Fixtures can then be generated as
part of a Rapid property test, allowing us to fuzz random combinations
in order to hunt down snapshot integrity issues.

Part of 
2024-10-31 15:16:38 +00:00
Will Jones 2aad59df18
Spot skipped-create dependencies even when inputs don't change ()
`Create`s happen in response to new resources being present in a Pulumi
program but not in Pulumi's state snapshot. For instance, take the
following TypeScript program:

```typescript
const b = new Resource("b", { x: 2 })
const c = new Resource("c", { x: 3 }, { deletedWith: b })
```

When asked to run this program on an empty state with no arguments,
Pulumi will issue two `Create` calls -- one for `b`, and one for `c`.
The call for `c` will *depend on* `b`'s due to the need for
`deletedWith` to refer to `b`'s URN.

If instead of passing no arguments we ask Pulumi to perform a *targeted*
operation that *only targets `c`*, Pulumi will throw an error and refuse
to execute the program. This is because `c`'s `Create` would depend on
`b`'s `Create`, but since `b` has not been targeted, its `Create` will
not happen and thus `c`'s call cannot be built. Internally, we call
`b`'s omission a "skipped create", and keep track of these so that we
can trigger the error above appropriately.

So far, so good. Now, consider that we have executed the above program
with no targets, so that Pulumi's state contains both `b` and `c`, with
`c` depending on `b` via a `deletedWith` link. We now add the missing
`a` to the program and modify `b` to depend on it:

```typescript
const a = new Resource("a", { x: 1 })
const b = new Resource("b", { x: 2 }, { deletedWith: a })
const c = new Resource("c", { x: 3 }, { deletedWith: b })
```

If we were to run the program with no targets, we would expect a state
in which all of `a`, `b` and `c` existed. `c` would depend on `b` (as it
did before) and `b` would have a new dependency on `a`. Lovely.

Of course, we are not about to run the program with no targets. We are
instead interested in *targeting `b` only*. When we do, the following
sequence of events will unfold:

* `a`, not existing in the state already and not being targeted, will
yield a skipped create.
* `b` depends on a skipped create. However, its *inputs* have not
changed (`x` was `2` before, and it's still `2` after). Consequently,
`b` will have yielded a `SameStep`. We *incorrectly* assume this means
that skipped creates are not a problem (`SameStep`s mean no changes,
right?) and allow the deployment to proceed.
* We write the new `b` to the state, with a dependency on the skipped
create, yielding a snapshot integrity error.
* 💥

We might ask "why assume that `SameStep`s are safe?". From looking at
the existing test cases, it seems likely that this was designed to cover
the case where a skipped create is depended on *by another skipped
create* (which is internally represented as a `SameStep`) -- in such
cases, we don't need to error and the deployment can proceed. However,
this bug shows that there are cases where `SameStep` does not imply
skipped create. This commit thus tightens up the logic appropriately,
checking explicitly for `SameStep`s whose `IsSkippedCreate` method
returns true.
2024-10-30 16:17:30 +00:00
Will Jones 555c051186
Support generating random providers in lifecycle tests ()
Snapshot integrity errors are very problematic when they occur and can
be hard to spot and prevent. To this end,  outlines a plan to
introduce [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) to our suite
of lifecycle tests in order to find cases and executions which might
violate snapshot integrity. This commit extends the `fuzzing` package of
the suite to support generating random providers. A provider may
configure operations such as `Create`, `Diff`, etc. to fail or succeed
in a number of ways (e.g. no diff, changes, replaces, etc.) on a per-URN
basis, so as to exercise a variety of code paths in step
generation/execution.

Part of 
2024-10-29 17:51:29 +00:00
Will Jones f3b163052d
Support generating random programs in lifecycle tests ()
Snapshot integrity errors are very problematic when they occur and can
be hard to spot and prevent. To this end,  outlines a plan to
introduce [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) to our suite
of lifecycle tests in order to find cases and executions which might
violate snapshot integrity. This commit extends the `fuzzing` package of
the suite to support generating random programs. A program is based upon
a snapshot and may randomly append and prepend new resources and copy,
drop or update existing ones.

Part of 
2024-10-29 15:32:53 +00:00
Will Jones 92df1f6a94
Support generating random snapshots in lifecycle tests ()
Snapshot integrity errors are very problematic when they occur and can
be hard to spot and prevent. To this end,  outlines a plan to
introduce [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) to our suite
of lifecycle tests in order to find cases and executions which might
violate snapshot integrity. This commit extends the `fuzzing` package of
the suite to support generating random valid snapshots, the idea being
that this will enable us to generate good starting states for our
property tests. A snapshot contains a number of random resources which
may depend on one another in valid ways. Snapshots may also contain
deleted resources which have not yet been cleaned from the snapshot.

Part of 
2024-10-29 14:27:19 +00:00
Will Jones 9343f54695
Support generating random resources in lifecycle tests ()
Snapshot integrity errors are very problematic when they occur and can
be hard to spot and prevent. To this end,  outlines a plan to
introduce [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) to our suite
of lifecycle tests in order to find cases and executions which might
violate snapshot integrity. This commit kicks this off by introducing a
`fuzzing` package to the suite and adding types and generators (from the
`pgregory.net/rapid` library) for generating random resources. The idea
is that from resources we can progress to generating random snapshots;
from there to programs and provider configurations and so on; and with
all these pieces execute random tests in an attempt to find snapshot
integrity bugs before our customers do.

Part of 
2024-10-29 13:22:47 +00:00
Will Jones 26ae5768a8
Factor out the lifecycle testing framework ()
Lifecycle tests exist to test the Pulumi engine, allowing us to mock
providers and programs and resource registrations before executing
specific operations. We'd like to extend our lifecycle test suite to
support fuzzing the engine -- that is, generating test cases to exercise
edge cases that we might otherwise struggle to find manually. Presently,
the framework underpinning the lifecycle tests is defined in a single
`test_plan.go` module that sits alongside the `_test.go` files
themselves. This means that it's not really possible to factor out or
neatly extend the framework (e.g. by adding a `fuzzing` submodule)
without introducing circular imports (since tests in `lifecycletest`
will import `fuzzing`, which imports the framework from
`lifecycletest`).

This commit therefore pulls out `test_plan.go` into a dedicated
`framework` submodule, which is now explicitly imported and used by
existing tests. It also adapts the framework slightly to define the
subset of `testing.T` functionality it needs more precisely, so that
when the time comes for us to fuzz we can e.g. pass a `rapid.T` just as
easily. No behavioural changes are made to the tests themselves -- this
is purely a structural change to facilitate subsequent work on fuzzing.
2024-10-28 11:58:59 +00:00
Will Jones c496921d44
Enable some more linting rules ()
Issue  lists a number of extra linting checks that we could enable
in order to make our Go code more robust. This commit implements as many
as seem sensible:

* `durationcheck`, which checks for multiplication of `time.Duration`s,
which can lead to unexpected behaviour (e.g. `time.Second * time.Second`
is *not* one second)
* `goprintffuncname`, which checks that `Printf`-like functions are
appropriately suffixed with `f` to indicate as such
* `tenv`, which checks for `os.Setenv` in tests where `t.Setenv` is
generally a better solution
* `wastedassign`, which checks for assignments whose values are never
used (such as initial values before an `if` where both branches then
overwrite the value)
* `whitespace`, which checks for blank lines at the beginning and end of
blocks such as functions, `if`s, `for`s and so on.

This commit does *not* enable the following checks listed in :

* `wrapcheck`, which insists that third-party library errors are always
`%w`rapped -- we have a lot of cases where we don't do this and it's
probably a bit more involved than "just wrap them" in terms of making
sure we don't break anything (maybe)
* `predeclared`, which checks for shadowing of existing Go identifiers
-- we use `old` and `new` a lot, especially in step generation, so this
is probably a slightly bigger clean-up/one we might want to opt out of
* `mnd` (magic number detection) -- we have a lot of failures on this
* `nilnil` -- we only have a couple of failures on this; these could
probably be handled with `//nolint` but for now I've opted not to take
this route.
2024-10-03 17:37:13 +00:00
Fraser Waters 205515d4dc
Add a lifecycle test to show that ImportID works with parameterized providers ()
We didn't have a test confirming that ImportID worked with parameterized
providers. We thought it did, but a test to prove it is worth having.
2024-09-25 19:51:02 +00:00
Will Jones dc6f275f03
Fix dependency traversal for untargeted skipped creates ()
When performing a `--target`ed operation, it is possible that the new
program introduces new resources that are not targeted. In such cases,
we can't emit `CreateStep`s (since that would not respect the
`--target`s), but we need to track the creations so that e.g. later
resources that will come to depend on these would-be resources can be
handled appropriately. The step generator maintains a list of so-called
"skipped creates" to this end.

Unfortunately, when later traversing the list of skipped creates to see
if any resources depend on them, "depend on" does not take into account
property dependencies and deleted-with relationships. This results in
snapshot integrity errors as we end up writing the broken dependencies
to the state. This commit rewrites the code to use the new
`GetAllDependencies()` method on `resource.State` to take account of the
other possible relationships, adding lifecycle tests for these new cases
and cleaning up the existing tests to be more specific about the errors
they expect.
2024-09-23 15:02:37 +00:00
Fraser Waters 2ebaf039ec
Add a test for ()
Add a test for https://github.com/pulumi/pulumi/issues/14254, which
happened to have been "fixed" in
https://github.com/pulumi/pulumi/pull/15716. We didn't notice this at
the time, and almost regressed this fix in
https://github.com/pulumi/pulumi/pull/17245.

Adding a test to ensure we don't regress this.

Fixes https://github.com/pulumi/pulumi/issues/14254.
2024-09-13 07:58:21 +00:00
Julien 68524af701
Enable goheader rule and add missing license headers ()
This commit adds the `goheader` rule to `golangci-lint` to enforce that
all our Go source code includes appropriate licence headers, fixing up
files that currently fail that check.

---------

Co-authored-by: Will Jones <will@sacharissa.co.uk>
2024-09-09 12:05:45 +00:00
Will Jones 079f98be15
Hide unnecessary rows in non-interactive mode ()
When using a `--non-interactive` renderer, it seems that we don't
respect the presence or absence of options like `--show-sames`, instead
displaying all rows. While we do correctly compute whether or not a row
should be hidden, we don't check that computed boolean later on unless
we are performing an interactive render. This commit fixes that.

Fixes 
2024-09-09 07:44:27 +00:00
Will Jones 391fe212ae
Document testing in `pulumi/pulumi` ()
This commit begins the process of documenting how we test Pulumi, from
unit tests to language conformance. It's nowhere near complete yet but
the various sections have been stubbed out and anything that already
exists has been ported over and linked to other new pages where
possible. This commit is a good example of where we can start to take
advantage of Sphinx's rich cross-references, placing e.g. `README.md`
files close to the things they document (such as
`cmd/pulumi-test-language/README.md`) while folding them into a larger
section on testing using Sphinx's tables of contents (TOCs).
2024-09-05 11:51:32 +00:00
Fraser Waters 49d8298e84
Fix snapshot integrity on pending replacement ()
This fixes a snapshot integrity issue with delete before replace, failed
creates, and multiple updates.

This was caused by https://github.com/pulumi/pulumi/pull/16510 where we
started removing pending replace resources as part of their replacement
steps. A well intentioned fix to stop trying to delete resources we'd
already deleted in a previous update.

The bug would manifest if the following happened:
1. An update goes to replace a resource with `deleteBeforeCreate` set.
It deletes the old resource (and saves the state as pending replace) but
then failed to create the new replacement resource.
2. A second update is run to try and create the replacement resource. At
this point the bug manifest and we delete the pending replace resource
from state, but then if the new resource fails to create again we end up
with an invalid snapshot.

The fix is very simple. If a resource is already pending replacement we
just don't issue a delete/remove step at all. The pending replace
resource will get cleaned up at the end by the create step creating the
new version.

Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
2024-09-04 10:52:43 +00:00
dropbigfish e5b28b7218
chore: fix some function names ()
Signed-off-by: dropbigfish <fillfish@foxmail.com>
Co-authored-by: Thomas Gummerer <t.gummerer@gmail.com>
2024-09-03 08:57:30 +00:00
Fraser Waters 7ab5fad61e
Use cached loader in RPC codegen functions ()
Also close the loader client connection when finished with it.
2024-09-01 05:12:16 +00:00
Will Jones 217187d9a8
Propagate deleted parents of untargeted resources ()
When using `--target` to target specific resources during an update, we
use the list of targets to decide which steps to generate given a set of
resource registrations. Specifically, *prior to #16247*:

* If a registration event named a resource that was targeted, we would
process it as usual.
* If a registration event named a resource that was not targeted, we
would emit a SameStep for it.

In , this behaviour was changed so that we'd consider the entire
dependency graph of an untargeted resource, in order to avoid missing
copying resources that had been removed from the program but not
targeted (and thus whose old states would be copied over later on,
potentially *after* the states of resources depending on them). As part
of this work, parent-child relationships were not handled. This did not
manifest in errors in the cases tested, but in the presence of aliases
and other edge cases, parent-child relationships *can* result in
snapshot integrity issues and thus must be handled too. This commit
implements those changes.

Fixes  (I believe)
2024-08-30 14:31:50 +00:00
Artur Laksberg 44a95c434a
Support PULUMI_SHOW_COPILOT_LINK to enable 'explainFailure' link () 2024-08-29 20:43:29 +00:00
Fraser Waters 153d830947
Add loader_target to RunRequest ()
This adds a schema loader target to the language run request, similar to
the generate project/package requests. This allows a language runtime to
connect back to the engine to load schema information in _exactly_ the
same way as the engine.

We'll want to switch YAML over to using this loader so that it will work
with paramaterized packages and conformance tests.
2024-08-29 09:14:39 +00:00
Artur Laksberg df7b3e8027
Add link to Copilot in CLI printDiagnostics ()
Add a deep link to Pulumi Console from CLI.
Details:
https://docs.google.com/document/d/1QJucNEnO4hcnhqwADyMYXEHMuyhrB6tyulpoxbYX7Xg/edit?usp=sharing
2024-08-28 18:43:01 +00:00
Julien eefd0ce92f
Use int32 in Go interfaces that map to protobufs using int32 ()
We use plain `int`, which can be 32 or 64 bit based on the system, in
some of the Go interfaces that map to protobuf using int32. Update the
Go interfaces to use int32 to prevent potential integer overflow.

These potential integer conversion overflows are flagged by the latest
golangci-lint. Once this PR is merged, we can upgrade our version of
golangci-lint, which will unblock us from using Go 1.23 in CI.

This will require some small updates in providers, mostly removing casts
`int32(...)`.
2024-08-28 13:45:17 +00:00
Thomas Gummerer bc744d2ae5
create unknowns when the provider is not known during construct ()
Currently when a provider is not known when trying to call construct, we
error out. A provider not being known can however legitimately happen
during a preview, for example when setting up an eks cluster, and then
using that kubeconfig to set up a kubernetes provider.

Users can currently work around that by using `--skip-preview`, and only
going through the actual up, where the provider is known. This is
however not a great user experience.

Return a specific error from the provider Construct call, which we can
then use in the resource monitor to fake constructing a resource with
all unknown outputs. This way the preview can still succeed, but any
dependencies will also end up being unknown.

Fixes https://github.com/pulumi/pulumi/issues/16331
2024-08-19 07:47:31 +00:00
Fraser Waters c9c30f0939
Ensure internal provider state doesn't clash with user config ()
Provider internal state is now separated from provider user config. This
allows providers to use a config key like "pluginDownloadURL" which
previously would have conflicted with the engines internal state. It
also allows the engine to add new internal state fields without concern
for clashing with existing or future provider config keys.

"version" is maintained as a root level key because providers already
look at this field, and we use it in conformance tests. However it's
been fixed to be the package version, not the plugin version. That's the
same for normal providers, but for a parameterised provider it will be
the version of the parameterised package, not the base plugin version.
As part of this I've made schema more strict that a provider can't
define its own version field. This would have clashed with the use of
version to set the plugin version to download, and is better set by
users by using the version resource option.

Fixes https://github.com/pulumi/pulumi/issues/16757.

---------

Co-authored-by: Will Jones <will@sacharissa.co.uk>
2024-07-30 12:22:32 +00:00
Fraser Waters 0061a577eb
Support explicit parameterized providers ()
source_eval wasn't looking at package refs for explicit provider
resources, only when building default providers. This meant that
explicit provider resources for parameterized packages would fail to
construct as they wouldn't be able to find their plugin.

This fixes source_eval to check for PackageRef and set all the required
fields from the provider request it maps to.
2024-07-29 14:34:44 +00:00
Will Jones 33e6d626cb
Test parameterized `Construct` support in the engine ()
Parameterization refers to the ability for a provider to vary its schema
based on a parameter that is passed to a new `Parameterize` call on the
provider interface. The package reference that is returned may then be
used to interact with the bespoke schema/packages within.
Paramterization is key to e.g. dynamically bridging providers. In this
instance we can manage and release a single "bridge" provider that
accepts a parameter defining the upstream provider to bridge, and
returns a reference to a dynamically constructed package whose schema
reflects the upstream as needed.

In this commit, we add test cases for `Construct`, the call which powers
multi-language components (MLCs). The nature of `Construct` means that
it already works by virtue of the changes made to `RegisterResource`
(where it is handled when `remote` is `true`) but this commit tests this
code path explicitly.
2024-07-29 08:44:05 +00:00
Will Jones 1afd76989c
Add parameterized `Call` support to the engine ()
Parameterization refers to the ability for a provider to vary its schema
based on a parameter that is passed to a new `Parameterize` call on the
provider interface. The package reference that is returned may then be
used to interact with the bespoke schema/packages within.
Paramterization is key to e.g. dynamically bridging providers. In this
instance we can manage and release a single "bridge" provider that
accepts a parameter defining the upstream provider to bridge, and
returns a reference to a dynamically constructed package whose schema
reflects the upstream as needed.

In this commit, we extend the engine so that `Call` calls can accept a
package reference and thus be parameterized.
2024-07-26 15:36:53 +00:00
Will Jones 5c0ea39b24
Update lifecycle test APIs to match provider APIs ()
In , provider methods were normalised to the form
`Method(context.Context, MethodRequest) (MethodResponse, error)`. This
commit makes the same changes to the lifecycle tests, both for
consistency and discoverability when writing stubs for provider methods.
2024-07-26 12:14:45 +00:00
Fraser Waters 9f7134039d
Add parameterized read support to the engine ()
This enables the engine to receive ReadResource for parameterized
providers.
2024-07-26 10:03:18 +00:00
Fraser Waters fcb0571b02
Fix default provider names for parameterized providers ()
These were using the base package version rather than the parameterized
package version.
2024-07-25 12:56:17 +00:00
Fraser Waters 0758303dc9
Add parameterized invoke support to the engine ()
This enables the engine to receive Invokes for parameterized providers.
2024-07-24 17:23:13 +00:00
Thomas Gummerer 0febb7b096
continue-on-error: fix integrity issues in up with changed dependencies ()
In `pulumi up --continue-on-error`, we have to make sure to not delete
resources that are still needed in the snapshot because an earlier step
failed. E.g. if B depends on A, and A fails to be deleted, we need to
make sure B is not deleted as well, otherwise we get a snapshot
integrity failure because B is missing A.

We do this by checking the dependencies of B in the *current program*,
e.g. only checking `dep.Res()`. However it is possible that the
dependencies in the new program are not the same.

For example, say resource A depends on provider B. Now we change the
program and resource A depends on provider C. We try to `pulumi up
--continue-on-error` the latter program, but the Update of resource A
fails. But because the new resource A only depends on C, and not on B,
we go ahead and delete B, resulting in a snapshot integrity failure,
because A has not been updated.

Fix this by also checking the `res.Old()` for dependencies, and not
deleting those either if `res` had a failure.

Note that this isn't an issue for creates/updates, because we only need
to check wheter a *new* resource is dependent on the existing resource.

Fixes https://github.com/pulumi/pulumi/issues/16720
2024-07-22 11:25:33 +00:00
Fraser Waters 99a4025866
Rename 'package' to 'packageRef' ()
See comment at
https://github.com/pulumi/pulumi/pull/16392#discussion_r1683541340 for
context.

We want to call this "thing" the same in each language ideally, and have
settled on 'packageRef' as the name for it, rather than 'package', or
'pkgref'.
2024-07-20 10:08:44 +00:00
Will Jones ba43cb6fc7
Don't set `PendingReplacement` until `Delete` succeeds ()
There are a number of use cases for `Delete` steps in a Pulumi
operation. Aside from direct deletions, where a resource has been
removed from a program, they are also a key component of _replacements_,
whereby a resource has changed in a manner where it cannot simply be
updated in the provider but instead must be completely removed and
reconstructed. In such cases, Pulumi offers two options:

* "Delete after replace", in which a new resource is first created
before the old one is deleted.
* "Delete before replace", in which the old resource is first deleted
before a replacement is created.

Delete-after-replace is the default, since where possible it is
typically the preferred option by users, enabling e.g. zero-downtime
deployments. However, there are cases where delete-after-replace is not
possible (e.g. because the new and old resources would clash in some
manner), and so delete-before-replace is an option that can be opted
into.

In cases where the deletion must happen first, we must be careful how we
handle the Pulumi state. Either or both of the delete and create calls
could fail, and we always want a state file that tells us how to resume
a failed call to yield the desired outcome. In the case of
delete-before-replace operations, a key component of resumable state is
the `PendingReplacement` field on a resource. `PendingReplacement`
indicates that a resource has been deleted in the provider, but that
this deletion is part of a replacement (and thus that a create call will
subsequently occur). In this way, the deleted resource can remain in the
state file throughout the operation, meaning that e.g. resources that
depend on the deleted resource won't have their dependencies violated
(causing a snapshot integrity error).

Alas, until this point, `PendingReplacement` was set unconditionally on
the creation of a delete-before-replace step, meaning that if the
provider delete failed, we'd elide the delete on a retry and end up with
a bad state failing snapshot integrity checks. This commit fixes this by
moving the setting of `PendingReplacement` inside `DeleteStep.Apply`, so
that it occurs only if the provider delete call succeeds. It also adds a
lifecycle test to test this case and hopefully guard against
regressions.

Fixes 
Part of 
2024-07-18 12:27:06 +00:00
Fraser Waters 5cbf4cf728
Extend the TestReplacementParameterizedProvider test ()
The test now checks that destroy works, and also that the parameterized
provider state is what we expect.

This required flipping around some of the state managment around reading
and writing provider inputs.
2024-07-15 08:33:36 +00:00
Thomas Gummerer 34b0409d07
continue-on-error: respect all dependency relationships ()
When running `pulumi up --continue-on-error`, we can't bring up new
resources that have dependencies that have failed, or have been skipped.
Since the resource would have a failed dependency, that dependency would
not be in the snapshot, resulting in a snapshot integrity failure. Also
it simply does not make sense to `up` a resource that has a failed
dependency.

We took care of that for the regular dependency relationship, however at
the time we missed doing the same for other types of dependencies,
namely parent-child relationships, deleted with relationships and
property dependencies.

This can result in snapshot integrity failures for example when a parent
fails to be created, but we still do the resource creation of the child,
such as what happened in https://github.com/pulumi/pulumi/issues/16638.

Fix this by skipping the step when a resource with any type of
dependency relationship fails or is skipped beforehand.

Fixes https://github.com/pulumi/pulumi/issues/16638
2024-07-12 09:42:57 +00:00
Thomas Gummerer b8226b7058
implement engine support for invoke transforms ()
We've recently introduced resource transforms, which allow users to
update any resource options and properties for the duration of a program
using a callback. We want to introduce similar functionality for Invokes
(and eventually also StreamInvokes, Read and Calls). This can help users
e.g. set default providers through transforms consistently for all
components.

While this PR only implements the engine parts of invoke transforms, the
API for this will look very similar to what the API for resource
transforms looks like. For example in TypeScript:

```
pulumi.runtime.registerInvokeTransform(args => {
[...]
});
```

---------

Co-authored-by: Will Jones <will@sacharissa.co.uk>
2024-07-11 16:01:44 +00:00
Fraser Waters a0e0208dd1
Change parameterization to be bytes based ()
Making this change has a couple of benefits.

Firstly (and honestly the main one) is it make codegen much simpler. We
just to emit a base64 string and/or other embedded byte array to the
generated SDK. Currently we need emit a full `proto.Value` expression
for each language, and that's not hard but it's also not trivial and
means more combinations of things per test per language.

Secondly it will allow providers to use more efficient encodings of
their parameter than JSON if there is one. I imagine some providers
might make the parameter value a protobuf message and parse that
(similar to what we do for transform functions), but they can easily
fallback to just treating the bytes as a JSON string if they want.

The only downside to this is the value is obfuscated in the generated
SDK and in the state file. Neither of those are really expected to be
viewed by users, so this feels like a minor loss.
2024-07-10 11:15:35 +00:00
Fraser Waters f079f6d7f5
Keep plugin values together in RegisterPackageRequest ()
Also improve some of the field and message naming while we've got a
chance before this becomes stable.
2024-07-08 10:55:08 +00:00
Thomas Gummerer 804c9bc7f3
fix hang when continue-on-error is used with import resource option ()
This is a very similar fix to
https://github.com/pulumi/pulumi/pull/16371, but for imports instead of
create and updates.

When using the import resource option with continue-on-error, and there
is a diff in the import, the import fais, but we still also return the
completion function from the `Apply` call. This results in the engine
trying to call `Done` twice, which in turn results in it running
indefinitely, trying to write the result to a channel that's no longer
being read from.

Fix this by not returning the completion function from `Apply` for the
ImportStep when there is an error.

(I went through the other steps as well to double check we don't need a
similar fix there, and it looks like they are all fine).

Fixes: https://github.com/pulumi/pulumi/issues/16570
2024-07-03 10:24:26 +00:00
Will Jones 37caaba2ff
Don't call `Diff` when refreshing external resources ()
External resources are resources whose state is tracked but not managed.
That is, they feature in the state file, but Pulumi does not issue
create, read or update operations for them. Typically they arise as the
result of generated `.get` functions and methods in Pulumi programs
(these should not be confused with specific `getX` functions/methods,
which are backed by `Invoke`s and designed to reflect data
sources/arbitrary function calls).

In , `refresh` operations were modified to use actual bonafide
`Diff` calls to compute differences in a manner that would reflect "what
will happen on the next `preview`". In general, this aligns better with
what users expect when running `refresh`, and fixes a number of issues
that have long plagued `refresh`. However, it falls down in the case of
external resources. The existing `Diff` contract takes (somewhat
strangely) _three_ key inputs:

* The old inputs ("what the program used to say")
* The old outputs ("what we got back from the last update")
* The new inputs ("what the program says now")

Intuitively, `Diff`'s job is to return the difference between the old
and new inputs. Aside from old outputs being a bit of a third wheel,
this contract makes sense for an `update` operation, where the inputs
are driving the outputs and changes there will yield changes elsewhere.
In a `refresh` operation, however, we have a different situation -- one
in which _both_ old and new inputs _and_ outputs are available. Alas, we
don't currently have a "four-argument `Diff`" (or a two-argument one we
can call twice with no fear of repercussions), and so the usage in
`refresh` is a little bit of a clever hack that ends up biasing inputs
over outputs (see  or the implementation for details).

For external resources, the old behaviour (where we just compared
outputs) seems like a better fit. This commit therefore adds this
special case and a test to ensure that when we see external resources,
we simply compare outputs and don't call `Diff`.

Fixes 
Fixes 

---------

Co-authored-by: Thomas Gummerer <t.gummerer@gmail.com>
2024-07-02 10:11:10 +00:00