This includes a few changes:
* The repo name -- and hence the Go modules -- changes from pulumi-fabric to pulumi.
* The Node.js SDK package changes from @pulumi/pulumi-fabric to just pulumi.
* The CLI is renamed from lumi to pulumi.
As explained in pulumi/pulumi-fabric#293, we were a little ad-hoc in
how configuration was "applied" to resource providers.
In fact, config wasn't ever communicated directly to providers; instead,
the resource providers would simply ask the engine to read random heap
locations (via tokens). Now that we're on a plan where configuration gets
handed to the program at startup, and that's that, and where generally
speaking resource providers never communicate directly with the language
runtime, we need to take a different approach.
As such, the resource provider interface now offers a Configure RPC
method that the resource planning engine will invoke at the right
times with the right subset of configuration variables filtered to
just that provider's package. This fixespulumi/pulumi#293.
We are renaming Lumi to Pulumi Fabric. This change simply renames the
pulumi/lumi repo to pulumi/pulumi-fabric, without the CLI tools and other
changes that will follow soon afterwards.
This change fixes a few things:
* Most importantly, we need to place a leading "." in the paths
to Gometalinter, otherwise some sub-linters just silently skip
the directory altogether. errcheck is one such linter, which
is a very important one!
* Use an explicit Gometalinter.json file to configure the various
settings. This flips on a few additional linters that aren't
on by default (line line length checking). Sadly, a few that
I'd like to enable take waaaay too much time, so in the future
we may consider a nightly job (this includes code similarity,
unused parameters, unused functions, and others that generally
require global analysis).
* Now that we're running more, however, linting takes a while!
The core Lumi project now takes 26 seconds to lint on my laptop.
That's not terrible, but it's long enough that we don't want to
do the silly "run them twice" thing our Makefiles were previously
doing. Instead, we shall deploy some $$($${PIPESTATUS[1]}-1))-fu
to rely on the fact that grep returns 1 on "zero lines".
* Finally, fix the many issues that this turned up.
I think(?) we are done, except, of course, for needing to drive
down some of the cyclomatic complexity issues (which I'm possibly
going to punt on; see pulumi/lumi#259 for more details).
This change enables parallelism for our tests.
It also introdues a `test_core` Makefile target to just run the
core engine tests, and not the providers, since they take a long time.
This is intended only as part of the inner developer loop.
Right now, we reject dashes in package names. I've hit this a few times and it annoys
me each time. (It would seem to makes sense to permit hyphens in package names, given
that [almost?] every other package manager on Earth does...)
No more! Hyphens welcome!
This change redoes the way module exports are represented. The old
mechanism -- although laudible for its attempt at consistency -- was
wrong. For example, consider this case:
let v = 42;
export { v };
The old code would silently add *two* members, both with the name "v",
one of which would be dropped since the entries in the map collided.
It would be easy enough just to detect collisions, and update the
above to mark "v" as public, when the export was encountered. That
doesn't work either, as the following two examples demonstrate:
let v = 42;
export { v as w };
let x = w; // error!
This demonstrates:
* Exporting "v" with a different name, "w" to consumers of the
module. In particular, it should not be possible for module
consumers to access the member through the name "v".
* An inability to access the exported name "w" from within the
module itself. This is solely for external consumption.
Because of this, we will use an export table approach. The exports
live alongside the members, and we are smart about when to consult
the export table, versus the member table, during name binding.
In dynamic scenarios, property keys could be arbitrary strings, including
invalid identifiers. This change reflects that in the runtime representation
of objects and also in the object literal and overall indexing logic.
This change fixes a whole host of issues with our current token binding
logic. There are two primary aspects of this change:
First, the prior token syntax was ambiguous, due to our choice of
delimiter characters. For instance, "/" could be used both as a module
member delimiter, in addition to being a valid character for sub-modules.
The result is that we could not look at a token and know for certain
which kind it is. There was also some annoyance with "." being the
delimiter for class members in addition to being the leading character
for special names like ".this", ".super", and ".ctor". Now, we just use
":" as the delimiter character for everything. The result is unambiguous.
Second, the simplistic token table lookup really doesn't work. This is
for three reasons: 1) decorated types like arrays, maps, pointers, and
functions shouldn't need token lookup in the classical sense; 2) largely
because of decorated naming, the mapping of token pieces to symbolic
information isn't straightforward and requires parsing; 3) default modules
need to be expanded and the old method only worked for simple cases and,
in particular, would not work when combined with decorated names.
This change refactors the interpreter hooks into a first class interface
with many relevant event handlers (including enter/leave functions for
packages, modules, and functions -- something necessary to generate object
monikers). It also includes a rudimentary start for tracking actual object
allocations and their dependencies, a step towards creating a MuGL graph.
This change dumps the evaluation state after evaluation completes, at
log-level 5. This includes which modules and classes were initialized,
in addition to the values for all global variables.
In addition to this, we rename a few things:
* Rename Object's Data field to Value.
* Rename the Object.T() methods to Object.TValue(). This more clearly
indicates what they are doing (i.e., fetching the value from the object)
and also avoids object.String() conflicting with fmt.Stringer's String().
* Rename Reference to Pointer, so it's consistent with everything else.
* Rename the GetValueReference/InitValueReference/etc. family of methods
to GetValueAddr/InitValueAddr/etc., since this reflects what they are
actually doing: manipulating a variable slot's address.
Now that we have introduced a full blown token map -- new as of just
a few changes ago -- we can start using it for all of our symbol binding.
This also addresses some order-dependent issues, like intra-module
references looking up symbols that have been registered in the token map
but not necessarily stored in the relevant parent symbols just yet.
Plus, frankly, it's much simpler and uses a hashmap lookup instead of
a fairly complex recursive tree walk.
I've kept the tree walk case, however, to improve diagnostics upon
failure. This allows us to tell developers, for example, that the reason
a binding failed was due to a missing package.
This change revives some compiler tests that are still lingering around
from the old architecture, before our latest round of ship burning.
It also fixes up some bugs uncovered during this:
* Don't claim that a symbol's kind is incorrect in the binder error
message when it wasn't found. Instead, say that it was missing.
* Do not attempt to compile if an error was issued during workspace
resolution and/or loading of the Mufile. This leads to trying to
load an empty path and badness quickly ensues (crash).
* Issue an error if the Mufile wasn't found (this got lost apparently).
* Rename the ErrorMissingPackageName message to ErrorInvalidPackageName,
since missing names are now caught by our new fancy decoder that
understands required versus optional fields. We still need to guard
against illegal characters in the name, including the empty string "".
* During decoding, reject !src.IsValid elements. This represents the
zero value and should be treated equivalently to a missing field.
* Do not permit empty strings "" as Names or QNames. The old logic
accidentally permitted them because regexp.FindString("") == "", no
matter the regex!
* Move the TestDiagSink abstraction to a new pkg/util/testutil package,
allowing us to share this common code across multiple package tests.
* Fix up a few messages that needed tidying or to use Infof vs. Info.
The binder tests -- deleted in this -- are about to come back, however,
I am splitting up the changes, since this represents a passing fixed point.
This change looks up the main module from a package, and that module's
entrypoint, when performing evaluation. In any case, the arguments are
validated and bound to the resulting function's parameters.
This change eliminates the scope-based symbol table. Because we now
require that all module, type, function, and variable elements are
encoded as fully qualified tokens, there is no need for the scope-based
lookups. Instead, the languages themselves decide how the names bind
to locations and just encode that information directly.
The scope is still required for local variables, however, since those
don't have a well-defined "fixed" notion of name. This is also how
we will ensure the evaluator stores values correctly -- including
discarding them -- in a lexically scoped manner.
This change completes my testing of decorator parsing for now. It tests the token
`*[]map[string]map[()*(bool,string,test/package:test/module/Crazy)number][][]test/package:test/module/Crazy`.
This turned up some bugs, most notably in the way we returned the "full" token for
the parsed types. We need to extract the subset of the token consumed by the parsing
routine, rather than the entire thing. To do this, we introduce a tokenBuffer type
that allows for convenient parsing of tokens (eating, advancing, extraction, etc).
This isn't comprehensive yet, however it caught two bugs:
1. parseNextType should operate on "rest" in most cases, not "tok".
2. We must eat the "]" map separator before moving on to the element type.
Part of the token grammar permits so-called "decorated" types. These
are tokens that are pointer, array, map, or function types. For example:
* `*any`: a pointer to anything.
* `[]string`: an array of primitive strings.
* `map[string]number`: a map from strings to numbers.
* `(string,string)bool`: a function with two string parameters and a
boolean return type.
* `[]aws:s3/Bucket`: an array of objects whose class is `Bucket` from
the package `aws` and its module `s3`.
This change introduces this notion into the parsing and handling of
type tokens. In particular, it uses recursive parsing to handle complex
nested structures, and the binder.bindTypeToken routine has been updated
to call out to these as needed, in order to produce the correct symbol.
This change includes some tests for token parsing and conversions. It
also fixes a bug where we treated Type tokens like ClassMembers, when
we ought to have been treating them like ModuleMembers.
This change performs typechecking during binding. This is less about
typechecking per se -- since higher level languages will have presumably
given us well-typed IL -- and more about preparing the AST so that we
can evaluate the fully bound nodes to produce a MuGL graph. It also
serves as a "verifier" for the incoming MuIL, however.
This is clearly incomplete, as the dozens of TODOs will make obvious.
But it's a clean checkpoint that does enough interesting typechecking
that I am landing it now.
This change rearranges the old way we dealt with URLs. In the old system,
virtually every reference to an element, including types, was fully qualified
with a possible URL-like reference. (The old pkg/tokens/Ref type.) In the
new model, only dependency references are URL-like. All maps and references
within the MuPack/MuIL format are token and name based, using the new
pkg/tokens/Token and pkg/tokens/Name family of related types.
As such, this change renames Ref to PackageURLString, and RefParts to
PackageURL. (The convenient name is given to the thing with "more" structure,
since we prefer to deal with structured types and not strings.) It moves
out of the pkg/tokens package and into pkg/pack, since it is exclusively
there to support package resolution. Similarly, the Version, VersionSpec,
and related types move out of pkg/tokens and into pkg/pack.
This change cleans up the various binder, package, and workspace logic.
Most of these changes are a natural fallout of this overall restructuring,
although in a few places we remained sloppy about the difference between
Token, Name, and URL. Now the type system supports these distinctions and
forces us to be more methodical about any conversions that take place.
I was sloppy in my use of names versus tokens in the original AST.
Now that we're actually binding things to concrete symbols, etc., we
need to be more precise. In particular, names are just identifiers
that must be "interpreted" in a given lexical context for them to
make any sense; whereas, tokens stand alone and can be resolved without
context other than the set of imported packages, modules, and overall
module structure. As such, names are much simpler than tokens.
As explained in the comments, tokens.Names are simple identifiers:
Name = [A-Za-z_][A-Za-z0-9_]*
and tokens.QNames are fully qualified identifiers delimited by "/":
QName = [ <Name> "/" ]* <Name>
The legal grammar for a token depends on the subset of symbols that
token is meant to represent. However, the most general case, that
accepts all specializations of tokens, is roughly as follows:
Token = <Name> |
<PackageName>
[ ":" <ModuleName>
[ "/" <ModuleMemberName>
[ "." <Class MemberName> ]
]
]
where:
PackageName = <QName>
ModuleName = <QName>
ModuleMemberName = <Name>
ClassMemberName = <Name>
Please refer to the comments in pkg/tokens/tokens.go for more details.
This change further merges the new AST and MuPack/MuIL formats and
abstractions into the core of the compiler. A good amount of the old
code is gone now; I decided against ripping it all out in one fell
swoop so that I can methodically check that we are preserving all
relevant decisions and/or functionality we had in the old model.
The changes are too numerous to outline in this commit message,
however, here are the noteworthy ones:
* Split up the notion of symbols and tokens, resulting in:
- pkg/symbols for true compiler symbols (bound nodes)
- pkg/tokens for name-based tokens, identifiers, constants
* Several packages move underneath pkg/compiler:
- pkg/ast becomes pkg/compiler/ast
- pkg/errors becomes pkg/compiler/errors
- pkg/symbols becomes pkg/compiler/symbols
* pkg/ast/... becomes pkg/compiler/legacy/ast/...
* pkg/pack/ast becomes pkg/compiler/ast.
* pkg/options goes away, merged back into pkg/compiler.
* All binding functionality moves underneath a dedicated
package, pkg/compiler/binder. The legacy.go file contains
cruft that will eventually go away, while the other files
represent a halfway point between new and old, but are
expected to stay roughly in the current shape.
* All parsing functionality is moved underneath a new
pkg/compiler/metadata namespace, and we adopt new terminology
"metadata reading" since real parsing happens in the MetaMu
compilers. Hence, Parser has become metadata.Reader.
* In general phases of the compiler no longer share access to
the actual compiler.Compiler object. Instead, shared state is
moved to the core.Context object underneath pkg/compiler/core.
* Dependency resolution during binding has been rewritten to
the new model, including stashing bound package symbols in the
context object, and detecting import cycles.
* Compiler construction does not take a workspace object. Instead,
creation of a workspace is entirely hidden inside of the compiler's
constructor logic.
* There are three Compile* functions on the Compiler interface, to
support different styles of invoking compilation: Compile() auto-
detects a Mu package, based on the workspace; CompilePath(string)
loads the target as a Mu package and compiles it, regardless of
the workspace settings; and, CompilePackage(*pack.Package) will
compile a pre-loaded package AST, again regardless of workspace.
* Delete the _fe, _sema, and parsetree phases. They are no longer
relevant and the functionality is largely subsumed by the above.
...and so very much more. I'm surprised I ever got this to compile again!