Tuesday, October 23, 2018

Puppet 6 Type System - Object and Custom Data Types

Puppet 6 type system Object and Custom Data Types

Object and Custom Data Types in Puppet

Type System In Retrospect

In 2014 and 2015 when I was busy with implementing the Puppet Type System I was not sure how it would be received. Today I am very happy with how it turned out as it has been very well received and is now extensively used in Puppet modules for both Puppet and Bolt. (I am talking about data types like Integer, String, Hash, Array, parameterized types like Array[String], plus all the more specialized data types. This blog post is not about those - you can go and read read about all of them in the official documentation Puppet Documentation - Data Types). This post is about something much more exciting. Jumping back in history a bit…

A couple of things were bothering us:

  • There was no way to extend the type system with custom data types; you really would have to contribute the data type to Puppet’s code base for that to work.
  • We were using RGen for meta modeling (how to describe a model using another model (“meta model”)). RGen is an implementation of UML/Ecore metamodeling, and we were not super happy with the performance and implications of using it. The primary use case for us was to model the > 100 classes in the Puppet AST; the data structure being the result of parsing Puppet Language Logic. While RGen is great, it was not a perfect fit.
  • Serialization in Puppet just sucked in general, and was especially difficult to use with data types not having a 1:1 representation in JSON.

As we were discussing this back and forth (“we” being me and Thomas Hallgren at Puppet), he came up with the brilliant idea to implement meta modeling based on the Puppet Type System, and that we should replace RGen with our own implementation rather than trying to fit UML/Ecore style modeling into the Puppet type system. One major incompatibility and headache in trying to marry the type system with RGen was that the type system assumes immutability and RGen/Ecore does not really do that. Further, Ecore sprung more or less from the Java type system and while Java/Ecore has generics that is nowhere near the power of the Puppet Type system’s parameterized types.

In 2015 and 2016 we worked out the design for what we call Pcore - a term we now use as the name of the Puppet type system. The specification for Pcore turned out to be a major opus with lots of things to work out and explain, and while I was working on that, Thomas Hallgren did a herculean job on the implementation; (i.e. my already brilliant implementation 😎 got even more so as the result), and Thomas did an amazing job on the Pcore implementation and the related serialization protocols.

We had two early use cases for Pcore; we built the puppet generate types feature for environment isolation using Pcore, and we had one internal project using it to model network devices. Our major use case was however to update the Puppet AST from using RGen to using Pcore. That was committed in February 2016 - and now the Puppet Language AST became implemented in the Puppet Language 🤯- read the source of ast.pp here. In Puppet 5.0.0 we switched and dropped the use of RGen.

In the Puppet 5.x time frame it was possible to experiment with the features by using rich_data=true in the configuration - but this only worked for puppet apply and for puppet resource, you still could not use this in an agent/master scenario. And, while it worked to send rich data to PDB, it was not exactly what we wanted.

Now in the fall of 2018, with the Puppet 6.0.0 release the use of rich_data=true is on by default and the work we started in 2015 can now finally be used! 🎉

Earlier, it was not terribly meaningful to blog about the wonderful things you can do with Pcore since - well, you could not really use it in practice. But now you can!

A Blog Series about Pcore

I intend to blog about Pcore in a series of posts - this being the first.

There are a lot of things to cover as you can see if you go and read the quite long specification (74 pages), but I am going to take a more pragmatic approach and show useful examples rather that serving you reference material. I also have a lot of work to do taking the Pcore specification in its current form and turning into a more formal specification for the puppet-specifications repository. That work is quite tedious so I am going to mix that with blogging about the features.

The Object Data Type

At the heart of the type system there is the Object data type. You can create one in Puppet if you like, or in Ruby. It can be a simple object only having attributes, or a more complex one also supporting callable methods.

A Car data type in Puppet

For simplicity this is one example in a manifest. You real data types in puppet should be using locations on this form <moduleroot>/types/<typename>.pp as that makes them autoloaded.

In example1.pp:

type MyModule::Car = Object[{
  attributes => {
    reg_nbr => String,
    color => String,
  }
}]
$my_car = MyModule::Car('abc123', 'pink')
notice $my_car

Notices the car:

$ puppet apply example1.pp
Notice: Scope(Class[main]): MyModule::Car({'reg_nbr' => 'abc123', 'color' => 'pink'})

You can use code like this while compiling. Puppet will even autoload the data type just like it does with type aliases - i.e. something like type MyType = Array[String]. You can however not yet use such a data type on the agent side because there is no pluginsync of data types defined in the Puppet Language. If you try you will get an error like this:

Could not intern from rich_data_json: No implementation mapping found for Puppet Type MyModule::Car

It does however work if your data type is implemented in Ruby since everything under lib/puppet in your module is synced to the agent! Let’s implement the same data type in Ruby.

A Car data type in Ruby

In <mymodule>/lib/puppet/datatypes/car.rb:

Puppet::DataTypes.create_type('MyModule::Car') do
  interface <<-PUPPET
    attributes => {
      reg_nbr => String,
      color => String,
    }
   PUPPET
end

A note about file location: as you see it is under lib/puppet/datatypes since lib/puppet/types is for resource types (for historical reasons).

Now we try applying a manifest using that - site.pp:

notify { "example":
  message => MyModule::Car("abc123", "pink")
}

Which we can try out most easily with apply:

puppet apply site.pp

Which results in this:

Notice: /Stage[main]/Main/Notify[example]/message: defined 'message' as MyModule::Car({
  'reg_nbr' => 'abc123',
  'color' => 'pink'
})

You can try that with an agent as well - you should get the same result.

If you look inside the catalog - the notify looks like this:

    {
      "type": "Notify",
      "title": "example",
      "tags": [
        "notify",
        "example",
        "class"
      ],
      "line": 8,
      "exported": false,
      "parameters": {
        "message": {
          "__ptype": "MyModule::Car",
          "reg_nbr": "abc123",
          "color": "pink"
        }
      }
    }

This is the rich_data serialization format which is a “Pcore in human readable JSON” serialization. If you want to learn everything there is to know about serialization and the rich-data format look at the specification for Pcore Data Representation.

An Object data type with methods

In the Puppet Language you cannot yet implement methods of an Object data type. While it is possible to specify the interface for methods in Puppet, the data type cannot be used unless there is an implementation available for the methods.

We can do this in Ruby however. There are a couple of options:

  • The implementation can be done inside the code block given to create_type. This is what I am showing in this blog post.
  • The implementation can be any Ruby class that implements (at least) the interface.
  • The implementation can autoload the implementation from inside the module’s lib. (This should be the last resort as you must use the same version in all environments).

Defining methods in implementation

Methods for instances of the data type are easily added inside
a block given to a call to the implementation method.

The first kind of method I am showing is one that is needed when we declare an attribute to be of kind derived. Note that the specification for the attribute age is now a hash with more details than just the type. The kind derived means that the value of the attribute is computed/derived from other attributes and that it cannot be given when creating an instance of the data type. Since it needs to be computed, there must be an implementation of that computation.

Puppet::DataTypes.create_type('MyModule::Person') do
  interface <<-PUPPET
    attributes => {
      name => String,
      year_of_birth => Integer,
      age => { type => Integer, kind => derived }
    }
  PUPPET

  implementation do
    def age
      DateTime.now.year - @year_of_birth
    end
  end
end

We can use that in a manifest like this:

$p = MyModule::Person('Henrik', 1959) # yeah, that old...
notice "Name: ${p.name}, Age: ${p.age}"

As you may have figured out, with this approach, Pcore will automatically provide a constructor and methods to get the attributes - all we had to do was to supply the missing age computation.

The constructor takes either positional arguments, given in the order
they are specified in the interface, or a Hash of attribute name to value.
Thus we can create the same Person like this: Person('name' => 'Henrik', 'year_of_birth' => 1959)

We can use this to add additional methods - they must be specified in the interface if you want them to be available in the Puppet Language. Methods
not specified in the interface are still available to the Ruby code.

Defining functions in the interface

In order to enable calling methods on a data type (other than those implied
by the attributes, and the general API of all objects) they must be defined in the
data type’s interface.

Puppet::DataTypes.create_type('MyModule::Image') do
  interface <<-PUPPET
    attributes => {
      image_url => URI,
    }
    functions => {
      # resize is an operation that takes two integers (min 1)
      # for x, and y, and returns a new MyModule::Image
      # for the resized result.
      resize => Callable[[Integer[1], Integer[1]], MyModule::Image],
      # image_bytes returns a Binary containing the image
      image_bytes => Callable[[], Binary]
    }
  PUPPET

  implementation do
    def resize(x, y)
      # an imaginary service uploads the image, resizes
      # it, and provides an url to the resized image
      new_url = SomeService::process(@image_url, 'resize', x, y)
      # Return a new MyModule::Image based on the new url
      self.class.new(new_url)
    end
    def image_bytes()
      # an imaginary service gets the image as Base64 encoded string
      bits_base_64 = SomeService::process(@image_url, 'get')
      Binary(bits_base_64)
    end
  end
end

Summary

This post introduced the Object data type and shows how it is defined in Puppet and Ruby and how the Ruby implementation allows also defining behavior in methods that can be used from the Puppet Language.

The use of Objects with methods provides a richer extension mechanism to Puppet than functions and when using the provided support to implement these, they are completely (Puppet) environment friendly since each environment can have a different version of the implementation (still: any external gems you require must be the same for all environments).

While there is a lot more to say about how you can specify attributes, their data type, default values, derived values, and how to define operations/methods, and how to map an object data type to an existing Ruby class - I hope this blog post gives you enough to be able to experiment.

Look out for more posts in this series.

Monday, October 15, 2018

Puppet PAL wants to be your friend

Puppet PAL wants to be your friend

Puppet PAL wants to be your friend.

PAL stands for Puppet As-a Library and it is a new Ruby API in Puppet giving an application written in Ruby access to an API for Puppet Language related operations ranging from full scale features such as compiling a catalog to fine grained parsing and evaluating Puppet Language logic.

PAL was introduced as an experimental feature in the 5.x series (primarily to support Bolt). Now with both Puppet 6.0 and Bolt 1.0 having been released the experimental status of PAL is lifted and it will now follow Semver. And - it is about time this post got written to make the features of PAL more widely known.

This first blog post introduces PAL and contains reference material for its use. I will come back with more posts with additional examples as this blog post is already quite long…

Yet another API ?

You may ask why PAL is needed when Puppet already has APIs for (almost) everything. I would characterize the problem as the existing APIs are either too high level or too low:

  • the high level APIs are not flexible enough - sure you can ask for a catalog just like the agent does, but you have very little say over how that is done and it is very hard to mixin your custom variations.
  • the lower level APIs naturally work, but using them is like getting a dump of Lego pieces to assemble any way you like.

As a result of this, those that wanted some kind of variation of a “puppet apply”, or “puppet master compile” application would typically copy long sequences of code from one of the implementations in Puppet (yes there are several). This creates a problem because it also means copying bugs, and missing features and then having to play catch up whenever the implementation in Puppet changes.

A design goal for PAL was to come up with an API that would work even if the underlying implementation of Puppet was written in another language, or for a remote service. That in turn means that PAL cannot expose the underlying implementation classes directly to the user of the API.

I think we succeeded with the ambitions for PAL, but as always time constraints required us to make a couple of trade offs. The one part that comes to mind is that PAL still requires the Puppet settings system to be initialized and it is thus not free from concern from the rest of Puppet. A number of helper classes used in Puppet does not have wrappers and it did not make sense to create those - they may need to change in some distant future - if anything at this point, it is a bit strange/ugly/confusing to see the odd class popping up in PAL from deeper down in the puppet module hierarchy. Notably, an API for querying the catalog is missing (although the Catalog has an API it exposes your logic to many implementation details). We wish to fix these things in future versions of PAL.

A Conceptual View of Puppet Internals

The following graph is an illustration of what is going on inside Puppet when a catalog is being compiled (or for that matter when something seemingly trivial as getting the result of a Puppet Language expression like 1+1).

defines what is loadable
produces
with side effect
produces
evaluates
Node
Environment
Facts
TopScope
Settings
Code
Compiler
ModulePath
Hiera
Modules
Certificate
Evaluator
Parser
Lexer
Result
Catalog
EppEvaluator
AST
Context

PAL is an API that abstracts this complex internal configuration. While the parts have their own API it is difficult to assemble them correctly (and in the right order). (Note that the graph is a simplification as many of the arrows are bidirectional).

The Context requires a note as it is something that exists in the Puppet Implementation - it is simply a way to set and override what can be thought of as global variables - key/value bindings that can be obtained anywhere inside the code in a particular context. The context is used to enable access to things that would otherwise have to be passed around in every call inside Puppet.

Script and Catalog Compilers

PAL has the concept of a Compiler - being either a ScriptCompiler or a CatalogCompiler. As you can guess, the catalog compiler produces a Catalog, and the script compiler does not. The script compiler is more lightweight and allows use of tasks, plans, and the apply keyword but not any of the catalog building expressions (except when they are inside an apply clause).

While some operations can be done with PAL directly, you almost always will need one of the compilers.

Examples

Evaluating a string from the command line

This small sample is all that is needed to evaluate a string of Puppet Language logic given on the command line (similar to what a puppet apply -e does):

eval_arg_script.rb:

require 'puppet_pal'
Puppet.initialize_settings
result = Puppet::Pal.in_tmp_environment('pal_env',
  modulepath: [],
  facts: {}
  ) do |pal|
    pal.with_script_compiler {|c| c.evaluate_string(ARGV[0])}
  end
puts result

Let’s try it out on the command line:

bundle exec ruby eval_arg_script.rb '1+1'
2

Note: I am leaving out all things related to setting up an environment
with puppet and its dependencies, getting a Ruby of a particular
version etc. etc. as that requires a series of blog posts on its own. I have rbenv
installed, I run puppet from source, and I use bundle install (or update) as I shift
between puppet versions. You will most likely install puppet as a gem and use that. (Note that puppet_pal comes from the puppet gem - there is another gem that has nothing to do with this PAL that is named puppet_pal.)

Here is a breakdown of the example:

require 'puppet_pal'

Here PAL is required, and it will in turn require puppet. This is done this way since right now a require 'puppet' will require almost everything inside puppet, and we may modify that so only the relevant parts of puppet are required when using PAL.

Puppet.initialize_settings

Sadly, this is needed as we did not have time to change the puppet code base to get values from settings in such a way that they can be given to PAL. Thus, a full initialization of the settings is required. This in turn requires a configured puppet installation - from which the settings are read.

result = Puppet::Pal.in_tmp_environment('pal_env',

Here we are telling PAL that we are going to do things in a temporary environment. We let PAL create a temporary location for an environment that we name pal_env. This environment will be empty. As you will see later there are other ways of specifying an environment to operate in. The name of the environment is not really important here, but you may want to avoid production just to make it not be confused with the environment with the same name that is default in Puppet.

  modulepath: [],
  facts: {}

Here we give the environment two important inputs - we don’t have any modules we want to use anywhere so we use an empty array. We also initialize the facts to an empty hash - this is done to speed up loading as PAL runs facter to obtain the facts if they are not specified. This can take something like 0.5-1sec. The downside is naturally that $facts will be empty. There are other ways to specify the facts. As you can see in the diagram, a node is actually required in most situations - and in our simple example we did not specify anything related to node - and PAL with then assume that the host the script is running on is the node to use. Thus, in the example with get “localhost” (whatever its name is), and empty set of facts. More about this later.

  ) do |pal|
    pal.with_script_compiler {|c| c.evaluate_string(ARGV[0])}
  end

Here we give a lambda to the call to in_tmp_environment, it gets an instance of PAL as its argument - pal thus represents the environment in which we are going to be doing something. We then call with_script_compiler to get a script compiler, and it takes a lambda which is called with an instantiated compiler - thus c is our interface to getting things done. We call evaluate_string with ARGV[0] (the puppet language string from the command line). The evaluate_string will lex and parse, and validate the resulting AST before evaluating it. The result is returned. And we are back at:

result = Puppet::Pal.in_tmp_environment('pal_env',

We now have the result, and the script ends with:

puts result

Which prints the result (the output “2” in the example above).

Getting a catalog in JSON

Now, a slightly more elaborate example where we want the Catalog that is built as a side effect of evaluating Puppet Language logic. We will now use the catalog compiler instead of the script compiler and we want the built Catalog in JSON as a result:

require 'puppet_pal'
Puppet.initialize_settings
result = Puppet::Pal.in_tmp_environment('pal_env', modulepath: [], facts: {}) do |pal|
  pal.with_catalog_compiler do |c|
    c.evaluate_string(ARGV[0])
    c.compile_additions # eval lazy constructs and validate again
    c.with_json_encoding { |encoder| encoder.encode }
  end
end
puts result

As you can see this has the same structure. Here are the details for the differences:

pal.with_catalog_compiler do |c|

Here we use with_catalog_compiler instead of with_script_compiler since we want a catalog to be built. The next line is the same - it evaluates the argument string.

c.compile_additions # eval lazy constructs and validate again

Then we call compile_additions to make PAL evaluate all lazy constructs and expected subsequent side effects to the catalog that were introduced by the call to evaluate_string. For example, if the evaluated logic declares a user defined type, that resource would not be evaluated unless compile_additions was called.

As you will see later there are other ways to specify the puppet logic “the code” to evaluate that does not require compile_additions to be called. It is only required when evaluating extra snippets of logic like in this example.

What actually happens in the example is that when the string is evaluated there is already an almost empty catalog already compiled, and compile_additions integrates the side effects of the just evaluated string into the catalog.

When calling compile_additions any future references to resources not yet in the catalog would raise an error as compile_additions also validates the result for dangling resource references.

c.with_json_encoding { |encoder| encoder.encode }

This gets a “json encoder” for the catalog. This encoder’s encode will produce the desired JSON representation of the catalog. By default the result is a pretty printed JSON string. Since this is the last thing in the block, that string becomes the result, and it is assigned to result. At the very end this is output to stdout with puts.

So, when we try this out on the command line:

bundle exec ruby to_catalog.rb 'notify { "awesome": }'

We get this output:

{
  "tags": [
    "settings"
  ],
  "name": "example.com",
  "version": 1539340088,
  "code_id": null,
  "catalog_uuid": "7d80fa68-05eb-4684-93e2-6f61529b7571",
  "catalog_format": 1,
  "environment": "production",
  "resources": [
    {
      "type": "Stage",
      "title": "main",
      "tags": [
        "stage",
        "class"
      ],
      "exported": false,
      "parameters": {
        "name": "main"
      }
    },
    {
      "type": "Class",
      "title": "Settings",
      "tags": [
        "class",
        "settings"
      ],
      "exported": false
    },
    {
      "type": "Class",
      "title": "main",
      "tags": [
        "class"
      ],
      "exported": false,
      "parameters": {
        "name": "main"
      }
    },
    {
      "type": "Notify",
      "title": "awesome",
      "tags": [
        "notify",
        "awesome",
        "class"
      ],
      "line": 1,
      "exported": false
    }
  ],
  "edges": [
    {
      "source": "Stage[main]",
      "target": "Class[Settings]"
    },
    {
      "source": "Stage[main]",
      "target": "Class[main]"
    },
    {
      "source": "Class[main]",
      "target": "Notify[awesome]"
    }
  ],
  "classes": [
    "settings"
  ]
}

Variations on “environment”

The examples used with_tmp_environment but there are other options to specify the environment to use.

Using a tmp environment

The with_tmp_environment takes an environment name (required) and the following optional named arguments:

  • String env_name – a name to use for the temporary environment - this only shows up in errors
  • Array[String] modulepath – an array of directory paths containing Puppet modules, may be empty, defaults to empty array
  • [Hash] settings_hash a hash of settings – currently not used, defaults to empty hash
  • [Hash] facts – map of fact name to fact value - if not given will initialize the facts (which is a slow operation)
  • [Hash] variables – optional map of fully qualified variable name to value

It returns:

  • Any – returns what the given block returns

It yields:

  • Puppet::Pal pal – a context that responds to Puppet::Pal methods

Sadly, the settings part did not get done. In the future this will be how settings are fed into PAL instead of requiring a call to Puppet.initialize_settings.

It should be quite clear what the purpose of the options are. One note though; the variables allows setting any fully qualified variable in any scope. This can be used to test a snippet that has references to variables that would be set by included classes when used in a real compilation - i.e. there is nothing stopping you from passing in {'apache::port' => 666} and thus allowing the tested logic to reference $apache::port without having a complete apache class declared in the catalog. (Naturally: also including the class would result in errors as the variable would already be set).

Using a named, real environment

The alternative to using a tmp environment is to use an existing configured environment on disk that is found on the environment path.

The name of an environment (env_name) is always given. The location of that environment on disk is then either constructed by:

  • searching a given envpath where name is a child of a directory on that path, or…
  • it is the directory given in env_dir (which must exist).
  • (The env_dir and envpath options are mutually exclusive.)

The with_environment takes an environment name (required) which must be an existing environment on disk, and the following optional named arguments:

  • modulepath Array[String] – an array of directory paths containing Puppet
    modules, overrides the modulepath of an existing env. Defaults to
    {env_dir}/modules if env_dir is given,
  • pre_modulepath Array[String] – like modulepath, but is prepended to the modulepath
  • post_modulepath Array[String] – like modulepath, but is appended to the modulepath
  • settings_hash Hash – a hash of settings - currently not used for anything, defaults to empty hash
  • env_dir String – a reference to a directory being the named environment (mutually exclusive with envpath)
  • envpath String – a path of directories in which there are environments to search for env_name (mutually exclusive with env_dir). Should be a single directory, or several directories separated with platform specific File::PATH_SEPARATOR character.
  • facts Hash – optional map of fact name to fact value - if not given will initialize the facts (which is a slow operation).
  • variables Hash – optional map of fully qualified variable name to value

Returns:

  • Any – returns what the given block returns

Yields:

  • Puppet::Pal pal – a context that responds to Puppet::Pal methods

In practice:

  • either:
    • use an environment name and let PAL search the envpath
    • or give an environment directory that does not have to be on an environment path
  • and either:
    • specify the module path
    • or use the default module path (defined by the environment, or is the ./modules directory given as the env_dir)
    • and then use one of:
      • pre_modulepath to push additional modules first on the path
      • post_modulepath to push additional modules last on the path

Inside the PAL context (advanced)

I included this for those that have some familiarity with the internals of Puppet - you can safely skip this section…

Before PAL calls the block given to in_tmp_environment or in_environment it will set values in the global Puppet context like this:

environments: environments, # The env being used is the only one...
pal_env: env, # provide as convenience
pal_current_node: node, # to allow it to be picked up instead of created
pal_variables: variables, # common set of variables across several inner contexts
pal_facts: facts # common set of facts across several inner contexts (or nil)

Thus Puppet.lookup() (not to be confused with hiera lookup) can get those values when needed.

The keys in the context are part of the PAL API, but the values are not. The values for environments and env are not part of PAL as they expose classes in Puppet that may or may not be strictly specified as API.

The pal_current_node allows code to override the automatically created Node object with a custom created one by pushing this onto a context wrapping further operations. This cannot be done from outside PAL as a Node needs some of the other components when it is created. (Not perfect, but this is how far we got on this).

The API of the Compilers

The ScriptCompiler and CatalogCompiler share many methods in an abstract Compiler class. The script compiler is created with a call to PAL’s with_script_compiler, and the catalog compiler with a call to with_catalog_compiler. Both methods take exactly the same (optional) named arguments:

  • configured_by_env Boolean – if the environment in use (as determined by the call to PAL) determines manifest/code to evaluate. Defaults to false.
  • manifest_file String – the path to a .pp file to use as the main manifest.
  • code_string String – a string with puppet logic.
  • facts Hash[String, Any] – a Hash of facts. If not given PAL will run facter to get the facts for localhost.
  • variables Hash[String, Any] – a Hash of variable names (can be fully qualified) to values that will be set before any evaluation takes place.

The parameters code_string, manifest_file and configured_by_env are mutually exclusive.

Here is a look at what you can do with both of the compilers:

Call a function

call_function(function_name, *args, &block)

Calls a function given by name with arguments specified in an Array, and optionally accepts a code block.

  • function_name String – the name of the function to call.
  • *args Any – the arguments to the function.
  • block Proc – an optional callable block that is given to the called function.

Returns:

  • Any– what the called function returns.

Get a function signature

function_signature(function_name)

Returns a Puppet::Pal::FunctionSignature object or nil if function is not found. The returned FunctionSignature has information about all overloaded signatures of the function.

# returns true if 'myfunc' is callable with
# three integer arguments 1, 2, 3
compiler.function_signature('myfunc').callable_with?([1,2,3])

List available functions

list_functions(filter_regex = nil, error_collector = nil)

Returns an array of TypedName objects (see below) for all functions, optionally filtered by a regular expression. The returned array has more information than just the leaf name - the typical thing is to just get the name as showing the following example.

Errors that occur during function discovery will either be logged as warnings or added to the optional error_collector array. When provided, it will be appended with Puppet::DataTypes::Error instances describing each error in detail and no warnings will be logged.

# getting the names of all functions
puts compiler.list_functions.map {|tn| tn.name }
  • filter_regex Regexp – an optional regexp that filters based on name (matching names are included in the result).
  • error_collector Array[Puppet::DataTypes::Error] – an optional array that will get errors during load appended.

Returns

  • Array[Puppet::Pops::Loader::TypedName>] – an array of typed names.

A TypedName is as the name suggests a combination of name and data type.
A typed name has methods to get name, type - which are self expanatory.
It also has methods name_parts which is an array of each part of a
qualified / name-spaced name, and name_authority which is a reference to
what defined this type, and finally compound_name which is a unique identifier.
Instances of TypedName are suitable as keys in hashes and is used extensively by the loaders.

Evaluate a string

evaluate_string(puppet_code, source_file = nil)

Evaluates a string of Puppet Language code in top scope. A “source_file” reference to a source can be given - if not an actual file name, by convention the name should be bracketed with < > to indicate it is something symbolic; for example <commandline> if the string was given on the command line.

If the given puppet_code is nil or an empty string, nil is returned, otherwise the result of evaluating the puppet language string.

The given string must form a complete and valid expression/statement as an error is raised otherwise. That is, it is not possible to divide a compound expression by line and evaluate each line individually.

Parameters:

  • puppet_code Optional[String] – the puppet language code to evaluate, must be a complete expression/statement.
  • source_file Optional[String] – an optional reference to a source (a file or symbolic name/location).

Returns

  • Any – what the puppet_code evaluates to.

Evaluate a file

evaluate_file(file)

Evaluates a Puppet Language file in top scope. The file must exist and contain valid Puppet Language code or an error is raised.

Parameters:

  • file String – an absolute path to a file with puppet language code, must exist.

Returns:

  • Any – what the last evaluated expression in the file evaluated to.

Evaluate AST

evaluate(ast)

Evaluates an AST obtained from parse_string or parse_file in topscope. If the ast is a Puppet::Pops::Model::Program (what is returned from the parse methods), any definitions in the program (that is, any function, plan, etc.) that is defined is available for use.

Parameter:

  • ast Puppet::Pops::Model::PopsObject – typically the returned Program from the parse methods, but can be any Expression if you want to evaluate only part of the returned AST.

Returns:

  • Any – whatever the ast evaluates to.

AST stands for Abstract Syntax Tree - which is the result from parsing. The Puppet AST is described using Puppet Pcore and it is thus a model – a term often used interchangeably with AST when it is clear from context that the only model it could refer to is a particular AST. See Introduction to Modeling for more about modeling.

Evaluate a literal value

evaluate_literal(ast)

Produces a literal value if the AST obtained from parse_string or parse_file does not require any actual evaluation. Raises an error if the given ast does not represent a literal value.

This method is useful if it is expected that the user gives a literal value in puppet form and thus that the AST represents literal values such as string, integer, float, boolean, regexp, array, hash, etc. This for example from having read a string representation of an array or hash from the command line or as values in some file.

Parameters:

  • ast Puppet::Pops::Model::PopsObject – typically the returned Program from the parse methods, but can be any Expression.

Returns:

  • Any – whatever literal value the ast evaluates to.

Parse a String

parse_string(code_string, source_file = nil)

Parses and validates a puppet language string and returns an instance of Puppet::Pops::Model::Program on success (i.e. AST). If the content is not valid an error is raised.

Parameters:

  • code_string String – a puppet language string to parse and validate.

  • source_file Optional[String] – an optional reference to a file or other location in angled brackets, only used for information.

Returns:

  • Puppet::Pops::Model::Program – returns a Program instance on success

Parse a File

parse_file(file)

Parses and validates a puppet language file and returns an instance of Puppet::Pops::Model::Program on success. If the content is not valid an error is raised.

Parameters:

  • file String – a file with puppet language content to parse and validate.

Returns:

  • Puppet::Pops::Model::Program – returns a Program instance on success.

Parse a data type

type(type_string)

Parses a puppet data type given in string format and returns that type, or raises an error. A type is needed in calls to new to create an instance of the data type, or to perform type checking of values - typically using type.instance?(obj) to check if obj is an instance of the type.

# Verify if obj is an instance of a data type
pal.type('Enum[red, blue]').instance?("blue") # returns true

Parameters:

  • type_string String – a puppet language data type.

Returns:

  • Type – the data type

Create a data type

create(data_type, *arguments) – Creates a new instance of a given data type.

Parameters:

  • data_type Variant[String, Type] – the data type as a data type or in String form.
  • *arguments Any – one or more arguments to the called new function.

Returns:

  • Any – an instance of the given data type, or raises an error if it was not possible to parse data type or create an instance.
# Create an instance of a data type (using an already created type)
t = pal.type('Car')
pal.create(t, 'color' => 'black', 'make' => 't-ford')

# same thing, but type is given in String form
pal.create('Car', 'color' => 'black', 'make' => 't-ford')

Check if this is a catalog compiler

has_catalog? – Returns true if this is a compiler that compiles a catalog.

Script Compiler

The Script Compiler has these additional methods:

Get the signature of a plan by name

plan_signature(plan_name)

Parameters:

  • plan_name String – the name of the plan to get the signature of.

Returns:

  • Optional[Puppet::Pal::PlanSignature] – returns a PlanSignature, or nil if plan is not found.

Get a list of available plans with optional filtering on name

list_plans(filter_regex = nil, error_collector = nil)

Returns an array of TypedName objects for all plans, optionally filtered by a regular expression. The returned array has more information than just the leaf name although the typical thing is to just get the name as shown in the following example.

Errors that occur during plan discovery will either be logged as warnings or collected in the optional error_collector array. When provided, it will get Puppet::DataTypes::Error instances appended (i.e. the data type known as Error in the Puppet language) describing each error in detail and no warnings will be logged.

# Example: getting the names of all plans
puts compiler.list_plans.map {|tn| tn.name }

Parameters:

  • filter_regex Regexp – an optional regexp that filters based on name (matching names are included in the result).
  • error_collector Array[Error] – an optional array that will get errors appended during load.

Returns:

  • Array[Puppet::Pops::Loader::TypedName] – an array of typed names.

Get the signature of a task by name

task_signature(task_name)

Returns the callable signature of the given task (that is, the arguments it accepts, and the data type it returns).

Parameters:

  • task_name String – the name of the task to get the signature of.

Returns:

  • Optional[Puppet::Pal::TaskSignature] – returns a TaskSignature, or nil if task is not found.

Get a list of available tasks with optional filtering on name

list_tasks(filter_regex = nil, error_collector = nil)

Returns an array of TypedName objects for all tasks, optionally filtered by a regular expression. The returned array has more information than just the leaf name - the typical thing is to just get the name as shown in the following example:

# Example getting the names of all tasks
compiler.list_tasks.map {|tn| tn.name }

Errors that occur during task discovery will either be logged as warnings or appended to the optional error_collector array. When provided, it will get Error instances appended describing each error in detail and no warnings will be logged.

Parameters:

  • filter_regex Regexp – an optional regexp that filters based on name (matching names are included in the result).
  • error_collector Array[Error] – an optional array that will get errors appended during load.

Returns:

  • Array[Puppet::Pops::Loader::TypedName] – an array of typed names.

Catalog Compiler methods

Produce a Catalog in JSON

with_json_encoding(pretty: true, exclude_virtual: true)

Calls a block of code and yields a configured JsonCatalogEncoder to the block.

Parameters:

  • pretty Boolean – if the resulting Json should be pretty printed or not. Defaults to true.
  • exclude_virtual Boolean – if the resulting catalog should have virtual resources filtered out or not. The default is true.
# Example Get resulting catalog as pretty printed Json
Puppet::Pal.in_environment() do |pal|
  pal.with_catalog_compiler() do |compiler|
    compiler.with_json_encoding {| encoder | encoder.encode }
  end
end

Compiler additions to the catalog - handle lazy evaluation

compile_additions()

Compiles the result of additional evaluation taking place in a PAL catalog compilation. This will evaluate all lazy constructs until all have been evaluated, and then validate the resulting catalog.

This should be called when having evaluated strings or files of puppet logic after the initial compilation took place by giving PAL a manifest or code-string.

This method should be called when a series of evaluations is thought to have reached a valid state (at a point where there should be no relationships to resources that does not exist).

As an alternative the methods evaluate_additions can be called without any requirements on consistency and then calling validate at the end. (Both can be called multiple times).

Note: A Catalog compilation needs to start by creating a catalog and declaring some initial things. The standard compilation then continues to evaluate either what was given as the main manifest, or as a string of Puppet Language code (internally this is referred to as “the initial import”). Normally this defines the entire compilation as the main manifest + definitions from ENC includes all of the wanted classes (and then what they include etc.) via autoloading. When using PAL you may have a use case where you want to do that first, and then continue with additions, or you may want the initial compilation to be as small as possible and build the catalog from a series of calls you make to PAL. Again depending on use case, you may require that what you include in the catalog has been fully evaluated before taking the next step, or you can simply finalize your catalog building at the very end with a compile_additions.

Validating the catalog (after additions)

validate()

Validates the state of the catalog (without performing evaluation of any elements requiring lazy evaluation. (Can be called multiple times). Call this if you want to validate the catalog’s state after having done one or more calls to evaluate_additions(). Will raise an error if catalog is not valid.

Evaluate additions, but do not validate

evaluate_additions()

Evaluates all lazy constructs that were produced as a side effect of evaluating puppet logic. Can be called multiple times. Call this instead of compile_additions() if you want to hold off with the validation of the catalog’s state. May raise an error from the evaluation.

Summary

Oh my, that turned out to be one long post! Sorry about that - simply a lot to cover…
There is probably a lot more you would like to know about how you can use this, and especially if you are interested in writing tooling around language stuff. While I have written about Language internals and modeling in past blog posts, I will probably come back with examples of useful utilities that can easily be written using PAL. Ping me in comments below, or hit me up on one of the #puppet channels on Slack if there is something you would like to see.