Tuesday, October 23, 2018

Puppet 6 Type System - Object and Custom Data Types

Puppet 6 type system Object and Custom Data Types

Object and Custom Data Types in Puppet

Type System In Retrospect

In 2014 and 2015 when I was busy with implementing the Puppet Type System I was not sure how it would be received. Today I am very happy with how it turned out as it has been very well received and is now extensively used in Puppet modules for both Puppet and Bolt. (I am talking about data types like Integer, String, Hash, Array, parameterized types like Array[String], plus all the more specialized data types. This blog post is not about those - you can go and read read about all of them in the official documentation Puppet Documentation - Data Types). This post is about something much more exciting. Jumping back in history a bit…

A couple of things were bothering us:

  • There was no way to extend the type system with custom data types; you really would have to contribute the data type to Puppet’s code base for that to work.
  • We were using RGen for meta modeling (how to describe a model using another model (“meta model”)). RGen is an implementation of UML/Ecore metamodeling, and we were not super happy with the performance and implications of using it. The primary use case for us was to model the > 100 classes in the Puppet AST; the data structure being the result of parsing Puppet Language Logic. While RGen is great, it was not a perfect fit.
  • Serialization in Puppet just sucked in general, and was especially difficult to use with data types not having a 1:1 representation in JSON.

As we were discussing this back and forth (“we” being me and Thomas Hallgren at Puppet), he came up with the brilliant idea to implement meta modeling based on the Puppet Type System, and that we should replace RGen with our own implementation rather than trying to fit UML/Ecore style modeling into the Puppet type system. One major incompatibility and headache in trying to marry the type system with RGen was that the type system assumes immutability and RGen/Ecore does not really do that. Further, Ecore sprung more or less from the Java type system and while Java/Ecore has generics that is nowhere near the power of the Puppet Type system’s parameterized types.

In 2015 and 2016 we worked out the design for what we call Pcore - a term we now use as the name of the Puppet type system. The specification for Pcore turned out to be a major opus with lots of things to work out and explain, and while I was working on that, Thomas Hallgren did a herculean job on the implementation; (i.e. my already brilliant implementation 😎 got even more so as the result), and Thomas did an amazing job on the Pcore implementation and the related serialization protocols.

We had two early use cases for Pcore; we built the puppet generate types feature for environment isolation using Pcore, and we had one internal project using it to model network devices. Our major use case was however to update the Puppet AST from using RGen to using Pcore. That was committed in February 2016 - and now the Puppet Language AST became implemented in the Puppet Language 🤯- read the source of ast.pp here. In Puppet 5.0.0 we switched and dropped the use of RGen.

In the Puppet 5.x time frame it was possible to experiment with the features by using rich_data=true in the configuration - but this only worked for puppet apply and for puppet resource, you still could not use this in an agent/master scenario. And, while it worked to send rich data to PDB, it was not exactly what we wanted.

Now in the fall of 2018, with the Puppet 6.0.0 release the use of rich_data=true is on by default and the work we started in 2015 can now finally be used! 🎉

Earlier, it was not terribly meaningful to blog about the wonderful things you can do with Pcore since - well, you could not really use it in practice. But now you can!

A Blog Series about Pcore

I intend to blog about Pcore in a series of posts - this being the first.

There are a lot of things to cover as you can see if you go and read the quite long specification (74 pages), but I am going to take a more pragmatic approach and show useful examples rather that serving you reference material. I also have a lot of work to do taking the Pcore specification in its current form and turning into a more formal specification for the puppet-specifications repository. That work is quite tedious so I am going to mix that with blogging about the features.

The Object Data Type

At the heart of the type system there is the Object data type. You can create one in Puppet if you like, or in Ruby. It can be a simple object only having attributes, or a more complex one also supporting callable methods.

A Car data type in Puppet

For simplicity this is one example in a manifest. You real data types in puppet should be using locations on this form <moduleroot>/types/<typename>.pp as that makes them autoloaded.

In example1.pp:

type MyModule::Car = Object[{
  attributes => {
    reg_nbr => String,
    color => String,
  }
}]
$my_car = MyModule::Car('abc123', 'pink')
notice $my_car

Notices the car:

$ puppet apply example1.pp
Notice: Scope(Class[main]): MyModule::Car({'reg_nbr' => 'abc123', 'color' => 'pink'})

You can use code like this while compiling. Puppet will even autoload the data type just like it does with type aliases - i.e. something like type MyType = Array[String]. You can however not yet use such a data type on the agent side because there is no pluginsync of data types defined in the Puppet Language. If you try you will get an error like this:

Could not intern from rich_data_json: No implementation mapping found for Puppet Type MyModule::Car

It does however work if your data type is implemented in Ruby since everything under lib/puppet in your module is synced to the agent! Let’s implement the same data type in Ruby.

A Car data type in Ruby

In <mymodule>/lib/puppet/datatypes/car.rb:

Puppet::DataTypes.create_type('MyModule::Car') do
  interface <<-PUPPET
    attributes => {
      reg_nbr => String,
      color => String,
    }
   PUPPET
end

A note about file location: as you see it is under lib/puppet/datatypes since lib/puppet/types is for resource types (for historical reasons).

Now we try applying a manifest using that - site.pp:

notify { "example":
  message => MyModule::Car("abc123", "pink")
}

Which we can try out most easily with apply:

puppet apply site.pp

Which results in this:

Notice: /Stage[main]/Main/Notify[example]/message: defined 'message' as MyModule::Car({
  'reg_nbr' => 'abc123',
  'color' => 'pink'
})

You can try that with an agent as well - you should get the same result.

If you look inside the catalog - the notify looks like this:

    {
      "type": "Notify",
      "title": "example",
      "tags": [
        "notify",
        "example",
        "class"
      ],
      "line": 8,
      "exported": false,
      "parameters": {
        "message": {
          "__ptype": "MyModule::Car",
          "reg_nbr": "abc123",
          "color": "pink"
        }
      }
    }

This is the rich_data serialization format which is a “Pcore in human readable JSON” serialization. If you want to learn everything there is to know about serialization and the rich-data format look at the specification for Pcore Data Representation.

An Object data type with methods

In the Puppet Language you cannot yet implement methods of an Object data type. While it is possible to specify the interface for methods in Puppet, the data type cannot be used unless there is an implementation available for the methods.

We can do this in Ruby however. There are a couple of options:

  • The implementation can be done inside the code block given to create_type. This is what I am showing in this blog post.
  • The implementation can be any Ruby class that implements (at least) the interface.
  • The implementation can autoload the implementation from inside the module’s lib. (This should be the last resort as you must use the same version in all environments).

Defining methods in implementation

Methods for instances of the data type are easily added inside
a block given to a call to the implementation method.

The first kind of method I am showing is one that is needed when we declare an attribute to be of kind derived. Note that the specification for the attribute age is now a hash with more details than just the type. The kind derived means that the value of the attribute is computed/derived from other attributes and that it cannot be given when creating an instance of the data type. Since it needs to be computed, there must be an implementation of that computation.

Puppet::DataTypes.create_type('MyModule::Person') do
  interface <<-PUPPET
    attributes => {
      name => String,
      year_of_birth => Integer,
      age => { type => Integer, kind => derived }
    }
  PUPPET

  implementation do
    def age
      DateTime.now.year - @year_of_birth
    end
  end
end

We can use that in a manifest like this:

$p = MyModule::Person('Henrik', 1959) # yeah, that old...
notice "Name: ${p.name}, Age: ${p.age}"

As you may have figured out, with this approach, Pcore will automatically provide a constructor and methods to get the attributes - all we had to do was to supply the missing age computation.

The constructor takes either positional arguments, given in the order
they are specified in the interface, or a Hash of attribute name to value.
Thus we can create the same Person like this: Person('name' => 'Henrik', 'year_of_birth' => 1959)

We can use this to add additional methods - they must be specified in the interface if you want them to be available in the Puppet Language. Methods
not specified in the interface are still available to the Ruby code.

Defining functions in the interface

In order to enable calling methods on a data type (other than those implied
by the attributes, and the general API of all objects) they must be defined in the
data type’s interface.

Puppet::DataTypes.create_type('MyModule::Image') do
  interface <<-PUPPET
    attributes => {
      image_url => URI,
    }
    functions => {
      # resize is an operation that takes two integers (min 1)
      # for x, and y, and returns a new MyModule::Image
      # for the resized result.
      resize => Callable[[Integer[1], Integer[1]], MyModule::Image],
      # image_bytes returns a Binary containing the image
      image_bytes => Callable[[], Binary]
    }
  PUPPET

  implementation do
    def resize(x, y)
      # an imaginary service uploads the image, resizes
      # it, and provides an url to the resized image
      new_url = SomeService::process(@image_url, 'resize', x, y)
      # Return a new MyModule::Image based on the new url
      self.class.new(new_url)
    end
    def image_bytes()
      # an imaginary service gets the image as Base64 encoded string
      bits_base_64 = SomeService::process(@image_url, 'get')
      Binary(bits_base_64)
    end
  end
end

Summary

This post introduced the Object data type and shows how it is defined in Puppet and Ruby and how the Ruby implementation allows also defining behavior in methods that can be used from the Puppet Language.

The use of Objects with methods provides a richer extension mechanism to Puppet than functions and when using the provided support to implement these, they are completely (Puppet) environment friendly since each environment can have a different version of the implementation (still: any external gems you require must be the same for all environments).

While there is a lot more to say about how you can specify attributes, their data type, default values, derived values, and how to define operations/methods, and how to map an object data type to an existing Ruby class - I hope this blog post gives you enough to be able to experiment.

Look out for more posts in this series.

No comments:

Post a Comment