Sunday, January 25, 2015

The Puppet 4x Function API - part 2

In the first post about the 4x Function API I showed the fundamentals of the new API. In this post I am going to show how you can write more advanced functions that take a code block / lambda as an argument and how you can call this block from Ruby. This can be used to create your own iterative functions or functions that make it possible to write puppet code in a more function oriented style.

Accepting a Code Block / Lambda

A 4x function can accept a code block / lambda. You can make it required by calling required_block_parameter in the definition of the dispatcher, or optional by calling optional_block_parameter.

Here is an example of a simple function called then, that takes one argument and a block and calls the block with argument unless the argument is nil.

Puppet::Functions.create_function(:then) do
  dispatch :then do
    param 'Any', :x
    required_block_param
  end

  def then(x)
    x.nil? ? nil : yield(x)
  end
end  

Note that: Puppet blocks are passed the same way as Ruby blocks are and we can simply yield to the given block. Just as with Ruby blocks, the block can be captured in a parameter by having a &block parameter last, the block_given? method can be used, etc.

The then function is useful when looking up a nested value in a hash as it removed the need to check intermediate results for undef. Say, there may or may not be a value in a $hash such that $hash[a][b][c] and we just want that value, or undef if either a, b, or c are not found instead of an error if we say try to lookup c in undef (if b did not exist).

Instead we use the then function we just defined - like this:

$result = $hash
 .then |$x| { $x[a] }
 .then |$x| { $x[b] }
 .then |$x| { $x[c] }

And for completeness, if you were to write that without the function, you end up with something like this:

$result =
if $hash[a] != undef and $hash[a]|b] != undef and $hash[a][b][c] != undef {
  $hash[a][b][c]
}

...or worse if you start using variables for the intermediate steps

The block's number of parameters and their types

If nothing is specified about the number of parameters and types expected in the accepted block, the user can give the function any block. This is what you get by just calling required_block_parameter, or optional_block_parameter. You still get type checking, but this takes place when the block is called.

If you want to involve the number of parameters and their types in the dispatching - i.e. selecting which ruby method to call based on what the user defined in the block you can do so by stating the Callable type of the block. (The Callable type was added in Puppet 3.7, and is described in this blog post). In brief - Callable[2,2], means something that can be called with exactly two arguments of any type).

Here is the dispatcher part of the each function (from Puppet source code):

Puppet::Functions.create_function(:each) do
  dispatch :foreach_Hash_2 do
    param 'Hash[Any, Any]', :hash
    required_block_param 'Callable[2,2]', :block
  end

  dispatch :foreach_Hash_1 do
    param 'Hash[Any, Any]', :hash
    required_block_param 'Callable[1,1]', :block
  end

  dispatch :foreach_Enumerable_2 do
    param 'Any', :enumerable
    required_block_param 'Callable[2,2]', :block
  end

  dispatch :foreach_Enumerable_1 do
    param 'Any', :enumerable
    required_block_param 'Callable[1,1]', :block
  end

  def foreach_Hash_1(hash)
    enumerator = hash.each_pair
    hash.size.times do
      yield(enumerator.next)
    end
    # produces the receiver
    hash
  end

And to be complete, here are the methods the dispatchers calls - the actual implementation of the each function. As you can see, each variation on how this function can be called; with an Array, a Hash, a String, and one or two arguments are now handled in a small and precise method. (It is really just Hash that needs special treatment, all others are handled as enumerables (i.e. what ever the Puppet Type System has defined as something that can be enumerated / iterated over in the Puppet Language).

  def foreach_Hash_2(hash)
    enumerator = hash.each_pair
    hash.size.times do
      yield(*enumerator.next)
    end
    # produces the receiver
    hash
  end

  def foreach_Enumerable_1(enumerable)
    enum = asserted_enumerable(enumerable)
      begin
        loop { yield(enum.next) }
      rescue StopIteration
      end
    # produces the receiver
    enumerable
  end

  def foreach_Enumerable_2(enumerable)
    enum = asserted_enumerable(enumerable)
    index = 0
    begin
      loop do
        yield(index, enum.next)
        index += 1
      end
    rescue StopIteration
    end
    # produces the receiver
    enumerable
  end

  def asserted_enumerable(obj)
    unless enum = Puppet::Pops::Types::Enumeration.enumerator(obj)
      raise ArgumentError, ("#{self.class.name}(): wrong argument type (#{obj.class}; must be something enumerable.")
    end
    enum
  end
end

What about Dependent Types and Type Parameters?

If you read the above example carefully, or if you already are used to working with a rich type system you may wonder about type parameters and if it is possible to use dependent type.

The short answer is no, the puppet type system, while capable of describing rich types we have not added the ability to use type parameters. They would be really useful - take the hash example, where we instead of:

    param 'Hash[Any, Any]', :hash
    required_block_param 'Callable[2,2]', :block

could specify that the block must accept the key and value type of the given Hash - e.g. something like:

    param 'Hash[K Any, V Any]', :hash
    required_block_param 'Callable[K,V]', :block

This however requires quite a lot of complexity both in the type system itself and what users are exposed to. (The syntax has to be something more elaborate than what is shown above since the references to K and V must naturally find the declared K and V somehow - in the sample that is solved by magic :-).

If we do provide a mechanism to reference the type parameters of the actual types given in a call, we could fully support dependent types. As an example, this would enable declaring that a function takes two arrays of equal length.

How about Return Type?

Return type is also something we decided to leave out for the time being. In hindsight it should have been added from the start as this enables both advanced type inference and type checking to be performed. For this reason we may add this into the dispatch API early in the 4x series. The most difficult part will be figuring out the syntax for the Callable type since it also needs to be able to describe the return type of the callable.

1 comment:

  1. Post updated with the latest changes to the 4x function API. The Puppet Language lambda/block is now given the same way as when coding in Ruby - the method gets a Proc either implicitly (and yields to it), or declares it with &block, and calls the block block.call(). This greatly simplifies testing since Ruby Procs can be used directly when writing rspec tests.

    ReplyDelete