Some Default Moose Coercions

Having coded in a number of programming languages, nothing makes me happier than applying ‘use Moose’ to a package in perl and then hacking.  The declarative style is really solid, and aids in writing well designed, even if not documented, libraries and tools.

One issue that I’ve tried to work around a few times has been the auto-classing of deep hash structures.  I don’t like using raw hashes for anything that isn’t unknown data – if you know the structure of the hash, it should be defined as a class.  Keeping it in a raw hash means that other developers need to go around and search for the potential contents of the hash, which can be unknown if you are mutating that hash in other parts of the code.  Also, the actions that are used to modify that hash are not encapsulated anywhere, so it’s impossible to know what can be done.  I covered more of my opinions on this in Classify Your Hashes.

When you have deep hash structures returned in json format, though, things can get a little tricky from time to time.  I want to be able to pass the entire json object into a root class and have the entire structure converted into classes for me automagically.  Of course, this can be done in a constructor by explicitly converting the hash structure into objects (in Moose, you can use the BUILD method to do this).  I don’t like to do this, because I code in Moose explicitly so that I can avoid writing constructors in the first place.

The main parts where you might end up with issues are when an object points to an array or hash/map of other objects, or when an object may or may not exist, based on the request.  For instance, consider the following:

{
    a: '1',
    b: '2',
    c: {
        c1: '3',
        c2: '4'
    }
}

And then imagine that the element ‘c’ is optional.

A starting point that will handle this JSON document could look like this:

package MyA;

use Moose;
use MyA::C;

has [qw/a b/] => (
    is => 'ro',
    isa => 'Num',
    required => 1,
);

has 'c' => (
    is => 'ro',
    isa => 'MyA::C',
);

package MyA::C;

use Moose;

has [qw/c1 c2/] => (
    is => 'ro',
    isa => 'Num',
    required => 1,
);

With the ultimate goal of being able to pass the entire hash structure into MyA->new and getting out an entire classified object structure.  However, if you try to do that with the code above (which is out of order and not written into separate files to allow it to work) you will get an error when Moose tries to figure out what to do with the ‘c’ attribute.  It will complain because ‘c’ is a HashRef, but you requested that it be a MyA::C, so it can’t build your class. In order to make this coerce properly in the default case (c is there), you would add the following:

package MyA::C;

use Moose;
use Moose::Util::TypeConstraints

coerce __PACKAGE__,
    from 'HashRef',
    via { __PACKAGE__->new($_); }

Then add coerce => 1 to the attribute ‘c’ in the MyA definition. Now, when you pass that decoded json document in, the element ‘c’ will automatically become an object of type MyA::C for you.

But what if ‘c’ is completely optional? This is something I had trouble with for a while, and I came up with the following coercion to deal with it the other day while parsing the output for a mysql plan json document:

subtype __PACKAGE__ . '::Maybe',
    as 'Maybe[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::Maybe',
    from 'HashRef',
    via { __PACKAGE__->new($_); };

In my original code, I also included:

coerce __PACKAGE__ . '::Maybe',
    from 'Undef',
    via { $_ };

But I don’t think that it’s necessary.

Now the original attribute in package MyA looks like:

has 'c' => (
    is => 'ro',
    isa => 'MyA::C::Maybe',
    coerce => 1,
    default => undef,
);

And that attribute is optional. If you now were to take a json document like:

{
    a: '1',
    b: '2'
}

But also:

{
    a: '1',
    b: '2',
    c: null
}

Then the entry for ‘c’ would be accepted just fine as undef. Of course, this messes with predicates (has_c would result in different results for the two calls) but at least your auto coercion of the deep has doesn’t die from this response (assuming it is retrieved from an API).

The other two I like to add are generally to handle HashRefs and ArrayRefs of objects, like:

{
    a: '1',
    b: '2',
    c: [
        {
            c1 => '3',
            c2 => '4'
        },
        {
            c1 => '5',
            c2 => '6'
        }
    ]
}

This, as an Array, can be resolved as a coercion in package MyA::C using the following:

subtype __PACKAGE__ . '::ArrayRef',
    as 'ArrayRef[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::ArrayRef',
    from 'ArrayRef[HashRef]',
    via { [ map { __PACKAGE__->new($_) } @$_ ] };

coerce __PACKAGE__ . '::ArrayRef',
    from 'Undef',
    via { [ ] };

The second coercion is a ‘just in case’ we get the value c: null. Doing this in your API might be bad (something that expects an ArrayRef in response can also return null or undef) and should just return an empty array. Of course that’s an argument that could be one for the ages with tons of logic to the contrary, but I have no proof. Regardless, this will do that coercion for you and turn it into an empty array.

Now you can use an attribute structure like:

has 'c' => (
    is => 'ro',
    isa => 'MyA::C::ArrayRef',
    coerce => 1,
);

For HashRef, it’s much the same:

subtype __PACKAGE__ . '::HashRef',
    as 'HashRef[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::HashRef',
    from 'HashRef[HashRef]',
    via {
        my $arg = $_;
        my %data = map { $_ => __PACKAGE__->new($arg->{$_}) } keys %$arg;
        \%data;
    };

coerce __PACKAGE__ . '::HashRef',
    from 'Undef',
    via { { } };

Now, writing these subtypes directly in the class can be problematic in Moose sometimes, but I tend to do it in code quite often. It’s only a few cases where subtype locations and redo’s can cause issues.

The nice thing is, though, that I can just copy these coercions into any class and expect it to work. If I could find a way to shorthand these into a standard coercion set, I would. I’m often working with API’s that return deep documents, and I like for everything to have classes and definitions so that it is easy to tell what their structure is right from the beginning, and so that they are documented, if only in code.

Also, writing it this way allows me to avoid using curly brackets everywhere.

my $t = JSON::decode_json($document);
$t->{c}->{c2};

# become

my $t = MyA->new(JSON::decode_json($document));
$t->c->c2;

Also, now the class can have all kinds of additional methods associated with it that you can’t do when just pushing a HashRef all over the place. My highly opinionated side is going to tell everyone not to use HashRefs all over the place unless you are counting words; if you know the structure of the document, code it. If you are using Moose, then doing so is ridiculously simple. You don’t have to write constructors that do type checking, as Moose::Util::TypeConstraints can do that all for you. You can teach the code how to deal with these edge cases on it’s own. You don’t need to parse the document to build objects, but instead, just hand the entire document to the root class and let it deal with that nonsense.

The entire code chunk that I used to test all of this looks more or less like the following:

package MyA::C;

use Moose;

use Moose::Util::TypeConstraints;

coerce __PACKAGE__,
    from 'HashRef',
    via { __PACKAGE__->new($_) };

subtype __PACKAGE__ . '::Maybe',
    as 'Maybe[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::Maybe',
    from 'HashRef',
    via { __PACKAGE__->new($_) };

subtype __PACKAGE__ . '::ArrayRef',
    as 'ArrayRef[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::ArrayRef',
    from 'ArrayRef[HashRef]',
    via { [ map { __PACKAGE__->new($_) } @$_ ] };

coerce __PACKAGE__ . '::ArrayRef',
    from 'Undef',
    via { [ ] };

subtype __PACKAGE__ . '::HashRef',
    as 'HashRef[' . __PACKAGE__ . ']';

coerce __PACKAGE__ . '::HashRef',
    from 'HashRef[HashRef]',
    via {
        my $arg = $_;
        my %data = map { $_ => __PACKAGE__->new($arg->{$_}) } keys %$arg;
        \%data;
    };

coerce __PACKAGE__ . '::HashRef',
    from 'Undef',
    via { { } };

has [qw/c1 c2/] => (
    is => 'ro',
    isa => 'Num',
    required => 1,
);

package MyA;

use Moose;

has [qw/a b/] => (
    is => 'ro',
    isa => 'Num',
    required => 1,
);

# This could be ::ArrayRef, ::HashRef, or ::Maybe based on needs.
has 'c' => (
    is => 'ro',
    isa => 'MyA::C::HashRef',
    coerce => 1,
);

package main;

use strict;
use warnings;

my $t = MyA->new(...);

Of course, including that many coercions in EVERY class is quite a bit of overkill. Try to just include what you need, but the nice thing about using this model (__PACKAGE__ . “whatever”) is that you can very rapidly prototype and understand a document, and you are able to quickly encapsulate functionality.

Leave a comment