Document Transformations in Marten with Javascript

In all likelihood, Marten would garner much more rapid adoption if we were able to build on top of Sql Server 2016 instead of Postgresql. Hopefully, .Net folks will be willing to try switching databases after they see how many helpful capabilities that Postgresql has that Sql Server can’t match yet. This blog post, yet one more stop along the way to Marten 1.0, demonstrates how we’re taking advantage of Postgresql’s built in Javascript engine (PLV8). 

A Common .Net Approach to “Readside” Views

Let’s say that you’re building HTTP services, and some of your HTTP endpoints will need to return some sort of “readside” representation of your persisted domain model. For the purpose of making Marten shine, let’s say that you’re going to need to work with hierarchical data. In a common .Net technology stack, you’d:

  1. Load the top level model object through Entity Framework or some other kind of ORM. EF would issue a convoluted SQL query with lots of OUTER JOIN’s so that it can make a single call to the database to fetch the entire hierarchy of data you need from various tables. EF would then proceed to iterate through the sparsely populated recordset coming back and turn that into the actual domain model object represented by the data with lots of internally generated code.
  2. You’d then use something like AutoMapper to transform the domain model object into a “read side” Data Transfer Object (view models, etc.) that’s more suitable to going over the wire to clients outside of your service.
  3. Serialize your DTO to a JSON string and write that out to the HTTP response

Depending on how deep your hierarchy is, #1 can be expensive in the database query. The serialization in #3 is also somewhat CPU intensive.

As a contrast, here’s an example of how you might approach that exact same use case with Marten:

    var json = session.Query<User>()
        .Where(x => x.Id == user.Id)
        .TransformToJson("get_fullname").Single();

In the usage above, I’m retrieving the data for a single User document from Marten and having Postgresql transform the persisted JSON data to the format I need for the client with a pre-loaded Javascript transformation. In the case of Marten, the workflow is to:

  1. Find the entire hierarchical document JSON in a single database row by its primary key
  2. Apply a Javascript function to transform the persisted JSON to the format that the client needs and return a JSON representation as a String
  3. Stream the JSON from the Linq query directly to the HTTP response without any additional serialization work

Not to belabor the point too much, but the Marten mechanics are simpler and probably much more efficient at runtime because:

  • The underlying database query is much simpler if all the data is in one field in one row
  • The Javascript transformation probably isn’t that much faster or slower than the equivalent AutoMapper mechanics, so let’s call that a wash
  • You don’t have the in memory allocations to load a rich model object just to immediately transform that into a completely different model object
  • You avoid the now unnecessary cost of serializing the DTO view models to a JSON string

A couple additional points:

  • Jimmy Bogard reviewed this and pointed out that in some cases you could bypass the Domain Model to DTO transformation by selecting straight to the DTO, but that wouldn’t cover all cases by any means. The same limitations apply to Marten and its Select() transformation features.
  • To get even more efficient in your Marten usage, the Javascript transformations can be used inside of Marten’s Compiled Query feature to avoid the CPU cost of repetitively parsing Linq statements. You can also do Javascript transformations inside of batched queries – which can of course, also be combined with the aforementioned compiled queries;)

 

Now, let’s see how it all works…

 

Building the Javascript Function

The way this works in Marten is that you write your Javascript function into a single file and export the main function with the “module.exports = ” CommonJS syntax. Marten is expecting the main function to have the signature “function(doc)” and return the transformed document.

Here’s a sample Javascript function I used to test this feature that works against a User document type:

module.exports = function(doc) {
    return {fullname: doc.FirstName + ' ' + doc.LastName};
}

Given the persisted JSON for a User document, this transformation would return a different object that would then be streamed back to the client as a JSON string.

There is some thought and even infrastructure for doing Javascript transformations with multiple, related documents, but that feature won’t make it into Marten 1.0.

To load the function into a Javascript-enabled Postgresql schema, Marten exposes this method:

    var store = DocumentStore.For(_ =>
    {
        _.Connection(ConnectionSource.ConnectionString);

        // Let Marten derive the transform name
        // from the file name
        _.Transforms.LoadFile("get_fullname.js");

        // or override the transform name
        _.Transforms.LoadFile("get_fullname.js", "fullname");
    });

Internally, Marten will wrap a PLV8 function wrapper around your Javascript function like this:

CREATE OR REPLACE FUNCTION public.mt_transform_get_fullname(doc jsonb)
  RETURNS jsonb AS
$BODY$

  var module = {export: {}};

module.exports = function (doc) {
    return {fullname: doc.FirstName + ' ' + doc.LastName};
}

  var func = module.exports;

  return func(doc);

$BODY$
  LANGUAGE plv8 IMMUTABLE STRICT;


My intention with the approach shown above was to allow users to write simple Javascript functions and be able to test their transformations in simple test harnesses like Mocha. By having Marten wrap the raw Javascript in a generated PLV8 function, users won’t have to be down in the weeds and worrying about Postgresql mechanics.

Depending on the configuration, Marten is good enough to build, or rebuild, the function to match the current version of the Javascript code on the first usage of that transformation. The Javascript transforms are also part of our schema management support for database migrations.

 

Transformations

The persisted JSON documents in Marten are a reflection of your .Net classes. Great, that makes it absurdly easy to keep the database schema in synch with your application code at development time — especially compared to the typical development process against a relational database. However, what happens when you really do need to make breaking changes or additions to a document type but you already have loads of persisted documents in your Marten database with the old structure?

To that end, Marten allows you to use Javascript functions to alter the existing documents in the database. As an example, let’s go back to the User document type and assume for some crazy reason that we didn’t immediately issue a user name to some subset of users. As a default, we might just assign their user names by combining their first and last names like so:

module.exports = function (doc) {
    doc.UserName = (doc.FirstName + '.' + doc.LastName).toLowerCase();

    return doc;
}

To apply this transformation to existing rows in the database, Marten exposes this syntax:

    var store = DocumentStore.For(_ =>
    {
        _.Connection(ConnectionSource.ConnectionString);

        _.Transforms.LoadFile("default_username.js");
    });

    store.Transform
        .Where<User>("default_username", x => x.UserName == null);

When you run the code above, Marten will issue a single SQL statement that issues an UPDATE to the rows matching the given criteria by applying the Javascript function above to alter the existing document. No data is ever fetched or processed in the actual application tier.

One thought on “Document Transformations in Marten with Javascript

Leave a comment