Native queries for MongoDB

Once upon a time, object databases where a hot topic. They suffered from the same problem as modern databases — the query language was string based, which led to runtime errors in case of typos, code updates etc. One of the databases experimented with Native queries for persistent objects in Java and C#. Native queries were much easier to work with, since they were based on the language used to write the other code and benefitted from type checking, code completion etc.

Now fast forward to today. There are new languages and document databases, such as MongoDB, with the same inconvenience — queries are usually either string based or general dictionaries (such as JSON or BSON). In this post I'll try to implement native queries for MongoDB. I'll use Julia because it has nice macros. A similar solution would work for Rust (via macros) and Go (via its built-in code generation support).

We first define the type for persistent objects:

@kwdef struct Person
    name::String
    age::Int
end

Instances of Person will be converted into BSON when storing them and back from BSON when retrieving them.

Let's now assume we have a MongoDB collection in the collection variable. We want to be able to use something like:

for obj in run(
        collection,
        @query p::Person -> p.name == "John" && p.age >= 30
    )
    println(obj)
end

The query looks like a usual lambda expression in Julia, but of course this can't be passed to the MongoDB server. Under the hood the @query macro translates the lambda expression into a BSON document. More specifically, it creates an instance of the following structure:

struct Query
    doc::Mongoc.BSON
    type::Type
    fn::Function
end

The doc field contains the corresponding BSON document, the type field contains the type of the objects we want to fetch and the fn field is the original lambda expression.

For example, in the above code the lambda expression is translated into the following BSON query:

{ "name" : "John", "age" : { "$gte" : 30 } }

The retrieved documents are then converted into instances of Person (the type is taken from p::Person) and an array of these instances is returned.

Under the hood

The macro is defined as follows:

macro query(fn)
    if fn.head != :(->)
        throw("query must be an anonymous function")
    end
    local sig = fn.args[1]
    if sig.head != :(::)
        throw("query function must take one typed argument")
    end
    local var = sig.args[1]
    local type = sig.args[2]
    local expr = fn.args[2]
    if expr.head != :block
        throw("query function must be a block")
    end
    local expr = queryfromast(expr.args[2], var)
    quote
        Query(Mongoc.BSON($expr), $type, $fn)
    end
end

The queryfromast function converts an expression (for example, the body of the anonymous function) into an instance of Expr (an abstract syntax tree) representing the query BSON's attribute-value pairs.

The run function is relatively simple:

function run(collection::Mongoc.Collection, query::Query)
    println("running with query BSON: $(query.doc)")
    local objects = Vector{query.type}()
    for doc in Mongoc.find(collection, query.doc)
        local obj = objectfrombson(doc, query.type)
        if query.fn(obj)
            push!(objects, obj)
        end
    end
    return objects
end

Note that the lambda expression (stored in query.fn) is evaluated for every fetched object. This is because the BSON query might be less specific than the lambda expression.

MongodbJulia
Avatar for Petr Homola

Written by Petr Homola

Studied physics & CS; PhD in NLP; interested in AI, HPC & PLT

Loading

Fetching comments

Hey! 👋

Got something to say?

or to leave a comment.