JSON schema and code generation

One of our goals is to build a communication platform for devices to connect to the cloud. For it to be of any use, it must be programming language agnostic and easy to use. To achieve that we decided to go with a JSON schema as a way to describe data structures and generate stubs for particular programming languages from that.

JSON schema itself is a nice way to describe how you want your piece of JSON to look like. It supports quite a variety of constraints that you can use to distinguish good values from bad ones.

When it comes to generating code for a language with static type checks, the most important thing you want to know about the value is its type. And JSON schema allows you to specify that with the "type" keyword, which can have one of the values "null" (not very useful), "string", "number", "integer", "boolean", "array" or "object". For objects, you can specify schema for each property and generate structure types recursively. As for arrays, there are two most common cases: all items are of the same type (list) or there's a fixed number of items, each with its own type (tuple).

We're using Go Language for our server side software, so this post will focus on that, but the same approach can be easily translated to other languages.

Let's get straight to examples. A simple schema might look like this:

{
"type": "object",
"properties": {
"name": {"type": "string"},
"quantity": {"type": "integer"}
}
}

And it's quite straightforward to generate structure type for Go language (or any language that has structural types):

type MyStruct struct {
Name string
Quantity int64
}

Add to this a bit of boilerplate to transform it to JSON and back – and it's already a big step forward from messing with raw JSON to the beautiful world of static type checks. Now the programmer only needs to care about native data types, enjoying static type checks done by the compiler, and all the code that works with JSON is generated, reducing the likelihood of bugs and typos.

If you have something a bit more complex, like nested objects, you can apply the same algorithm recursively:

{
"type": "object",
"properties": {
"name": {"type": "string"},
"tag": {
"type": "object",
"properties": {
"key": {"type": "string"},
"value": {"type": "string"}
}
}
}
}

type MyStruct struct {
Name string
Tag MyStructTag
}
type MyStructTag struct {
Key string
Value string
}

Lists are also straightforward if the target language has arrays. What is more interesting is how to represent tuples. Let's take a look at the schema first:

{
"type": "array",
"items": [
{"type": "string"},
{"type": "number"}
],
"additionalItems": false,
"minItems": 2
}

This schema declares that the value:

"type": must be an array
"items": first item of which, if present, must be a string and second – a number
"additionalItems": it must have no items beyond ones described in "items"
"minItems": it must have at least 2 items

In short, the value must be a 2-tuple consisting of a string and a number.

Not that many popular languages have native tuples and static type checks at the same time, and Go is not one of them, so we must reinvent the wheel here:

type MyStruct struct {
First string
Second float64
}

A bit awkward, but it does the trick. We decided to allow up to five elements in a tuple in our code generator, as at that point confusion about what value goes where outweighs the simplicity of JSON representation; switching to an object will allow you to give each item a meaningful name.

JSON schema defines quite a few keywords and supporting all of them in code generator is unpractical, as constraints can be quite complex and involve regular expressions cross-checks between multiple properties of an object. This means that the actual input language supported by the code generator will be a subset of JSON schema and you'd better document somewhere which keywords are accounted for, but it's still useful to allow all keywords to be used, for a few reasons:

it can give a better idea to the reader about what kind of data is expected, in addition to documentation in human-readable language;
you can still perform proper validation before copying JSON data into native data structures, even if some of the constraints are not easily expressible in the target language;
once you add support for the keyword there's no need to go over all schemas and check if it makes sense to use it, if it does – it will be already there.

In summary, JSON schema is great for describing data structures represented as JSON and "type" keyword is the most valuable piece of it for code generation.

To contact: send us a message or ask on the developer forum.

JSON schema and code generation

JSON-RPC Example [Mongoose more than an Embedded Web Server]

Mongoose Web Server version 6.17 released