Chapter 4. JSON Schema

In Chapter 3, we covered the JSON data types. The importance and usefulness of data types was discussed. Knowing ahead of time what something is and what it is for (remember the kid with the hammer) makes a world of difference.

In most scenarios with data interchange formats, the data is being created to send across the Internet or a network to another party. That party usually has a desired format for the document that they are expecting, including structure and data types. They will usually provide documentation that explains the format and provides examples.

Even when the most detailed, beautiful documentation is provided, it is not difficult to create errors in your data. To be clear, these aren’t syntax errors we are talking about here. These are errors of misunderstanding, like “I sent an apple, and you were expecting an orange.” In this book, I will refer to this type of validation as conformity validation so   that it may be distinguished from syntax validation.

In this scenario, the process usually plays out in the following steps:

  1. You are finished creating your data and you feel confident.
  2. You send your data across the Internet to the other party. Your Internet connection is slow today, and the data file you are sending is huge, so it takes several minutes.
  3. You get an error response because your data was not formatted how they were expecting it. Confidence deflated. If you’re lucky, the error response will tell you something meaningful, like what you did wrong.
  4. You pore over their documentation, find what you think you did wrong, fix it, and start back at step 1.

This scenario has existed with data interchange since before JSON existed. Fortunately, the people of the technology industry are problem solvers, and the concept of the schema was born.

Contracts with Validation Magic

In the real world, we often use contracts between two parties where the outcome is important. When I sign a contract that says I will complete a project for someone, the details are outlined in that contract. I agree that I will deliver the spaceship by August 31st, and the final product will have a fully functional spaceship with life support, lasers, and three engines.

Imagine now that we live in a world of wizards and magic. When the company I’m doing the project for handed me the contract, they added a bit of magic. At any time, I can tap my wand on the contract and it will tell me whether I’ve met my end of the bargain. I’d never have to walk into the meeting to proclaim “I’m done!” and be met with the embarrassing response of “What about the third engine you promised to put on the spaceship. Where is it?” At any time I can verify that I am really done with the project and walk into the meeting with confidence.

A data interchange schema is much like that imagined world of wizards and magic. Before we send our data, we can at any time validate it for conformity with the schema and find out whether our data is acceptable. When we are interchanging data with a schema, the process is much different than our scenario without the schema:

  1. You validate the conformity of your data with your schema and fix any errors found. You are usually given useful information about the errors.
  2. You are finished creating your data and you feel confident.
  3. You send your data across the Internet and you get a success response. Mission complete.

Additionally, the JSON schema can be used on the other end of the transaction by the party that is accepting the data. A JSON schema can be a first line of defense in accepting data, to verify that the data conforms. It can answer all of these questions before the data is processed:

Are the data types of the values correct?
We can specify that a value has to be a number, string, etc.
Does this include the required data?
We can specify what data is required, and what is not.
Are the values in the format that I require?
We can specify ranges, minimum and maximum.

Introduction to JSON Schema

While JSON is fairly mature, JSON Schema is still under development. As of April 2015, JSON Schema is in draft 4. This doesn’t mean that you shouldn’t use JSON Schema—it just means it’s still evolving to better serve the world.

A JSON Schema is written with JSON, so reading or writing one is only a few steps away. In our very first name-value pair of our JSON, we must declare it as a schema document (Example 4-1).

Example 4-1. The name for this declaration will always be “$schema,” and the value will always be the link for the draft version
{
    "$schema": "http://json-schema.org/draft-04/schema#"
}

The second name-value pair in our JSON Schema Document will be the title (see Example 4-2).

Example 4-2. Format for a document that represents a cat
{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Cat"
}

In the third name-value pair of our JSON Schema Document, we will define the properties that we want to be included in the JSON. The "properties" value is essentially a skeleton of the name-value pairs of the JSON we want. Instead of a literal value, we have an object that defines the data type, and optionally the description (Example 4-3).

Example 4-3. Defining the properties for a cat
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Cat",
  "properties": {
    "name": {
      "type": "string"
    },
    "age": {
      "type": "number",
      "description": "Your cat's age in years."
    },
    "declawed": {
      "type": "boolean"
    }
  }
}

We can then validate that our JSON conforms to the JSON Schema (Example 4-4).

Example 4-4. This JSON conforms to our JSON Schema for “Cat”
{
  "name": "Fluffy",
  "age": 2,
  "declawed": false
}

Earlier I stated that a JSON Schema can answer the following questions:

Are the data types of the values correct?
We can specify that a value has to be a number, string, etc.
Does this include the required data?
We can specify what data is required, and what is not.
Are the values in the format that I require?
We can specify ranges, minimum and maximum.

With the very simple cat example, the first question was answered. We were able to validate that the JSON for the cat “Fluffy” has the correct data types for the values of name, age, and declawed. Let’s answer the second question: does this include the required data?

When we ask for data, there are often properties (or fields) that we must have values for, and others that are optional. For example, when I create a new account on a shopping website, I need to complete a shipping address form. That address form requires my name, street, city, state, and zip code. Optionally, I can include a company name, apartment number, and a second line for a street address. If I leave out one of the required fields, I cannot move forward with the account creation.

To achieve this required logic in the JSON schema, we add a fourth name-value pair after "$schema", "title", and "properties". This name-value pair has the name "required" and a value of the array data type. The array includes the fields we require.

In Example 4-5, we first add another field for "description". Next, we add a fourth name-value pair, "required", with an array of required values for its value. "name", "age", and "declawed" are required, so we add them to this list. We leave out "description" because it’s not required.

Example 4-5. Defining the required fields
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Cat",
  "properties": {
    "name": {
      "type": "string"
    },
    "age": {
      "type": "number",
      "description": "Your cat's age in years."
    },
    "declawed": {
      "type": "boolean"
    },
    "description": {
      "type": "string"
    }
  },
  "required": [
    "name",
    "age",
    "declawed"
  ]
}

With the addition of "required" to our JSON schema, the JSON in Example 4-6 is valid. This JSON conforms to our JSON schema for "Cat" with the required fields of "name", "age", and "declawed". We are including the optional name-value pair, "description".

Example 4-6. Valid JSON
{
  "name": "Fluffy",
  "age": 2,
  "declawed": false,
  "description" : "Fluffy loves to sleep all day."
}

We may also leave out the "description" field, as it’s not included in the list of required fields. The JSON in Example 4-7 conforms to our JSON Schema for "Cat" with the required fields of "name", "age", and "declawed".

Example 4-7. Valid JSON without the “description” field
{
  "name": "Fluffy",
  "age": 2,
  "declawed": false
}

It is important to note that if you do not include the "required" name-value pair in your JSON schema with the array of required names, then nothing is required. A JSON object with no name-value pairs inside it would be considered valid. Without the array of "required", the JSON in Example 4-8 is considered valid for the "Cat" JSON Schema.

Example 4-8. Valid JSON
{}

The third and final question we can answer with our JSON schema is: are the values in the format I require? We answered the question about the data types of our values, but we often need a specific format for the type. For example, I require a username, but the username should not exceed 20 characters. Additionally, I might ask you to think of a number between 10 and 100. We can express these specific requirements in our JSON schema.

In the cat JSON, we have requirements such as name being a string and age being a number. However, we do not want someone giving us data with a really long cat name, a really short cat name, or a negative number for the cat’s age. In our JSON schema, we can define a minimum length and a maximum length for a string, and a minimum for a number.

In Example 4-9, validation has been added to ensure that the cat’s name is a minimum of 3 characters and a maximum of 20 characters. Additionally, we ensure that the age of the cat submitted is not a negative number.

Example 4-9. Validating the cat JSON
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Cat",
  "properties": {
    "name": {
      "type": "string",
      "minLength": 3,
      "maxLength" : 20
    },
    "age": {
      "type": "number",
      "description": "Your cat's age in years.",
      "minimum" : 0
    },
    "declawed": {
      "type": "boolean"
    },
    "description": {
      "type": "string"
    }
  },
  "required": [
    "name",
    "age",
    "declawed"
  ]
}

The JSON in Example 4-10 is not valid with the "Cat" JSON Schema because the name value exceeds the "maxLength", and the age value precedes the "minimum".

Example 4-10. Invalid JSON
{
  "name": "Fluffy the greatest cat in the whole wide world",
  "age": -2,
  "declawed": false,
  "description" : "Fluffy loves to sleep all day."
}

The JSON in Example 4-11 is valid with the cat JSON Schema and conforms to the requirements for the values.

Example 4-11. This JSON is valid
{
  "name": "Fluffy",
  "age": 2,
  "declawed": false,
  "description" : "Fluffy loves to sleep all day."
}

If we return to the comparison of a schema to a contract, you can see that the details of our contract can be very specific. The examples provided in this chapter are introductory and just the tip of the iceberg. JSON Schema even supports regular expressions (character patterns, such as an email address format) and enum (a list of possible values). If you wish to become a master of JSON Schema, visit the following pages, where you can find links to the specifications:

There is a long and growing list of JSON Schema libraries and projects for specific programming languages and frameworks. A quick Google search of “JSON Schema Validation [insert programming language name here]” should get you what you need if you’d like to integrate JSON Schema validation into a project. Additionally, there are a few online validators, which are programming language agnostic and great for experimenting with JSON Schema:

If I go to the JSON Schema Lint website, I will be presented with two text areas: one for the JSON schema, and another for the JSON document to be validated. If I paste in the schema from Example 4-9 and the JSON from Example 4-10, I will see the following errors:

  • Field: data.name, Error: has longer length than allowed , Value: “Fluffy the greatest cat in the whole wide world”
  • Field: data.age, Error: is less than minimum, Value: -2

If I go to the JSON Schema Validator website, I am also presented with the same two text areas. Once again, if I paste in the schema from Example 4-9 and the JSON from Example 4-10, I will see errors. Additionally, the line numbers of the JSON will display a red x, showing us where the errors are at in the JSON.

  • Message: String ‘Fluffy the greatest cat in the whole wide world’ exceeds maximum length of 20, Schema Path: #/properties/name/maxLength
  • Message: Integer -2 is less than minimum value of 0, Schema Path: #/properties/age/minimum

The JSON Schema Validator not only points us to the line numbers where the error takes place, but also gives us the paths to the schema requirements that are causing the validation to fail. The two validators may have described the errors a bit differently, but both found the same errors.

Key Terms and Concepts

This chapter covered the following key term:

JSON Schema
A virtual contract for data interchange.

We also discussed these key concepts:

  • A JSON validator provides syntax validation, while JSON Schema provides conformity validation.
  • JSON Schema can serve as a first line of defense in accepting data or as a time (and sanity) saving tool for the party providing the data that ensures their data will conform to what is accepted.
  • A JSON Schema can answer the following three questions for conformity validation:
    Are the data types of the values correct?
    We can specify that a value has to be a number, string, etc.
    Does this include the required data?
    We can specify what data is required, and what is not.
    Are the values in the format that I require?
    We can specify ranges, minimum and maximum.

Get Introduction to JavaScript Object Notation now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.