Learn JSON Schema
JSON Schema specification
The JSON Schema specification is in DRAFT status in the IETF, however, it is widely used today and is practically considered a de facto standard.
JSON-Schema establishes a set of rules that model and validate a data structure. The following example defines a schema that models a simple data structure with 2 fields: id and value. It is also indicated that the id is mandatory and that no additional fields are allowed.
{
"type": "object",
"additionalProperties": false,
"required": [
"id"
],
"properties": {
"id": {"type":"string"},
"value": {"type":"integer"}
}
}
VALID JSON OBJECT
{
"id": "id_1",
"value": 23
}
INVALID JSON OBJECTS
{
"value": 3 // id is not defined and is mandatory
}
{
"id": "id_3",
"value": 3,
"count": 5 // additional properties are not allowed
}
JSON SCHEMA ONLINE VALIDATOR
You can test this behavior using this online and interactive JSON Schema validator.Creating a JSON-Schema
The following example is by no means definitive of all the value JSON Schema can provide. For this you will need to go deep into the specification itself. Learn more at json schema specification..
Let’s pretend we’re interacting with a JSON based car registration. This registration has a car which has:
- An manufacturer identifier:
chassisNumber - Identification of country of registration:
licensePlate - Number of kilometers driven:
mileage - An optional set of tags:
tags.
For example:
{
"chassisNumber": 72837362,
"licensePlate": "8256HYN",
"mileage": 60000,
"tags": [ "semi-new", "red" ]
}
While generally straightforward, the example leaves some open questions. Here are just a few of them:
- What is
chassisNumber? - Is
licensePlaterequired? - Can the
mileagebe less than zero? - Are all of the
tagsstring values?
When you’re talking about a data format, you want to have metadata about what keys mean, including the valid inputs for those keys. JSON Schema is a proposed IETF standard how to answer those questions for data.
Starting the schema
To start a schema definition, let’s begin with a basic JSON schema.
We start with four properties called keywords which are expressed as JSON keys.
Yes. the standard uses a JSON data document to describe data documents, most often that are also JSON data documents but could be in any number of other content types like
text/xml.
- The
$schemakeyword states that this schema is written according to a specific draft of the standard and used for a variety of reasons, primarily version control. - The
$idkeyword defines a URI for the schema, and the base URI that other URI references within the schema are resolved against. - The
titleanddescriptionannotation keywords are descriptive only. They do not add constraints to the data being validated. The intent of the schema is stated with these two keywords. - The
typevalidation keyword defines the first constraint on our JSON data and in this case it has to be a JSON Object.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object"
}
We introduce the following pieces of terminology when we start the schema:
- Schema Keyword:
$schemaand$id. - Schema Annotations:
titleanddescription. - Validation Keyword:
type.
Defining the properties
chassisNumber is a numeric value that uniquely identifies a car. Since this is the canonical identifier for a var, it doesn’t make sense to have a car without one, so it is required.
In JSON Schema terms, we update our schema to add:
- The
propertiesvalidation keyword. - The
chassisNumberkey.descriptionschema annotation andtypevalidation keyword is noted – we covered both of these in the previous section.
- The
requiredvalidation keyword listingchassisNumber.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object",
"properties": {
"chassisNumber": {
"description": "Manufacturer's serial number",
"type": "integer"
}
},
"required": [ "chassisNumber" ]
}
licensePlateis a string value that acting as a secondary identifier. Since there isn’t a car without a registration it also is required.- Since the
requiredvalidation keyword is an array of strings we can note multiple keys as required; We now includelicensePlate.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object",
"properties": {
"chassisNumber": {
"description": "Manufacturer's serial number",
"type": "integer"
},
"licensePlate": {
"description": "Identification of country of registration",
"type": "string"
}
},
"required": [ "chassisNumber", "licensePlate" ]
}
Going deeper with properties
According to the car registry, they cannot have negative mileage.
- The
mileagekey is added with the usualdescriptionschema annotation andtypevalidation keywords covered previously. It is also included in the array of keys defined by therequiredvalidation keyword. - We specify that the value of
mileagemust be greater than or equal to zero using theminimumvalidation keyword.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object",
"properties": {
"chassisNumber": {
"description": "Manufacturer's serial number",
"type": "integer"
},
"licensePlate": {
"description": "Identification of country of registration",
"type": "string"
},
"mileage": {
"description": "Number of kilometers driven",
"type": "number",
"minimum": 0
}
},
"required": [ "chassisNumber", "licensePlate", "mileage" ]
}
Next, we come to the tags key.
The car registry has established the following:
- If there are tags there must be at least one tag,
- All tags must be unique; no duplication within a single car.
- All tags must be text.
- Tags are nice but they aren’t required to be present.
Therefore:
- The
tagskey is added with the usual annotations and keywords. - This time the
typevalidation keyword isarray. - We introduce the
itemsvalidation keyword so we can define what appears in the array. In this case:stringvalues via thetypevalidation keyword. - The
minItemsvalidation keyword is used to make sure there is at least one item in the array. - The
uniqueItemsvalidation keyword notes all of the items in the array must be unique relative to one another. - We did not add this key to the
requiredvalidation keyword array because it is optional.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object",
"properties": {
"chassisNumber": {
"description": "Manufacturer's serial number",
"type": "integer"
},
"licensePlate": {
"description": "Identification of country of registration",
"type": "string"
},
"mileage": {
"description": "Number of kilometers driven",
"type": "number",
"minimum": 0
},
"tags": {
"description": "Tags for the car",
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
}
},
"required": [ "chassisNumber", "licensePlate", "mileage" ]
}
Nesting data structures
Up until this point we’ve been dealing with a very flat schema – only one level. This section demonstrates nested data structures.
- The
dimensionskey is added using the concepts we’ve previously discovered. Since thetypevalidation keyword isobjectwe can use thepropertiesvalidation keyword to define a nested data structure.- We omitted the
descriptionannotation keyword for brevity in the example. While it’s usually preferable to annotate thoroughly in this case the structure and key names are fairly familiar to most developers.
- We omitted the
- You will note the scope of the
requiredvalidation keyword is applicable to the dimensions key and not beyond.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/car.schema.json",
"title": "Car",
"description": "A registered car",
"type": "object",
"properties": {
"chassisNumber": {
"description": "Manufacturer's serial number",
"type": "integer"
},
"licensePlate": {
"description": "Identification of country of registration",
"type": "string"
},
"mileage": {
"description": "Number of kilometers driven",
"type": "number",
"minimum": 0
},
"tags": {
"description": "Tags for the car",
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
},
"dimensions": {
"type": "object",
"properties": {
"length": {
"type": "number"
},
"width": {
"type": "number"
},
"height": {
"type": "number"
}
},
"required": [ "length", "width", "height" ]
}
},
"required": [ "chassisNumber", "licensePlate", "mileage" ]
}
Taking a look at data for our defined JSON Schema
We’ve certainly expanded on the concept of a car since our earliest sample data (scroll up to the top). Let’s take a look at data which matches the JSON Schema we have defined.
{
"chassisNumber": 1,
"licensePlate": "8256HYN",
"mileage": 60000,
"tags": [ "semi-new", "red" ],
"dimensions": {
"length": 4.005,
"width": 1.932,
"height": 1.425
}
}