Validate client-submitted data using JSON Schema documents and convert JSON Schema documents into different data-interchange formats.
Contents
- Installation
- Usage
- Data Validation
- Data Validation CLI
- Data Validation API
- Structured Messaged Generation
- Supported Data-Interchange Formats
- Avro
- Data-Interchange CLI
- Data-Interchange API
- Testing
- Additional Resources
- Future Considerations
- Maintainers
- Contributing
- License
Why aptos?
- Validate client-submitted data
- Convert JSON Schema documents into different data-interchange formats
- Simple syntax
- CLI support for data validation and JSON Schema conversion
Installation
via pip
$ pip install aptos
via git
$ git clone https://github.com/pennsignals/aptos.git && cd aptos
$ python setup.py install
Usage
aptos
supports the following capabilities:
- Data Validation: Validate client-submitted data using validation keywords described in the JSON Schema specification.
- Schema Conversion: Convert JSON Schema documents into different data-interchange formats. See the list of supported data-interchange formats for more information.
usage: aptos [arguments] SCHEMA
aptos is a tool for validating client-submitted data using the JSON Schema
vocabulary and converts JSON Schema documents into different data-interchange
formats.
positional arguments:
schema JSON document containing the description
optional arguments:
-h, --help show this help message and exit
Arguments:
{validate,convert}
validate Validate a JSON instance
convert Convert a JSON Schema into a different data-interchange
format
More information on JSON Schema: http://json-schema.org/
Data Validation
Here is a basic example of a JSON Schema:
{
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
}
},
"required": ["firstName", "lastName"]
}
Given a JSON Schema, aptos
can validate client-submitted data to ensure that it satisfies a certain number of criteria.
JSON Schema Validation keywords such as minimum
and required
can be used to impose requirements for successful validation of an instance. In the JSON Schema above, both the firstName
and lastName
properties are required, and the age
property MUST have a value greater than or equal to 0.
Valid Instance :heavy_check_mark: | Invalid Instance :heavy_multiplication_x: |
---|---|
{"firstName": "John", "lastName": "Doe", "age": 42} |
{"firstName": "John", "age": -15} (missing required property lastName and age is not greater than or equal to 0) |
aptos
can validate client-submitted data using either the CLI or the API:
Data Validation CLI
$ aptos validate -instance INSTANCE SCHEMA
Arguments:
- INSTANCE: JSON document being validated
- SCHEMA: JSON document containing the description
Successful Validation :heavy_check_mark: | Unsuccessful Validation :heavy_multiplication_x: |
---|---|
Data Validation API
import json
from aptos.parser import SchemaParser
from aptos.visitor import ValidationVisitor
with open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
# Invalid client-submitted data (instance)
instance = {
'firstName': 'John'
}
try:
component.accept(ValidationVisitor(instance))
except AssertionError as e:
print(e) # instance {'firstName': 'John'} is missing required property 'lastName'
Structured Message Generation
Given a JSON Schema, aptos
can generate different structured messages.
:warning: Note: The JSON Schema being converted MUST be a valid JSON Object.
Supported Data-Interchange Formats
Format | Supported | Notes |
---|---|---|
Apache Avro | :heavy_check_mark: | |
Protocol Buffers | :heavy_multiplication_x: | Planned for future releases |
Apache Thrift | :heavy_multiplication_x: | Planned for future releases |
Apache Parquet | :heavy_multiplication_x: | Planned for future releases |
Avro
Using the Person
schema in the previous example, aptos
can convert the schema into the Avro data-interchange format using either the CLI or the API.
aptos
maps the following JSON schema types to Avro types:
JSON Schema Type | Avro Type |
---|---|
string |
string |
boolean |
boolean |
null |
null |
integer |
long |
number |
double |
object |
record |
array |
array |
JSON Schema documents containing the
enum
validation keyword are mapped to Avroenum
symbols
attribute.
JSON Schema documents with the
type
keyword as an array are mapped to Avro Union types.
Data-Interchange CLI
$ aptos convert -format FORMAT SCHEMA
Arguments:
- FORMAT: Data-interchange format
- SCHEMA: JSON document containing the description
Data-Interchange API
import json
from aptos.parser import SchemaParser
from aptos.schema.visitor import AvroSchemaVisitor
with open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
record = component.accept(AvroSchemaVisitor())
print(json.dumps(record, indent=2))
The above code generates the following Avro schema:
{
"type": "record",
"fields": [
{
"doc": "",
"type": "string",
"name": "lastName"
},
{
"doc": "",
"type": "string",
"name": "firstName"
},
{
"doc": "Age in years",
"type": "long",
"name": "age"
}
],
"name": "Person"
}
Testing
All unit tests exist in the tests directory.
To run tests, execute the following command:
$ python setup.py test
Additional Resources
- Stop Being a “Janitorial” Data Scientist - A blog post explaining why aptos was created
- Understanding JSON Schema - An excellent guide for schema authors, from the Space Telescope Science Institute
Future Considerations
- Swagger support
- Additional data-interchange formats
Maintainers
Jason Walsh |
Contributing
Contributions welcome! Please read the contributing.json
file first.