gemseo / third_party

gemseo.third_party.fastjsonschema

Installation

pip install fastjsonschema

Support only for Python 3.3 and higher.

About

fastjsonschema implements validation of JSON documents by JSON schema. The library implements JSON schema drafts 04, 06 and 07. The main purpose is to have a really fast implementation. See some numbers:

  • Probably most popular jsonschema can take up to 5 seconds for valid inputs and 1.2 seconds for invalid inputs.

  • Second most popular json-spec is even worse with up to 7.2 and 1.7 seconds.

  • Last validictory, now deprecated, is much better with 370 or 23 milliseconds, but it does not follow all standards and it can be still slow for some purposes.

With this library you can gain big improvements as fastjsonschema takes only about 25 milliseconds for valid inputs and 2 milliseconds for invalid ones. Pretty amazing, right? :-)

Technically it works by generating the most stupid code on the fly which is fast but is hard to write by hand. The best efficiency is achieved when compiled once and used many times, of course. It works similarly like regular expressions. But you can also generate the code to the file which is even slightly faster.

You can do the performance on your computer or server with an included script:

$ make performance
fast_compiled        valid      ==> 0.030474655970465392
fast_compiled        invalid    ==> 0.0017561429995112121
fast_file            valid      ==> 0.028758891974575818
fast_file            invalid    ==> 0.0017655809642747045
fast_not_compiled    valid      ==> 4.597834145999514
fast_not_compiled    invalid    ==> 1.139162228035275
jsonschema           valid      ==> 5.014410221017897
jsonschema           invalid    ==> 1.1362981660058722
jsonspec             valid      ==> 8.1144932230236
jsonspec             invalid    ==> 2.0143173419637606
validictory          valid      ==> 0.4084212710149586
validictory          invalid    ==> 0.026061681972350925

This library follows and implements JSON schema draft-04, draft-06, and draft-07. Sometimes it’s not perfectly clear so I recommend also check out this understanding JSON schema.

Note that there are some differences compared to JSON schema standard:

  • Regular expressions are full Python ones, not only what JSON schema allows. It’s easier to allow everything and also it’s faster to compile without limits. So keep in mind that when you will use a more advanced regular expression, it may not work with other library or in other languages.

  • Because Python matches new line for a dollar in regular expressions (a$ matches a and a\\n), instead of $ is used \Z and all dollars in your regular expression are changed to \\Z as well. When you want to use dollar as regular character, you have to escape it (\$).

  • JSON schema says you can use keyword default for providing default values. This implementation uses that and always returns transformed input data.

API