(Translated by https://www.hiragana.jp/)
⚓ T368131 ArrayDef: add convenience methods for generating schemas
Page MenuHomePhabricator

ArrayDef: add convenience methods for generating schemas
Closed, ResolvedPublic

Description

ArrayDef supports validation of array structures using JSON schema (draft 4). However, JSON schema can be cumbersome to write. We should provide convenience methods for covering the simple use case. For example, it should be possible to declare a list of strings as follows:

"foobar" => [
    ParamValidator::PARAM_TYPE => 'array',
    Handler::PARAM_SOURCE => 'body',
    ArrayDef::PARAM_SCHEMA => ArrayDef::makeListSchema( 'string' ),
]

Calls to the convenience methods can be nested to create more complex schemas, e.g. to create a list of objects that each contain a string and a map of booleans:

ArrayDef::PARAM_SCHEMA => ArrayDef::makeListSchema(
    ArrayDef::makeObjectSchema( [
        'something' => 'string',
        'otherthing' => ArrayDef::makeMapSchema( 'boolean' )
    ] );
),

The following three convenience methods should be supported:

makeListSchema( $itemSchema ): returns a JSON schema of type array with the given schema for each item. If $itemSchema is a string, it will be interpreted as a type (so the schema would be [ 'type' => $itemSchema ].

makeMapSchema( $entrySchema ): returns a JSON schema of type object with the given schema for each entry (using additionalProperties ). If $itemSchema is a string, it will be interpreted as a type (so the schema would be [ 'type' => $itemSchema ]. We may need to use some custom hackers to support numeric keys.

makeObjectSchema( $requiredProperties, $optionalProperties = [], $additionalProperties= false ): returns a JSON schema of type object with properties defined as follows (details to be decided):

  • $optionalProperties is a map of parameter names to their respective schemas. If a schema is given as a string, it's considered to be shorthand for as type, like in makeListSchema and makeListSchema.
  • $requiredProperties is a map like $optionalProperties, or it's a list of names of required properties. $optionalProperties must not be a list.
  • properties is defined by merging $requiredProperties and $optionalProperties (if $requiredProperties is a list, use only $optionalProperties).
  • required is defined by taking the array keys of $requiredProperties (or just $requiredProperties, if it's a map).
  • additionalProperties is taken from $additionalProperties, which defaults to false. If it's a string, it's considered to be the type of the schema, as above.

Some examples:

ArrayDef::makeObjectSchema( [ 'a' => 'integer', 'b' => [ 'enum' => [ 'x', 'y', 'z' ] ] ] ) becomes:

{
    "type": "object",
    "required": [ "a", "b" ],
    "properties": [
        "a" => { "type": "integer" },
        "b" => { "enum": "x", "y", "z" },
    ]
}

ArrayDef::makeObjectSchema( [ 'a' => 'integer' ], [ 'b' => [ 'enum' => [ 'x', 'y', 'z' ] ] ] ) becomes:

{
    "type": "object",
    "required": [ "a" ],
    "properties": [
        "a" => { "type": "integer" },
        "b" => { "enum": "x", "y", "z" },
    ]
}

ArrayDef::makeObjectSchema( [ 'a' ], [ 'a' => 'integer', 'b' => [ 'enum' => [ 'x', 'y', 'z' ] ] ] ) would be the same.

ArrayDef::makeObjectSchema( [ 'a' ], [], 'string' ) becomes:

{
    "type": "object",
    "required": [ "a" ],
    "additionalProperties": [ "type": "string ]
}

Event Timeline

BPirkle renamed this task from ArrayDef: add convenince methods for generating schemas to ArrayDef: add convenience methods for generating schemas.Jun 22 2024, 5:04 PM
BPirkle changed the task status from Open to In Progress.Jun 24 2024, 5:20 PM
BPirkle triaged this task as Medium priority.
BPirkle moved this task from Incoming (Needs Triage) to In Progress on the MW-Interfaces-Team board.

Change #1050064 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/core@master] Add ArrayDef convenience methods for JSON Schema generation

https://gerrit.wikimedia.org/r/1050064

Per synchronous discussion, we agreed to change the signature of makeObjectSchema to take three parameters as follows:

  • required: array of name/schema pairs for required parameters (defaults to empty array, meaning no optional parameters)
  • optional: array of name/schema pairs for optional parameters (defaults to empty array, meaning no optional parameters)
  • additional: false to exclude additional parameters, string or array to specify the schema of additional parameters, or true to allow arbitrary parameters (this last bit is just omitting the "additionalParameters" section of the spec)

I do have one question, probably due to my formative understanding of the details of JSON Schema. What is the difference between these two schemas (represented here as PHP arrays)?

			[
				'type' => 'object',
				'required' => [ 'a', 'b' ],
				'properties' => [
					'a' => [ 'type' => 'integer' ],
					'b' => [ 'enum' => [ 'x', 'y', 'z' ] ],
				],
				'additionalProperties' => false
			]

This one is identical, except for how 'b' is defined:

			[
				'type' => 'object',
				'required' => [ 'a', 'b' ],
				'properties' => [
					'a' => [ 'type' => 'integer' ],
					'b' => [ 'type' => [ 'enum' => [ 'x', 'y', 'z' ] ] ],
				],
				'additionalProperties' => false
			]

In local development, I tried coding makeObjectSchema() to generate both, and the same set of validations passed for both (and failed in the same way for both when I sent invalid input). Are they valid alternate forms, or do they mean different things and my tests were insufficient to reveal that? If they are equivalent, is one preferred over the other?

What is the difference between these two schemas (represented here as PHP arrays)?

As far as I know, the first is valid, and the second is not.My understanding is that type can only refer to one of the seven basic types (or a list of basic types): https://json-schema.org/understanding-json-schema/reference/type. enum specifies a list of allowed values, independent of the concept of "types". "enum": ['x', 12, null, 0.7] is allowed, I think.

What is the difference between these two schemas (represented here as PHP arrays)?

As far as I know, the first is valid, and the second is not.

Thanks. That was my impression from reading the spec, but (just confirmed again) both pass our validator library and work as expected - the second form actually does validate against the enum and give expected errors if the input includes a value that is not present.

I'll code for the first version.

Change #1050064 merged by jenkins-bot:

[mediawiki/core@master] Add ArrayDef convenience methods for JSON Schema generation

https://gerrit.wikimedia.org/r/1050064

Change #1055181 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] REST: showcase used of ArrayDef convenience functions

https://gerrit.wikimedia.org/r/1055181

Change #1055181 merged by jenkins-bot:

[mediawiki/core@master] REST: showcase usage of ArrayDef convenience functions

https://gerrit.wikimedia.org/r/1055181