Metadata-Version: 2.1
Name: json_strong_typing
Version: 0.1.7
Summary: Type-safe data interchange for Python data classes
Home-page: https://github.com/hunyadi/strong_typing
Author: Levente Hunyadi
Author-email: hunyadi@gmail.com
License: MIT
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Description-Content-Type: text/markdown
License-File: LICENSE

# Type-safe data interchange for Python data classes

JSON is a popular message interchange format employed in API design for its simplicity, readability, flexibility and wide support. However, `json.dump` and `json.load` offer no direct support when working with Python data classes employing type annotations. This package offers services for working with strongly-typed Python classes: serializing objects to JSON, deserializing JSON to objects, and producing a JSON schema that matches the data class, e.g. to be used in an OpenAPI specification.

This package differs from [pydantic dataclasses](https://pydantic-docs.helpmanual.io/usage/dataclasses/) in that you don't need to have `BaseModel` in your class inheritance chain, making it suitable for operating on classes defined in third-party modules.

## Features

This package offers the following services:

* JSON serialization and de-serialization
    * Generate a JSON object from a Python object (`serialization.object_to_json`)
    * Parse a JSON object into a Python object (`serialization.json_to_object`)
* JSON schema
    * Generate a JSON schema from a Python type (`schema.classdef_to_schema`)
    * Validate a JSON object against a Python type (`schema.validate_object`)
* Type information
    * Extract documentation strings (a.k.a. docstring) from types (`docstring.parse_type`)
    * Inspect types, including generics (package `inspection`)

These services come with full support for complex types like data classes, named tuples and generics

In the context of this package, a *JSON object* is the (intermediate) Python object representation produced by `json.loads` from a *JSON string*. In contrast, a *JSON string* is the string representation generated by `json.dumps` from the (intermediate) Python object representation.

## Use cases

* Writing a cloud function (lambda) that communicates with JSON messages received as HTTP payload or websocket text messages
* Verifying if an API endpoint receives well-formed input
* Generating a type schema for an OpenAPI specification to impose constraints on what messages an API can receive
* Parsing JSON configuration files into a Python object

## Usage

Consider the following class definition:

```python
@dataclass
class Example:
    "A simple data class with multiple properties."

    bool_value: bool = True
    int_value: int = 23
    float_value: float = 4.5
    str_value: str = "string"
    datetime_value: datetime.datetime = datetime.datetime(1989, 10, 23, 1, 45, 50)
    guid_value: uuid.UUID = uuid.UUID("f81d4fae-7dec-11d0-a765-00a0c91e6bf6")
```

First, we serialize the object to JSON with
```python
source = Example()
json_obj = object_to_json(source)
```

Here, the variable `json_obj` has the value:
```python
{
    "bool_value": True,
    "int_value": 23,
    "float_value": 4.5,
    "str_value": "string",
    "datetime_value": "1989-10-23T01:45:50",
    "guid_value": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6",
}
```

Next, we restore the object from JSON with
```python
target = json_to_object(Example, json_obj)
```

Here, `target` holds the restored data class object:
```python
Example(
    bool_value=True,
    int_value=23,
    float_value=4.5,
    str_value="string",
    datetime_value=datetime.datetime(1989, 10, 23, 1, 45, 50),
    guid_value=uuid.UUID("f81d4fae-7dec-11d0-a765-00a0c91e6bf6"),
)
```

We can also produce the JSON schema corresponding to the Python class:
```python
json_schema = json.dumps(classdef_to_schema(Example), indent=4)
```
which yields
```json
{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "properties": {
        "bool_value": {
            "type": "boolean",
            "default": true
        },
        "int_value": {
            "type": "integer",
            "default": 23
        },
        "float_value": {
            "type": "number",
            "default": 4.5
        },
        "str_value": {
            "type": "string",
            "default": "string"
        },
        "datetime_value": {
            "type": "string",
            "format": "date-time",
            "default": "1989-10-23T01:45:50"
        },
        "guid_value": {
            "type": "string",
            "format": "uuid"
        }
    },
    "additionalProperties": false,
    "required": [
        "bool_value",
        "int_value",
        "float_value",
        "str_value",
        "datetime_value",
        "guid_value"
    ],
    "title": "A simple data class with multiple properties."
}
```

If a type has a Python docstring, then `title` and `description` fields in the JSON schema are populated from the text in the documentation string.

## Standards

For producing a JSON schema, the following JSON schema standards are supported:

* [Draft 7](https://json-schema.org/specification-links.html#draft-7)
* [Draft 2019-09](https://json-schema.org/specification-links.html#draft-2019-09-formerly-known-as-draft-8)
* [Draft 2020-12](https://json-schema.org/specification-links.html#2020-12)

## Conversion table

The following table shows the conversion types the package employs:

| Python type | JSON schema type | Behavior |
| -- | -- | -- |
| None | null |
| bool | boolean |
| int | integer |
| float | number |
| str | string |
| decimal.Decimal | number |
| bytes | string | represented with Base64 content encoding |
| datetime | string | constrained to match ISO 8601 format `2018-11-13T20:20:39+00:00` |
| date | string | constrained to match ISO 8601 format `2018-11-13` |
| time | string | constrained to match ISO 8601 format `20:20:39+00:00` |
| UUID | string | constrained to match UUID format `f81d4fae-7dec-11d0-a765-00a0c91e6bf6` |
| Enum | *value type* | stores the enumeration value type (typically integer or string) |
| Optional[**T**] | *depends on inner type* | reads and writes **T** if present |
| Union[**T1**, **T2**, ...] | *depends on concrete type* | serializes to the appropriate inner type; deserializes from the first matching type |
| List[**T**] | array | recursive in **T** |
| Dict[**K**, **V**] | object | recursive in **V**, keys are coerced into string |
| Dict[Enum, **V**] | object | recursive in **V**, keys are of enumeration value type and coerced into string |
| Set[**T**] | array | recursive in **T**, container has uniqueness constraint |
| Tuple[**T1**, **T2**, ...] | array | array has fixed length, each element has specific type |
| Literal[**const**] | *type matching* **const** | export the literal value as a constant value |
| data class | object | iterates over fields of data class |
| named tuple | object | iterates over fields of named tuple |
| regular class | object | iterates over `dir(obj)` |
| JsonArray | array | untyped JSON array |
| JsonObject | object | untyped JSON object |
| Any | oneOf | a union of all basic JSON schema types |
| Annotated[**T**, ...] | *depends on* **T** | outputs value for **T**, applies constraints and format based on auxiliary type information |

## JSON schema examples

### Simple basic types

| Python type | JSON schema |
| -- | -- |
| bool | `{"type": "boolean"}` |
| int | `{"type": "integer"}` |
| float | `{"type": "number"}` |
| str | `{"type": "string"}` |
| bytes | `{"type": "string", "contentEncoding": "base64"}` |

### Simple built-in types

| Python type | JSON schema |
| -- | -- |
| decimal.Decimal | `{"type": "number"}` |
| datetime.date | `{"type": "string", "format": "date"}` |
| uuid.UUID | `{"type": "string", "format": "uuid"}` |

### Enumeration types

```python
class Side(enum.Enum):
    LEFT = "L"
    RIGHT = "R"
```
```json
{"enum": ["L", "R"], "type": "string"}
```

### Container types

| Python type | JSON schema |
| -- | -- |
| List[int] | `{"type": "array", "items": {"type": "integer"}}` |
| Dict[str, int] | `{"type": "object", "additionalProperties": {"type": "integer"}}` |
| Set[int] | `{"type": "array", "items": {"type": "integer"}, "uniqueItems": True}}` |
| Tuple[int, str] | `{"type": "array", "minItems": 2, "maxItems": 2, "prefixItems": [{"type": "integer"}, {"type": "string"}]}` |

### Annotated types

Range:
```python
Annotated[int, IntegerRange(23, 82)])
```
```json
{
    "type": "integer",
    "minimum": 23,
    "maximum": 82,
}
```

Precision:
```python
Annotated[decimal.Decimal, Precision(9, 6)])
```
```json
{
    "type": "number",
    "multipleOf": 0.000001,
    "exclusiveMinimum": -1000,
    "exclusiveMaximum": 1000,
}
```

### Fixed-width types

Fixed-width integer (e.g. `uint64`) and floating-point (e.g. `float32`) types are annotated types defined in the package `strong_typing.auxiliary`. Their signature is recognized when generating a schema, and a `format` property is written instead of minimum and maximum constraints.

`int32`:
```python
int32 = Annotated[int, Signed(True), Storage(4), IntegerRange(-2147483648, 2147483647)]
```

```json
{"format": "int32", "type": "integer"}
```

`uint64`:
```python
uint64 = Annotated[int, Signed(False), Storage(8), IntegerRange(0, 18446744073709551615)]
```

```json
{"format": "uint64", "type": "integer"}
```

### Any type

```json
{
    "oneOf": [
        {"type": "null"},
        {"type": "boolean"},
        {"type": "number"},
        {"type": "string"},
        {"type": "array"},
        {"type": "object"},
    ]
}
```

## Custom serialization and de-serialization

If a composite object (e.g. a dataclass or a plain Python class) has a `to_json` member function, then this function is invoked to produce a JSON object representation from an instance.

If a composite object has a `from_json` class function (a.k.a. `@classmethod`), then this function is invoked, passing the JSON object as an argument, to produce an instance of the corresponding type.

## Custom types

It is possible to declare custom types when generating a JSON schema. For example, the following class definition has the annotation `@json_schema_type`, which will register a JSON schema subtype definition under the path `#/definitions/AzureBlob`, which will be referenced later with `$ref`:

```python
_regexp_azure_url = re.compile(
    r"^https?://([^.]+)\.blob\.core\.windows\.net/([^/]+)/(.*)$")

@dataclass
@json_schema_type(
    schema={
        "type": "object",
        "properties": {
            "mimeType": {"type": "string"},
            "blob": {
                "type": "string",
                "pattern": _regexp_azure_url.pattern,
            },
        },
        "required": ["mimeType", "blob"],
        "additionalProperties": False,
    }
)
class AzureBlob(Blob):
    ...
```

You can use `@json_schema_type` without the `schema` parameter to register the type name but have the schema definition automatically derived from the Python type. This is useful if the type is reused across the type hierarchy:

```python
@json_schema_type
class Image:
    ...

class Study:
    left: Image
    right: Image
```

Here, the two properties of `Study` (`left` and `right`) will refer to the same subtype `#/definitions/Image`.

## Name mangling

If a Python class has a property augmented with an underscore (`_`) as per [PEP 8](https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles) to avoid conflict with a Python keyword (e.g. `for` or `in`), the underscore is removed when reading from or writing to JSON.
