Basic Usage

This document provides an overview of nti.externalization and shows some simple examples of its usage.

Reading through the Glossary before beginning is highly recommended.

Motivation and Use-cases

This package provides a supporting framework for transforming to and from Python objects and an arbitrary binary or textual representation. That representation is self-describing and intended to be stable. Uses for that representation include:

  • Communicating with browsers or other clients over HTTP (e.g., AJAX) or sockets;
  • Storing persistently on disk for later reconstituting the Python objects;
  • Using as a human-readable configuration format (with the proper choice of representation)

We expect that there will be lots of such objects in an application, and we want to make it as easy as possible to communicate them. Ideally, we want to be able to completely automate that, handing the entire task off to this package.

It is also important that when we read external input, we validate that it meets any internal constraints before further processing. For example, numbers should be within their allowed range, or references to other objects should actually result in an object of the expected type.

Finally, we don’t want to have to intermingle any code for reading and writing objects with our actual business (application) logic. The two concerns should be kept as separated as possible from our model objects. Ideally, we should be able to use third-party objects that we have no control over seamlessly in external and internal data.

Getting Started

In its simplest form, there are two functions you’ll use to externalize and internalize objects:

>>> from nti.externalization import to_external_object
>>> from nti.externalization import update_from_external_object

We can define an object that we want to externalize:

Caution

The examples in this section are not representative of best practices or preferred patterns. Please keep reading.

class InternalObject(object):

    def __init__(self, id=''):
        self._field1 = 'a'
        self._field2 = 42
        self._id = id

    def toExternalObject(self, request=None, **kwargs):
        return {'A Letter': self._field1, 'The Number': self._field2}

    def __repr__(self):
        return '<%s %r letter=%r number=%d>' % (
            self.__class__.__name__, self._id, self._field1, self._field2)

And we can externalize it with to_external_object:

>>> from pprint import pprint
>>> pprint(to_external_object(InternalObject()))
{'A Letter': 'a', 'The Number': 42}

If we want to update it, we need to write the corresponding method:

class UpdateInternalObject(InternalObject):

    def updateFromExternalObject(self, external_object, context=None):
         self._field1 = external_object['A Letter']
         self._field2 = external_object['The Number']

Updating it uses update_from_external_object:

>>> internal = UpdateInternalObject('internal')
>>> internal
<UpdateInternalObject 'internal' letter='a' number=42>
>>> update_from_external_object(internal, {'A Letter': 'b', 'The Number': 3})
<UpdateInternalObject 'internal' letter='b' number=3>

That’s Not Good Enough

Notice that we had to define procedurally the input and output steps in our classes. For some (small) applications, that may be good enough, but it doesn’t come anywhere close to meeting our motivations:

  1. By mingling the externalization code into our business objects, it makes them larger and muddies their true purpose.
  2. There’s nothing doing any validation. Any such checking is left up to the object itself.
  3. It’s manual code to write and test for each of the many objects we can communicate. There’s nothing automatic about it.

Let’s see how this package helps us address each of those concerns in turn.

Adapters and Configuration

This package makes heavy use of the Zope Component Architecture to abstract away details and separate concerns. Most commonly this is configured using ZCML, and this package ships with a configure.zcml that you should load:

>>> from zope.configuration import xmlconfig
>>> import nti.externalization
>>> xmlconfig.file('configure.zcml', nti.externalization)
<zope.configuration.config.ConfigurationMachine ...>

The toExternalObject method is defined by the nti.externalization.interfaces.IInternalObjectExternalizer interface, and the updateFromExternalObject method is defined by nti.externalization.interfaces.IInternalObjectUpdater interface. Because it is common that one object both reads and writes the external representation, the two interfaces are joined together in nti.externalization.interfaces.IInternalObjectIO. Let’s create a new internal object:

class InternalObject(object):
    def __init__(self, id=''):
        self._field1 = 'a'
        self._field2 = 42
        self._id = id

    def __repr__(self):
        return '<%s %r letter=%r number=%d>' % (
            self.__class__.__name__, self._id, self._field1, self._field2)

Now we will write an IInternalObjectIO adapter for it:

from zope.interface import implementer
from zope.component import adapter

from nti.externalization.interfaces import IInternalObjectIO
from nti.externalization.datastructures import StandardInternalObjectExternalizer

@implementer(IInternalObjectIO)
@adapter(InternalObject)
class InternalObjectIO(StandardInternalObjectExternalizer):
    def __init__(self, context):
        super().__init__(context)
        # Setting this is optional, if we don't like the default
        self.__external_class_name__ = 'ExternalObject'

    def toExternalObject(self, **kwargs):
       result = super(InternalObjectIO, self).toExternalObject(**kwargs)
       result.update({
           'Letter': self.context._field1,
           'Number': self.context._field2
       })
       return result

    def updateFromExternalObject(self, external_object, context=None):
         self.context._field1 = external_object['Letter']
         self.context._field2 = external_object['Number']

Tip

It is a best practice for custom externalizers to either extend an existing datastructure, typically StandardInternalObjectExternalizer for simple cases (as in the example above), or to begin with result = to_standard_external_dictionary(self.context) and update that mapping in place. (The StandardInternalObjectExternalizer calls to_standard_external_dictionary under the covers.)

Caution

The signature for toExternalObject is poorly defined right now. The suitable keyword arguments should be enumerated and documented, but they are not. See https://github.com/NextThought/nti.externalization/issues/54

We can register the adapter (normally this would be done in ZCML) and use it:

<configure xmlns="http://namespaces.zope.org/zope">
    <include package="nti.externalization" />
    <adapter factory=".InternalObjectIO" />
</configure>

Because we don’t have a Python package to put this ZCML in, we’ll register it manually.

>>> from zope import component
>>> component.provideAdapter(InternalObjectIO, provides=IInternalObjectIO)
>>> internal = InternalObject('original')
>>> internal
<InternalObject 'original' letter='a' number=42>
>>> pprint(to_external_object(internal))
{'Class': 'ExternalObject', 'Letter': 'a', 'Number': 42}
>>> update_from_external_object(internal, {'Letter': 'b', 'Number': 3})
<InternalObject 'original' letter='b' number=3>

Notice that the external form included the Class key; this is one of the StandardExternalFields automatically recognized by the built-in externalizers, whose value is taken from the corresponding key named in StandardInternalFields. There are others:

>>> internal.creator = u'sjohnson'
>>> internal.createdTime = 123456
>>> pprint(to_external_object(internal))
{'Class': 'ExternalObject',
 'CreatedTime': 123456,
 'Creator': 'sjohnson',
 'Letter': 'b',
 'Number': 3}

Note

Notice how the names of the internal fields, creator and createdTime because Creator and CreatedTime in the external object. The convention used by this library is that fields that cannot be modified directly by the client are always capitalized. Your custom externalizers and interface definitions should follow this convention.

By using adapters like this, we can separate out externalization from our core logic. Of course, that’s still a lot of manual code to write.

Using Schemas for Validation and Automatic Externalization

Most application objects will implement one or more interfaces. When those interfaces contain attributes from zope.schema.field or nti.schema.field, they are also called schemas. This package can automate the entire externalization process, including validation, based on the schemas an object implements.

Let’s start by writing a simple schema.

from zope.interface import Interface
from zope.interface import taggedValue

from nti.schema.field import ValidTextLine

class IAddress(Interface):

    full_name = ValidTextLine(title=u"First name", required=True)

    street_address_1 = ValidTextLine(title=u"Street line 1",
                                     max_length=75, required=True)

    street_address_2 = ValidTextLine(title=u"Street line 2",
                                     required=False, max_length=75)

    city = ValidTextLine(title=u"City name", required=True)

    state = ValidTextLine(title=u"State name",
                          required=False, max_length=10)

    postal_code = ValidTextLine(title=u"Postal code",
                                required=False, max_length=30)

    country = ValidTextLine(title=u"Nation name", required=True)

And now an implementation of that interface.

from nti.schema.fieldproperty import createDirectFieldProperties
from nti.schema.schema import SchemaConfigured

@implementer(IAddress)
class Address(SchemaConfigured):
     createDirectFieldProperties(IAddress)

Externalizing (and updating!) based on the schema is done with InterfaceObjectIO. We’ll create a subclass to configure it.

from nti.externalization.datastructures import InterfaceObjectIO

@adapter(IAddress)
class AddressIO(InterfaceObjectIO):
    _ext_iface_upper_bound = IAddress

Now we can register and use it as before:

>>> component.provideAdapter(AddressIO)
>>> address = Address(full_name=u'Steve Jobs',
...    street_address_1=u'One Infinite Loop',
...    city=u'Cupertino',
...    state=u'CA',
...    postal_code=u'95014',
...    country=u'USA')
>>> external = to_external_object(address)
>>> pprint(external)
{'Class': 'Address',
 'city': 'Cupertino',
 'country': 'USA',
 'full_name': 'Steve Jobs',
 'postal_code': '95014',
 'state': 'CA',
 'street_address_1': 'One Infinite Loop',
 'street_address_2': None}

Oops, One Infinte Loop was Apple’s old address. They’ve since moved into their new headquarters:

>>> external['street_address_1'] = u'One Apple Park Way'
>>> _ = update_from_external_object(address, external)
>>> address.street_address_1
'One Apple Park Way'

Notice that our schema declared a number of constraints. For instance, the full_name is required, and the state cannot be longer than ten characters. Let’s see what happens when we try to violate these conditions:

>>> external['state'] = u'Commonwealth of Massachusetts'
>>> update_from_external_object(address, external)
Traceback (most recent call last):
...
TooLong: ('State is too long.', 'state', u'Commonwealth of Massachusetts')
>>> external['state'] = u'CA'
>>> external['full_name'] = None
>>> update_from_external_object(address, external)
Traceback (most recent call last):
...
zope.schema._bootstrapinterfaces.RequiredMissing: full_name

Much better! We get validation of our constraints and we didn’t have to write much code. But, we still had to write some code, one class for each object we’re externalizing. Can we do better?

autoPackageIO: Handing responsibility to the framework

The answer is yes, we can do much better, with the ext:registerAutoPackageIO ZCML directive.

Note

ext:registerAutoPackageIO is biased for a conventional setup of a single package: one or more root interfaces in interfaces.py, one or more modules defining factories (classes) implementing those interfaces. To an extent this can be changed using the iobase argument.

The above example schema is taken from the tests distributed with this package in nti.externalization.tests.benchmarks. That package provides the schema (as shown above), an implementation of it, and the ZCML file that pulls it all together with one directive.

Here’s the schema, along with several other schema to define a rich user profile, in interfaces.py:

# -*- coding: utf-8 -*-
"""
A rich user profile.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from zope.interface import Interface
from zope.interface import taggedValue

from zope.schema import Object
from zope.schema import URI

from nti.schema.field import Dict
from nti.schema.field import TextLine
from nti.schema.field import ValidTextLine
from nti.schema.field import DecodingValidTextLine

from nti.externalization.tests.benchmarks.bootstrapinterfaces import IRootInterface
from nti.externalization.tests.benchmarks.bootstrapinterfaces import checkEmailAddress
from nti.externalization.tests.benchmarks.bootstrapinterfaces import checkRealname

# pylint:disable=inherit-non-class
class IFriendlyNamed(Interface):

    alias = TextLine(title=u'Display alias',
                     description=u"Enter preferred display name alias, e.g., johnnyboy."
                     u"Your site may impose limitations on this value.",
                     required=False)

    realname = TextLine(title=u'Full Name aka realname',
                        description=u"Enter full name, e.g. John Smith.",
                        required=False,
                        constraint=checkRealname)

class IAvatarURL(Interface):
    """
    Something that features a display URL.
    """

    avatarURL = URI(title=u"URL of your avatar picture",
                    description=u"If not provided, one will be generated for you.",
                    required=False)


class IBackgroundURL(Interface):

    backgroundURL = URI(title=u"URL of your background picture",
                        description=u"If not provided, one will be generated for you.",
                        required=False)


class IProfileAvatarURL(IAvatarURL, IBackgroundURL):
    pass


class IAddress(IRootInterface):

    full_name = ValidTextLine(title=u"First name", required=True)

    street_address_1 = ValidTextLine(title=u"Street line 1",
                                     max_length=75, required=True)

    street_address_2 = ValidTextLine(title=u"Street line 2",
                                     required=False, max_length=75)

    city = ValidTextLine(title=u"City name", required=True)

    state = ValidTextLine(title=u"State name",
                          required=False, max_length=10)

    postal_code = ValidTextLine(title=u"Postal code",
                                required=False, max_length=30)

    country = ValidTextLine(title=u"Nation name", required=True)
    taggedValue('__external_class_name__',
                'Address')


class IUserContactProfile(Interface):

    addresses = Dict(title=u"A mapping of address objects.",
                     key_type=DecodingValidTextLine(title=u"Adresss key"),
                     value_type=Object(IAddress),
                     min_length=0,
                     required=False)

    phones = Dict(title=u"A mapping of phone numbers objects.",
                  key_type=ValidTextLine(title=u"Phone key"),
                  value_type=ValidTextLine(title=u"A phone"),
                  min_length=0,
                  required=False)

    contact_emails = Dict(title=u"A mapping of contact emails.",
                          key_type=DecodingValidTextLine(title=u"Email key"),
                          value_type=ValidTextLine(title=u'Email',
                                                   constraint=checkEmailAddress),
                          min_length=0,
                          required=False)

class IUserProfile(IProfileAvatarURL, # pylint:disable=too-many-ancestors
                   IUserContactProfile,
                   IFriendlyNamed,
                   IRootInterface):
    """A user profile"""
    taggedValue('__external_class_name__',
                'UserProfile')

They are implemented in objects.py very simply (as above):

from zope import interface
from zope.schema.fieldproperty import createFieldProperties
from nti.schema.eqhash import EqHash
from nti.externalization.representation import WithRepr
from nti.externalization.tests.benchmarks import interfaces

@interface.implementer(interfaces.IAddress)
@EqHash('full_name', 'street_address_1', 'postal_code')
@WithRepr
class Address(SchemaConfigured):
    createDirectFieldProperties(interfaces.IAddress)

@interface.implementer(interfaces.IUserProfile)
@EqHash('addresses', 'alias', 'phones', 'realname')
@WithRepr
class UserProfile(SchemaConfigured):
     createFieldProperties(interfaces.IUserProfile)

Finally, the ZCML file contains one directive that ties everything together:

<configure xmlns="http://namespaces.zope.org/zope"
           xmlns:ext="http://nextthought.com/ntp/ext">

    <include package="zope.component" />

    <include package="nti.externalization" file="meta.zcml" />
    <include package="nti.externalization" />

    <ext:registerAutoPackageIO
        root_interfaces=".interfaces.IRootInterface"
        modules=".objects"
        />
</configure>

If we configure this file, we can create and update addresses. We’ll do so through their container object, the UserProfile, thus demonstrating that nested schemas and objects are possible.

>>> import nti.externalization.tests.benchmarks
>>> _ = xmlconfig.file('configure.zcml', nti.externalization.tests.benchmarks)
>>> from nti.externalization.tests.benchmarks.objects import Address
>>> from nti.externalization.tests.benchmarks.objects import UserProfile
>>> home_address = Address(
...     full_name=u'Steve Jobs',
...     street_address_1=u'1313 Mockingbird Lane',
...     city=u'Salem',
...     state=u'MA',
...     postal_code=u'6666',
...     country=u'USA',
... )
>>> work_address = Address(
...     full_name=u'Apple',
...     street_address_1=u'1 Infinite Loop',
...     city=u'Cupertino',
...     state=u'CA',
...     postal_code=u'55555',
...     country=u'USA',
...  )
>>> user_profile = UserProfile(
...     addresses={u'home': home_address, u'work': work_address},
...     phones={u'home': u'405-555-1212', u'work': u'405-555-2323'},
...     contact_emails={u'home': u'steve.jobs@gmail.com', u'work': u'steve@apple.com'},
...     avatarURL='http://apple.com/steve.png',
...     backgroundURL='https://apple.com/bg.jpeg',
...     alias=u'Steve',
...     realname=u'Steve Jobs',
... )
>>> external = to_external_object(user_profile)
>>> pprint(external)
{'Class': 'UserProfile',
 'MimeType': 'application/vnd.nextthought.benchmarks.userprofile',
 'addresses': {'home': {'Class': 'Address',
                        'MimeType': 'application/vnd.nextthought.benchmarks.address',
                        'city': 'Salem',
                        'country': 'USA',
                        'full_name': 'Steve Jobs',
                        'postal_code': '6666',
                        'state': 'MA',
                        'street_address_1': '1313 Mockingbird Lane',
                        'street_address_2': None},
               'work': {'Class': 'Address',
                        'MimeType': 'application/vnd.nextthought.benchmarks.address',
                        'city': 'Cupertino',
                        'country': 'USA',
                        'full_name': 'Apple',
                        'postal_code': '55555',
                        'state': 'CA',
                        'street_address_1': '1 Infinite Loop',
                        'street_address_2': None}},
 'alias': 'Steve',
 'avatarURL': 'http://apple.com/steve.png',
 'backgroundURL': 'https://apple.com/bg.jpeg',
 'contact_emails': {'home': 'steve.jobs@gmail.com', 'work': 'steve@apple.com'},
 'phones': {'home': '405-555-1212', 'work': '405-555-2323'},
 'realname': 'Steve Jobs'}

Notice that there are some additional bits of data in the external form that are not specified in the interface. Here, that’s Class and MimeType. These are two of the Standard Fields.

Let’s make a change to the work address:

>>> external['addresses'][u'work']['street_address_1'] = u'One Apple Park Way'
>>> _ = update_from_external_object(user_profile, external)
>>> user_profile.addresses['work'].street_address_1
'One Apple Park Way'

Importantly, note that, by default, the nested objects are created fresh and not mutated.

>>> user_profile.addresses['work'] is work_address
False

This is described in more detail in Factories.

Representations

Being able to get a Python dictionary from an object, and update an object given a Python dictionary, is nice, but it doesn’t go all the way toward solving the goals of this package, interoperating with remote clients using a text (or byte) based stream.

For that, we have the nti.externalization.representation module, and its key interface IExternalObjectIO.

A representation is a format that can serialize Python dictionaries to text, and given that text, produce a Python dictionary. This package provides two representations by default, JSON and YAML. These are named utilities providing IExternalObjectIO. The function nti.externalization.to_external_representation is a shortcut for dumping to a string:

>>> from nti.externalization import to_external_representation
>>> from nti.externalization.interfaces import EXT_REPR_JSON, EXT_REPR_YAML
>>> to_external_representation(address, EXT_REPR_JSON)
'{"Class": "Address", "city": "Cupertino",...
>>> to_external_representation(address, EXT_REPR_YAML)
"{Class: Address, city: Cupertino, country: USA,...

Loading from a string doesn’t have a shortcut, we need to use the utility:

>>> from nti.externalization.interfaces import IExternalObjectIO
>>> external = to_external_object(address)
>>> yaml_io = component.getUtility(IExternalObjectIO, EXT_REPR_YAML)
>>> ext_yaml_str = yaml_io.dump(external)
>>> external_from_yaml = yaml_io.load(ext_yaml_str)
>>> external_from_yaml == external
True