nti.externalization.integer_strings: Short readable strings from large integers

Functions to represent potentially large integers as the shortest possible human-readable and writable strings. The motivation is to be able to take int ids as produced by an zc.intid.IIntId utility and produce something that can be written down and typed in by a human. To this end, the strings produced have to be:

  • One-to-one and onto the integer domain;
  • As short as possible;
  • While not being easily confused;
  • Or accidentaly permuted

To meet those goals, we define an alphabet consisting of the ASCII digits and upper and lowercase letters, leaving out troublesome pairs (zero and upper and lower oh and upper queue, one and upper and lower ell) (actually, those troublesome pairs will all map to the same character).

We also put a version marker at the end of the string so we can evolve this algorithm gracefully but still honor codes in the wild.

to_external_string(integer)[source]

Turn an integer into a native string representation.

>>> from nti.externalization.integer_strings import to_external_string
>>> to_external_string(123)
'xk'
>>> to_external_string(123456789)
'kVxr5'
from_external_string(key)[source]

Turn the string in key into an integer.

>>> from nti.externalization.integer_strings import from_external_string
>>> from_external_string('xkr')
6773
Parameters:

key (str) – A native string, as produced by to_external_string. (On Python 2, unicode keys are also valid.)

Raises:
  • ValueError – If the key is invalid or contains illegal characters.
  • UnicodeDecodeError – If the key is a Unicode object, and contains non-ASCII characters (which wouldn’t be valid anyway)