wsl - Python 3 library for WSL databases

This library provides an easy to use API to read and write WSL databases with built-in and user-defined datatypes. It uses str for parsing and formatting throughout.

This library is experimental. API changes are to be expected.

The wsl library in 1 minute:

Read a WSL database from a file with included schema. The built-in types ID and String are used to construct meaningful domains. These domains in turn are used to define tables. [Foo Bar] is the notation for the standard WSL string type. Its advantage is having separate opening and closing delimiters.

$ cat db.wsl
% DOMAIN Person ID
% DOMAIN Comment String
% TABLE person Person Comment
% TABLE parent Person Person
% KEY Person person P *
% REFERENCE Person1OfParent parent P * => person P *
% REFERENCE Person2OfParent parent * P => person P *
person foo [Foo Bar]
parent foo bar
import wsl

dbfilepath = "db.wsl"
schema, tables = wsl.parse_db(dbfilepath=dbfilepath)
for person in tables['person']:
    print(person)

Read a WSL database from a python3 string. Here, the schema is given separately.

import wsl

schemastr = """\
DOMAIN Person ID
DOMAIN Comment String
TABLE person Person Comment
TABLE parent Person Person
KEY Person person P *
REFERENCE Person1OfParent parent P * => person P *
REFERENCE Person2OfParent parent * P => person P *
"""

dbstr = """\
person foo [Foo Bar]
parent foo bar
"""

schema, tables = wsl.parse_db(dbstr=dbstr, schemastr=schemastr)
for person in tables['person']:
    print(person)

Given a parsed schema and a suitable tables dict, we can encode the database back to a text string:

include_schema = True
text = wsl.format_db(schema, tables, include_schema)
print(text, end='')

User-defined datatypes

Custom datatypes are quite easy to add. We need a decoder and an encoder for values in database tuples. The decoder gets an already lexed token. It returns the decoded value or raises wsl.ParseError.

The encoder is the inverse. It takes a value and returns a (str) token or raises wsl.FormatError.

Let’s make a decoder / encoder pair for base64 encoded data.

import wsl
import base64
import binascii

def Base64_decode(token):
    try:
        v = base64.b64decode(token, validate=True)
    except binascii.Error as e:
        raise wsl.ParseError('Failed to parse base64 literal at character %d, line "%s"' %(beg, line))
    return v

def Base64_encode(x):
    return base64.b64encode(x).decode('ascii')

Furthermore we need a domain parser. A domain parser gets a parameterization string (on a single line) and returns a domain object.

The parameterization string is what comes after the name of the datatype in the DOMAIN declaration line of the database schema.

A domain object holds the decoder and encoder callables as well as a lexer and an unlexer. The latter two can usually be taken from the wsl library.

In this example, we don’t add any parameterizability. But later, we might want to specify other characters instead of + and /.

def parse_Base64_domain(line):
    """Parser for Base64 domain declarations.

    No special syntax is recognized. Only the bare "Base64" is allowed.
    # TODO: Allow other characters instead of + and /
    """
    if line:
        raise wsl.ParseError('Construction of Base64 domain does not receive any arguments')
    class Base64Datatype:
        wsllex = wsl.lex_wsl_identifier
        wslunlex = wsl.unlex_wsl_identifier
        decode = Base64_decode
        encode = Base64_encode
    return Base64Datatype

Now we can parse a database using our custom parser:

schemastr = """\
DOMAIN Filename String
DOMAIN Data Base64
TABLE pic Filename Data
"""

dbstr = """\
pic [cat.png] bGDOgm10Dm+5ZPjfNmuP4kalHWUlqT3ZAK7WdP9QniET60y5aO4WmxDCxZUTD/IKOrC2DTSLSb/tLWkb7AyYfP1oMqdw08AFEVTdl8EEA2xldYPF4FY9WB5N+87Ymmjo7vVMpiFvcMJkZZv0zOQ6eeMpCUH2MoTPrrkTHOHx/yPA2hO32gKnOGpoCZQ7q6wUS/M1oHd6DRu1CyIMeJTAZAQjJz74oYAfr8Qt1GOWVswzLkojZlODE1WcVt8nrfm3+Kj3YNS43g2zNGwf7mb2Z7OZwzMqtQNnCuDJgXN3
"""

dps = wsl.get_builtin_domain_parsers()
dps['Base64'] = parse_Base64_domain
schema, tables = wsl.parse_db(dbstr=dbstr, schemastr=schemastr, domain_parsers=dps)
include_schema = True
print(wsl.format_db(schema, tables, include_schema), end='')

API listing

Schema

class wsl.Schema(spec, domains, tables, keys, foreignkeys)

Schema information for a WSL database.

spec

str – Textual specification used to construct this schema.

domains

A dict mapping domain names to SchemaDomain objects.

tables

A dict mapping table names to SchemaTable objects.

keys

A dict mapping unique key names to SchemaKey objects.

foreignkeys

A dict mapping foreign key names to SchemaForeignKey objects.

is_key(table, columns)

Test whether the given columns are a key in to the given table

Parameters:
  • table (str) – The name of a table in this schema
  • columns (tuple) – 0-basesd columns indices of the table
Returns:

Whether the given columns for a key (not necessarily a

candidate key).

Return type:

bool

Raises:
  • ValueError – If the table is not in the schema
  • ValueError – If the given columns indices don’t match the given table.
class wsl.SchemaDomain(name, spec, funcs)

Domain object

name

str – Name of the domain as used e.g. in table declarations.

spec

str – Spec of the domain (parameterization from the definition).

funcs

function object, holding value decoder and encoder.

class wsl.SchemaTable(name, spec, columns, colnames)

Table object

name

str – Name of the table.

spec

str – Spec of the table (definition string).

columns

A tuple of domain names indicating the columns of the table.

colnames

A list containing tuples of column names. The list may be empty. Each tuple must be of the same length as column. It holds a possible naming of the columns. Each name may be None in which case there is no name availble for the column in this naming.

class wsl.SchemaKey(name, spec, table, columns)

Unique key object

name

str – Name of the key.

spec

str – Spec of the key (definition string).

table

str – Name of the table on which the unique key constraint is placed.

columns

Tuple of 0-based column indices in strictly ascending order. These are the columns on which the table rows must be unique.

class wsl.SchemaForeignKey(name, spec, table, columns, reftable, refcolumns)

Foreign key object

name

str – name of the foreign key.

spec

str – Spec of the foreign key (definition string).

table

str – Name of the table on which the constraint is placed.

columns

Tuple of 0-based column indices in strictly ascending order. These are the columns which serve as index into the foreign table.

reftable

str – Name of the foreign table.

refcolumns

Tuple of 0-based column indices in strictly ascending order. The number and types of the columns must be identical to those in columns.

Integrity

wsl.check_database_integrity(schema, tables)

Check integrity of a database.

Parameters:
  • schema (wsl.Schema) – WSL schema
  • tables – A dict mapping each table name defined in the schema to a table (a list of rows)
Raises:

wsl.IntegrityError – if the integrity check failed.

This function will check for violations of KEY and REFERENCE constraints.

Database

class wsl.Database(schema, reference_followers=None)

Database instance.

schema

wsl.Schema – The WSL schema of which this database is an instance.

tuple_types

dict – The dynamically created tuple types. There is one for each table in the schema.

tables

A dict mapping table name to lists of table rows.

indices

A dict mapping (tablename, column indices) pairs to table rows.

__init__(schema, reference_followers=None)

Create a Database instance.

schema (wsl.Schema):
The WSL schema of which a database instance should be created.
reference_followers:

A list of (foreign_key_name, member) pairs, where foreign key name is the name of a foreign key reference in the given schema and member is the name of a property to create on tuples of the table that is constrained by the foreign key reference.

reference_followers indicates what follower properties should be created. If it is None then reference followers for all foreign keys in the WSL schema are created in their respective tuple types. Member name is ref_KEYNAME where KEYNAME is the name of the key on the foreign table (which must always exist).

Exceptions

exception wsl.WslValueError

Base class for all WSL exceptions

exception wsl.LexError(lexicaltype, text, startpos, errorpos, errormsg)

LexError represents WSL text format token lexing errors

lexicaltype

str – Name of the lexical type of the value that could not be lexed.

text

str – The str buffer from which the value could not be lexed.

startpos

int – Position in text from which the lexing of the value started.

errorpos

int – Position in text where the lexing error occurred.

errormsg

str – Description of the lexing error.

exception wsl.UnlexError(lexicaltype, token, errormsg)

UnlexError represents errors that occurred while unlexing tokens to WSL text format

exception wsl.ParseError(context, text, startpos, errorpos, errormsg)

Raised on general parsing errors

exception wsl.FormatError(context, value, errormsg)

Raised on database formatting errors

exception wsl.IntegrityError

Raised on database inconsistencies

Lexing

wsl.lex_wsl_space(text, i)

Lex a single space character.

Parameters:
  • text (str) – Where to lex the space from.
  • i (int) – An index into text where the space is supposed to be.
Returns:

If the lex succeeds, (i, None) where i is the index of the next character following the space.

Raises:

wsl.LexError – If no space is found.

wsl.lex_wsl_newline(text, i)

Lex a single newline character.

Parameters:
  • text (str) – Where to lex the newline from.
  • i (int) – An index into text where the newline is supposed to be.
Returns:

If the lex succeeds, (i, None) where i is the index of the next character following the newline.

Raises:

wsl.LexError – If no newline is found.

Parsing

wsl.parse_db(dbfilepath=None, dbstr=None, schema=None, schemastr=None, domain_parsers=None)

Convenience def to parse a WSL database.

This routine parses a database given schema information.

Zero or one of schema or schemastr must be given. If schema is None, the schema is parsed from a schema string. If schemastr is also None, the schema string is assumed to be inline before the database contents.

One, and only one, of dbfilepath or dbstr should be given.

Parameters:
  • dbfilepath (str) – Path to the file that contains the database.
  • dbstr (str) – A string that holds the database.
  • schema (wsl.Schema) – Optional schema. If not given, the schema is expected to be given in text form (either in schemastr or inline as part of the database).
  • schemastr (str) – Optional schema specification. If not given, the schema is expected to be given either in schema or inline as part of the database (each line prefixed with %).
  • domain_parsers (dict) – Optional domain parsers for the domains used in the database. If not given, the built-in parsers are used.
Returns:

A tuple (schema, tables). schema is the parsed wsl.Schema and

tables is a dict mapping each table name to a list of database rows (parsed values)

Raises:

wsl.ParseError – if the parse failed.

wsl.parse_schema(schemastr, domain_parsers=None)

Parse a wsl schema (without % escapes)

Parameters:
  • schemastr (str) – The schema string to parse
  • domain_parsers (dict) – maps parser names to parsers
Returns:

The parsed schema object

Return type:

wsl.Schema

Raises:

wsl.ParseError – If the parse failed

wsl.parse_row(text, i, lexers_of_relation)

Parse a database row (a relation name and according tuple of tokens).

This def lexes a relation name, which is used to lookup a domain object in objects_of_relation. Then that object is used to call parse_tokens().

Parameters:
  • text (str) – holds a database tuple.
  • lexers_of_relation (dict) – maps relation names to the list of the lexers of their according columns.
Returns:

(int, – The index of the first unconsumed character and a 2-tuple holding the lexed relation name and lexed tokens.

Return type:

str, tuple

Raises:

wsl.ParseError – if the lex failed.

wsl.get_builtin_domain_parsers()

Get a dict containing all domain parsers built-in to this library.

The dict is freshly created, so can be modified by the caller.

Returns:A dictionary mapping the names of all built-in parsers to the parsers.
Return type:dict

Formatting

wsl.format_db(schema, tables, inline_schema)

Convenience function for formatting a WSL database.

Parameters:
  • schema (wsl.Schema) – schema object
  • db (dict) – A dict mapping each table name to a table (list of rows)
  • inline_schema (bool) – Whether to include the schema (in escaped form).
Returns:

The formatted database

Return type:

str

Raises:

wsl.FormatError – if formatting fails.

wsl.format_schema(schema, escape=False)

Encode a schema object as a WSL schema string.

Parameters:
  • schema (wsl.Schema) – The schema object
  • escape (bool) – Whether the resulting string should be escaped for inline schema notation.
Returns:

The textual representation of the schema. Currently, this is just the spec attribute of the schema object. If escape=True, each line is prepended with %, so the schema string can be used inline in a text file.

Return type:

str

wsl.format_row(table, row, encoders)

Encode a WSL database row (including leading table name).

Parameters:
  • table (str) – Name of the table this row belongs to.
  • row (tuple) – Values according to the columns of table
  • encoders (tuple) – Encoders according to the columns of table
Returns:

A single line (including the terminating newline character).

Return type:

str

Raises:

wsl.FormatError – if formatting fails.

wsl.format_values(row, encoders)

Encode a WSL database row (without leading table name)

Parameters:
  • tup (tuple) – Some values to encode
  • encoders – Encoders according to the values in tup.
Returns:

A single line (including the terminating newline character).

Return type:

str

Raises:

wsl.FormatError – if formatting fails.