Recipes

Parsed classes

For some classes, the easiest way to write them in a YAML file is as a string representing all the values in the class. For example, you may want to have a namespaced name that looks like ns.subns.name in the YAML text format, but on the Python side represent it as a class having one attribute namespaces that is a list of namespaces, and an attribute name that is a string.

To do this, you need to override recognition to tell YAML to recognise a string (because by default it will expect a mapping), and then to add a savorizing function that parses the string and generates a mapping, attributes from which will then be fed to your constructor. In order to save objects of your class as strings, you’ll need to add a sweetening function too. The complete solution looks like this:

docs/examples/parsed_classes.py
from typing import List

from ruamel import yaml

import yatiml


class Identifier:
    def __init__(self, namespaces: List[str], name: str) -> None:
        self.namespaces = namespaces
        self.name = name

    @classmethod
    def _yatiml_recognize(cls, node: yatiml.UnknownNode) -> None:
        node.require_scalar(str)

    @classmethod
    def _yatiml_savorize(cls, node: yatiml.Node) -> None:
        text = str(node.get_value())
        parts = text.split('.')
        node.make_mapping()

        # We need to make a yaml.SequenceNode by hand here, since
        # set_attribute doesn't take lists as an argument.
        start_mark = yaml.error.StreamMark('generated node', 0, 0, 0)
        end_mark = yaml.error.StreamMark('generated node', 0, 0, 0)
        item_nodes = list()
        for part in parts[:-1]:
            item_nodes.append(yaml.ScalarNode('tag:yaml.org,2002:str', part,
                                              start_mark, end_mark))
        ynode = yaml.SequenceNode('tag:yaml.org,2002:seq', item_nodes,
                                  start_mark, end_mark)
        node.set_attribute('namespaces', ynode)
        node.set_attribute('name', parts[-1])

    @classmethod
    def _yatiml_sweeten(self, node: yatiml.Node) -> None:
        namespace_nodes = node.get_attribute('namespaces').seq_items()
        namespaces = list(map(yatiml.Node.get_value, namespace_nodes))
        namespace_str = '.'.join(namespaces)

        name = node.get_attribute('name').get_value()
        node.set_value('{}.{}'.format(namespace_str, name))


load = yatiml.load_function(Identifier)
dumps = yatiml.dumps_function(Identifier)

yaml_text = ('yatiml.logger.setLevel\n')
doc = load(yaml_text)

print(type(doc))
print(doc.namespaces)
print(doc.name)

doc = Identifier(['yatiml'], 'add_to_loader')
yaml_text = dumps(doc)

print(yaml_text)

The tricky part here is in the savorize and sweeten functions. Savorize needs to build a list of strings, which YAtiML doesn’t help with, so it needs to construct PyYAML objects directly. For each namespace item, it builds a yaml.ScalarNode, which represents a scalar and has a tag to describe the type, and a value, a string. It also requires a start and end mark, for which we use dummy values, as this node was generated and is therefore not in the file. The PyYAML library will raise an Exception if you do not add those. The item nodes are then added to a yaml.SequenceNode, and the whole thing set as the value of the namespaces attribute.

Sweeten does the reverse of course, getting a list of yatiml.Node objects representing the items in the namespaces attribute, extracting the string values using yatiml.Node.get_value(), then joining them with periods and finally combining them with the name. Since we’re only altering the top-level node here, we do not need to build a yaml.ScalarNode ourselves but can just use yatiml.Node.set_value().

Timestamps and dates

YAML has a timestamp type, which represents a point in time. The PyYAML library parses this into a python datetime.date object, and will serialise such an object back to a YAML timestamp. YAtiML supports this as well, so all you need to do to use a timestamp or a date is to use datetime.date in your class definition.

Note that the object created by YAtiML may be an instance of datetime.date (if no time is given) or an instance of datetime.datetime (if a time is given) which is a subclass of datetime.date. Since Python does not have a date-without-time type, you cannot currently specify in the type that you want only a date, without a time attached to it.

If this is an attribute in a class, and date-with-time is not a legal value, then you should add a check to the __init__ method that raises an exception if the given value is an instance of datetime.datetime. That way, you can’t accidentally make an instance of the class in Python with an incorrect value either.

Dashed keys

Some YAML-based formats (like CFF) use dashes in their mapping keys. This is a problem for YAtiML, because keys get mapped to parameters of __init__, which are identifiers, and those are not allowed to contain dashes in Python. So some kind of conversion will have to be made. YAtiML’s seasoning mechanism is just the way to do it: yatiml.Node has two methods to convert all dashes in a mapping’s keys to underscores and back: unders_to_dashes_in_keys() and dashes_to_unders_in_keys(), so all you need to do is use underscores instead of dashes when defining your classes, and add seasoning functions. Here’s an example:

docs/examples/dashed_keys.py
from typing import Union
import yatiml


# Create document class
class Dashed:
    def __init__(self, an_attribute: int, another_attribute: str) -> None:
        self.an_attribute = an_attribute
        self.another_attribute = another_attribute

    @classmethod
    def _yatiml_savorize(cls, node: yatiml.Node) -> None:
        node.dashes_to_unders_in_keys()

    @classmethod
    def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
        node.unders_to_dashes_in_keys()


# Create loader
load = yatiml.load_function(Dashed)

# Create dumper
dumps = yatiml.dumps_function(Dashed)

# Load YAML
yaml_text = ('an-attribute: 42\n'
             'another-attribute: with-dashes\n')
doc = load(yaml_text)

print(type(doc))
print(doc.an_attribute)
print(doc.another_attribute)


# Dump YAML

dumped_text = dumps(doc)
print(dumped_text)

If you’ve been paying very close attention, then you may be wondering why this example passes through the recognition stage. After all, the names of the keys do not match those of the __init__ parameters. YAtiML is a bit flexible in this regard, and will match a key to a parameter if it is identical after dashes have been replaced with underscores. The flexibility is only in the recognition stage, not in the type checking stage, so you do need the seasoning functions. (The reason to not completely automate this is that YAtiML cannot know if the YAML side should have dashes or underscores. So you need to specify this somehow in order to be able to dump correctly, and then it’s better to specify it on loading as well for symmetry.)

Seasoning enumerations

By default, YAtiML will use an enum member’s name to write to the YAML file, and that’s what it will recognise on loading as well. Sometimes, that’s not what you want however. Maybe you want to use the values, or you want to have the names on the Python side in uppercase (because PEP-8 says so) while you want to use a lower-case version in the YAML file. In that case, you can apply YAtiML’s seasoning mechanisms like this:

docs/examples/enum_use_values.py
# Example from https://github.com/GooglingTheCancerGenome/sv-gen
import enum

import yatiml


class Genotype(enum.Enum):
    HOMO_NOSV = 'hmz'
    HOMO_SV = 'hmz-sv'
    HETERO_SV = 'htz-sv'

    @classmethod
    def _yatiml_savorize(cls, node: yatiml.Node) -> None:
        # enum.Enum has a __members__ attribute which contains its
        # members, which we reverse here to make a look-up table that
        # converts values in the YAML file to names expected by YAtiML.
        yaml_to_python = {
                v.value: v.name for v in cls.__members__.values()}

        # Remember that the node can be anything here. We only convert
        # if it's a string with an expected value, otherwise we leave
        # it alone so that a useful error message can be generated.
        if node.is_scalar(str):
            if node.get_value() in yaml_to_python:
                node.set_value(yaml_to_python[node.get_value()])

    @classmethod
    def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
        # Here we just use cls.__members__ directly to convert.
        node.set_value(cls.__members__[node.get_value()].value)

or like this:

docs/examples/enum_lowercase.py
import enum

import yatiml


class Color(enum.Enum):
    """Demonstrates lowercased Enum names in YAML."""
    RED = 0
    GREEN = 1
    BLUE = 2

    @classmethod
    def _yatiml_savorize(cls, node: yatiml.Node) -> None:
        if node.is_scalar(str):
            node.set_value(node.get_value().upper())

    @classmethod
    def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
        node.set_value(node.get_value().lower())