Recipes
Parsed classes
For some classes, the easiest way to write them in a YAML file is as a string
representing all the values in the class. For example, you may want to have a
namespaced name that looks like ns.subns.name
in the YAML text format, but
on the Python side represent it as a class having one attribute namespaces
that is a list of namespaces, and an attribute name that is a string.
To do this, you need to override recognition to tell YAML to recognise a string (because by default it will expect a mapping), and then to add a savorizing function that parses the string and generates a mapping, attributes from which will then be fed to your constructor. In order to save objects of your class as strings, you’ll need to add a sweetening function too. The complete solution looks like this:
docs/examples/parsed_classes.py
from typing import List
from ruamel import yaml
import yatiml
class Identifier:
def __init__(self, namespaces: List[str], name: str) -> None:
self.namespaces = namespaces
self.name = name
@classmethod
def _yatiml_recognize(cls, node: yatiml.UnknownNode) -> None:
node.require_scalar(str)
@classmethod
def _yatiml_savorize(cls, node: yatiml.Node) -> None:
text = str(node.get_value())
parts = text.split('.')
node.make_mapping()
# We need to make a yaml.SequenceNode by hand here, since
# set_attribute doesn't take lists as an argument.
start_mark = yaml.error.StreamMark('generated node', 0, 0, 0)
end_mark = yaml.error.StreamMark('generated node', 0, 0, 0)
item_nodes = list()
for part in parts[:-1]:
item_nodes.append(yaml.ScalarNode('tag:yaml.org,2002:str', part,
start_mark, end_mark))
ynode = yaml.SequenceNode('tag:yaml.org,2002:seq', item_nodes,
start_mark, end_mark)
node.set_attribute('namespaces', ynode)
node.set_attribute('name', parts[-1])
@classmethod
def _yatiml_sweeten(self, node: yatiml.Node) -> None:
namespace_nodes = node.get_attribute('namespaces').seq_items()
namespaces = list(map(yatiml.Node.get_value, namespace_nodes))
namespace_str = '.'.join(namespaces)
name = node.get_attribute('name').get_value()
node.set_value('{}.{}'.format(namespace_str, name))
load = yatiml.load_function(Identifier)
dumps = yatiml.dumps_function(Identifier)
yaml_text = ('yatiml.logger.setLevel\n')
doc = load(yaml_text)
print(type(doc))
print(doc.namespaces)
print(doc.name)
doc = Identifier(['yatiml'], 'add_to_loader')
yaml_text = dumps(doc)
print(yaml_text)
The tricky part here is in the savorize and sweeten functions. Savorize needs to
build a list of strings, which YAtiML doesn’t help with, so it needs to
construct PyYAML objects directly. For each namespace item, it builds a
yaml.ScalarNode
, which represents a scalar and has a tag to describe the type,
and a value, a string. It also requires a start and end mark, for which we use
dummy values, as this node was generated and is therefore not in the file. The
PyYAML library will raise an Exception if you do not add those. The item
nodes are then added to a yaml.SequenceNode
, and the whole thing set as the
value of the namespaces
attribute.
Sweeten does the reverse of course, getting a list of yatiml.Node
objects representing the items in the namespaces
attribute, extracting the
string values using yatiml.Node.get_value()
, then joining them with
periods and finally combining them with the name. Since we’re only altering the
top-level node here, we do not need to build a yaml.ScalarNode
ourselves but
can just use yatiml.Node.set_value()
.
Timestamps and dates
YAML has a timestamp type, which represents a point in time. The PyYAML library parses this into a python datetime.date object, and will serialise such an object back to a YAML timestamp. YAtiML supports this as well, so all you need to do to use a timestamp or a date is to use datetime.date in your class definition.
Note that the object created by YAtiML may be an instance of datetime.date (if no time is given) or an instance of datetime.datetime (if a time is given) which is a subclass of datetime.date. Since Python does not have a date-without-time type, you cannot currently specify in the type that you want only a date, without a time attached to it.
If this is an attribute in a class, and date-with-time is not a legal value, then you should add a check to the __init__ method that raises an exception if the given value is an instance of datetime.datetime. That way, you can’t accidentally make an instance of the class in Python with an incorrect value either.
Dashed keys
Some YAML-based formats (like CFF) use dashes in their mapping keys. This is a
problem for YAtiML, because keys get mapped to parameters of __init__
,
which are identifiers, and those are not allowed to contain dashes in Python. So
some kind of conversion will have to be made. YAtiML’s seasoning mechanism is
just the way to do it: yatiml.Node
has two methods to convert all
dashes in a mapping’s keys to underscores and back:
unders_to_dashes_in_keys()
and dashes_to_unders_in_keys()
, so
all you need to do is use underscores instead of dashes when defining your
classes, and add seasoning functions. Here’s an example:
docs/examples/dashed_keys.py
from typing import Union
import yatiml
# Create document class
class Dashed:
def __init__(self, an_attribute: int, another_attribute: str) -> None:
self.an_attribute = an_attribute
self.another_attribute = another_attribute
@classmethod
def _yatiml_savorize(cls, node: yatiml.Node) -> None:
node.dashes_to_unders_in_keys()
@classmethod
def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
node.unders_to_dashes_in_keys()
# Create loader
load = yatiml.load_function(Dashed)
# Create dumper
dumps = yatiml.dumps_function(Dashed)
# Load YAML
yaml_text = ('an-attribute: 42\n'
'another-attribute: with-dashes\n')
doc = load(yaml_text)
print(type(doc))
print(doc.an_attribute)
print(doc.another_attribute)
# Dump YAML
dumped_text = dumps(doc)
print(dumped_text)
If you’ve been paying very close attention, then you may be wondering why this
example passes through the recognition stage. After all, the names of the keys
do not match those of the __init__
parameters. YAtiML is a bit flexible in
this regard, and will match a key to a parameter if it is identical after dashes
have been replaced with underscores. The flexibility is only in the recognition
stage, not in the type checking stage, so you do need the seasoning functions.
(The reason to not completely automate this is that YAtiML cannot know if the
YAML side should have dashes or underscores. So you need to specify this
somehow in order to be able to dump correctly, and then it’s better to specify
it on loading as well for symmetry.)
Seasoning enumerations
By default, YAtiML will use an enum member’s name to write to the YAML file, and that’s what it will recognise on loading as well. Sometimes, that’s not what you want however. Maybe you want to use the values, or you want to have the names on the Python side in uppercase (because PEP-8 says so) while you want to use a lower-case version in the YAML file. In that case, you can apply YAtiML’s seasoning mechanisms like this:
docs/examples/enum_use_values.py
# Example from https://github.com/GooglingTheCancerGenome/sv-gen
import enum
import yatiml
class Genotype(enum.Enum):
HOMO_NOSV = 'hmz'
HOMO_SV = 'hmz-sv'
HETERO_SV = 'htz-sv'
@classmethod
def _yatiml_savorize(cls, node: yatiml.Node) -> None:
# enum.Enum has a __members__ attribute which contains its
# members, which we reverse here to make a look-up table that
# converts values in the YAML file to names expected by YAtiML.
yaml_to_python = {
v.value: v.name for v in cls.__members__.values()}
# Remember that the node can be anything here. We only convert
# if it's a string with an expected value, otherwise we leave
# it alone so that a useful error message can be generated.
if node.is_scalar(str):
if node.get_value() in yaml_to_python:
node.set_value(yaml_to_python[node.get_value()])
@classmethod
def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
# Here we just use cls.__members__ directly to convert.
node.set_value(cls.__members__[node.get_value()].value)
or like this:
docs/examples/enum_lowercase.py
import enum
import yatiml
class Color(enum.Enum):
"""Demonstrates lowercased Enum names in YAML."""
RED = 0
GREEN = 1
BLUE = 2
@classmethod
def _yatiml_savorize(cls, node: yatiml.Node) -> None:
if node.is_scalar(str):
node.set_value(node.get_value().upper())
@classmethod
def _yatiml_sweeten(cls, node: yatiml.Node) -> None:
node.set_value(node.get_value().lower())