Problem solving
This section collects some useful information about solving problems you may encounter when working with YAtiML.
Enabling logging
YAtiML uses the standard Python logging system, using a logger named yatiml
.
YAtiML produces log messages at the INFO and DEBUG levels, which makes them
invisible at Python’s default log level of WARNING. To be able to see what
YAtiML does, you can either lower the general Python log level (which may cause
other parts of your program to produce (more) log output as well), or you can
lower YAtiML’s log level specifically. To do the latter, use this:
import logging
import yatiml
yatiml.logger.setLevel(logging.INFO)
Or you can use logging.DEBUG
for very detailed output.
It seems that you may have to do a logging.debug()
call to get any output at
all, maybe because that causes Python to set up something it needs. There’s
probably a good explanation and a better fix for this. If you know, please
contribute.
In order to understand how to interpret the output, it helps to have an idea of how YAtiML processes a YAML file into Python objects. See The YAtiML pipeline below for more on that.
Unions with bool
While defining your classes, you may want to have an attribute that can be of
multiple types. As described in the tutorial, you would use a Union
for
this. For example something like
class Setting:
def __init__(name: str,
value: Union[str, int, float, bool]
) -> None:
self.name = name
self.value = value
is likely to occur in many YAML-based configuration file formats.
There is a problem with the above code, in that it will give an error message if you try to read the following input on Python < 3.7, saying that it could not recognise the bool value:
name: test
value: true
(Scroll down for the fix, if you don’t care for the explanation.)
Arguably, this is a bug in Python’s type handling, and the developers of
Python’s typing
module seem to agree, because they have fixed this in Python
3.7. What happens here is that in Python, bool
is a subtype of int
, in
other words, any bool
value is also an int
value. Furthermore, the
Union
class will automatically normalise the types it is passed by removing
redundant types. If you put in a type twice then one copy will be removed for
instance, and also, if you put in a type and also a subtype of that type, then
the subtype will be removed. This makes some sense: if every bool
is an
int
, then just int
will already match boolean values, and bool
is
redundant.
While this works for Python, it’s problematic in YAML, where bool
and
int
are unrelated types. Indeed, YAtiML will not accept a boolean value in
the YAML file if you declare the attribute to be an int
. And that’s where
we get into trouble: Python normalises the above Union to Union[str, int,
float]
, and YAtiML reads this and generates an error if you feed it a boolean.
In Python 3.7, the behaviour of Union has changed. While mypy still does the normalisation internally when checking types, the runtime Union object no longer normalises. Since the runtime object is what YAtiML reads, this problem does not occur on Python 3.7 (and hopefully versions after that, the typing module is not entirely stable yet).
A fix for Python < 3.7
So, this is fixed in Python 3.7, but what if you’re running on an older version?
In that case you need a work-around, and YAtiML provides one called
bool_union_fix
. It works like this:
from yatiml import bool_union_fix
class Setting:
def __init__(name: str,
value: Union[str, int, float, bool, bool_union_fix]
) -> None:
self.name = name
self.value = value
All you need to do is import bool_union_fix
and add it to the list of types
in the Union
, and now things will work as expected (also in Python 3.7).
bool_union_fix
is essentially a dummy type that is recognised by YAtiML and
treated just like bool
. Since it’s a separate type, it isn’t merged into the
int
, so it’ll still be there for YAtiML to read. Note that you do need the
bool
in there as well, to avoid mypy complaining if you try to create a
Setting object in your code with a bool for the value attribute.
The YAtiML pipeline
With plain PyYAML or ruamel.yaml, the loading process has two stages. First, the text is parsed into a parse tree, which consists of nodes. Each node has a tag and a value. Second, objects are constructed from the nodes, with the type of the object decided based on the tag, and the contents of the object coming from the value.
YAtiML inserts three additional stages in between the two existing ones: recognition, savourising, and type checking.
Recognition determines, for each node, as which type YAtiML will try to process
it. This is mostly based on the object model given to the custom loader. In our
ongoing example, the value corresponding to the name
attribute is expected
to be a string, so YAtiML will try to recognise only a string here. The age
attribute has a union type, and for those YAtiML will look at the value given
and see if it matches one of the types in the union. If it matches exactly one,
it is recognised as that type. If it matches none of them, or multiple, an error
message is given.
When recognising a node that according to the object model is of a custom class
type, YAtiML will try to recognise a mapping node with keys and values according
to the custom class’s __init__
method’s parameters. If the custom class has
subclasses which are also registered with the loader, then those will be
recognised as well at this point in the document. If both a class and its
subclass are matched, the node is recognised as being of the subclass, i.e.
the recognition process prefers the most derived class. If there are multiple
matching sibling subclasses, the node is declared ambiguous and an error is
raised. Recognition for a custom class can be overridden using a
_yatiml_recognize()
method.
Incidentally, a technical term for what the recognition process does is type inference, which explains the name YAtiML: it inserts type inference in the middle of the YAML processing pipeline.
The second and third stages, savourising and type checking, only apply to custom
classes. To savourise a recognised node, YAtiML calls that node’s
_yatiml_savorize()
method, after calling those of its base classes, if any.
Savourising is entirely defined by the custom class, the default is to do
nothing. After savourising, the resulting mapping is type checked against the
__init__
signature, since Python does not do run-time type checking itself.
This is a safety feature, since the read-in YAML document will often be
untrusted, or if it is, at least a convenience feature, in that what you see in
the __init__
signature is guaranteed to be what you get, thus applying the
principle of least surprise.
Note that no type check is done for built-in types, but for built-in types the default recognition process is effectively a type check, and it cannot be overridden. Another way of looking at the type check for custom classes is that it reduces the requirements on custom recognition functions: they need to merely disambiguate between derived classes and in unions, rather than performing a complete type check. That makes it easier to write them.
Error messages
This section contains some error messages that you may encounter when using YAtiML, and potential solutions to try if you do. If you run into an error that you cannot figure out, please make an issue describing the error message and what you are doing (a short example really helps!). Contributions directly to the documentation are of course also welcome! See the Development section for information on how to contribute.
_yatiml_recognize missing required argument
If you get
TypeError: _yatiml_recognize() missing 1 required positional argument: 'node'
or
TypeError: _yatiml_savorize() missing 1 required positional argument: 'node'
then you have probably forgotten to add the @classmethod
decorator to your
_yatiml_recognize()
or _yatiml_savorize()
function.