Verifying and Translating Entries¶
Verification¶
Your application may require consistency guarantees. Instead of committing
fraud when a transaction in your bookkeeping system doesn’t add up to zero,
you might want to add a verification step to make sure that that doesn’t
happen in the first place. More prosaically, the statement “The door is
locked” is either True or False. (However, you always should be prepared
for an answer of “No idea”, aka None
. That’s not avoidable.)
Types¶
Type entries may contain a schema
attribute with a JSON Schema that
verifies the data. They also may contain a code
attribute which forms
the body of a validation procedure. The variable value
contains the
value in question.
Type entries are hierarchic: An (“int”,”percent”) type is first validated against (None,”type”,”int”), then against (None,”type”,”int”,”percent”).
Type checkers cannot modify data.
Type check entries must be accompanied by “good” and “bad” values, which
must be non-empty arrays of values which pass or fail this type check. For
subordinate types, both kinds must pass the supertype check: if you
add a type “float percentage”, the bad
list may contain values like -1.2
or
123.45
, but not "hello"
.
Beware that restricting an existing type is dangerous. The DistKV server does not verify that all existing entries verify correctly. In pedantic mode, your network may no longer load its data or converge.
Matches¶
The (None,”match”) hierarchy mirrors the actual object tree, except that wildcards are allowed:
“#”
matches any number of levels
“+”
matches exactly one level
This matches MQTT’s behavior.
Unlike MQTT, there may be more than one “#” wildcard.
Be aware that adding or modifying matches to existing entries is dangerous. The DistKV server does not verify that all existing entries verify correctly. In pedantic mode, your network may no longer load its data or converge.
Putting it all together¶
Given the following structure, values stored at (“foo”, anything, “bar”) must be integers:
_: 123
null:
match:
foo:
+:
bar:
_:
type:
- int
- percent
type:
int:
_:
bad: [none, "foo"]
code: 'if not isinstance(value,int): raise ValueError(''not an int'')'
good: [0,2]
percent:
_:
bad: [-1,555]
code: 'if not 0<=value<=100: raise ValueError(''not a percentage'')
'
good: [0,100,50]
foo:
dud:
bar:
_: 55
The above is the server content at the end of the testcase
tests/test_feature_typecheck.py::test_72_cmd
, when
dumped with the command distkv client get -rd_
.
Translation¶
Sometimes, clients need special treatment. For instance, an IoT-MQTT message
that reports turning on a light might send “ON” to topic
/home/state/bath/light
, while what you’d really like to do is to change
the Boolean state
attribute of home.bath.light
. Or maybe the value
is a percentage and you’d like to ensure that the stored value is 0.5
instead of “50%”, and that no rogue client can set it to -20 or “gotcha”.
To ensure this, DistKV employs a two-level type mechanism.
- “type” entries describe the type of entry (“this is an integer between 0 and 42”).
- “match” entries describe the path position to which that type applies
In addition, a similar mechanism may be used to convert clients’ values to DistKV entries and back.
- “codec” entries describe distinct converters (“50%” => 0.5; “ON” => ‘set
the entry’s “state” property to
True
’) - “map” entries are activated per client (via command, or controlled by its login) and describe the path position to which a codec applies
Codecs¶
Codec entries contain decode
and encode
attributes which form the
bodies of procedures that rewrite external data to DistKV values and vice
versa, respectively, using the value
parameter as input. The decode
procedure gets an additional prev
variable which contains the old
value. That value must not be modified; create a copy or (preferably)
use distkv.util.combine_dict()
to assemble the result.
Codecs may be named hierarchically for convenience; if you want to call the “parent” codec, put the common code in a module and import that.
Codecs also require “in” and “out” attributes, each of which must contain a list of 2-tuples with that conversion’s source value and its result. “in” corresponds to decoding, “out” to encoding – much like Python’s binary codecs.
Converters¶
While the (None,"map")
subtree contains a single mapping, (None,"conv")
uses an additional single level of codec group names. A mapping must be
applied to a user (by adding a “conv=GROUPNAME” to the user’s aux data
field) before it is used. This change is instantaneous, i.e. an existing
user does not need to reconnect.
Below that, converter naming works like that for mappings. Of course, the
pointing attribute is named codec
instead of type
.
Putting it all together¶
Given the following data structure, the user “conv” will only be able to write stringified integers under keys below the “inty” key, which will be stored as integers:
null:
auth:
_:
current: _test
_test:
user:
con:
_:
_aux:
conv: foo
std:
_:
_aux: {}
codec:
int:
_:
decode: assert isinstance(value,str); return int(value)
encode: return str(value)
in:
- [ '1', 1 ]
- [ '2', 2 ]
- [ '3', 3 ]
out:
- [ 1, '1' ]
- [ 2, '2' ]
- [ -3, '-3' ]
conv:
foo:
inty:
'#':
_:
codec:
- int
inty:
_: hello
ten:
_: 10
yep:
yepyepyep:
_: 13
yep:
_: 99
The above is the server content at the end of the testcase
tests/test_feature_convert.py::test_71_basic
, when
dumped with the command distkv client get -rd_
.
Paths¶
Currently, DistKV does not offer automatic path translation. If you need
that, the best way is to code two active object hierarchies, and
let their set_value
methods shuffle data to the “other” side.
There are some caveats:
- All such data are stored twice.
- Don’t change a value that didn’t in fact change; if you do, you’ll generate an endless loop.
- You need to verify that the two trees match when you start up, and decide
which is more correct. (The
tock
stamp will help you here.) Don’t accidentally overwrite changes that arrive while you do that.