Filter: Update trie documentation

2020-04-06 14:20:16 +02:00 · 2020-04-06 14:20:16 +02:00 · dd61278c9d
commit dd61278c9d
parent 562a2b8c29
1 changed files with 69 additions and 44 deletions
--- a/filter/trie.c
+++ b/filter/trie.c
@ -1,7 +1,8 @@
 /*
 *	Filters: Trie for prefix sets
 *
- *	Copyright 2009 Ondrej Zajicek <santiago@crfreenet.org>
+ *	(c) 2009--2020 Ondrej Zajicek <santiago@crfreenet.org>
+ *	(c) 2009--2020 CZ.NIC z.s.p.o.
 *
 *	Can be freely distributed and used under the terms of the GNU GPL.
 */
@ -9,53 +10,68 @@
 /**
 * DOC: Trie for prefix sets
 *
- * We use a (compressed) trie to represent prefix sets. Every node
- * in the trie represents one prefix (&addr/&plen) and &plen also
- * indicates the index of the bit in the address that is used to
- * branch at the node. If we need to represent just a set of
- * prefixes, it would be simple, but we have to represent a
- * set of prefix patterns. Each prefix pattern consists of
- * &ppaddr/&pplen and two integers: &low and &high, and a prefix
- * &paddr/&plen matches that pattern if the first MIN(&plen, &pplen)
- * bits of &paddr and &ppaddr are the same and &low <= &plen <= &high.
+ * We use a (compressed) trie to represent prefix sets. Every node in the trie
+ * represents one prefix (&addr/&plen) and &plen also indicates the index of
+ * bits in the address that are used to branch at the node. Note that such
+ * prefix is not necessary a member of the prefix set, it is just a canonical
+ * prefix associated with a node. Prefix lengths of nodes are aligned to
+ * multiples of &TRIE_STEP (4) and there is 16-way branching in each
+ * node. Therefore, we say that a node is associated with a range of prefix
+ * lengths (&plen .. &plen + TRIE_STEP - 1).
 *
- * We use a bitmask (&accept) to represent accepted prefix lengths
- * at a node. As there are 33 prefix lengths (0..32 for IPv4), but
- * there is just one prefix of zero length in the whole trie so we
- * have &zero flag in &f_trie (indicating whether the trie accepts
- * prefix 0.0.0.0/0) as a special case, and &accept bitmask
+ * The prefix set is not just a set of prefixes, it is defined by a set of
+ * prefix patterns. Each prefix pattern consists of &ppaddr/&pplen and two
+ * integers: &low and &high. The tested prefix &paddr/&plen matches that pattern
+ * if the first MIN(&plen, &pplen) bits of &paddr and &ppaddr are the same and
+ * &low <= &plen <= &high.
+ *
+ * There are two ways to represent accepted prefixes for a node. First, there is
+ * a bitmask &local, which represents independently all 15 prefixes that extend
+ * the canonical prefix of the node and are within a range of prefix lengths
+ * associated with the node. E.g., for node 10.0.0.0/8 they are 10.0.0.0/8,
+ * 10.0.0.0/9, 10.128.0.0/9, .. 10.224.0.0/11. This order (first by length, then
+ * lexicographically) is used for indexing the bitmask &local, starting at
+ * position 1. I.e., index is 2^(plen - base) + offset within the same length,
+ * see function trie_local_mask6() for details.
+ *
+ * Second, we use a bitmask &accept to represent accepted prefix lengths at a
+ * node. The bit is set means that all prefixes of given length that are either
+ * subprefixes or superprefixes of the canonical prefix are accepted. As there
+ * are 33 prefix lengths (0..32 for IPv4), but there is just one prefix of zero
+ * length in the whole trie so we have &zero flag in &f_trie (indicating whether
+ * the trie accepts prefix 0.0.0.0/0) as a special case, and &accept bitmask
 * represents accepted prefix lengths from 1 to 32.
 *
- * There are two cases in prefix matching - a match when the length
- * of the prefix is smaller that the length of the prefix pattern,
- * (&plen < &pplen) and otherwise. The second case is simple - we
- * just walk through the trie and look at every visited node
- * whether that prefix accepts our prefix length (&plen). The
- * first case is tricky - we don't want to examine every descendant
- * of a final node, so (when we create the trie) we have to propagate
- * that information from nodes to their ascendants.
+ * One complication is handling of prefix patterns with unaligned prefix length.
+ * When such pattern is to be added, we add a primary node above (with rounded
+ * down prefix length &nlen) and a set of secondary nodes below (with rounded up
+ * prefix lengths &slen). Accepted prefix lengths of the original prefix pattern
+ * are then represented in different places based on their lengths. For prefixes
+ * shorter than &nlen, it is &accept bitmask of the primary node, for prefixes
+ * between &nlen and &slen - 1 it is &local bitmask of the primary node, and for
+ * prefixes longer of equal &slen it is &accept bitmasks of secondary nodes.
 *
- * Suppose that we have two masks (M1 and M2) for a node. Mask M1
- * represents accepted prefix lengths by just the node and mask M2
- * represents accepted prefix lengths by the node or any of its
- * descendants. Therefore M2 is a bitwise or of M1 and children's
- * M2 and this is a maintained invariant during trie building.
- * Basically, when we want to match a prefix, we walk through the trie,
- * check mask M1 for our prefix length and when we came to
- * final node, we check mask M2.
+ * There are two cases in prefix matching - a match when the length of the
+ * prefix is smaller that the length of the prefix pattern, (&plen < &pplen) and
+ * otherwise. The second case is simple - we just walk through the trie and look
+ * at every visited node whether that prefix accepts our prefix length (&plen).
+ * The first case is tricky - we do not want to examine every descendant of a
+ * final node, so (when we create the trie) we have to propagate that
+ * information from nodes to their ascendants.
 *
- * There are two differences in the real implementation. First,
- * we use a compressed trie so there is a case that we skip our
- * final node (if it is not in the trie) and we came to node that
- * is either extension of our prefix, or completely out of path
- * In the first case, we also have to check M2.
+ * There are two kinds of propagations - propagation from child's &accept
+ * bitmask to parent's &accept bitmask, and propagation from child's &accept
+ * bitmask to parent's &local bitmask. The first kind is simple - as all
+ * superprefixes of a parent are also all superprefixes of appropriate length of
+ * a child, then we can just add (by bitwise or) a child &accept mask masked by
+ * parent prefix length mask to the parent &accept mask. This handles prefixes
+ * shorter than node &plen.
 *
- * Second, we really need not to maintain two separate bitmasks.
- * Checks for mask M1 are always larger than &applen and we need
- * just the first &pplen bits of mask M2 (if trie compression
- * hadn't been used it would suffice to know just $applen-th bit),
- * so we have to store them together in &accept mask - the first
- * &pplen bits of mask M2 and then mask M1.
+ * The second kind of propagation is necessary to handle superprefixes of a
+ * child that are represented by parent &local mask - that are in the range of
+ * prefix lengths associated with the parent. For each accepted (by child
+ * &accept mask) prefix length from that range, we need to set appropriate bit
+ * in &local mask. See function trie_amask_to_local() for details.
 *
 * There are four cases when we walk through a trie:
 *
@ -65,8 +81,7 @@
 * - we are beyond the end of path (node length > &plen)
 * - we are still on path and keep walking (node length < &plen)
 *
- * The walking code in trie_match_prefix() is structured according to
- * these cases.
+ * The walking code in trie_match_net() is structured according to these cases.
 */

 #include "nest/bird.h"
@ -166,6 +181,10 @@ attach_node(struct f_trie_node *parent, struct f_trie_node *child, int v4)
 }


+/*
+ * Compute appropriate mask representing prefix px/plen in local bitmask of node
+ * with prefix length nlen. Assuming that nlen <= plen < (nlen + TRIE_STEP).
+ */
 static inline uint
 trie_local_mask4(ip4_addr px, uint plen, uint nlen)
 {
@ -182,6 +201,12 @@ trie_local_mask6(ip6_addr px, uint plen, uint nlen)
  return 1u << pos;
 }

+/*
+ * Compute an appropriate local mask (for a node with prefix length nlen)
+ * representing prefixes of px that are accepted by amask and fall within the
+ * range associated with that node. Used for propagation of child accept mask
+ * to parent local mask.
+ */
 static inline uint
 trie_amask_to_local(ip_addr px, ip_addr amask, uint nlen)
 {