Commit graph

3833 commits

Author SHA1 Message Date
Ondrej Zajicek (work)
e818f16448 Netlink: Enable strict checking for KRT dumps
Add strict checking for netlink KRT dumps to avoid PMTU cache records
from FNHE table dump along with KRT.

Linux Kernel added FNHE table dump to the netlink API in patch:

8d3b68cd37.1561131177.git.sbrivio@redhat.com/

Therefore, since Linux 5.3 these route cache entries are dumped together
with regular routes during periodic KRT scans, which in some cases may be
huge amount of useless data. This can be avoided by using strict checking
for netlink dumps:

https://lore.kernel.org/netdev/20181008031644.15989-1-dsahern@kernel.org/

The patch mitigates the risk of receiving unknown and potentially large
number of FNHE records that would block BIRD I/O in each sync. There is a
known issue caused by the GRE tunnels on Linux that seems to be creating
one FNHE record for each destination IP address that is routed through
the tunnel, even when the PMTU equals to GRE interface MTU.

Thanks to Tomas Hlavacek for the original patch.
2022-01-14 21:53:40 +01:00
Ondrej Zajicek (work)
d0dd1d20cd Netlink: Explicitly skip received cloned routes
Kernel uses cloned routes to keep route cache entries, but reports them
together with regular routes. They were skipped implicitly as they
do not have rtm_protocol filled. Add explicit check for cloned flag
and skip such routes explicitly.

Also, improve debug logs of skipped routes.
2022-01-14 19:07:57 +01:00
Ondrej Zajicek (work)
60e9def9ef BGP: Add option 'free bind'
The BGP 'free bind' option applies the IP_FREEBIND/IPV6_FREEBIND
socket option for the BGP listening socket.

Thanks to Alexander Zubkov for the idea.
2022-01-09 02:44:32 +01:00
Alexander Zubkov
87a02489f3 IO: Support nonlocal bind in socket interface
Add option to socket interface for nonlocal binding, i.e. binding to an
IP address that is not present on interfaces. This behaviour is enabled
when SKF_FREEBIND socket flag is set. For Linux systems, it is
implemented by IP_FREEBIND socket flag.

Minor changes done by commiter.
2022-01-08 19:02:31 +01:00
Ondrej Zajicek (work)
bcb25084d3 Test: Activate some remaining build tests 2022-01-05 20:07:27 +01:00
Ondrej Zajicek (work)
f5c8fb5fba Netlink: Do not ignore dead routes from BIRD
Currently, BIRD ignores dead routes to consider them absent. But it also
ignores its own routes and thus it can not correctly manage such routes
in some cases. This patch makes an exception for routes with proto bird
when ignoring dead routes, so they can be properly updated or removed.

Thanks to Alexander Zubkov for the original patch.
2022-01-05 19:25:42 +01:00
Ondrej Zajicek (work)
77d032c71f Netlink: Improve multipath parsing errors
Function nl_parse_multipath() should handle errors internally.
2022-01-05 18:46:41 +01:00
Ondrej Zajicek (work)
29dda184e5 Conf: Fix parsing full-length IPv6 addresses
Lexer expression for bytestring was too loose, accepting also
full-length IPv6 addresses. It should be restricted such that
colon is used between every byte or never.

Fix the regex and also add some test cases for it.

Thanks to Alexander Zubkov for the bugreport
2022-01-05 16:38:49 +01:00
Matous
75aceadaf7 gitlab-ci.yml: failing gitlab runner fixed.
'registry.labs.nic.cz' -> 'registry.nic.cz' changed
2022-01-05 04:13:39 +01:00
Alexander Zubkov
77042292ff Doc: Document min/max operators for lists 2021-12-28 04:09:36 +01:00
Alexander Zubkov
0e1fd7ea6a Filter: Add operators to find minimum and maximum element of sets
Add operators .min and .max to find minumum or maximum element in sets
of types: clist, eclist, lclist. Example usage:

bgp_community.min
bgp_ext_community.max
filter(bgp_large_community, [(as1, as2, *)]).min

Signed-off-by: Alexander Zubkov <green@qrator.net>
2021-12-28 04:07:09 +01:00
Alexander Zubkov
e15e465720 Doc: Document community components access operators 2021-12-28 04:07:09 +01:00
Alexander Zubkov
a2a268da4f Filter: Add operators to pick community components
Add operators that can be used to pick components from
pair (standard community) or lc (large community) types.
For example:

(10, 20).asn --> 10
(10, 20).data --> 20

(10, 20, 30).asn --> 10
(10, 20, 30).data1 --> 20
(10, 20, 30).data2 --> 30

Signed-off-by: Alexander Zubkov <green@qrator.net>
2021-12-28 04:07:00 +01:00
Ondrej Zajicek (work)
a39cd2cc0b BSD: Assume onlink flag on ifaces with only host addresses
The BSD kernel does not support the onlink flag and BIRD does not use
direct routes for next hop validation, instead depends on interface
address ranges. We would like to handle PtMP cases with only host
addresses configured, like:

  ifconfig wg0 192.168.0.10/32
  route add 192.168.0.4 -iface wg0
  route add 192.168.0.8 -iface wg0

To accept BIRD routes with onlink next-hop, like:

  route 192.168.42.0/24 via 192.168.0.4%wg0 onlink

BIRD would dismiss the route when receiving from the kernel, as the
next-hop 192.168.0.4 is not part of any interface subnet and onlink
flag is not kept by the BSD kernel.

The commit fixes this by assuming that for routes received from the
kernel, any next-hop is onlink on ifaces with only host addresses.

Thanks to Stefan Haller for the original patch.
2021-12-27 21:00:04 +01:00
Job Snijders
b9f38727a7 RPKI: Add contextual out-of-bound checks in RTR Prefix PDU handler
RFC 6810 and RFC 8210 specify that the "Max Length" value MUST NOT be
less than the Prefix Length element (underflow). On the other side,
overflow of the Max Length element also is possible, it being an 8-bit
unsigned integer allows for values larger than 32 or 128. This also
implicitly ensures there is no overflow of "Length" value.

When a PDU is received where the Max Length field is corrputed, the RTR
client (BIRD) should immediately terminate the session, flush all data
learned from that cache, and log an error for the operator.

Minor changes done by commiter.
2021-12-18 16:35:28 +01:00
Simon Ruderich
00410fd6c1 Doc: bgp: remove "advertise ipv4"
The option was removed in d15b0b0a ("BGP redesign", 2016-12-07)
but the documentation wasn't updated.
2021-12-18 03:17:48 +01:00
Ondrej Zajicek (work)
b21104c97e Nest: Do not ignore secondary flag changes in ifa updates
Compare all IA_* flags that are set by sysdep iface code.

The old code ignores IA_SECONDARY flag when comparing whether iface
address updates from kernel changed anything. This is usually not an
issue as kernel removes all secondary addresses due to removal of the
primary one, but it breaks when sysctl 'promote_secondaries' is enabled
and kernel promotes secondary addresses to primary ones.

Thanks to 'Alexander' for the bugreport.
2021-12-18 01:09:52 +01:00
Ondrej Zajicek (work)
78ddfd2600 Trie: Clarify handling of less-common net types
For convenience, Trie functions generally accept as input values not only
NET_IPx types of nets, but also NET_VPNx and NET_ROAx types. But returned
values are always NET_IPx types.
2021-12-02 03:35:29 +01:00
Maria Matejka
f772afc525 Memory statistics split into Effective and Overhead
This feature is intended mostly for checking that BIRD's allocation
strategies don't consume much memory space. There are some cases where
withdrawing routes in a specific order lead to memory fragmentation and
this output should give the user at least a notion of how much memory is
actually used for data storage and how much memory is "just allocated"
or used for overhead.

Also raising the "system allocator overhead estimation" from 8 to 16
bytes; it is probably even more. I've found 16 as a local minimum in
best scenarios among reachable machines. I couldn't find any reasonable
method to estimate this value when BIRD starts up.

This commit also fixes the inaccurate computation of memory overhead for
slabs where the "system allocater overhead estimation" was improperly
added to the size of mmap-ed memory.
2021-11-27 22:54:15 +01:00
Ondrej Zajicek (work)
14fc24f3a5 Trie: Implement longest-prefix-match queries and walks
The prefix trie now supports longest-prefix-match query by function
trie_match_longest_ipX() and it can be extended to iteration over all
covering prefixes for a given prefix (from longest to shortest) using
TRIE_WALK_TO_ROOT_IPx() macro.
2021-11-26 03:26:36 +01:00
Maria Matejka
644e9ca94e Directly mapped pages are kept for future use if temporarily not needed 2021-11-24 19:42:52 +00:00
Ondrej Zajicek (work)
062e69bf52 Trie: Implement trie walking code
Trie walking allows enumeration of prefixes in a trie in the usual
lexicographic order. Optionally, trie enumeration can be restricted
to a chosen subnet (and its descendants).
2021-11-19 18:04:32 +01:00
Ondrej Zajicek (work)
71c18d9f53 Trie: Simplify network matching code
Introduce ipX_prefix_equal() and use it to simplify network matching code.
2021-11-13 21:11:18 +01:00
Ondrej Zajicek (work)
9f24fef5e9 Conf: Fix crash during shutdown
BIRD implements shutdown by reconfiguring to fake empty configuration.
Such fake config structure is created from the last running config and
shares some data, including symbol table. This allows access to (removed)
routing tables and causes crash when 'show route' command is used during
shutdown.

Clean up symbol table, table list and links to default tables, so removed
routing tables cannot be accessed during shutdown.
2021-10-20 01:51:28 +02:00
Ondrej Zajicek (work)
067f69a56d Filter: Add prefix trie benchmarks
Add trie tests intended as benchmarks that use external datasets
instead of generated prefixes. As datasets are not included, they
are commented out by default.
2021-09-25 16:06:43 +02:00
Ondrej Zajicek (work)
e709dc09e6 Filter: Improve prefix trie tests
Add tests explicitly matching insides and outsides of trie and update
tests to do testing of both IPv4 and IPv6 tries.
2021-09-25 16:06:43 +02:00
Ondrej Zajicek (work)
dd61278c9d Filter: Update trie documentation 2021-09-25 16:06:43 +02:00
Ondrej Zajicek (work)
562a2b8c29 Filter: Fix trie test
Generated prefixes must be valid.
2021-09-25 16:06:43 +02:00
Ondrej Zajicek (work)
13225f1dbf Filter: Faster prefix sets
Use 16-way (4bit) branching in prefix trie instead of basic binary
branching. The change makes IPv4 prefix sets almost 3x faster, but
with more memory consumption and much more complicated algorithm.

Together with a previous filter change, it makes IPv4 prefix sets
about ~4.3x faster and slightly smaller (on my test data).
2021-09-25 16:06:43 +02:00
Ondrej Zajicek (work)
f761be6b30 Nest: Clean up main channel handling
Remove assumption that main channel is the only channel.
2021-06-17 16:56:51 +02:00
Ondrej Zajicek (work)
1b9bf4e192 Nest: Fix export of tmpattrs through pipes
Pipes copy the original rte with old values, so they require rte to be
exported with stored tmpattrs. Other protocols access stored attributes
using eattr list, so they require rte to be exported with expanded
tmpattrs. This is temporary hack, we plan to remove whoe tmpattr mechanism.

Thanks to Paul Donohue for the bugreport.
2021-06-14 20:02:50 +02:00
Ondrej Zajicek (work)
3ebabab277 Revert "Nest: Fix export of tmpattrs through pipes"
This reverts commit f8e273b5e7.
2021-06-14 17:58:37 +02:00
Ondrej Zajicek (work)
f8e273b5e7 Nest: Fix export of tmpattrs through pipes
In most cases of export there is no need to store back temporary
attributes to rte, as receivers (protocols) access eattr list anyway.
But pipe copies the original rte with old values, so we should store
tmpattrs also during export.

Thanks to Paul Donohue for the bugreport.
2021-06-14 16:30:59 +02:00
Ondrej Zajicek (work)
3f19100f5a CI: Allow Babel tests 2021-06-11 01:31:10 +02:00
Ondrej Zajicek (work)
596f2e32e3 Nest: Allow both 'password' and 'key' keywords for authentication keys 2021-06-09 19:54:01 +02:00
Ondrej Zajicek (work)
6d26f85395 Babel: Simplify auth expiration
Just use hello_expiry for that, keep init_expiry for initial
unauthentized neighbors.
2021-06-09 19:31:55 +02:00
Ondrej Zajicek (work)
8eea396baf Nest: Fix password list parsing code
One of previous patches broke password list parsing code, fix that.
2021-06-06 19:10:33 +02:00
Ondrej Zajicek (work)
ee9516dbe8 Lib: Fix static assert macro 2021-06-06 17:23:45 +02:00
Ondrej Zajicek (work)
b174cc0abc Babel: Add MAC authentication support - update
Some cleanups and bugfixes to the previous patch, including:

 - Fix rate limiting in index mismatch check

 - Fix missing BABEL_AUTH_INDEX_LEN in auth_tx_overhead computation

 - Fix missing auth_tx_overhead recalculation during reconfiguration

 - Fix pseudoheader construction in babel_auth_sign() (sport vs fport)

 - Fix typecasts for ptrdiffs in log messages

 - Make auth log messages similar to corresponding RIP/OSPF ones

 - Change auth log messages for events that happen during regular
   operation to debug messages

 - Switch meaning of babel_auth_check*() functions for consistency
   with corresponding RIP/OSPF ones

 - Remove requirement for min/max key length, only those required by
   given MAC code are enforced
2021-06-06 16:28:18 +02:00
Toke Høiland-Jørgensen
b218a28f61 Babel: Add MAC authentication support
This implements support for MAC authentication in the Babel protocol, as
specified by RFC 8967. The implementation seeks to follow the RFC as close
as possible, with the only deliberate deviation being the addition of
support for all the HMAC algorithms already supported by Bird, as well as
the Blake2b variant of the Blake algorithm.

For description of applicability, assumptions and security properties,
see RFC 8967 sections 1.1 and 1.2.
2021-06-06 16:28:18 +02:00
Toke Høiland-Jørgensen
69d10132a6 Babel: Refactor TLV parsing code for easier reuse
In preparation for adding authentication checks, refactor the TLV
walking code so it can be reused for a separate pass of the packet
for authentication checks.
2021-06-06 16:28:18 +02:00
Toke Høiland-Jørgensen
589f7d1e4f Nest: Allow MAC algorithms to specify min/max key length
Add min/max key length fields to the MAC algorithm description and
validate configured keys before they are used.
2021-06-06 16:28:18 +02:00
Toke Høiland-Jørgensen
35f88b305a Nest: Allow specifying security keys as hex bytes as well as strings
Add support for specifying a password in hexadecimal format, The result
is the same whether a password is specified as a quoted string or a
hex-encoded byte string, this just makes it more convenient to input
high-entropy byte strings as MAC keys.
2021-06-06 16:28:18 +02:00
Toke Høiland-Jørgensen
f1a824190c Lib: Add tests for blake2s and blake2b
Import the blake2-kat.h header with test vector output from the blake
reference implementation, and add tests to mac_test.c to compare the
output of the Bird MAC algorithm implementations with that reference
output.

Since the reference implementation only has test vectors for the full
output size, there are no tests for the smaller-sized output variants.
2021-06-06 16:28:09 +02:00
Toke Høiland-Jørgensen
725d9af94a Lib: Add Blake2s and Blake2b hash functions
The Babel MAC authentication RFC recommends implementing Blake2s as one of
the supported algorithms. In order to achieve do this, add the blake2b and
blake2s hash functions for MAC authentication. The hashing function
implementations are the reference implementations from blake2.net.

The Blake2 algorithms allow specifying an arbitrary output size, and the
Babel MAC spec says to implement Blake2s with 128-bit output. To satisfy
this, we add two different variants of each of the algorithms, one using
the default size (256 bits for Blake2s, 512 bits for Blake2b), and one
using half the default output size.

Update to BIRD coding style done by committer.
2021-06-06 16:26:58 +02:00
Ondrej Zajicek (work)
e5724f71d2 sysdep: Add wrapper to get random bytes - update
Simplify the code and fix an issue with getentropy() return value.
2021-06-06 16:26:06 +02:00
Toke Høiland-Jørgensen
c48ebde5ce sysdep: Add wrapper to get random bytes
Add a wrapper function in sysdep to get random bytes, and required checks
in configure.ac to select how to do it. The configure script tries, in
order, getrandom(), getentropy() and reading from /dev/urandom.
2021-06-06 16:26:06 +02:00
Ondrej Zajicek (work)
91d0458389 BGP: Ensure that freed neighbor entry is not accessed
Routes from downed protocols stay in rtable (until next rtable prune
cycle ends) and may be even exported to another protocol. In BGP case,
source BGP protocol is examined, although dynamic parts (including
neighbor entries) are already freed. That may lead to crash under some
race conditions. Ensure that freed neighbor entry is not accessed to
avoid this issue.
2021-06-01 02:20:26 +02:00
Maria Matejka
ebd5751cde Babel: Seqno requests are properly decoupled from neighbors when the underlying interface disappears
When an interface disappears, all the neighbors are freed as well. Seqno
requests were anyway not decoupled from them, leading to strange
segfaults. This fix adds a proper seqno request list inside neighbors to
make sure that no pointer to neighbor is kept after free.
2021-05-30 13:29:21 +02:00
Ondrej Zajicek (work)
10498b8e89 OSPF: Fix OSPFv3 in IPv4 mode with multiple areas
Some area handling code got confused by IPv4 setup in OSPFv3 mode.
2021-05-26 18:57:32 +02:00