Commit graph

833 commits

Author SHA1 Message Date
Ondrej Zajicek 543c8ba097 BSD: Fix krt socket code w.r.t. rte/rta changes 2022-11-30 02:43:39 +01:00
Ondrej Zajicek bbac9ca958 Conf: Make 'configure check' command restricted
While it does not directly change BIRD state, it can trigger reading
arbitrary files and eating significant memory.
2022-11-09 22:02:46 +01:00
Ondrej Zajicek 371eb49043 Conf: Free stored old config before parsing new one
BIRD keeps a previous (old) configuration for the purpose of undo. The
existing code frees it after a new configuration is successfully parsed
during reconfiguration. That causes memory usage spikes as there are
temporarily three configurations (old, current, and new). The patch
changes it to free the old one before parsing the new one (as user
already requested a new config). The disadvantage is that undo is
not available after failed reconfiguration.
2022-11-09 21:54:45 +01:00
Maria Matejka 57308fb277 Page allocator: Fixed minor bugs and added commentary 2022-11-03 12:38:57 +01:00
Maria Matejka 9d03c3f56c Memory pages are not munmapped, instead we just madvise()
Memory unmapping causes slow address space fragmentation, leading in
extreme cases to failing to allocate pages at all. Removing this problem
by keeping all the pages allocated to us, yet calling madvise() to let
kernel dispose of them.

This adds a little complexity and overhead as we have to keep the
pointers to the free pages, therefore to hold e.g. 1 GB of 4K pages with
8B pointers, we have to store 2 MB of data.
2022-11-02 12:56:54 +01:00
Ondrej Zajicek 3242529750 Netlink: Parse onlink flag even on direct routes
While onlink flag is meaningful only with explicit next hops, it can be
defined also on direct routes. Parse it also in this case to avoid
periodic updates of the same route.

Thanks to Marcin Saklak for the bugreport.
2022-10-12 17:57:26 +02:00
Alexander Zubkov 0f2be469f8 KRT: Fix setting default preference
Changes in commit eb937358 broke setting of channel preference for alien
routes learned during scan. The preference was set only for async routes.
Move common attribute processing part of functions krt_learn_async() and
krt_learn_async() to a separate function to have only one place for such
changes.
2022-09-27 11:33:41 +02:00
Maria Matejka d2c1036a42 Merge branch 'mq-fix-eattr-setting' into backport 2022-08-18 22:07:50 +02:00
Maria Matejka dc28c6ed1c Simplified the protocol hookup code in Makefiles 2022-08-18 22:07:30 +02:00
Maria Matejka 16ac6c3c74 Fixed initialization of Linux kernel route attributes 2022-08-18 17:44:00 +02:00
Ondrej Zajicek 082905a833 Merge branch 'master' into backport 2022-07-27 00:47:24 +02:00
Ondrej Zajicek ddb1bdf281 Netlink: Restrict route replace for IPv6
Seems like the previous patch was too optimistic, as route replace is
still broken even in Linux 4.19 LTS (but fixed in Linux 5.10 LTS) for:

  ip route add 2001:db8::/32 via fe80::1 dev eth0
  ip route replace 2001:db8::/32 dev eth0

It ends with two routes instead of just the second.

The issue is limited to direct and special type (e.g. unreachable)
routes, the patch restricts route replace for cases when the new route
is a regular route (with a next hop address).
2022-07-26 18:45:20 +02:00
Ondrej Zajicek 722daa9500 Netlink: Simplify handling of IPv6 ECMP routes
When IPv6 ECMP support first appeared in Linux kernel, it used different
API than IPv4 ECMP. Individual next hops were updated and announced
separately, instead of using RTA_MULTIPATH as in IPv4. This has several
drawbacks and requires complex code to merge received notifications to
one multipath route.

When Linux came with IPv6 RTA_MULTIPATH support, the initial versions
were somewhat buggy, so we kept using the old API for updates (splitting
multipath routes to sequences of route updates), while accepting both
old-style routes and RTA_MULTIPATH routes in scans / notifications.

As IPv6 RTA_MULTIPATH support is here for a long time, this patch fully
switches Netlink to the IPv6 RTA_MULTIPATH API and removes old complex
code for handling individual next hop announces.

The required Linux version is at least 4.11 for reliable operation.

Thanks to Daniel Gröber for the original patch.
2022-07-25 00:11:40 +02:00
Ondrej Zajicek 534d0a4b44 KRT: Scan routing tables separetely on linux to avoid congestion
Remove compile-time sysdep option CONFIG_ALL_TABLES_AT_ONCE, replace it
with runtime ability to run either separate table scans or shared scan.

On Linux, use separate table scans by default when the netlink socket
option NETLINK_GET_STRICT_CHK is available, but retreat to shared scan
when it fails.

Running separate table scans has advantages where some routing tables are
managed independently, e.g. when multiple routing daemons are running on
the same machine, as kernel routing table modification performance is
significantly reduced when the table is modified while it is being
scanned.

Thanks Daniel Gröber for the original patch and Toke Høiland-Jørgensen
for suggestions.
2022-07-24 02:15:20 +02:00
Maria Matejka 2e5bfeb73a Merge remote-tracking branch 'origin/master' into backport 2022-07-11 11:08:10 +02:00
Maria Matejka d429bc5c84 Merge commit 'beb5f78a' into backport 2022-07-11 10:41:17 +02:00
Maria Matejka 7e9cede1fd Merge version 2.0.10 into backport 2022-07-10 14:19:24 +02:00
Ondrej Zajicek (work) 946cedfcfe Filter: Implement soft scopes
Soft scopes are anonymous scopes that most likely do not contain any
symbol, so allocating regular scope is postponed when it is really
needed.
2022-06-27 21:13:31 +02:00
Maria Matejka beb5f78ada Preexport callback now takes the channel instead of protocol as argument
Passing protocol to preexport was in fact a historical relic from the
old times when channels weren't a thing. Refactoring that to match
current extensibility needs.
2022-06-27 19:04:24 +02:00
Ondrej Zajicek b867c798c3 NEWS and version update 2022-06-16 02:58:37 +02:00
Ondrej Zajicek f39e9aa203 IO: Improve resolution of latency debugging messages 2022-06-04 17:54:08 +02:00
Maria Matejka 097f157182 Merge commit '692055e3df6cc9f0d428d3b0dd8cdd8e825eb6f4' into haugesund-to-2.0 2022-05-30 15:17:52 +02:00
Maria Matejka 9eec503b25 Fixed a munmap abort bug
When BIRD was munmapping too many pages, it sometimes aborted, saying
that munmap failed with "Not enough memory" as the address space was
getting more and more fragmented.

There is a workaround in place, simply keeping that page for future use,
yet it has never been compiled in because I somehow forgot to include
errno.h. And because I also thought that somebody may have ENOMEM not
defined (why?!), there was a check which quietly omitted that
workaround.

Anyway, ENOMEM is POSIX. It's an utter nonsense to check for its
existence. If it doesn't exist, something is broken.
2022-04-13 11:36:54 +02:00
Maria Matejka 4a23ede2b0 Protocols have their own explicit init routines 2022-04-06 18:14:08 +02:00
Ondrej Zajicek (work) 4b1aa37f93 Netlink: Remove superfluous sysdep/linux/netlink.c.orig
Thanks to Vincent Bernat for notice.
2022-03-16 23:16:26 +01:00
Maria Matejka 4e60b3ee72 Fixed a static assert in page allocator 2022-03-09 13:28:03 +01:00
Maria Matejka 19e727a248 Merge commit '60880b539b8886f76961125d89a265c6e1112b7a' into haugesund 2022-03-09 11:29:56 +01:00
Maria Matejka 24773af9e0 Merge commit 'e42eedb9' into haugesund 2022-03-09 11:02:55 +01:00
Maria Matejka 83d9920f90 Merge commit '5cff1d5f' into haugesund
Conflicts:
      proto/bgp/attrs.c
      proto/pipe/pipe.c
2022-03-09 10:56:06 +01:00
Maria Matejka ff47cd80dd Merge commit 'd5a32563' into haugesund 2022-03-09 10:50:38 +01:00
Maria Matejka 9e60a1fbc3 Fixed resource initialization in unit tests 2022-03-09 10:30:42 +01:00
Maria Matejka eeec9ddbf2 Merge commit '0c59f7ff' into haugesund 2022-03-09 09:13:55 +01:00
Maria Matejka 0c59f7ff01 Revert "Bound allocated pages to resource pools with page caches to avoid unnecessary syscalls"
This reverts commit 7f0e598208.
2022-03-09 09:13:31 +01:00
Maria Matejka 1c7df2c240 Revert "Multipage allocation"
This reverts commit 6cd3771378.
2022-03-09 09:13:20 +01:00
Maria Matejka c78247f9b9 Single-threaded version of sark-branch memory page management 2022-03-09 09:10:44 +01:00
Maria Matejka 48bf1322aa Introducing an universal temporary linpool flushed after every task 2022-03-02 12:13:49 +01:00
Maria Matejka d071aca7aa Merge commit '2c13759136951ef0e70a3e3c2b2d3c9a387f7ed9' into haugesund 2022-03-02 10:01:44 +01:00
Ondrej Zajicek (work) 71c9484b00 NEWS and version update 2022-02-09 03:47:49 +01:00
Ondrej Zajicek (work) 2fc8b4c4ba Alloc: Use posix_memalign() instead of aligned_alloc()
For compatibility with older systems use posix_memalign(). We can
switch to aligned_alloc() when we commit to C11 for multithreading.
2022-02-08 22:42:00 +01:00
Ondrej Zajicek (work) ef614f2984 Netlink: Minor cleanup 2022-02-08 22:21:08 +01:00
Ondrej Zajicek (work) 81ee6cda2e Netlink: Add option to specify netlink socket receive buffer size
Add option 'netlink rx buffer' to specify netlink socket receive buffer
size. Uses SO_RCVBUFFORCE, so it can override rmem_max limit.

Thanks to Trisha Biswas and Michal for the original patches.
2022-01-17 05:11:29 +01:00
Ondrej Zajicek (work) bbc33f6ec3 Netlink: Add another workaround for older kernel headers
Unfortunately, SOL_NETLINK is both recently added and arch-dependent,
so we cannot just define it.
2022-01-15 22:39:40 +01:00
Ondrej Zajicek (work) 8988264a64 Netlink: Add workaround for older kernel headers 2022-01-14 23:15:05 +01:00
Ondrej Zajicek (work) e818f16448 Netlink: Enable strict checking for KRT dumps
Add strict checking for netlink KRT dumps to avoid PMTU cache records
from FNHE table dump along with KRT.

Linux Kernel added FNHE table dump to the netlink API in patch:

8d3b68cd37.1561131177.git.sbrivio@redhat.com/

Therefore, since Linux 5.3 these route cache entries are dumped together
with regular routes during periodic KRT scans, which in some cases may be
huge amount of useless data. This can be avoided by using strict checking
for netlink dumps:

https://lore.kernel.org/netdev/20181008031644.15989-1-dsahern@kernel.org/

The patch mitigates the risk of receiving unknown and potentially large
number of FNHE records that would block BIRD I/O in each sync. There is a
known issue caused by the GRE tunnels on Linux that seems to be creating
one FNHE record for each destination IP address that is routed through
the tunnel, even when the PMTU equals to GRE interface MTU.

Thanks to Tomas Hlavacek for the original patch.
2022-01-14 21:53:40 +01:00
Ondrej Zajicek (work) d0dd1d20cd Netlink: Explicitly skip received cloned routes
Kernel uses cloned routes to keep route cache entries, but reports them
together with regular routes. They were skipped implicitly as they
do not have rtm_protocol filled. Add explicit check for cloned flag
and skip such routes explicitly.

Also, improve debug logs of skipped routes.
2022-01-14 19:07:57 +01:00
Alexander Zubkov 87a02489f3 IO: Support nonlocal bind in socket interface
Add option to socket interface for nonlocal binding, i.e. binding to an
IP address that is not present on interfaces. This behaviour is enabled
when SKF_FREEBIND socket flag is set. For Linux systems, it is
implemented by IP_FREEBIND socket flag.

Minor changes done by commiter.
2022-01-08 19:02:31 +01:00
Ondrej Zajicek (work) f5c8fb5fba Netlink: Do not ignore dead routes from BIRD
Currently, BIRD ignores dead routes to consider them absent. But it also
ignores its own routes and thus it can not correctly manage such routes
in some cases. This patch makes an exception for routes with proto bird
when ignoring dead routes, so they can be properly updated or removed.

Thanks to Alexander Zubkov for the original patch.
2022-01-05 19:25:42 +01:00
Ondrej Zajicek (work) 77d032c71f Netlink: Improve multipath parsing errors
Function nl_parse_multipath() should handle errors internally.
2022-01-05 18:46:41 +01:00
Ondrej Zajicek (work) a39cd2cc0b BSD: Assume onlink flag on ifaces with only host addresses
The BSD kernel does not support the onlink flag and BIRD does not use
direct routes for next hop validation, instead depends on interface
address ranges. We would like to handle PtMP cases with only host
addresses configured, like:

  ifconfig wg0 192.168.0.10/32
  route add 192.168.0.4 -iface wg0
  route add 192.168.0.8 -iface wg0

To accept BIRD routes with onlink next-hop, like:

  route 192.168.42.0/24 via 192.168.0.4%wg0 onlink

BIRD would dismiss the route when receiving from the kernel, as the
next-hop 192.168.0.4 is not part of any interface subnet and onlink
flag is not kept by the BSD kernel.

The commit fixes this by assuming that for routes received from the
kernel, any next-hop is onlink on ifaces with only host addresses.

Thanks to Stefan Haller for the original patch.
2021-12-27 21:00:04 +01:00
Maria Matejka 644e9ca94e Directly mapped pages are kept for future use if temporarily not needed 2021-11-24 19:42:52 +00:00