Commit graph

169 commits

Author SHA1 Message Date
Ondrej Zajicek (work)
943478b00f Basic VRF support
Add basic VRF (virtual routing and forwarding) support. Protocols can be
associated with VRFs, such protocols will be restricted to interfaces
assigned to the VRF (as reported by Linux kernel) and will use sockets
bound to the VRF. E.g., different multihop BGP instances can use diffent
kernel routing tables to handle BGP TCP connections.

The VRF support is preliminary, currently there are several limitations:

- Recent Linux kernels (4.11) do not handle correctly sockets bound
to interaces that are part of VRF, so most protocols other than multihop
BGP do not work. This will be fixed by future kernel versions.

- Neighbor cache ignores VRFs. Breaks config with the same prefix on
local interfaces in different VRFs. Not much problem as single hop
protocols do not work anyways.

- Olock code ignores VRFs. Breaks config with multiple BGP peers with the
same IP address in different VRFs.

- Incoming BGP connections are not dispatched according to VRFs.
Breaks config with multiple BGP peers with the same IP address in
different VRFs. Perhaps we would need some kernel API to read VRF of
incoming connection? Or probably use multiple listening sockets in
int-new branch.

- We should handle master VRF interface up/down events and perhaps
disable associated protocols when VRF goes down. Or at least disable
associated interfaces.

- Also we should check if the master iface is really VRF iface and
not some other kind of master iface.

- BFD session request dispatch should be aware of VRFs.

- Perhaps kernel protocol should read default kernel table ID from VRF
iface so it is not necessary to configure it.

- Perhaps we should have per-VRF default table.
2017-09-06 17:38:48 +02:00
Ondrej Zajicek (work)
98bb80a243 KRT: Fix IPv6 ECMP handling with Linux 4.11+
Starting from Linux 4.11, IPv6 ECMP routes are now notified using
RTA_MULTIPATH, like IPv4 ones. The patch adds support for RTA_MULTIPATH
parsing for IPv6 routes. This also enables to parse ECMP alien routes
correctly.

Thanks to Vincent Bernat for the original patch.
2017-09-05 00:02:20 +02:00
Ondrej Zajicek (work)
c253ec3a9c Some more autoconf cleanups
Replace integer type width detection with C99 fixed-width types.
Also remove some unused or obsolete code.

Thanks to Ruben Kerkhof for the patchset.
2017-05-16 12:59:22 +02:00
Jan Moskyto Matejka
2c33da5070 Netlink: fix occasional netlink hangs on busy machines 2016-12-20 20:36:56 +01:00
Ondrej Zajicek (work)
c8cafc8ebb Minor code cleanups 2016-11-08 17:46:29 +01:00
Jan Moskyto Matejka
3e236955c9 Build: switch on -Wextra, get rid of most of the warnings
There are several unresolved -Wmissing-field-initializers on older
versions of GCC than 5.1, all of them false positive.
2016-11-01 14:52:54 +01:00
Jan Moskyto Matejka
ccd2a3eda2 Kernel socket missing err_hook fix
Thanks to Tim Weippert for bugreport.
2016-09-29 13:21:16 +02:00
Ondrej Zajicek (work)
6e75d0d27f KRT: Add krt_scope attribute
Add a new route attribute, krt_scope, to expose the Linux kernel route
scope. Constants from /etc/iproute2/rt_scopes (prefixed by "ips_") are
expected to be used with the attribute. Both import and export are
supported.

Also, the patch fixes device route export to the kernel, by setting link
scope automatically.
2016-09-19 12:29:56 +02:00
Ondrej Zajicek (work)
4adcb9df1b KRT: Add kernel metric protocol option
Kernel routes with different metrics do not clash with each other,
therefore using dedicated metric value is a reliable way to avoid
overwriting routes from other sources (e.g. kernel device routes).

Although kernel route metric could already be set as a route attribute by
filters, that is not consistent with the way how Linux kernel handles
route metric - not just a route attribute, but a part of a route key.
2016-09-15 14:59:06 +02:00
Ondrej Zajicek (work)
2feaa6931b KRT: Support for IPv6 ECMP
Linux represents IPv6 ECMP routes as a sequence of unipath routes with
the same prefix. We have to translate between our representation (one
route with multipath next hop) and the Linux representation in both
directions.

Proper learning of alien IPv6 ECMP routes still not supported.

Thanks to Mikhail Sennikovskii for the original patch.
2016-09-14 11:53:54 +02:00
Ondrej Zajicek (work)
f9f2e280ea KRT: Forbid path merging on BSD
We support ECMP routes only on Linux. Exported routes are checked in
krt_capable(), but a route generated during path merging avoids this
check.
2016-08-30 12:43:46 +02:00
Ondrej Zajicek (work)
a08a81c6b4 Netlink: Fix build with older headers missing IFA_FLAGS 2016-07-20 15:31:25 +02:00
Ondrej Zajicek (work)
e37d2e3e70 Netlink: Ignore tentative addresses
Ignore tentative IPv6 addresses and wait until finish of Duplicate
Address Detection (We got notification when an address is no longer
tentative) to avoid problems when protocols try to use interfaces
with tentative link-local addresses.

Based on patch from Jan Moskyto Matejka
2016-07-20 15:06:57 +02:00
Stijn Tintel
31e9e10144 netlink: update struct msghdr
The netlink code assumes an order for the members of struct msghdr.
This breaks recvmsg and sendmsg with musl libc on mips64. Fix this by
using designated initializers instead.

Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be>
2016-05-10 16:05:16 +02:00
Ondrej Zajicek (work)
a7baa09862 BSD: Add the IPsec SA/SP database entries control
Add code for manipulation with TCP-MD5 keys in the IPsec SA/SP database
at FreeBSD systems. Now, BGP MD5 authentication (RFC 2385) keys are
handled automatically on both Linux and FreeBSD.

Based on patches from Pavel Tvrdik.
2016-04-13 14:37:09 +02:00
Ondrej Zajicek (work)
e86cfd41d9 KRT: Fix route learn scan when route changed
When a kernel route changed, function krt_learn_scan() noticed that and
replaced the route in internal kernel FIB, but after that, function
krt_learn_prune() failed to propagate the new route to the nest, because
it confused the new route with the (removed) old best route and decided
that the best route did not changed.

Wow, the original code (and the bug) is almost 17 years old.
2016-04-06 11:46:25 +02:00
Jan Moskyto Matejka
ad27615760 Netlink: attribute validation before parsing
Wanted netlink attributes are defined in a table, specifying
their size and neediness. Removing the long conditions that did the
validation before.

Also parsing IPv4 and IPv6 versions regardless on the IPV6 macro.
2015-11-24 14:30:20 +01:00
Ondrej Zajicek (work)
1e4891e48e Nest: Fix bug in device proto
If an interface address notification is received during device protocol
shutdown/restart, BIRD crashed.

Thanks to Wei Huang for the bugreport.
2015-11-23 11:13:40 +01:00
Pavel Tvrdík
fce764f90e Fix compiling with --enable-debug option 2015-11-11 11:46:38 +01:00
Jan Moskyto Matejka
9ddbfbddf8 Netlink: Allow more than 256 routing tables.
Since 2.6.19, the netlink API defines RTA_TABLE routing attribute to
allow 32-bit routing table IDs. Using this attribute to index routing
tables at Linux, instead of 8-bit rtm_table field.
2015-11-11 11:40:49 +01:00
Ondrej Zajicek (work)
acb04cfdc5 Minor changes 2015-10-17 14:43:37 +02:00
Ondrej Zajicek
641172c6e5 Netlink: Fixes uninitialized variable
Thanks to Pavel Tvrdik for the bugfix
2015-07-28 12:36:03 +02:00
Ondrej Zajicek
78a2cc289f KRT: Fixes some minor bugs in kernel protocol 2015-06-08 02:24:08 +02:00
Pavel Tvrdík
ae80a2de95 unsigned [int] -> uint 2015-06-08 02:24:08 +02:00
Ondrej Zajicek
38e835dede Fix in the last commit 2015-05-13 13:19:26 +02:00
Ondrej Zajicek
9fdf9d29b6 KRT: Add support for plenty of kernel route metrics
Linux kernel route metrics (RTA_METRICS netlink route attribute) are
represented and accessible as new route attributes:

krt_mtu, krt_window, krt_rtt, krt_rttvar, krt_sstresh, krt_cwnd, krt_advmss,
krt_reordering, krt_hoplimit, krt_initcwnd, krt_rto_min, krt_initrwnd,
krt_quickack, krt_lock_mtu, krt_lock_window, krt_lock_rtt, krt_lock_rttvar,
krt_lock_sstresh, krt_lock_cwnd, krt_lock_advmss, krt_lock_reordering,
krt_lock_hoplimit, krt_lock_rto_min, krt_feature_ecn, krt_feature_allfrag
2015-05-12 16:42:22 +02:00
Ondrej Zajicek
16a3254c4c Understand IFF_MULTICAST flag on ifaces in Linux
Unfortunately, some interfaces support multicast but do not have
this flag set, so we use it only as a positive hint.

Thanks to Clint Armstrong for noticing the problem.
2015-03-31 23:59:40 +02:00
Ondrej Zajicek
86c3eea0f3 Use AF_UNSPEC for RTM_GETLINK
This value is specified in documentation.
2015-02-21 21:19:49 +01:00
Ondrej Zajicek
1123e70740 Implements token bucket filter for rate limiting. 2014-10-02 12:52:50 +02:00
Ondrej Zajicek
8945f73d94 Ensures that msg_controllen includes last padding.
Although RFC 3542 allows both cases, Theo de Raadt thinks
he knows better, and msg_controllen without last padding
fails on OpenBSD.

Thanks to Job Snijders for the bugreport.
2014-06-26 13:30:27 +02:00
Ondrej Zajicek
05476c4d04 IPv4/IPv6 integrated socket code. 2014-05-18 11:42:26 +02:00
Ondrej Zajicek
eb5ea6bdd6 Fixes build on some old systems. 2014-03-31 13:21:13 +02:00
Ondrej Zajicek
3216eb03dd Fixes longstanding issue with interfaces staying in IF_TMP_DOWN.
Thanks to Pierluigi Rolando and others for the bugreport.
2014-02-26 12:52:00 +01:00
Ondrej Zajicek
48e5f32db6 Many changes in I/O and OSPF sockets and packet handling.
I/O:
 - BSD: specify src addr on IP sockets by IP_HDRINCL
 - BSD: specify src addr on UDP sockets by IP_SENDSRCADDR
 - Linux: specify src addr on IP/UDP sockets by IP_PKTINFO
 - IPv6: specify src addr on IP/UDP sockets by IPV6_PKTINFO
 - Alternative SKF_BIND flag for binding to IP address
 - Allows IP/UDP sockets without tx_hook, on these
   sockets a packet is discarded when TX queue is full
 - Use consistently SOL_ for socket layer values.

OSPF:
 - Packet src addr is always explicitly set
 - Support for secondary addresses in BSD
 - Dynamic RX/TX buffers
 - Fixes some minor buffer overruns
 - Interface option 'tx length'
 - Names for vlink pseudoifaces (vlinkX)
 - Vlinks use separate socket for TX
 - Vlinks do not use fixed associated iface
 - Fixes TTL for direct unicast packets
 - Fixes DONTROUTE for OSPF sockets
 - Use ifa->ifname instead of ifa->iface->name
2014-02-06 17:46:01 +01:00
Ondrej Zajicek
283c7dfada Merge branch 'master' into add-path 2013-11-25 18:42:47 +01:00
Ondrej Zajicek
e237b28a4d Changes primary addr selection on BSD to respect SIOCGIFADDR ioctl() result.
Thanks to Alexander V. Chernikov for the original patch.
2013-11-25 01:21:39 +01:00
Ondrej Zajicek
65194bd1eb Removes workaround related to import of kernel device routes.
Thanks to Benjamin Cama for notification.
2013-11-23 22:48:27 +01:00
Ondrej Zajicek
736e143fa5 Merge branch 'master' into add-path
Conflicts:

	filter/filter.c
	nest/proto.c
	nest/rt-table.c
	proto/bgp/bgp.h
	proto/bgp/config.Y
2013-11-23 11:50:34 +01:00
Ondrej Zajicek
f83ce94d5e Fixes missing unregister of kernel table handling code.
And some minor fixes.

Thanks to Sergey Popovich for the patch.
2013-09-26 17:33:00 +02:00
Ondrej Zajicek
c6964c305b Makes krt.c much more readable. 2013-07-04 18:02:22 +02:00
Ondrej Zajicek
70e212f913 Implements TTL security for OSPF and RIP.
Interfaces for OSPF and RIP could be configured to use (and request)
TTL 255 for traffic to direct neighbors.

Thanks to Simon Dickhoven for the original patch for RIPng.
2013-06-25 15:39:44 +02:00
Ondrej Zajicek
ef4a50be10 Better packet priority and traffic class handling.
Implements support for IPv6 traffic class, sets higher priority for OSPF
and RIP outgoing packets by default and allows to configure ToS/DS/TClass
IP header field and the local priority of outgoing packets.
2013-06-24 16:37:30 +02:00
Ondrej Zajicek
9810d05562 Fixes problems with routing table scans on some platforms.
Negative bit shifts are definitely undefined oprations.
2013-05-28 10:44:44 +02:00
Ondrej Zajicek
094d2bdb79 Implements ADD-PATH extension for BGP.
Allows to send and receive multiple routes for one network by one BGP
session. Also contains necessary core changes to support this (routing
tables accepting several routes for one network from one protocol).
It needs some more cleanup before merging to the master branch.
2012-08-14 16:46:43 +02:00
Ondrej Zajicek
c06de722dd Some minor fixes. 2012-08-06 11:09:13 +02:00
Ondrej Zajicek
47c447c42e Minor cleanups. 2012-05-11 12:10:21 +02:00
Ondrej Zajicek
95616c8202 Cleanup in sysdep KRT code, part 4.
Adding some files that was accidentally removed
(instead of moved) in cleanup part 2.
2012-05-04 16:38:25 +02:00
Ondrej Zajicek
f1aceff59b Cleanup in sysdep KRT code, part 2.
Remove support for historic Linux kernels,
merge krt-iface, krt-set and krt-scan stub headers.
2012-04-30 22:25:24 +02:00
Ondrej Zajicek
396dfa9042 Cleanup in sysdep KRT code, part 1.
OS-dependent functions renamed to be more consistent,
prepared to merge krt-set and krt-scan headers.

Name changes:

struct krt_if_params -> struct kif_params
struct krt_if_status -> struct kif_status
struct krt_set/scan_params -> struct krt_params
struct krt_set/scan_status -> struct krt_status

krt_if_params_same -> kif_sys_reconfigure
krt_if_copy_params -> kif_sys_copy_config
krt_set/scan_params_same -> krt_sys_reconfigure
krt_set/scan_copy_params -> krt_sys_copy_config

krt_if_scan -> kif_do_scan
krt_set_notify -> krt_do_notify
krt_scan_fire -> krt_do_scan

krt_if_ -> kif_sys_
krt_scan_ -> krt_sys_
krt_set_ -> krt_sys_
2012-04-30 15:31:32 +02:00
Ondrej Zajicek
3589546af4 Merge commit 'origin/master' 2012-04-24 23:37:01 +02:00