Documentation (and minor fixes) for BGP graceful restart.
This commit is contained in:
parent
0c791f873a
commit
6eda3f135f
4 changed files with 274 additions and 70 deletions
|
@ -157,6 +157,9 @@ options. The most important ones are:
|
||||||
|
|
||||||
<tag>-f</tag>
|
<tag>-f</tag>
|
||||||
run bird in foreground.
|
run bird in foreground.
|
||||||
|
|
||||||
|
<tag>-R</tag>
|
||||||
|
apply graceful restart recovery after start.
|
||||||
</descrip>
|
</descrip>
|
||||||
|
|
||||||
<p>BIRD writes messages about its work to log files or syslog (according to config).
|
<p>BIRD writes messages about its work to log files or syslog (according to config).
|
||||||
|
@ -187,6 +190,7 @@ configuration, but it is generally easy -- BIRD needs just the
|
||||||
standard library, privileges to read the config file and create the
|
standard library, privileges to read the config file and create the
|
||||||
control socket and the CAP_NET_* capabilities.
|
control socket and the CAP_NET_* capabilities.
|
||||||
|
|
||||||
|
|
||||||
<chapt>About routing tables
|
<chapt>About routing tables
|
||||||
|
|
||||||
<p>BIRD has one or more routing tables which may or may not be
|
<p>BIRD has one or more routing tables which may or may not be
|
||||||
|
@ -242,6 +246,20 @@ using comparison and ordering). Minor advantage is that routes are
|
||||||
shown sorted in <cf/show route/, minor disadvantage is that it is
|
shown sorted in <cf/show route/, minor disadvantage is that it is
|
||||||
slightly more computationally expensive.
|
slightly more computationally expensive.
|
||||||
|
|
||||||
|
<sect>Graceful restart
|
||||||
|
|
||||||
|
<p>When BIRD is started after restart or crash, it repopulates routing tables in
|
||||||
|
an uncoordinated manner, like after clean start. This may be impractical in some
|
||||||
|
cases, because if the forwarding plane (i.e. kernel routing tables) remains
|
||||||
|
intact, then its synchronization with BIRD would temporarily disrupt packet
|
||||||
|
forwarding until protocols converge. Graceful restart is a mechanism that could
|
||||||
|
help with this issue. Generally, it works by starting protocols and letting them
|
||||||
|
repopulate routing tables while deferring route propagation until protocols
|
||||||
|
acknowledge their convergence. Note that graceful restart behavior have to be
|
||||||
|
configured for all relevant protocols and requires protocol-specific support
|
||||||
|
(currently implemented for Kernel and BGP protocols), it is activated for
|
||||||
|
particular boot by option <cf/-R/.
|
||||||
|
|
||||||
|
|
||||||
<chapt>Configuration
|
<chapt>Configuration
|
||||||
|
|
||||||
|
@ -371,6 +389,12 @@ protocol rip {
|
||||||
would accept IPv6 routes only). Such behavior was default in
|
would accept IPv6 routes only). Such behavior was default in
|
||||||
older versions of BIRD.
|
older versions of BIRD.
|
||||||
|
|
||||||
|
<tag>graceful restart wait <m/number/</tag>
|
||||||
|
During graceful restart recovery, BIRD waits for convergence of routing
|
||||||
|
protocols. This option allows to specify a timeout for the recovery to
|
||||||
|
prevent waiting indefinitely if some protocols cannot converge. Default:
|
||||||
|
240 seconds.
|
||||||
|
|
||||||
<tag>timeformat route|protocol|base|log "<m/format1/" [<m/limit/ "<m/format2/"]</tag>
|
<tag>timeformat route|protocol|base|log "<m/format1/" [<m/limit/ "<m/format2/"]</tag>
|
||||||
This option allows to specify a format of date/time used by
|
This option allows to specify a format of date/time used by
|
||||||
BIRD. The first argument specifies for which purpose such
|
BIRD. The first argument specifies for which purpose such
|
||||||
|
@ -1493,6 +1517,8 @@ extended communities
|
||||||
(RFC 4360<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4360.txt">),
|
(RFC 4360<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4360.txt">),
|
||||||
route reflectors
|
route reflectors
|
||||||
(RFC 4456<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4456.txt">),
|
(RFC 4456<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4456.txt">),
|
||||||
|
graceful restart
|
||||||
|
(RFC 4724<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4724.txt">),
|
||||||
multiprotocol extensions
|
multiprotocol extensions
|
||||||
(RFC 4760<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4760.txt">),
|
(RFC 4760<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4760.txt">),
|
||||||
4B AS numbers
|
4B AS numbers
|
||||||
|
@ -1502,9 +1528,7 @@ and 4B AS numbers in extended communities
|
||||||
|
|
||||||
|
|
||||||
For IPv6, it uses the standard multiprotocol extensions defined in
|
For IPv6, it uses the standard multiprotocol extensions defined in
|
||||||
RFC 2283<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc2283.txt">
|
RFC 4760<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc4760.txt">
|
||||||
including changes described in the
|
|
||||||
latest draft<htmlurl url="ftp://ftp.rfc-editor.org/internet-drafts/draft-ietf-idr-bgp4-multiprotocol-v2-05.txt">
|
|
||||||
and applied to IPv6 according to
|
and applied to IPv6 according to
|
||||||
RFC 2545<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc2545.txt">.
|
RFC 2545<htmlurl url="ftp://ftp.rfc-editor.org/in-notes/rfc2545.txt">.
|
||||||
|
|
||||||
|
@ -1716,6 +1740,26 @@ for each neighbor using the following configuration parameters:
|
||||||
capability and accepts such requests. Even when disabled, BIRD
|
capability and accepts such requests. Even when disabled, BIRD
|
||||||
can send route refresh requests. Default: on.
|
can send route refresh requests. Default: on.
|
||||||
|
|
||||||
|
<tag>graceful restart <m/switch/|aware</tag>
|
||||||
|
When a BGP speaker restarts or crashes, neighbors will discard all
|
||||||
|
received paths from the speaker, which disrupts packet forwarding even
|
||||||
|
when the forwarding plane of the speaker remains intact. RFC 4724
|
||||||
|
specifies an optional graceful restart mechanism to alleviate this
|
||||||
|
issue. This option controls the mechanism. It has three states:
|
||||||
|
Disabled, when no support is provided. Aware, when the graceful restart
|
||||||
|
support is announced and the support for restarting neighbors is
|
||||||
|
provided, but no local graceful restart is allowed (i.e. receiving-only
|
||||||
|
role). Enabled, when the full graceful restart support is provided
|
||||||
|
(i.e. both restarting and receiving role). Note that proper support for
|
||||||
|
local graceful restart requires also configuration of other protocols.
|
||||||
|
Default: aware.
|
||||||
|
|
||||||
|
<tag>graceful restart time <m/number/</tag>
|
||||||
|
The restart time is announced in the BGP graceful restart capability
|
||||||
|
and specifies how long the neighbor would wait for the BGP session to
|
||||||
|
re-establish after a restart before deleting stale routes. Default:
|
||||||
|
120 seconds.
|
||||||
|
|
||||||
<tag>interpret communities <m/switch/</tag> RFC 1997 demands
|
<tag>interpret communities <m/switch/</tag> RFC 1997 demands
|
||||||
that BGP speaker should process well-known communities like
|
that BGP speaker should process well-known communities like
|
||||||
no-export (65535, 65281) or no-advertise (65535, 65282). For
|
no-export (65535, 65281) or no-advertise (65535, 65282). For
|
||||||
|
@ -2063,25 +2107,36 @@ overcome using another routing table and the pipe protocol.
|
||||||
<sect1>Configuration
|
<sect1>Configuration
|
||||||
|
|
||||||
<p><descrip>
|
<p><descrip>
|
||||||
<tag>persist <m/switch/</tag> Tell BIRD to leave all its routes in the
|
<tag>persist <m/switch/</tag>
|
||||||
routing tables when it exits (instead of cleaning them up).
|
Tell BIRD to leave all its routes in the routing tables when it exits
|
||||||
<tag>scan time <m/number/</tag> Time in seconds between two consecutive scans of the
|
(instead of cleaning them up).
|
||||||
kernel routing table.
|
|
||||||
<tag>learn <m/switch/</tag> Enable learning of routes added to the kernel
|
|
||||||
routing tables by other routing daemons or by the system administrator.
|
|
||||||
This is possible only on systems which support identification of route
|
|
||||||
authorship.
|
|
||||||
|
|
||||||
<tag>device routes <m/switch/</tag> Enable export of device
|
<tag>scan time <m/number/</tag>
|
||||||
routes to the kernel routing table. By default, such routes
|
Time in seconds between two consecutive scans of the kernel routing
|
||||||
are rejected (with the exception of explicitly configured
|
table.
|
||||||
device routes from the static protocol) regardless of the
|
|
||||||
export filter to protect device routes in kernel routing table
|
|
||||||
(managed by OS itself) from accidental overwriting or erasing.
|
|
||||||
|
|
||||||
<tag>kernel table <m/number/</tag> Select which kernel table should
|
<tag>learn <m/switch/</tag>
|
||||||
this particular instance of the Kernel protocol work with. Available
|
Enable learning of routes added to the kernel routing tables by other
|
||||||
only on systems supporting multiple routing tables.
|
routing daemons or by the system administrator. This is possible only on
|
||||||
|
systems which support identification of route authorship.
|
||||||
|
|
||||||
|
<tag>device routes <m/switch/</tag>
|
||||||
|
Enable export of device routes to the kernel routing table. By default,
|
||||||
|
such routes are rejected (with the exception of explicitly configured
|
||||||
|
device routes from the static protocol) regardless of the export filter
|
||||||
|
to protect device routes in kernel routing table (managed by OS itself)
|
||||||
|
from accidental overwriting or erasing.
|
||||||
|
|
||||||
|
<tag>kernel table <m/number/</tag>
|
||||||
|
Select which kernel table should this particular instance of the Kernel
|
||||||
|
protocol work with. Available only on systems supporting multiple
|
||||||
|
routing tables.
|
||||||
|
|
||||||
|
<tag>graceful restart <m/switch/</tag>
|
||||||
|
Participate in graceful restart recovery. If this option is enabled and
|
||||||
|
a graceful restart recovery is active, the Kernel protocol will defer
|
||||||
|
synchronization of routing tables until the end of the recovery. Note
|
||||||
|
that import of kernel routes to BIRD is not affected.
|
||||||
</descrip>
|
</descrip>
|
||||||
|
|
||||||
<sect1>Attributes
|
<sect1>Attributes
|
||||||
|
|
154
nest/proto.c
154
nest/proto.c
|
@ -51,6 +51,8 @@ static char *c_states[] = { "HUNGRY", "???", "HAPPY", "FLUSHING" };
|
||||||
static void proto_flush_loop(void *);
|
static void proto_flush_loop(void *);
|
||||||
static void proto_shutdown_loop(struct timer *);
|
static void proto_shutdown_loop(struct timer *);
|
||||||
static void proto_rethink_goal(struct proto *p);
|
static void proto_rethink_goal(struct proto *p);
|
||||||
|
static void proto_want_export_up(struct proto *p);
|
||||||
|
static void proto_fell_down(struct proto *p);
|
||||||
static char *proto_state_name(struct proto *p);
|
static char *proto_state_name(struct proto *p);
|
||||||
|
|
||||||
static void
|
static void
|
||||||
|
@ -151,21 +153,20 @@ extern pool *rt_table_pool;
|
||||||
* @t: routing table to connect to
|
* @t: routing table to connect to
|
||||||
* @stats: per-table protocol statistics
|
* @stats: per-table protocol statistics
|
||||||
*
|
*
|
||||||
* This function creates a connection between the protocol instance @p
|
* This function creates a connection between the protocol instance @p and the
|
||||||
* and the routing table @t, making the protocol hear all changes in
|
* routing table @t, making the protocol hear all changes in the table.
|
||||||
* the table.
|
|
||||||
*
|
*
|
||||||
* The announce hook is linked in the protocol ahook list and, if the
|
* The announce hook is linked in the protocol ahook list. Announce hooks are
|
||||||
* protocol accepts routes, also in the table ahook list. Announce
|
* allocated from the routing table resource pool and when protocol accepts
|
||||||
* hooks are allocated from the routing table resource pool, they are
|
* routes also in the table ahook list. The are linked to the table ahook list
|
||||||
* unlinked from the table ahook list after the protocol went down,
|
* and unlinked from it depending on export_state (in proto_want_export_up() and
|
||||||
* (in proto_schedule_flush()) and they are automatically freed after the
|
* proto_want_export_down()) and they are automatically freed after the protocol
|
||||||
* protocol is flushed (in proto_fell_down()).
|
* is flushed (in proto_fell_down()).
|
||||||
*
|
*
|
||||||
* Unless you want to listen to multiple routing tables (as the Pipe
|
* Unless you want to listen to multiple routing tables (as the Pipe protocol
|
||||||
* protocol does), you needn't to worry about this function since the
|
* does), you needn't to worry about this function since the connection to the
|
||||||
* connection to the protocol's primary routing table is initialized
|
* protocol's primary routing table is initialized automatically by the core
|
||||||
* automatically by the core code.
|
* code.
|
||||||
*/
|
*/
|
||||||
struct announce_hook *
|
struct announce_hook *
|
||||||
proto_add_announce_hook(struct proto *p, struct rtable *t, struct proto_stats *stats)
|
proto_add_announce_hook(struct proto *p, struct rtable *t, struct proto_stats *stats)
|
||||||
|
@ -183,7 +184,7 @@ proto_add_announce_hook(struct proto *p, struct rtable *t, struct proto_stats *s
|
||||||
h->next = p->ahooks;
|
h->next = p->ahooks;
|
||||||
p->ahooks = h;
|
p->ahooks = h;
|
||||||
|
|
||||||
if (p->rt_notify && (p->export_state == ES_READY))
|
if (p->rt_notify && (p->export_state != ES_DOWN))
|
||||||
add_tail(&t->hooks, &h->n);
|
add_tail(&t->hooks, &h->n);
|
||||||
return h;
|
return h;
|
||||||
}
|
}
|
||||||
|
@ -659,16 +660,59 @@ proto_rethink_goal(struct proto *p)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* DOC: Graceful restart recovery
|
||||||
|
*
|
||||||
|
* Graceful restart of a router is a process when the routing plane (e.g. BIRD)
|
||||||
|
* restarts but both the forwarding plane (e.g kernel routing table) and routing
|
||||||
|
* neighbors keep proper routes, and therefore uninterrupted packet forwarding
|
||||||
|
* is maintained.
|
||||||
|
*
|
||||||
|
* BIRD implements graceful restart recovery by deferring export of routes to
|
||||||
|
* protocols until routing tables are refilled with the expected content. After
|
||||||
|
* start, protocols generate routes as usual, but routes are not propagated to
|
||||||
|
* them, until protocols report that they generated all routes. After that,
|
||||||
|
* graceful restart recovery is finished and the export (and the initial feed)
|
||||||
|
* to protocols is enabled.
|
||||||
|
*
|
||||||
|
* When graceful restart recovery need is detected during initialization, then
|
||||||
|
* enabled protocols are marked with @gr_recovery flag before start. Such
|
||||||
|
* protocols then decide how to proceed with graceful restart, participation is
|
||||||
|
* voluntary. Protocols could lock the recovery by proto_graceful_restart_lock()
|
||||||
|
* (stored in @gr_lock flag), which means that they want to postpone the end of
|
||||||
|
* the recovery until they converge and then unlock it. They also could set
|
||||||
|
* @gr_wait before advancing to %PS_UP, which means that the core should defer
|
||||||
|
* route export to that protocol until the end of the recovery. This should be
|
||||||
|
* done by protocols that expect their neigbors to keep the proper routes
|
||||||
|
* (kernel table, BGP sessions with BGP graceful restart capability).
|
||||||
|
*
|
||||||
|
* The graceful restart recovery is finished when either all graceful restart
|
||||||
|
* locks are unlocked or when graceful restart wait timer fires.
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
|
||||||
static void graceful_restart_done(struct timer *t UNUSED);
|
static void graceful_restart_done(struct timer *t);
|
||||||
static void proto_want_export_up(struct proto *p);
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* graceful_restart_recovery - request initial graceful restart recovery
|
||||||
|
*
|
||||||
|
* Called by the platform initialization code if the need for recovery
|
||||||
|
* after graceful restart is detected during boot. Have to be called
|
||||||
|
* before protos_commit().
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
graceful_restart_recovery(void)
|
graceful_restart_recovery(void)
|
||||||
{
|
{
|
||||||
graceful_restart_state = GRS_INIT;
|
graceful_restart_state = GRS_INIT;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* graceful_restart_init - initialize graceful restart
|
||||||
|
*
|
||||||
|
* When graceful restart recovery was requested, the function starts an active
|
||||||
|
* phase of the recovery and initializes graceful restart wait timer. The
|
||||||
|
* function have to be called after protos_commit().
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
graceful_restart_init(void)
|
graceful_restart_init(void)
|
||||||
{
|
{
|
||||||
|
@ -689,6 +733,15 @@ graceful_restart_init(void)
|
||||||
tm_start(gr_wait_timer, config->gr_wait);
|
tm_start(gr_wait_timer, config->gr_wait);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* graceful_restart_done - finalize graceful restart
|
||||||
|
*
|
||||||
|
* When there are no locks on graceful restart, the functions finalizes the
|
||||||
|
* graceful restart recovery. Protocols postponing route export until the end of
|
||||||
|
* the recovery are awakened and the export to them is enabled. All other
|
||||||
|
* related state is cleared. The function is also called when the graceful
|
||||||
|
* restart wait timer fires (but there are still some locks).
|
||||||
|
*/
|
||||||
static void
|
static void
|
||||||
graceful_restart_done(struct timer *t UNUSED)
|
graceful_restart_done(struct timer *t UNUSED)
|
||||||
{
|
{
|
||||||
|
@ -727,7 +780,19 @@ graceful_restart_show_status(void)
|
||||||
cli_msg(-24, " Wait timer is %d/%d", tm_remains(gr_wait_timer), config->gr_wait);
|
cli_msg(-24, " Wait timer is %d/%d", tm_remains(gr_wait_timer), config->gr_wait);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Just from start hook */
|
/**
|
||||||
|
* proto_graceful_restart_lock - lock graceful restart by protocol
|
||||||
|
* @p: protocol instance
|
||||||
|
*
|
||||||
|
* This function allows a protocol to postpone the end of graceful restart
|
||||||
|
* recovery until it converges. The lock is removed when the protocol calls
|
||||||
|
* proto_graceful_restart_unlock() or when the protocol is stopped.
|
||||||
|
*
|
||||||
|
* The function have to be called during the initial phase of graceful restart
|
||||||
|
* recovery and only for protocols that are part of graceful restart (i.e. their
|
||||||
|
* @gr_recovery is set), which means it should be called from protocol start
|
||||||
|
* hooks.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
proto_graceful_restart_lock(struct proto *p)
|
proto_graceful_restart_lock(struct proto *p)
|
||||||
{
|
{
|
||||||
|
@ -741,6 +806,13 @@ proto_graceful_restart_lock(struct proto *p)
|
||||||
graceful_restart_locks++;
|
graceful_restart_locks++;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* proto_graceful_restart_unlock - unlock graceful restart by protocol
|
||||||
|
* @p: protocol instance
|
||||||
|
*
|
||||||
|
* This function unlocks a lock from proto_graceful_restart_lock(). It is also
|
||||||
|
* automatically called when the lock holding protocol went down.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
proto_graceful_restart_unlock(struct proto *p)
|
proto_graceful_restart_unlock(struct proto *p)
|
||||||
{
|
{
|
||||||
|
@ -867,29 +939,6 @@ protos_build(void)
|
||||||
proto_flush_event->hook = proto_flush_loop;
|
proto_flush_event->hook = proto_flush_loop;
|
||||||
proto_shutdown_timer = tm_new(proto_pool);
|
proto_shutdown_timer = tm_new(proto_pool);
|
||||||
proto_shutdown_timer->hook = proto_shutdown_loop;
|
proto_shutdown_timer->hook = proto_shutdown_loop;
|
||||||
proto_shutdown_timer = tm_new(proto_pool);
|
|
||||||
proto_shutdown_timer->hook = proto_shutdown_loop;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void
|
|
||||||
proto_fell_down(struct proto *p)
|
|
||||||
{
|
|
||||||
DBG("Protocol %s down\n", p->name);
|
|
||||||
|
|
||||||
u32 all_routes = p->stats.imp_routes + p->stats.filt_routes;
|
|
||||||
if (all_routes != 0)
|
|
||||||
log(L_ERR "Protocol %s is down but still has %d routes", p->name, all_routes);
|
|
||||||
|
|
||||||
bzero(&p->stats, sizeof(struct proto_stats));
|
|
||||||
proto_free_ahooks(p);
|
|
||||||
|
|
||||||
if (! p->proto->multitable)
|
|
||||||
rt_unlock_table(p->table);
|
|
||||||
|
|
||||||
if (p->proto->cleanup)
|
|
||||||
p->proto->cleanup(p);
|
|
||||||
|
|
||||||
proto_rethink_goal(p);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static void
|
static void
|
||||||
|
@ -1066,6 +1115,10 @@ proto_request_feeding(struct proto *p)
|
||||||
{
|
{
|
||||||
ASSERT(p->proto_state == PS_UP);
|
ASSERT(p->proto_state == PS_UP);
|
||||||
|
|
||||||
|
/* Do nothing if we are still waiting for feeding */
|
||||||
|
if (p->export_state == ES_DOWN)
|
||||||
|
return;
|
||||||
|
|
||||||
/* If we are already feeding, we want to restart it */
|
/* If we are already feeding, we want to restart it */
|
||||||
if (p->export_state == ES_FEEDING)
|
if (p->export_state == ES_FEEDING)
|
||||||
{
|
{
|
||||||
|
@ -1220,6 +1273,27 @@ proto_falling_down(struct proto *p)
|
||||||
proto_graceful_restart_unlock(p);
|
proto_graceful_restart_unlock(p);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void
|
||||||
|
proto_fell_down(struct proto *p)
|
||||||
|
{
|
||||||
|
DBG("Protocol %s down\n", p->name);
|
||||||
|
|
||||||
|
u32 all_routes = p->stats.imp_routes + p->stats.filt_routes;
|
||||||
|
if (all_routes != 0)
|
||||||
|
log(L_ERR "Protocol %s is down but still has %d routes", p->name, all_routes);
|
||||||
|
|
||||||
|
bzero(&p->stats, sizeof(struct proto_stats));
|
||||||
|
proto_free_ahooks(p);
|
||||||
|
|
||||||
|
if (! p->proto->multitable)
|
||||||
|
rt_unlock_table(p->table);
|
||||||
|
|
||||||
|
if (p->proto->cleanup)
|
||||||
|
p->proto->cleanup(p);
|
||||||
|
|
||||||
|
proto_rethink_goal(p);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* proto_notify_state - notify core about protocol state change
|
* proto_notify_state - notify core about protocol state change
|
||||||
|
|
|
@ -1110,6 +1110,21 @@ rt_examine(rtable *t, ip_addr prefix, int pxlen, struct proto *p, struct filter
|
||||||
return v > 0;
|
return v > 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* rt_refresh_begin - start a refresh cycle
|
||||||
|
* @t: related routing table
|
||||||
|
* @ah: related announce hook
|
||||||
|
*
|
||||||
|
* This function starts a refresh cycle for given routing table and announce
|
||||||
|
* hook. The refresh cycle is a sequence where the protocol sends all its valid
|
||||||
|
* routes to the routing table (by rte_update()). After that, all protocol
|
||||||
|
* routes (more precisely routes with @ah as @sender) not sent during the
|
||||||
|
* refresh cycle but still in the table from the past are pruned. This is
|
||||||
|
* implemented by marking all related routes as stale by REF_STALE flag in
|
||||||
|
* rt_refresh_begin(), then marking all related stale routes with REF_DISCARD
|
||||||
|
* flag in rt_refresh_end() and then removing such routes in the prune loop.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
rt_refresh_begin(rtable *t, struct announce_hook *ah)
|
rt_refresh_begin(rtable *t, struct announce_hook *ah)
|
||||||
{
|
{
|
||||||
|
@ -1126,6 +1141,14 @@ rt_refresh_begin(rtable *t, struct announce_hook *ah)
|
||||||
FIB_WALK_END;
|
FIB_WALK_END;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* rt_refresh_end - end a refresh cycle
|
||||||
|
* @t: related routing table
|
||||||
|
* @ah: related announce hook
|
||||||
|
*
|
||||||
|
* This function starts a refresh cycle for given routing table and announce
|
||||||
|
* hook. See rt_refresh_begin() for description of refresh cycles.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
rt_refresh_end(rtable *t, struct announce_hook *ah)
|
rt_refresh_end(rtable *t, struct announce_hook *ah)
|
||||||
{
|
{
|
||||||
|
@ -1405,6 +1428,19 @@ again:
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* rt_prune_table - prune a routing table
|
||||||
|
*
|
||||||
|
* This function scans the routing table @tab and removes routes belonging to
|
||||||
|
* flushing protocols, discarded routes and also stale network entries, in a
|
||||||
|
* similar fashion like rt_prune_loop(). Returns 1 when all such routes are
|
||||||
|
* pruned. Contrary to rt_prune_loop(), this function is not a part of the
|
||||||
|
* protocol flushing loop, but it is called from rt_event() for just one routing
|
||||||
|
* table.
|
||||||
|
*
|
||||||
|
* Note that rt_prune_table() and rt_prune_loop() share (for each table) the
|
||||||
|
* prune state (@prune_state) and also the pruning iterator (@prune_fit).
|
||||||
|
*/
|
||||||
static inline int
|
static inline int
|
||||||
rt_prune_table(rtable *tab)
|
rt_prune_table(rtable *tab)
|
||||||
{
|
{
|
||||||
|
@ -1415,16 +1451,15 @@ rt_prune_table(rtable *tab)
|
||||||
/**
|
/**
|
||||||
* rt_prune_loop - prune routing tables
|
* rt_prune_loop - prune routing tables
|
||||||
*
|
*
|
||||||
* The prune loop scans routing tables and removes routes belonging to
|
* The prune loop scans routing tables and removes routes belonging to flushing
|
||||||
* flushing protocols and also stale network entries. Returns 1 when
|
* protocols, discarded routes and also stale network entries. Returns 1 when
|
||||||
* all such routes are pruned. It is a part of the protocol flushing
|
* all such routes are pruned. It is a part of the protocol flushing loop.
|
||||||
* loop.
|
|
||||||
*
|
*
|
||||||
* The prune loop runs in two steps. In the first step it prunes just
|
* The prune loop runs in two steps. In the first step it prunes just the routes
|
||||||
* the routes with flushing senders (in explicitly marked tables) so
|
* with flushing senders (in explicitly marked tables) so the route removal is
|
||||||
* the route removal is propagated as usual. In the second step, all
|
* propagated as usual. In the second step, all remaining relevant routes are
|
||||||
* remaining relevant routes are removed. Ideally, there shouldn't be
|
* removed. Ideally, there shouldn't be any, but it happens when pipe filters
|
||||||
* any, but it happens when pipe filters are changed.
|
* are changed.
|
||||||
*/
|
*/
|
||||||
int
|
int
|
||||||
rt_prune_loop(void)
|
rt_prune_loop(void)
|
||||||
|
|
|
@ -51,6 +51,16 @@
|
||||||
* and bgp_encode_attrs() which does the converse. Both functions are built around a
|
* and bgp_encode_attrs() which does the converse. Both functions are built around a
|
||||||
* @bgp_attr_table array describing all important characteristics of all known attributes.
|
* @bgp_attr_table array describing all important characteristics of all known attributes.
|
||||||
* Unknown transitive attributes are attached to the route as %EAF_TYPE_OPAQUE byte streams.
|
* Unknown transitive attributes are attached to the route as %EAF_TYPE_OPAQUE byte streams.
|
||||||
|
*
|
||||||
|
* BGP protocol implements graceful restart in both restarting (local restart)
|
||||||
|
* and receiving (neighbor restart) roles. The first is handled mostly by the
|
||||||
|
* graceful restart code in the nest, BGP protocol just handles capabilities,
|
||||||
|
* sets @gr_wait and locks graceful restart until end-of-RIB mark is received.
|
||||||
|
* The second is implemented by internal restart of the BGP state to %BS_IDLE
|
||||||
|
* and protocol state to %PS_START, but keeping the protocol up from the core
|
||||||
|
* point of view and therefore maintaining received routes. Routing table
|
||||||
|
* refresh cycle (rt_refresh_begin(), rt_refresh_end()) is used for removing
|
||||||
|
* stale routes after reestablishment of BGP session during graceful restart.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#undef LOCAL_DEBUG
|
#undef LOCAL_DEBUG
|
||||||
|
@ -431,6 +441,17 @@ bgp_conn_enter_idle_state(struct bgp_conn *conn)
|
||||||
bgp_conn_leave_established_state(p);
|
bgp_conn_leave_established_state(p);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* bgp_handle_graceful_restart - handle detected BGP graceful restart
|
||||||
|
* @p: BGP instance
|
||||||
|
*
|
||||||
|
* This function is called when a BGP graceful restart of the neighbor is
|
||||||
|
* detected (when the TCP connection fails or when a new TCP connection
|
||||||
|
* appears). The function activates processing of the restart - starts routing
|
||||||
|
* table refresh cycle and activates BGP restart timer. The protocol state goes
|
||||||
|
* back to %PS_START, but changing BGP state back to %BS_IDLE is left for the
|
||||||
|
* caller.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
bgp_handle_graceful_restart(struct bgp_proto *p)
|
bgp_handle_graceful_restart(struct bgp_proto *p)
|
||||||
{
|
{
|
||||||
|
@ -448,6 +469,16 @@ bgp_handle_graceful_restart(struct bgp_proto *p)
|
||||||
rt_refresh_begin(p->p.main_ahook->table, p->p.main_ahook);
|
rt_refresh_begin(p->p.main_ahook->table, p->p.main_ahook);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* bgp_graceful_restart_done - finish active BGP graceful restart
|
||||||
|
* @p: BGP instance
|
||||||
|
*
|
||||||
|
* This function is called when the active BGP graceful restart of the neighbor
|
||||||
|
* should be finished - either successfully (the neighbor sends all paths and
|
||||||
|
* reports end-of-RIB on the new session) or unsuccessfully (the neighbor does
|
||||||
|
* not support BGP graceful restart on the new session). The function ends
|
||||||
|
* routing table refresh cycle and stops BGP restart timer.
|
||||||
|
*/
|
||||||
void
|
void
|
||||||
bgp_graceful_restart_done(struct bgp_proto *p)
|
bgp_graceful_restart_done(struct bgp_proto *p)
|
||||||
{
|
{
|
||||||
|
@ -457,6 +488,15 @@ bgp_graceful_restart_done(struct bgp_proto *p)
|
||||||
rt_refresh_end(p->p.main_ahook->table, p->p.main_ahook);
|
rt_refresh_end(p->p.main_ahook->table, p->p.main_ahook);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* bgp_graceful_restart_timeout - timeout of graceful restart 'restart timer'
|
||||||
|
* @t: timer
|
||||||
|
*
|
||||||
|
* This function is a timeout hook for @gr_timer, implementing BGP restart time
|
||||||
|
* limit for reestablisment of the BGP session after the graceful restart. When
|
||||||
|
* fired, we just proceed with the usual protocol restart.
|
||||||
|
*/
|
||||||
|
|
||||||
static void
|
static void
|
||||||
bgp_graceful_restart_timeout(timer *t)
|
bgp_graceful_restart_timeout(timer *t)
|
||||||
{
|
{
|
||||||
|
@ -968,7 +1008,7 @@ bgp_start(struct proto *P)
|
||||||
p->remote_id = 0;
|
p->remote_id = 0;
|
||||||
p->source_addr = p->cf->source_addr;
|
p->source_addr = p->cf->source_addr;
|
||||||
|
|
||||||
if (P->gr_recovery)
|
if (p->p.gr_recovery && p->cf->gr_mode)
|
||||||
proto_graceful_restart_lock(P);
|
proto_graceful_restart_lock(P);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
Loading…
Reference in a new issue