[Bug 1778] New: Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around
bugzilla-daemon at netfilter.org
bugzilla-daemon at netfilter.org
Thu Nov 7 12:43:50 CET 2024
https://bugzilla.netfilter.org/show_bug.cgi?id=1778
Bug ID: 1778
Summary: Skipping garbage collection in nf_conncount.c stops
working when jiffies wrap around
Product: netfilter/iptables
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P5
Component: nf_conntrack
Assignee: netfilter-buglog at lists.netfilter.org
Reporter: njensen at akamai.com
This previous patch skips garbage collection for nf_conncount if we already ran
garbage collection in the same jiffy:
https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee
In our testing this patch stops working when jiffies wrap around. This happens
after the kernel has run for 5 minutes since INITIAL_JIFFIES is set to -300*HZ.
We observed a massive slowdown for ct counts when this happens in our testing.
To reproduce add a simple ruleset with ct counts such as:
table inet filter {
chain input {
type filter hook input priority 0;
ct state { established, related } accept
reject
}
chain OUTPUT {
type filter hook output priority 0;
ct count over 100000 drop
accept
}
}
To better show the effect I have modified the kernel like below with a
debugging print:
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 4890af4dc263..b39fb3c10c06 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -134,6 +134,8 @@ static int __nf_conncount_add(struct net *net,
if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
goto add_new_node;
+ if ((u32)jiffies == list->last_gc)
+ printk(KERN_INFO "Already did GC this jiffy, but not skipping.
(u32)jiffies=%d, (unsigned long)list->last_gc=%lu, jiffies=%lu", (u32)jiffies,
(unsigned long)list->last_gc, jiffies);
/* check the saved connections */
list_for_each_entry_safe(conn, conn_n, &list->head, node) {
After the kernel has run for 5 minutes we see the following logged when quickly
sending a few SYNs:
Already did GC this jiffy, but not skipping. (u32)jiffies=2541, (unsigned
long)list->last_gc=2541, jiffies=4294969837
The problem seems to be that last_gc in the nf_conncount_list struct is an u32,
but jiffies is an unsigned long which is 8 bytes on my systems. When those two
are compared it only works until last_gc wraps around. The problematic check is
here:
https://github.com/torvalds/linux/blob/master/net/netfilter/nf_conncount.c#L135.
One fix could be to check if the last_gc matches the current (u32)jiffies like
below:
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -132,7 +132,7 @@
struct nf_conn *found_ct;
unsigned int collect = 0;
- if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
+ if ((u32)jiffies == list->last_gc)
goto add_new_node;
/* check the saved connections */
@@ -234,7 +234,7 @@
bool ret = false;
/* don't bother if we just did GC */
- if (time_is_after_eq_jiffies((unsigned long)READ_ONCE(list->last_gc)))
+ if ((u32)jiffies == READ_ONCE(list->last_gc))
return false;
/* don't bother if other cpu is already doing GC */
--
You are receiving this mail because:
You are watching all bug changes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20241107/f90b48d9/attachment.html>
More information about the netfilter-buglog
mailing list