[Bug 1778] New: Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around

Thu Nov 7 12:43:50 CET 2024

https://bugzilla.netfilter.org/show_bug.cgi?id=1778

            Bug ID: 1778
           Summary: Skipping garbage collection in nf_conncount.c stops
                    working when jiffies wrap around
           Product: netfilter/iptables
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: nf_conntrack
          Assignee: netfilter-buglog at lists.netfilter.org
          Reporter: njensen at akamai.com

This previous patch skips garbage collection for nf_conncount if we already ran
garbage collection in the same jiffy:
https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee

In our testing this patch stops working when jiffies wrap around. This happens
after the kernel has run for 5 minutes since INITIAL_JIFFIES is set to -300*HZ.
We observed a massive slowdown for ct counts when this happens in our testing.

To reproduce add a simple ruleset with ct counts such as:
table inet filter {
        chain input {
                type filter hook input priority 0;
                ct state { established, related } accept
                reject
        }

        chain OUTPUT {
                type filter hook output priority 0;
                ct count over 100000 drop
                accept
        }
}

To better show the effect I have modified the kernel like below with a
debugging print:

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 4890af4dc263..b39fb3c10c06 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -134,6 +134,8 @@ static int __nf_conncount_add(struct net *net,

        if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
                goto add_new_node;
+       if ((u32)jiffies == list->last_gc)
+         printk(KERN_INFO "Already did GC this jiffy, but not skipping.
(u32)jiffies=%d, (unsigned long)list->last_gc=%lu, jiffies=%lu", (u32)jiffies,
(unsigned long)list->last_gc, jiffies);

        /* check the saved connections */
        list_for_each_entry_safe(conn, conn_n, &list->head, node) {

After the kernel has run for 5 minutes we see the following logged when quickly
sending a few SYNs:
Already did GC this jiffy, but not skipping. (u32)jiffies=2541, (unsigned
long)list->last_gc=2541, jiffies=4294969837

The problem seems to be that last_gc in the nf_conncount_list struct is an u32,
but jiffies is an unsigned long which is 8 bytes on my systems. When those two
are compared it only works until last_gc wraps around. The problematic check is
here:
https://github.com/torvalds/linux/blob/master/net/netfilter/nf_conncount.c#L135.

One fix could be to check if the last_gc matches the current (u32)jiffies like
below:
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -132,7 +132,7 @@
        struct nf_conn *found_ct;
        unsigned int collect = 0;

-       if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
+       if ((u32)jiffies == list->last_gc)
                goto add_new_node;

        /* check the saved connections */
@@ -234,7 +234,7 @@
        bool ret = false;

        /* don't bother if we just did GC */
-       if (time_is_after_eq_jiffies((unsigned long)READ_ONCE(list->last_gc)))
+       if ((u32)jiffies == READ_ONCE(list->last_gc))
                return false;

        /* don't bother if other cpu is already doing GC */

-- 
You are receiving this mail because:
You are watching all bug changes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20241107/f90b48d9/attachment.html>