<html>
    <head>
      <base href="https://bugzilla.netfilter.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around"
   href="https://bugzilla.netfilter.org/show_bug.cgi?id=1778">1778</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>netfilter/iptables
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P5
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>nf_conntrack
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>netfilter-buglog@lists.netfilter.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>njensen@akamai.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>This previous patch skips garbage collection for nf_conncount if we already ran
garbage collection in the same jiffy:
<a href="https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee">https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee</a>

In our testing this patch stops working when jiffies wrap around. This happens
after the kernel has run for 5 minutes since INITIAL_JIFFIES is set to -300*HZ.
We observed a massive slowdown for ct counts when this happens in our testing.

To reproduce add a simple ruleset with ct counts such as:
table inet filter {
        chain input {
                type filter hook input priority 0;
                ct state { established, related } accept
                reject
        }

        chain OUTPUT {
                type filter hook output priority 0;
                ct count over 100000 drop
                accept
        }
}

To better show the effect I have modified the kernel like below with a
debugging print:
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 4890af4dc263..b39fb3c10c06 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -134,6 +134,8 @@ static int __nf_conncount_add(struct net *net,

        if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
                goto add_new_node;
+       if ((u32)jiffies == list->last_gc)
+         printk(KERN_INFO "Already did GC this jiffy, but not skipping.
(u32)jiffies=%d, (unsigned long)list->last_gc=%lu, jiffies=%lu", (u32)jiffies,
(unsigned long)list->last_gc, jiffies);

        /* check the saved connections */
        list_for_each_entry_safe(conn, conn_n, &list->head, node) {

After the kernel has run for 5 minutes we see the following logged when quickly
sending a few SYNs:
Already did GC this jiffy, but not skipping. (u32)jiffies=2541, (unsigned
long)list->last_gc=2541, jiffies=4294969837

The problem seems to be that last_gc in the nf_conncount_list struct is an u32,
but jiffies is an unsigned long which is 8 bytes on my systems. When those two
are compared it only works until last_gc wraps around. The problematic check is
here:
<a href="https://github.com/torvalds/linux/blob/master/net/netfilter/nf_conncount.c#L135">https://github.com/torvalds/linux/blob/master/net/netfilter/nf_conncount.c#L135</a>.

One fix could be to check if the last_gc matches the current (u32)jiffies like
below:
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -132,7 +132,7 @@
        struct nf_conn *found_ct;
        unsigned int collect = 0;

-       if (time_is_after_eq_jiffies((unsigned long)list->last_gc))
+       if ((u32)jiffies == list->last_gc)
                goto add_new_node;

        /* check the saved connections */
@@ -234,7 +234,7 @@
        bool ret = false;

        /* don't bother if we just did GC */
-       if (time_is_after_eq_jiffies((unsigned long)READ_ONCE(list->last_gc)))
+       if ((u32)jiffies == READ_ONCE(list->last_gc))
                return false;

        /* don't bother if other cpu is already doing GC */</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are watching all bug changes.</li>
      </ul>
    </body>
</html>