[Bug 1082] New: Hard lockup when inserting nft rules (esp. ct rule)

bugzilla-daemon at netfilter.org bugzilla-daemon at netfilter.org
Wed Aug 17 19:52:13 CEST 2016


            Bug ID: 1082
           Summary: Hard lockup when inserting nft rules (esp. ct rule)
           Product: nftables
           Version: unspecified
          Hardware: x86_64
                OS: Debian GNU/Linux
            Status: NEW
          Severity: blocker
          Priority: P5
         Component: kernel
          Assignee: pablo at netfilter.org
          Reporter: larkwang at gmail.com

We are switching from openvpn to strongswan (ipsec) for our branch offices to
headquarter VPN link.

We use nftables for better performance and clean ruleset. The ruleset is

#!/usr/sbin/nft -f

flush ruleset

table inet filter {
        set allowed_addr {
                type ipv4_addr
                elements = { <about 40+ IPs> }
        set allowed_port {
                type inet_service
                elements = { 80,443,<other about 10 ports> }

        chain forward {
                type filter hook forward priority 0;
                ip saddr { 10.xx.210.0-10.xx.217.255, 10.xx.0.12 } ip daddr
10.xx.0.0/16 counter accept
                ip saddr 10.xx.0.0/16 ip daddr @allowed_addr tcp dport
@allowed_port counter accept
                ip saddr 10.xx.0.0/16 ip daddr { 10.xx.254.0/24, 10.xx.yy.zz }
counter accept
                ip saddr 10.xx.0.0/16 ip daddr ip protocol tcp ct
state invalid,new counter reject

The vpn server (debian jessie with bpo) uses these:

linux-image  4.6.4-1~bpo8+1 (also 4.5.5-1)
nftables     0.6-1~bpo8+1
libnftnl4    1.0.6-1~bpo8+1
libmnl0      1.0.3-5

The ruleset is loaded without problem before we begin to transit vpn links.
After we transit all links, we want to update the ruleset to add a new open IP.
But loading the modified ruleset causes this machine hard lockup immediately. 

Then we had to revert the high load vpn link to openvpn server. With remaining
vpn links, we can reproduce hard lockup 100%.

After quick pinpoints, we are sure:

1. The unmodified ruleset can cause lockup too
2. The lockup is caused by the last "ct state" rule (if commented, no lockup)

We move most of vpn links to a backup server after work time, which has the
same hardware and software. Loading ruleset in this backup server doesn't cause
hard lockup. Loading ruleset in the aforementioned now unloaded server doesn't
cause hard lockup, either.

We are sure:

3. Certain traffic load is a factor for the hard lockup

Please look into this issue.

You are receiving this mail because:
You are watching all bug changes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20160817/52ea5b73/attachment.html>

More information about the netfilter-buglog mailing list