[Bug 118] NAT stops working

bugzilla-daemon@netfilter.org bugzilla-daemon@netfilter.org
Mon, 05 Jan 2004 15:46:34 +0100


https://bugzilla.netfilter.org/cgi-bin/bugzilla/show_bug.cgi?id=118





------- Additional Comments From owlman@ss.pub.ro  2004-01-05 15:46 -------
Well, here I am again. I just spent one hour tracking the newest instance of 
the bug. At the end of it I found a workaround and a new direction to search 
for it. I remind everyone that the machine is a Slackware box with a 2.4.23-
vanilla kernel. Indeed there is something special about my setup: it also does 
traffic shaping and I really don't know how to do it properly, so the traffic 
shaping scripts are really a mess. It works though (most of the time).

The internal network connects on eth0 with a network address of 212.179.37.0/24 
(yeah, it should be 192.168..., but...) The default route goes out the eth1. 
There is also a dial-on-demand ppp0 used as a backup route for the eth1 (a cron 
script checks the status of the eth1 connection and sets the default route 
appropriately).

The way I want it done is like this (I explain it because the trafic shaping 
scripts may be confusing): Everybody in the local net is supposed to use a 
squid proxy (with a complex system of speed limits - delay_pools) on this 
machine for just about everything. If some program cannot use the proxy, it is 
SNAT/MASQed. The only tc limits are imposed on the forwarded traffic which is 
not ssh ;) and those limits are much tighter than those in squid's config file 
(stimulating the use of the proxy). The traffic is classified with iptables -j 
MARK. There is also an ingress policer on ppp0 to cut the ping reply times 
under heavy load from 1 min (without the policer) to 5 seconds (with the 
policer).

The following tcpdump logs track the port 5050 (one of those used by the Yahoo 
Messenger) and my workstation (212.179.37.12). There are also snapshots of the 
ip_conntrack and one picture of iptraf watching eth0 and eth1 for port 5050 
traffic. The logs start when I discovered my messenger couldn't connect anymore 
on any port and end when I discovered that a ssh client on the same machine 
connected just fine to the outside so I reran tc.init and everything magically 
started to work. (Actually they end earlier, on a 4096 bytes boundary, courtesy 
of tee.)

After all this, I think the bug is probably in the traffic shaping code, and a 
very obscure bug triggered by my particularly bad scripting. I'd rather have a 
second opinion on this, so please comment.

tcpdump -i eth0 -n port 5050 | tee -a log1
http://www2.ict-rom.com/log1

tcpdump -i eth1 -n port 5050 | tee -a log2
http://www2.ict-rom.com/log2

cat /proc/net/ip_conntrack | grep 5050 >> log3 ; date >> log3; echo >> log3
http://www2.ict-rom.com/log3

cat /proc/net/ip_conntrack | grep 212.179.37.12 >> log4 ; date >> log4; echo >> 
log4
http://www2.ict-rom.com/log4

IPTraf snapshot:
http://www2.ict-rom.com/logiptraf

Traffic shaping scripts: rc.local calls tc.init and tc.init calls wshaper:
http://www2.ict-rom.com/tc.init
http://www2.ict-rom.com/wshaper

Relevant part of rc.firewall:
http://www2.ict-rom.com/rc.firewall



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.