fwmark + routing == broken
David Coulson
technoir@linux.com
Sat, 25 Nov 2000 19:48:01 +0000 (GMT)
After a bit of experimention, along with some helpful feedback from
Stephen Frost, I've come to the conclusion that fwmark based routing is
just plain borked.
I'm trying to do something really simple, such as send all outgoing smtp
connections over ippp0, and it mucks up the outgoing interfaces and stuff.
e.g.
root:~# telnet dev.themes.org 25
Trying 198.186.203.68...
It's stuck there waiting for a ACK packet back after it's sent out it's
SYN packet.
19:29:42.590733 > 194.222.178.203.1204 > 198.186.203.68.smtp: S
2514155076:2514155076(0) win 5840 <mss 1460,sackOK,timestamp 400376
0,nop,wscale 0> (DF)
19:29:42.790250 < 198.186.203.68.smtp > 10.1.0.2.1204: S
2673422574:2673422574(0) ack 2514155077 win 32120 <mss
1460,sackOK,timestamp 207001074 400376,nop,wscale 0> (DF)
194.222.178.203 is my ippp0 interface IP, which shows that the SNAT is
working right, but 10.1.0.2 is the local IP for the route onto my router,
which is the default route.
An inspection of netstat shows;
tcp 0 1 10.1.0.2:1205 198.186.203.68:25
SYN_SENT
which is obviously wrong (presumably it should have the IP of the local
ippp0 interface, as that is the route which is packet is using to leave
the machine, correct?)
If I do src based routing it works fine, so there is something not doing
it's job properly. The weird thing is that if I open a connection from,
say, 10.1.1.2 I get the following on the gateway
19:33:32.148284 > 194.222.178.203.39109 > 198.186.203.68.smtp: S
2759507136:2759507136(0) win 5840 <mss 1460,sackOK,timestamp 214334125
0,nop,wscale 0> (DF)
19:33:32.351011 < 198.186.203.68.smtp > 10.1.1.2.39109: S
2931433374:2931433374(0) ack 2759507137 win 32120 <mss
1460,sackOK,timestamp 207024028 214334125,nop,wscale 0> (DF)
19:33:35.777618 < 198.186.203.68.smtp > 10.1.1.2.39109: S
2931433374:2931433374(0) ack 2759507137 win 32120 <mss
1460,sackOK,timestamp 207024371 214334125,nop,wscale 0> (DF)
which, to me, looks right (in that, it's going to and from the right IPs
and ports.
However, the packets never gets to 10.1.1.2 and;
19:35:41.352043 10.1.1.2.39119 > 198.186.203.68.25: S
2886017640:2886017640(0) win 5840 <mss 1460,sackOK,timestamp 214347047
0,nop,wscale 0> (DF)
is all I get in the tcpdump for 10.1.1.2 (there is a connection in netstat
stuck on SYN_SENT).
There is nothing relating to that connection in /proc/net/ip_conntrack, so
I'm presuming that something isn't registering the outgoing connection,
then just dropping the packets as they come back in. There doesn't seem to
be an obvious way to log packets dropped by the kernel, so i can't be 100%
certain that this is the reason for the occurences I'm experiencing.
Anyway, that's about as far as I can investigate without any more ideas. I
know some people have said they have it working, but I'm 100% I'm not
doing anything dumb and I'm pretty sure Stephen is doing it as it should,
so my thoughts head towards a kernel problem.
Anyone got any more suggestions or know what on earth the kernel is doing
(or not, in thos case).
Thanks.
--
David Coulson
technoir@sourceforge.net