Policy routing based on source address
Julian Anastasov
ja@ssi.bg
Sat, 12 Jan 2002 23:48:21 +0000 (GMT)
Hello,
On Fri, 11 Jan 2002, Marc Evans wrote:
> I am trying to use the 2.4 kernel with iproute2 and/or iptables to perform
> policy routing of packets based on the source IP address. I am not
> succeeding, based on using the command "traceroute -s 192.168.1.40 hostname"
> as my method of test.
traceroute needs fixing - it has bug not to bind() to the
specified IP address (with -s). I fixed mine long time ago but I
didn't created a fix for it. IIRC, the problem was that bind() was
surrounded in #ifndef IP_HDRINCL that should be removed (1.4a12,
traceroute.c). IP_HDRINCL has nothing to do with the routing. It seems
nobody uses "very" advanced routing and this problem is not noticed :)
But there are cases that unfixed traceroute fails very badly and
this is true for your setup (even if you follow the rules mentioned
below). traceroute simply needs fixing.
> My configuration is as follows. I have 1 physical ethernet interface with
> 3 sub interfaces, resulting in the IP addresses 192.168.1.{1,2,3,4}/24 on
> eth0 eth0:{1,2,3} respectively. Attached to the same LAN are 4 routers
> (that perform NAT among other things) at 192.168.1.{40,41,42,43}. My goal
> in this exercise is that packets with the source address of 192.168.1.1
> are always routed to 192.168.1.40, and likewise down the list:
>
> src address 192.168.1.1 route via 192.168.1.40
> src address 192.168.1.2 route via 192.168.1.41
> src address 192.168.1.3 route via 192.168.1.42
> src address 192.168.1.4 route via 192.168.1.43
Such setup has one problem: the kernel absolutely denies
to support it :) It is considered insecure if used with publicly
available addresses. As we see you are using private addresses,
so this is not a problem. And I can say that this is a recommended
setup for end hosts selecting the best alive default gateway.
And the problem is that the kernels (2.2, 2.4, everyone)
does not handle very well the case where two or more gateways are
reachable through same interface. The kernel gurus will recommend
you that when you are using private addresses to send the traffic
to a smart NAT router that will use multipath route to split the
traffic through all border gateways (if needed). My question: what
happens if this gateway fails but there are still other alive
border gateways? This is one of the main reasons I to maintain
some patches extending the routing:
http://www.linuxvirtualserver.org/~julian/#routes
Read dgd-usage.txt or anything related that you find useful
near this URL: dgd.txt. I know, my docs are not very good, so
you can look at Christoph's: nano.txt
Other problems could be that all gateways are from same
subnet and the kernel can not deduce the right preferred source
IP for the traffic through each gateway but I see that your
traffic is already bound to the specified source IP addresses,
so this is not a problem (you are mentioning only for traffic already
bound to specific source address). There will be a problem if you try to
add multipath routes with 4 nexthops without specifying preferred
source IP. In your case if you need a multipath route then the best
thing would be if you define preferred source IP, for example,
192.168.1.1 and of course, when all gateways agree to route
this traffic, not only 192.168.1.40. The multipath route is needed
only if you need to utilize the links attached to the mentioned
gateways and if you want to make this decision in the end host.
Another solution could be if you use the four gateways by adding
alternative default routes and if the gateways are smart enough to
use multipath route together with NAT. So, you have 2 options
if you are using the above patches (I don't see solution without
applying them):
- use the first alive gateway (the gateway can decide whether
to use one output line or to split the traffic to other
border gateways)
- use all gateways at the same time (multipath route), you can even
load them differently by specifying relative ratio (the route's
"weight" parameter). Then the border gateways (NAT routers) can
simply use only their lines and not to send your traffic through
other gateways.
In all cases the patched kernel will use only alive gateways.
Of course, there are some details mentioned in nano.txt
> I have setup routing tables like this:
>
> ip rule add from 192.168.1.1 table transit1
> ip rule add from 192.168.1.2 table transit2
> ip rule add from 192.168.1.3 table transit3
> ip rule add from 192.168.1.4 table transit4
> ip route add default via 192.168.1.40 table transit1
> ip route add default via 192.168.1.41 table transit2
> ip route add default via 192.168.1.42 table transit3
> ip route add default via 192.168.1.43 table transit4
> ip route flush cache
the above rules are wrong. You have to look at
dgd-usage.txt for the right order of rules+routes when symmetric
routes are used (the most secure and error-free way). In short,
the order is:
- link routes (hosts reachable without gateway), selected by
destination, in table main
- routes via gateway for specific sources (your routes go here)
- routes via gateway without specific sources (mostly duplicated
from the previous ones)
As result you can:
# remove any default routes from table main, then
# try first the traffic that does not use gateways:
ip rule add prio 50 table main
# then start with the routes via gateway (they can be with equal
# rule priority but should be with different tables)
ip rule add prio 101 from 192.168.1.1 table 101
ip rule add default via 192.168.1.40 dev eth0 src 192.168.1.1 table 101
# similar for 102, 103 and 104
# then add your default routes for traffic with unknown source address:
ip rule add prio 200 table 200
ip rule append default via 192.168.1.40 dev eth0 src 192.168.1.1 table 200
ip rule append default via 192.168.1.41 dev eth0 src 192.168.1.2 table 200
ip rule append default via 192.168.1.42 dev eth0 src 192.168.1.3 table 200
ip rule append default via 192.168.1.43 dev eth0 src 192.168.1.4 table 200
# Note the "append" in the above commands, it
# is explained in dgd-usage.txt. We just added alternative default
# routes. Note that the last routes can be a multipath route with
# specified preferred source IP address (one of the sources)
# if you decide the end hosts to be smarter than the gateways :)
> I would have expected that to work by itself, and no require any iptables
> interaction. Sadly it doesn't appear to. Instead the default gateway from
iptables does not play here, at least, until you don't
play as NAT router. For this you can read nano.txt.
> the main table always seems to be used. Reading through some of the
always check with ip route get, for example:
# out traffic to remote host (hitting table 200)
ip route get 1.2.3.4
# out traffic with known source (hitting 101-104)
ip route get from 192.168.1.1 to 1.2.3.4
# out to host onlink
ip route get 192.168.1.40
# incoming traffic from remote hosts (universe)
ip route get from 1.2.3.4 to 192.168.1.1 iif eth0
# local incoming traffic
ip route get from 192.168.1.40 to 192.168.1.1 iif eth0
> archived messages for the iproute2 and iptables lists however make me
> wonder if I need to be marking the packets with iptables and then routing
> them based on the marks?
No, no. This adds asymmetry and can be with fatal results
if not configured correctly. But I'll not be surprised if someone
comes with such solution for your setup.
> - Marc
Regards
--
Julian Anastasov <ja@ssi.bg>