Connection tracking and filtering in TCP streams

Paul Rusty Russell Paul.Russell@rustcorp.com.au
Tue, 28 Sep 1999 20:21:57 +0930


In message <E11VtvX-0004qB-00@taurus.cus.cam.ac.uk> you write:
> Paul Rusty Russell writes ("Re: Connection tracking and filtering in TCP stre
ams "):
> > Yep.  You can do this at the moment 8-).  You can try to match up
> > PASVs with responses, or record the last \r\n from the server, and add
> > this to the string check.
> 
> I'm not convinced it's as easy as that - it doesn't cope well with
> overlapping packets or packets received out of order. Consider this:
> 
> Client sends packet 2 of 2, containing data B
> Client sends packet 1 of 2, containing data A
> Client sends packet 2 of 2, containing data C
> 
> Now, does the server application receive AB or AC? In the case of FTP,
> I don't think this matters, but in the generic case, the firewall
> either needs to know or it needs to prevent this situation happening.

Oh... the GENERIC case?  I started the userspace implementation of
this (to prove that it shouldn't done), so I have some clue how hard
this is to solve.  Regular-expression replacement on TCP streams.

It simply isn't possible.  Even if you strip SACK (too hard) and large
windows (too easy to DoS) you have the case of a partial match on a
packet when the window size is 1 packet; you've lost.

<SIGH>.  Is there any serious possibility of a server being convinced
to emit \r\n inside the filename?  If so, even the \r\n is not a
solution.

> I'd go further than that and say that in the context of a firewall,
> much of the existing packet-based code actually needs to process the
> TCP stream. Whether or not it belongs in the kernel is a matter for
> debate. Ideally, it shouldn't be, but if the firewall is handling many
> connections, it may be necessary in order to get the latency down.

I draw the line at looking inside the data.  Call me
old-fashioned. 8-)

> > (see below).  Someone "just" has to write the library (+7 points),
> > then we can move this crock out of the kernel altogether (Airplane
> > mode: "WE CAN MOVE THIS CROCK OUT OF THE KERNEL").
> 
> I don't think it solves my problem at all, but it is clearly a Good
> Thing (TM) to do.

Yes, it does.  The library does presents you with a datastream, not a
packet stream (that's the point).  Reading generically is possible.
Modifying isn't.

> > but what is lacking is the
> > setsockopts for userspace to set up expecting connections etc.  They
> > can be added fairly easily though.
> 
> I don't understand what you mean by "set up expecting connections". 
> I presume you mean adding a entry to the your equivalent of the
> (de-)masquerading table so a hole is opened in the firewall? That
> still leaves the primary connection which handles most of the
> data. Once the interesting bit is over, I don't want it the connection
> going through a helper. 

Yes.

rsh helper in the kernel:
       1) Registers a setsockopt() (IP_SO_RSH_CTL)
       2) Hands all packets to userspace.
       3) Stops handing to userspace on that connection when
          setsockopt(IP_SO_RSH_CTL) called.

See?  You can even fake a RST to the hosts if you want, but this
(having a real connection set up) is far easier than...

> a) redirect the rsh connection to a user-space helper           (easy)
> b) read the usernames/port number from the data stream          (easy)
> c) if the connection is prohibited, return an error and close   (easy)
> d) make a connection to the original destination using the 
>    original source address (this is important for r-protocols)  (hard)
> e) send the usernames                                           (easy)
> f) read the error return, and send the error back to the client (easy)
> g) set up the second connection                                 (hard)
> g) splice the two connections together and let the kernel get on
>    with forwarding packets                                      (hard)

ICK.  Connection splicing is something that is THEORETICALLY possible,
but once again, what if the TCP options are incompatible?  It's not
something I want to get into:

 You CAN use mimic the incoming TCP options in your outgoing
 connection, but you have to use the same replies (ie. you can't reply
 to the incoming TCP until you've got a reply from the server
 yourself).  Or you can try lowest-common-denominator options, at which
 point your performance is going to suck hard anyway.

> Some of the r-protocols (I can't remember which, but it's two out of
> rsh, rlogin and rexec) set up a second connection for stderr and
> passing signals. In addition, I'd like to implement additional policy
> in user-space, such as allowing users to rlogin in/out, but not
> allowing lp, OutBox, guest, root, etc.

You sound like the man to write the userspace library... think, you'll
be top of the scoreboard (7 for rsh extention, 7+7 bonus for the
library, 1 for the doco, 1 for a patch in correct form with Changelog
= 23 points + any bugfixes along the way).

I'd love something vaguely like:

#include "netfilter_pjblib.h"

int main()
{
	char buffer[128];
	pjb_handle h;
	pjb_limitedrx *seek = pjb_alloc_rx("stringIwant");

	h = pjb_setup(seek);

	while (pjb_get(h, buffer, 128)) {
	      ....
	}
}

<GRIN>,
Rusty.
--
Hacking time.