[PATCH] Race conditions in QUEUE handling on SMP builds

Alexander Demenshin aldem-nf@aldem.net
Fri, 21 Jul 2000 17:22:32 +0200


On Fri, Jul 21, 2000 at 10:58:35PM +1000, Rob Thomas wrote:

> This is the one-line patch jamesm sent to you, and the list? I don't

  No, it's not true, patch that was sent by James was different
  (just compare them).

> understand the need for the skb get and free either.  James, Rusty?

  Easy. First of all, in case if I have several QUEUE targets in
  _different_ tables, like:
  
  	iptables -t mangle -I OUTPUT -j QUEUE
  	iptables -t filter -I OUTPUT -j QUEUE
  	
  Packet will recach user space _two_ times, and it will be the same
  skb. So, first call to nf_reinject() (through set_verdict()) will free
  it, then... You will get OOPS (just try it - but on SMP build please).
  
  Additionally, we have to make sure that skbs will not vanish while they
  are on queue awaiting for decision from user space (it could happen
  in case I described above, it also raises some race conditions under
  heavy load on SMP).

> (BTW, last time I looked, james's patch had been put into CVS, so if you
> had been watching the CVS tree, you would have saved a lot of time on your
> behalf)

  Again, my solution is _different_ - NO ONE was able to fix problems reported
  by me, so I had to do it by myself (because I need it). And even more, James
  told me that he cannot reproduce problem on his system and asked me to test
  his patch (I did, and it did not work - easy to check if you want).
  
  Patch that was sent by James was different, related to locking code,
  my patch is only ensures that skbs are not freed while on ip_queue,
  and this _fixes_ all problems I had, with price of only few instructions.

> 
> > @@ -521,6 +536,7 @@
> >  	case NF_QUEUE:
> >  		nf_queue(skb, elem, info->pf, info->hook, 
> >  			 info->indev, info->outdev, info->okfn);
> > +		break;
> >  
> >  	case NF_DROP:
> >  		kfree_skb(skb);
> > 
> 
> I woule never have spotted that myself, so I'm guessing that james put a
> -lot- of time and effort into searching that one down.

  This one was sent in patch by James, true - but it is not my case if you will
  read my mails - I do not use NF_QUEUE verdict from user space, and since I've
  not seen this change in latest test5-pre series I included this here (but even
  without this my skb_get() _fix_ the problems). My changes are clearly described
  in code, with some explanations.

> I'm a bit grumpy because you immediately responded to james's patch with a
> 'No, that won't work', and then suddenly you arrive with this magic 'Ooh,
> look, this fixes it!'

  Hey, read cerafully all my correspondence. If you think that I am stealing
  someone else work - it is up to you, but I've spent a lot of sleepless
  nights looking through code and testing _why_ I am getting random errors
  (remember, they only exists on SMP builds, and seems that not so many people
  experimenting with ip_queue on SMP).
  
  Please, read again all our mails - you will find out that my work is different,
  moreover it does not need patch from James (with additional blocking), his
  patch does _not_ fix the problem (you are free to test it and feel by yourself),
  but mine _does_.
  
  After all, as for me it is regardless who has the priority - the only what I need
  is _stable_ system. Once I did something like hunting this annoying bug, so if you
  are curious you may find my name in net/ipv4/ip_output.c - problem was similar
  that I had here...
  
/Al