[PATCH] Race conditions in QUEUE handling on SMP builds
Alexander Demenshin
aldem-nf@aldem.net
Fri, 21 Jul 2000 17:22:32 +0200
On Fri, Jul 21, 2000 at 10:58:35PM +1000, Rob Thomas wrote:
> This is the one-line patch jamesm sent to you, and the list? I don't
No, it's not true, patch that was sent by James was different
(just compare them).
> understand the need for the skb get and free either. James, Rusty?
Easy. First of all, in case if I have several QUEUE targets in
_different_ tables, like:
iptables -t mangle -I OUTPUT -j QUEUE
iptables -t filter -I OUTPUT -j QUEUE
Packet will recach user space _two_ times, and it will be the same
skb. So, first call to nf_reinject() (through set_verdict()) will free
it, then... You will get OOPS (just try it - but on SMP build please).
Additionally, we have to make sure that skbs will not vanish while they
are on queue awaiting for decision from user space (it could happen
in case I described above, it also raises some race conditions under
heavy load on SMP).
> (BTW, last time I looked, james's patch had been put into CVS, so if you
> had been watching the CVS tree, you would have saved a lot of time on your
> behalf)
Again, my solution is _different_ - NO ONE was able to fix problems reported
by me, so I had to do it by myself (because I need it). And even more, James
told me that he cannot reproduce problem on his system and asked me to test
his patch (I did, and it did not work - easy to check if you want).
Patch that was sent by James was different, related to locking code,
my patch is only ensures that skbs are not freed while on ip_queue,
and this _fixes_ all problems I had, with price of only few instructions.
>
> > @@ -521,6 +536,7 @@
> > case NF_QUEUE:
> > nf_queue(skb, elem, info->pf, info->hook,
> > info->indev, info->outdev, info->okfn);
> > + break;
> >
> > case NF_DROP:
> > kfree_skb(skb);
> >
>
> I woule never have spotted that myself, so I'm guessing that james put a
> -lot- of time and effort into searching that one down.
This one was sent in patch by James, true - but it is not my case if you will
read my mails - I do not use NF_QUEUE verdict from user space, and since I've
not seen this change in latest test5-pre series I included this here (but even
without this my skb_get() _fix_ the problems). My changes are clearly described
in code, with some explanations.
> I'm a bit grumpy because you immediately responded to james's patch with a
> 'No, that won't work', and then suddenly you arrive with this magic 'Ooh,
> look, this fixes it!'
Hey, read cerafully all my correspondence. If you think that I am stealing
someone else work - it is up to you, but I've spent a lot of sleepless
nights looking through code and testing _why_ I am getting random errors
(remember, they only exists on SMP builds, and seems that not so many people
experimenting with ip_queue on SMP).
Please, read again all our mails - you will find out that my work is different,
moreover it does not need patch from James (with additional blocking), his
patch does _not_ fix the problem (you are free to test it and feel by yourself),
but mine _does_.
After all, as for me it is regardless who has the priority - the only what I need
is _stable_ system. Once I did something like hunting this annoying bug, so if you
are curious you may find my name in net/ipv4/ip_output.c - problem was similar
that I had here...
/Al