Monday, April 23, 2007

Multi-core and signal handling: Unable to create sub named "" at

This little snippit of code will always fail on a multi-core machine, but not on a single core device:

perl -e 'use strict;$SIG{CHLD}=sub {1};while (! system ("true")) {local $SIG{CHLD}="DEFAULT"}' & while kill -n 17 $! ;do true;done

It can be simulated single core on my core-duo w/ the taskset (1) command, setting both processors to '0':
taskset -c 0 perl -e 'use strict;$SIG{CHLD}=sub {1};while (! system ("true")) {local $SIG{CHLD}="DEFAULT"}' & taskset -c 0 bash -c "while kill -n 17 $! ;do true;done"

When the second taskset(1) command is set for a secondary "real" processor with "taskset -c 1", it'll fail typically within 30 seconds with
'Unable to create sub named "" at -e line 1.'

The fix? Don't local()ize $SIG within the inside loop or scope. Chances are when Perl resets the $SIG{CHLD} variable when leaving scope, it momentarily leaves it un-set before returning it to the original, pre-local() global value.

Verified with perl5.8.0 and perl5.8.8-i386-linux-thread-multi

No comments: