[Xenomai] EINTR in notifier.c (mercury)

Philippe Gerum rpm at xenomai.org
Fri Apr 11 16:59:09 CEST 2014


On 04/07/2014 06:49 PM, Matthias Schneider wrote:
> ----- Original Message -----
>
>> From: Philippe Gerum <rpm at xenomai.org>
>> To: Matthias Schneider <ma30002000 at yahoo.de>; "xenomai at xenomai.org" <xenomai at xenomai.org>
>> Cc:
>> Sent: Sunday, April 6, 2014 5:04 PM
>> Subject: Re: [Xenomai] EINTR in notifier.c (mercury)
>>
>> On 04/06/2014 01:11 PM, Matthias Schneider wrote:
>>>>   On Monday, March 31, 2014 7:50 PM, Gilles Chanteperdrix
>> <gilles.chanteperdrix at xenomai.org> wrote:
>>>
>>>>>   On 03/31/2014 03:18 PM, Philippe Gerum wrote:
>>>>>     On 03/31/2014 01:27 PM, Gilles Chanteperdrix wrote:
>>>>>>     On 03/31/2014 12:34 PM, Matthias Schneider wrote:
>>>>>>>     Hi all,
>>>>>>>
>>>>>>>
>>>>>>>     still working on thread suspension in mercury, I noticed
>> that some
>>>>>>>     threadobj_suspend() and threadobj_resume() calls seemed
>> not to have the desired
>>>>>>>     effect. Analyzing the issue, I found out that sometimes
>> the read operations on
>>>>>>>     the pipe in notifier_wait() seem to return with EINTR,
>> especially in
>>>>>>>     heavily loaded systems. Restarting the read system call
>> in that case made
>>>>>>>     thread suspension a lot more reliable in my case.
>>>>>>>
>>>>>>>     I have attached a patch adding loops to deal with the
>> EINTR situation in all
>>>>>>
>>>>>>     You can probably avoid testing for EINTR if all signal
>> handlers are
>>>>>>     registered with the SA_RESTART flag.
>>>>>>
>>>>>
>>>>>     The app may not explicitly care for signals (granted, most would
>> do, but
>>>>>     we would then have to assume they do it right).
>>>>>
>>>>
>>>>   Yes, applications may want to use signals to interrupt a call to read
>>>>   and voluntarily get EINTR anyway.
>>>>                            Gilles.
>>>
>>>
>>>   Please find attached a third version of my patch dealing with
>>>   EINTR syscalls. Note that I also stumbled over unexpected behavior
>>>   in threadobj_sleep and cancel_sync due to EINTRs. Considering that
>>>   mercury is using signals for user-space round-robin and thread
>>>   suspension, even without any application signals EINTR happens
>>>   quite often.
>>
>> @@ -86,9 +88,13 @@ static void notifier_sighandler(int sig, siginfo_t *siginfo,
>> void *uc)
>>          if (nf->owner && nf->owner != tid)
>>              continue;
>>
>> -        while (__STD(read(nf->psfd[0], &c, 1)) > 0)
>> -            /* Callee must run async-safe code only. */
>> -            nf->callback(nf);
>> +        do {
>> +            ret = __STD(read(nf->psfd[0], &c, 1));
>> +            if (ret > 0)
>> +                /* Callee must run async-safe code only. */
>> +                nf->callback(nf);
>> +        } while (ret > 0 || errno == EINTR);
>>
>> In theory, we could receive a zero byte count on EOF, in which case we should
>> not test errno, rather we should bail out immediately.
>>
>> @@ -1093,7 +1093,9 @@ int threadobj_sleep(struct timespec *ts)
>>       */
>>      current->run_state = __THREAD_S_DELAYED;
>>      threadobj_save_timeout(&current->core, ts);
>> -    ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME, ts, NULL));
>> +    do
>> +        ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME, ts,
>> NULL));
>>
>> We should definitely pass and use the "remain" field in
>> clock_nanosleep(), not to restart a complete wait each time we get interrupted.
>>
>>
>> --
>> Philippe.
>>
>
> I have attached a new version of the patch leaving the loop on ret == 0. Note
> that this will be a new behavior in compared to before the patch.
>
> Gilles already pointed out that TIMER_ABSTIME allows restarting the same syscall
> in threadobj_sleep.
>

Yes, but unfortunately, that part of the patch is still broken and 
introduced a regression of other copperplate-based APÏs. Several RTOS 
implement some kind of thread/task_unblock() service, which shall cause 
sleep calls based on threadobj_sleep() to return upon receiving the 
SIGRELS notification via a signal. This is the meaning of the comment 
heading that code.

So the proper fix should go to the call site in the pSOS emulator, 
instead of changing the generic behavior in threadobj_sleep().

-- 
Philippe.




More information about the Xenomai mailing list