[Xenomai] EINTR in notifier.c (mercury)

Philippe Gerum rpm at xenomai.org
Fri Apr 11 17:07:00 CEST 2014


On 04/11/2014 04:59 PM, Philippe Gerum wrote:
> On 04/07/2014 06:49 PM, Matthias Schneider wrote:
>> ----- Original Message -----
>>
>>> From: Philippe Gerum <rpm at xenomai.org>
>>> To: Matthias Schneider <ma30002000 at yahoo.de>; "xenomai at xenomai.org"
>>> <xenomai at xenomai.org>
>>> Cc:
>>> Sent: Sunday, April 6, 2014 5:04 PM
>>> Subject: Re: [Xenomai] EINTR in notifier.c (mercury)
>>>
>>> On 04/06/2014 01:11 PM, Matthias Schneider wrote:
>>>>>   On Monday, March 31, 2014 7:50 PM, Gilles Chanteperdrix
>>> <gilles.chanteperdrix at xenomai.org> wrote:
>>>>
>>>>>>   On 03/31/2014 03:18 PM, Philippe Gerum wrote:
>>>>>>     On 03/31/2014 01:27 PM, Gilles Chanteperdrix wrote:
>>>>>>>     On 03/31/2014 12:34 PM, Matthias Schneider wrote:
>>>>>>>>     Hi all,
>>>>>>>>
>>>>>>>>
>>>>>>>>     still working on thread suspension in mercury, I noticed
>>> that some
>>>>>>>>     threadobj_suspend() and threadobj_resume() calls seemed
>>> not to have the desired
>>>>>>>>     effect. Analyzing the issue, I found out that sometimes
>>> the read operations on
>>>>>>>>     the pipe in notifier_wait() seem to return with EINTR,
>>> especially in
>>>>>>>>     heavily loaded systems. Restarting the read system call
>>> in that case made
>>>>>>>>     thread suspension a lot more reliable in my case.
>>>>>>>>
>>>>>>>>     I have attached a patch adding loops to deal with the
>>> EINTR situation in all
>>>>>>>
>>>>>>>     You can probably avoid testing for EINTR if all signal
>>> handlers are
>>>>>>>     registered with the SA_RESTART flag.
>>>>>>>
>>>>>>
>>>>>>     The app may not explicitly care for signals (granted, most would
>>> do, but
>>>>>>     we would then have to assume they do it right).
>>>>>>
>>>>>
>>>>>   Yes, applications may want to use signals to interrupt a call to
>>>>> read
>>>>>   and voluntarily get EINTR anyway.
>>>>>                            Gilles.
>>>>
>>>>
>>>>   Please find attached a third version of my patch dealing with
>>>>   EINTR syscalls. Note that I also stumbled over unexpected behavior
>>>>   in threadobj_sleep and cancel_sync due to EINTRs. Considering that
>>>>   mercury is using signals for user-space round-robin and thread
>>>>   suspension, even without any application signals EINTR happens
>>>>   quite often.
>>>
>>> @@ -86,9 +88,13 @@ static void notifier_sighandler(int sig, siginfo_t
>>> *siginfo,
>>> void *uc)
>>>          if (nf->owner && nf->owner != tid)
>>>              continue;
>>>
>>> -        while (__STD(read(nf->psfd[0], &c, 1)) > 0)
>>> -            /* Callee must run async-safe code only. */
>>> -            nf->callback(nf);
>>> +        do {
>>> +            ret = __STD(read(nf->psfd[0], &c, 1));
>>> +            if (ret > 0)
>>> +                /* Callee must run async-safe code only. */
>>> +                nf->callback(nf);
>>> +        } while (ret > 0 || errno == EINTR);
>>>
>>> In theory, we could receive a zero byte count on EOF, in which case
>>> we should
>>> not test errno, rather we should bail out immediately.
>>>
>>> @@ -1093,7 +1093,9 @@ int threadobj_sleep(struct timespec *ts)
>>>       */
>>>      current->run_state = __THREAD_S_DELAYED;
>>>      threadobj_save_timeout(&current->core, ts);
>>> -    ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME,
>>> ts, NULL));
>>> +    do
>>> +        ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE,
>>> TIMER_ABSTIME, ts,
>>> NULL));
>>>
>>> We should definitely pass and use the "remain" field in
>>> clock_nanosleep(), not to restart a complete wait each time we get
>>> interrupted.
>>>
>>>
>>> --
>>> Philippe.
>>>
>>
>> I have attached a new version of the patch leaving the loop on ret ==
>> 0. Note
>> that this will be a new behavior in compared to before the patch.
>>
>> Gilles already pointed out that TIMER_ABSTIME allows restarting the
>> same syscall
>> in threadobj_sleep.
>>
>
> Yes, but unfortunately, that part of the patch is still broken and
> introduced a regression of other copperplate-based APÏs. Several RTOS
> implement some kind of thread/task_unblock() service, which shall cause
> sleep calls based on threadobj_sleep() to return upon receiving the
> SIGRELS notification via a signal. This is the meaning of the comment
> heading that code.
>
> So the proper fix should go to the call site in the pSOS emulator,
> instead of changing the generic behavior in threadobj_sleep().
>

Or threadobj_sleep() should be made smarter, specifically detecting 
SIGRELS in the mercury case for aborting the wait.

-- 
Philippe.




More information about the Xenomai mailing list