[Xenomai] EINTR in notifier.c (mercury)
Matthias Schneider
ma30002000 at yahoo.de
Fri Apr 11 18:44:18 CEST 2014
----- Original Message -----
> From: Philippe Gerum <rpm at xenomai.org>
> To: Matthias Schneider <ma30002000 at yahoo.de>; "xenomai at xenomai.org" <xenomai at xenomai.org>
> Cc:
> Sent: Friday, April 11, 2014 5:07 PM
> Subject: Re: [Xenomai] EINTR in notifier.c (mercury)
>
> On 04/11/2014 04:59 PM, Philippe Gerum wrote:
>> On 04/07/2014 06:49 PM, Matthias Schneider wrote:
>>> ----- Original Message -----
>>>
>>>> From: Philippe Gerum <rpm at xenomai.org>
>>>> To: Matthias Schneider <ma30002000 at yahoo.de>;
> "xenomai at xenomai.org"
>>>> <xenomai at xenomai.org>
>>>> Cc:
>>>> Sent: Sunday, April 6, 2014 5:04 PM
>>>> Subject: Re: [Xenomai] EINTR in notifier.c (mercury)
>>>>
>>>> On 04/06/2014 01:11 PM, Matthias Schneider wrote:
>>>>>> On Monday, March 31, 2014 7:50 PM, Gilles Chanteperdrix
>>>> <gilles.chanteperdrix at xenomai.org> wrote:
>>>>>
>>>>>>> On 03/31/2014 03:18 PM, Philippe Gerum wrote:
>>>>>>> On 03/31/2014 01:27 PM, Gilles Chanteperdrix wrote:
>>>>>>>> On 03/31/2014 12:34 PM, Matthias Schneider
> wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> still working on thread suspension in
> mercury, I noticed
>>>> that some
>>>>>>>>> threadobj_suspend() and threadobj_resume()
> calls seemed
>>>> not to have the desired
>>>>>>>>> effect. Analyzing the issue, I found out
> that sometimes
>>>> the read operations on
>>>>>>>>> the pipe in notifier_wait() seem to return
> with EINTR,
>>>> especially in
>>>>>>>>> heavily loaded systems. Restarting the read
> system call
>>>> in that case made
>>>>>>>>> thread suspension a lot more reliable in my
> case.
>>>>>>>>>
>>>>>>>>> I have attached a patch adding loops to
> deal with the
>>>> EINTR situation in all
>>>>>>>>
>>>>>>>> You can probably avoid testing for EINTR if all
> signal
>>>> handlers are
>>>>>>>> registered with the SA_RESTART flag.
>>>>>>>>
>>>>>>>
>>>>>>> The app may not explicitly care for signals
> (granted, most would
>>>> do, but
>>>>>>> we would then have to assume they do it right).
>>>>>>>
>>>>>>
>>>>>> Yes, applications may want to use signals to interrupt a
> call to
>>>>>> read
>>>>>> and voluntarily get EINTR anyway.
>>>>>> Gilles.
>>>>>
>>>>>
>>>>> Please find attached a third version of my patch dealing with
>>>>> EINTR syscalls. Note that I also stumbled over unexpected
> behavior
>>>>> in threadobj_sleep and cancel_sync due to EINTRs. Considering
> that
>>>>> mercury is using signals for user-space round-robin and
> thread
>>>>> suspension, even without any application signals EINTR
> happens
>>>>> quite often.
>>>>
>>>> @@ -86,9 +88,13 @@ static void notifier_sighandler(int sig,
> siginfo_t
>>>> *siginfo,
>>>> void *uc)
>>>> if (nf->owner && nf->owner != tid)
>>>> continue;
>>>>
>>>> - while (__STD(read(nf->psfd[0], &c, 1)) > 0)
>>>> - /* Callee must run async-safe code only. */
>>>> - nf->callback(nf);
>>>> + do {
>>>> + ret = __STD(read(nf->psfd[0], &c, 1));
>>>> + if (ret > 0)
>>>> + /* Callee must run async-safe code only. */
>>>> + nf->callback(nf);
>>>> + } while (ret > 0 || errno == EINTR);
>>>>
>>>> In theory, we could receive a zero byte count on EOF, in which case
>>>> we should
>>>> not test errno, rather we should bail out immediately.
>>>>
>>>> @@ -1093,7 +1093,9 @@ int threadobj_sleep(struct timespec *ts)
>>>> */
>>>> current->run_state = __THREAD_S_DELAYED;
>>>> threadobj_save_timeout(¤t->core, ts);
>>>> - ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME,
>>>> ts, NULL));
>>>> + do
>>>> + ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE,
>>>> TIMER_ABSTIME, ts,
>>>> NULL));
>>>>
>>>> We should definitely pass and use the "remain" field in
>>>> clock_nanosleep(), not to restart a complete wait each time we get
>>>> interrupted.
>>>>
>>>>
>>>> --
>>>> Philippe.
>>>>
>>>
>>> I have attached a new version of the patch leaving the loop on ret ==
>>> 0. Note
>>> that this will be a new behavior in compared to before the patch.
>>>
>>> Gilles already pointed out that TIMER_ABSTIME allows restarting the
>>> same syscall
>>> in threadobj_sleep.
>>>
>>
>> Yes, but unfortunately, that part of the patch is still broken and
>> introduced a regression of other copperplate-based APÏs. Several RTOS
>> implement some kind of thread/task_unblock() service, which shall cause
>> sleep calls based on threadobj_sleep() to return upon receiving the
>> SIGRELS notification via a signal. This is the meaning of the comment
>> heading that code.
>>
>> So the proper fix should go to the call site in the pSOS emulator,
>> instead of changing the generic behavior in threadobj_sleep().
>>
>
> Or threadobj_sleep() should be made smarter, specifically detecting
> SIGRELS in the mercury case for aborting the wait.
>
>
> --
> Philippe.
>
I think I would prefer to encapsulate that in threadobj_sleep(). Would
storing a flag in threadobj_current() in unblock_sighandler() and
checking this in threadobj_sleep() be an option? Sorry for having
overlooked this,
Matthias
More information about the Xenomai
mailing list