[Xenomai] EINTR in notifier.c (mercury)

Philippe Gerum rpm at xenomai.org
Sun Apr 6 17:04:41 CEST 2014


On 04/06/2014 01:11 PM, Matthias Schneider wrote:
>> On Monday, March 31, 2014 7:50 PM, Gilles Chanteperdrix <gilles.chanteperdrix at xenomai.org> wrote:
> 
>>> On 03/31/2014 03:18 PM, Philippe Gerum wrote:
>>>   On 03/31/2014 01:27 PM, Gilles Chanteperdrix wrote:
>>>>   On 03/31/2014 12:34 PM, Matthias Schneider wrote:
>>>>>   Hi all,
>>>>>
>>>>>
>>>>>   still working on thread suspension in mercury, I noticed that some
>>>>>   threadobj_suspend() and threadobj_resume() calls seemed not to have the desired
>>>>>   effect. Analyzing the issue, I found out that sometimes the read operations on
>>>>>   the pipe in notifier_wait() seem to return with EINTR, especially in
>>>>>   heavily loaded systems. Restarting the read system call in that case made
>>>>>   thread suspension a lot more reliable in my case.
>>>>>
>>>>>   I have attached a patch adding loops to deal with the EINTR situation in all
>>>>
>>>>   You can probably avoid testing for EINTR if all signal handlers are
>>>>   registered with the SA_RESTART flag.
>>>>
>>>
>>>   The app may not explicitly care for signals (granted, most would do, but
>>>   we would then have to assume they do it right).
>>>
>>
>> Yes, applications may want to use signals to interrupt a call to read
>> and voluntarily get EINTR anyway.
>>                          Gilles.
> 
> 
> Please find attached a third version of my patch dealing with
> EINTR syscalls. Note that I also stumbled over unexpected behavior
> in threadobj_sleep and cancel_sync due to EINTRs. Considering that
> mercury is using signals for user-space round-robin and thread
> suspension, even without any application signals EINTR happens
> quite often.

@@ -86,9 +88,13 @@ static void notifier_sighandler(int sig, siginfo_t *siginfo, void *uc)
 		if (nf->owner && nf->owner != tid)
 			continue;
 
-		while (__STD(read(nf->psfd[0], &c, 1)) > 0)
-			/* Callee must run async-safe code only. */
-			nf->callback(nf);
+		do {
+			ret = __STD(read(nf->psfd[0], &c, 1)); 
+			if (ret > 0)
+				/* Callee must run async-safe code only. */
+				nf->callback(nf);
+		} while (ret > 0 || errno == EINTR);

In theory, we could receive a zero byte count on EOF, in which case we should not test errno, rather we should bail out immediately.

@@ -1093,7 +1093,9 @@ int threadobj_sleep(struct timespec *ts)
 	 */
 	current->run_state = __THREAD_S_DELAYED;
 	threadobj_save_timeout(&current->core, ts);
-	ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME, ts, NULL));
+	do
+		ret = -__RT(clock_nanosleep(CLOCK_COPPERPLATE, TIMER_ABSTIME, ts, NULL));

We should definitely pass and use the "remain" field in clock_nanosleep(), not to restart a complete wait each time we get interrupted.

-- 
Philippe.




More information about the Xenomai mailing list