[BUG] copperlate/eventobj.c ->>> eventobj_inquire(), don't work

Philippe Gerum rpm at xenomai.org
Mon Jul 13 20:27:15 CEST 2020


On 7/13/20 7:59 PM, Jan Kiszka wrote:
> On 10.07.20 10:38, Philippe Gerum via Xenomai wrote:
>> On 7/10/20 8:04 AM, Caffreyfans via Xenomai wrote:
>>> Hi sir,
>>>
>>>      I'm trying to make another skin for xenomai.  When I do something about
>>> "event". I use `eventobj_inquire()` to get event flags. But no matter what
>>> value I post, I always get 0.
>>>
>>>      I find that eventobj_inquire() is not working. I know `alchemy/event`
>>> also
>>> use `eventobj`. So I write a test code by using alchemy skin. I am curious
>>> whether it is my own problem or there is an error in xenomai.
>>>
>>
>> Most likely a bug in Xenomai. In addition, looking at cobalt_event_post(),
>> there is a blatant race condition between the signal <-> wait operations. The
>> in-kernel wait() operation serializes on the ugly big lock which is not going
>> to help much against racing with the userland counterpart in
>> cobalt_event_post(), which does this:
>>
>>     __sync_or_and_fetch(&state->value, bits); /* full barrier. */
>>
>>     if ((state->flags & COBALT_EVENT_PENDED) == 0)
>>         return 0;
>>
>> The somebody-is-waiting bit tested above should be part of some atomic
>> operation shared with the wait-side or covered by the ugly big lock, but the
>> way it is implemented today can lead to spurious waits.
>>
>> The event code was fixed months ago for another bad issue, the whole thing
>> looks fragile. You may want to review all of it.
>>
> 
> The issue Caffreyfrans is describing seems more like a synchronous one. Didn't
> reproduce or analyzed yet, but it looks more "friendly" to me.
> 

The issue in cobalt_event_post() is very unlikely related to the problem with
the inquiry service, for sure. The serialization issue poked my eyes as I was
tracking the updates to the event value for the inquiry problem.

> The one that you bring up would be nasty. But why should that happen? Do we
> miss to recheck a condition inside the syscall and therefore starve?
> 

event_wait(kernel)		event_post(user)
------------------		----------------

lock(&nklock)			update event->value
bits not in event->value:	
				!EVENT_PENDED
	raise EVENT_PENDED
	xnsynch_sleep_on	
				=> no kernel entry
(waits indefinitely)		(event_sync is missed)

And SMP is not even required to break it. So either the EVENT_PENDED
information is folded into the event value so that both can be checked
atomically as one like mutexes do, or the broken optimization in userland is
replaced by a direct call to some kernel-based event_post service (tbd).
Obviously, option #1 would consume a bit in order to encode EVENT_PENDED,
limiting the effective event map to 31 bits, which would be a problem ABI- and
API-wise.

-- 
Philippe.



More information about the Xenomai mailing list