[CXP] Discussing the RTDM specification

Jan Kiszka jan.kiszka at siemens.com
Tue Jan 5 20:31:43 CET 2021


On 23.12.20 11:40, Philippe Gerum wrote:
> 
> Jan Kiszka <jan.kiszka at siemens.com> writes:
> 
>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>
>>> This wiki page [1] contains a draft proposal about specifying which
>>> services from the current RTDM interface should be part of the Common
>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>
>>> - I suspect that only very few RTDM drivers are actually handling
>>>   requests from other kernel-based drivers in real world applications,
>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>
>>>   In other words, although I would agree that a few particular drivers
>>>   might want to export a couple of services to kernel-based clients in
>>>   order to provide them some sort of backchannel, it seems wrong to
>>>   require RTDM drivers to provide a kernel interface which would match
>>>   their user interface in the same terms. For these specific cases, ad
>>>   hoc code in these few drivers should be enough.
>>>
>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>   vs user buffers, for only very few use cases.
>>>
>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>   move.
>>
>> That's fine with me. The idea was once that something like bus drivers
>> would appear, but that never happened.
>>
>>>
>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>   lock must go. For SMP scalability reasons, this big lock was
>>>   eliminated from the EVL core, which means that all the attached
>>>   semantics will not hold there. Serializing access to shared resources
>>>   should be guaranteed by resource-specific locking, not by a giant
>>>   traffic light like the big lock implements.
>>
>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>> now make it look like we no longer need this. Maybe this is already the
>> case when using rtdm_waitqueue*, but let's convert everyone first.
> 
> Alternatively, In-tree v3 drivers could also keep relying
> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
> them. Bottom line is to exclude from the CXP the whole idea that we may
> schedule while holding a lock to protect against missed wake ups, in
> general the very existence of any superlock which would cover everything
> from top to bottom when serializing. I agree that having v3 converge
> towards the CXP would be better though.
> 

I'm fine with migrating to a new pattern first, drop that old RTDM
pattern and declare the new one as migration path. Same for below.

>>
>>>
>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>   condition on grabbing a mutex either means that:
>>>
>>
>> I think you are missing the use cases:
>>
>> mutex-lock-timed
>> ...
>> wait-event-timed
>> ...
>> mutex-unlock
>> (which goes long with timeout sequences)
>>
> 
> There is a couple of issues with such use case: first we should never
> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
> user ( a [binary] semaphore would at least prevent this problem), but
> more importantly, how would this pattern allow the event to be signaled
> given the waiter holds the lock the sender would need to acquire first?

Just look at the existing drivers for the use cases (which obviously
imply signalling without holding the mutex). The clash with user-level
debugging was indeed always there and is another reason to provide users
an alternative.

> 
>> In fact, all that could be replaced with
>>
>> mutex-lock
>> ...
>> atomic-entry
>> mutex-unlock
>> wait-event-timed
>> mutex-lock
>> atomic-leave
>> ...
>> mutex-lock
>>
>> but that is what we want to replace as well...
>>
> 
> Yep, that would work for v3, preventing wake ups to be missed the same way.
> 

So let's define something that serves both versions, thus is
CXP-compatible. It makes no sense defining CXP while ignoring the use cases.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux



More information about the Xenomai mailing list