[RFC]why xnlock_put and xnlock_clear_irqon release xnlock directly?

Philippe Gerum rpm at xenomai.org
Fri Jul 10 11:05:21 CEST 2020


On 7/9/20 5:25 PM, alex_luca--- via Xenomai wrote:
> From: Alex_luca at 163.com
> 
> hello,list
> I have some doubts on *struct nklock* operations. 
> 
> Both *xnlock_get* and *xnlock_get_irqsave* is allowed to be called
> nested;they will check that if the nklock has been acquired. If so
> they won't actually do the "get", but return a value of 2.
> 
> But the *xnlock_put*, which corresponding to *xnlock_get* never
> checks and will actually release the lock. This is diffrent from
> *xnlock_put_irqrestore*.
> 
> What puzzles me is the following:
> {
>     ......
>     xnlock_get_irqsave(nklock, s)
>     ......
>         xnlock_get(nklock)
>         ......
>         xnlock_put(nklock)
>         ......
>     xnlock_put_irqrestore(nklock, s)
> }
> 
> First, *xnlock_get_irqsave* get the nklock, and *xnlock_get* will skip;
> Then *xnlock_put* is called, and it released the nklock; after a
> while *xnlock_put_irqrestore* is called on the same cpu, it also seems
> to release the nklock.
> 
> Will this above happen in xenomai?  But I have never seen this happen.
> Or *xnlock_get* was guaranteed never been called nesting?
> 

This is right, there is such asymmetry between the irq-aware form and the
plain one, which I believe was not intended but never caused any trouble yet
because the problematic nesting does not happen in the current implementation.
Kind of sleeping monster if you will. xnlock_get() is commonly used in IRQ
handlers in order to save the redundant interrupt state fixup, so it would be
the outer lock in this case, not the inner one.

On a general note, if you plan to remove the ugly big lock from the Xenomai
code base, I would suggest the following approach:

- rewrite the core synchronization (xnsynch) so that testing for a condition
and sleeping on the related synchronization object (if that condition is
unmet) does not have to be covered by the nklock in a single atomic section.

- remove all other occurrences of code which would reschedule (xnsched_run())
with the nklock held. This nesting has to go for the Xenomai core to be
SMP-scalable.

Then you would need to pick a workable locking pattern. One involves
introducing per-CPU runqueue locks covering the updates to the scheduler data,
and per-thread locks protecting updates to the thread state. The locking order
would be thread -> runqueue, with the runqueue lock held across context
switches and released in the scheduling tail by the incoming thread.

This approach was followed in order to convert Cobalt into a SMP-scalable core
which you can refer to as a working example if that helps:
https://git.evlproject.org/linux-evl.git/tree/kernel/evl/sched/core.c?h=evl/master

-- 
Philippe.



More information about the Xenomai mailing list