[Xenomai] [PATCH] ipipe: Do not leak the preemption counter on interrupted migrations

Jan Kiszka jan.kiszka at siemens.com
Fri Dec 15 07:17:57 CET 2017


On 2017-12-14 20:57, Philippe Gerum wrote:
> On 12/14/2017 06:32 PM, Jan Kiszka wrote:
>> On 2017-12-14 18:13, Philippe Gerum wrote:
>>> On 12/14/2017 05:45 PM, Jan Kiszka wrote:
>>>> From: Jan Kiszka <jan.kiszka at siemens.com>
>>>>
>>>> If __schedule fails because of a pending signal, we have to re-enable
>>>> preemption before leaving __ipipe_migrate_head. Failing to do so causes
>>>> a leak and sooner or later a "scheduling while atomic" or similar
>>>> errors.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka at siemens.com>
>>>> ---
>>>>
>>>> Sending separately as it seems critical.
>>>>
>>>>  kernel/sched/core.c | 6 +++---
>>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>> index de3e8929a501..304399683e23 100644
>>>> --- a/kernel/sched/core.c
>>>> +++ b/kernel/sched/core.c
>>>> @@ -8672,10 +8672,10 @@ int __ipipe_migrate_head(void)
>>>>  	if (likely(__schedule(false)))
>>>>  		return 0;
>>>>  
>>>> -	if (signal_pending(p))
>>>> -		return -ERESTARTSYS;
>>>> +	BUG_ON(!signal_pending(p));
>>>>  
>>>> -	BUG();
>>>> +	preempt_enable();
>>>> +	return -ERESTARTSYS;
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(__ipipe_migrate_head);
>>>>  
>>>>
>>>
>>> Merged, thanks.
>>>
>>
>> BTW, we have another problem of this kind:
>>
>> With event tracing enabled, we leak preemption locks when a RT task is
>> suspended while holding that lock during a tracepoint execution. That
>> only affects x86 since 3.13 because that was the point where that arch
>> switched from per-task to per-cpu preemption counter.
>>
> 
> The gist of the matter seems to be that a preemption lock is taken from
> a context running over the head stage, isn't it?
> 

Right. We are able to re-use the upstream tracepoint infrastructure
(inlcude/linux/tracepoint.h) by allowing rcu_read_lock_sched_notrace /
rcu_read_unlock_sched_notrace over any context. We therefore have a
check that prevent actual Linux rescheduling when the unlock is called
over non-Linux contexts. Unfortunately, I missed the case of enforced
suspension back then when thinking through the boundary conditions.

At the same time, I'm not sure if suspending a thread in the middle of a
syscall may not have other surprising effects in overseen corner cases.
Therefore my idea of allowing remote suspension only at known safe points.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux



More information about the Xenomai mailing list