RT thread seems blocked

Bradley Valdenebro Peter (DC-AE/ESW52) Peter.BradleyValdenebro at boschrexroth.nl
Wed Mar 25 08:59:05 CET 2020


Thanks for your support Jan.

We have already tried with ftrace and we couldn't reproduce the issue. We think because the introduced overhead. We usually record the following events:

	sudo trace-cmd start -e ipi -e irq -e rcu -e workqueue_execute_end -e workqueue_execute_start -e mm_page_alloc -e mm_page_free 
			-e sched_migrate_task -e sched_switch -e cobalt_irq_exit -e cobalt_irq_entry -e cobalt_thread_migrate -e cobalt_switch_context 
			-e cobalt_head_sysexit -e cobalt_clock_entry -e cobalt_clock_exit

Would changing to trace-cmd record -e "cobalt*" -e sched -e signal make a difference?

We have also tried using the ipipe tracer but there we also had no luck. We didn't use xntrace_user_freeze() but we monitored the trace IRQs-off times in /proc.
Using xntrace_user_freeze() might be worth a try.

Best regards,

Peter Bradley
-----Original Message-----
From: Jan Kiszka <jan.kiszka at siemens.com> 
Sent: 24 March 2020 12:53
To: Bradley Valdenebro Peter (DC-AE/ESW52) <Peter.BradleyValdenebro at boschrexroth.nl>; xenomai at xenomai.org
Subject: Re: RT thread seems blocked

On 24.03.20 12:43, Bradley Valdenebro Peter (DC-AE/ESW52) wrote:
> Hello,
> We run some tests during the weekend and although we have less occurrences we still indeed see them.
> One interesting thing we see besides the 1ms stalling is that sometimes we see over 100us between IRQ and ISR start.
> At this point we do not know how to proceed further. Any help or suggestion is greatly appreciated.

That is usually where you should start to look into tracing. Option A is event-level tracing via standard ftrace. Use

trace-cmd record -e "cobalt*" -e sched -e signal

to record such a trace. Ideally instrument the point where the latency is off in your application via xnftrace_printf() (will leave a mark in that recorded trace). Check via "trace-cmd report" if the schedule around that peak is what you would expect.

For digging even deeper, down to function level: 
specifically look into xntrace_user_freeze(), calling that from your application when you spot an excessive latency to stop the trace.

If the result is not clear to you, share it to get a community inspection.


Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux

More information about the Xenomai mailing list