[Xenomai] Kernel freezes in __ipipe_sync_stage

Marco Tessore marco.tessore at axelsw.it
Tue Jun 24 18:41:28 CEST 2014


Hi,

Il 20/06/2014 13:52, Gilles Chanteperdrix ha scritto:
> On 06/20/2014 11:11 AM, Marco Tessore wrote:
>> The kernel is version 2.6.31 for ARM architecture - specifically a
> Do you have the same problem with a recent I-pipe patches, like one for
> 3.8 or 3.10 kernel?
>

I managed to do some tests on 3.10 kernel but on onother board with 
imx28 CPU, actually it happens that that kernel freezes too,
but I haven't debugged it with the jtag debugger.

I have, instead, some information on the original problem, that is the 
one that worried me more:

In summary:
I have a board based on imx25, with kernel 2.6.31, Xenomai 2.5.6 and 
ipipe patch 1.16-02.

Rarely, but often enough to be a problem, the kernel freezes at boot.
Thanks to a JTAG debugger I'm able to observe the kernel in the 
following situation:
I'm in an infinite loop with the following stack trace:
__ipipe_set_irqpending
xnintr_host_tick (__ipipe_propagate_irq)
xnintr_clock_handler
__ipipe_sync_stage    <- (1)
ipipe_suspend_domain
__ipipe_walk_pipeline
__ipipe_restore_pipeline_head
xnarch_next_tick_shot
clockevents_program_event
tick_dev_program_event
hrtimer_interrupt
mxc_interrupt
handle_IRQ_event
handle_level_irq
asm_do_IRQ
__ipipe_sync_stage <- (2)
ipipe_suspend_domain
__ipipe_walk_pipeline
__ipipe_restore_pipeline_head
xnpod_enable_timesource
xnpod_init
__native_skin_init
...
...

Specifically, it happens that the first call to __ipipe_sync_stage, the 
one marked with the number (2), is working on a stage that I can not 
determine,
let's say for convenience stage S1, I think is the Linux secondary 
domain but I'm not sure,
so the function invokes the interrupt handler of the system timer.
Continuing in the stack trace, I have a nested call to 
__ipipe_sync_stage, indicated with (1),
but this call works on another stage, for convenience domain S2,
in turn this function invokes a handler for the timer irq, which at a 
certain point invokes the __ipipe_propagate_irq which raises the flags 
for the stage S1,
thus making the first call to __ipipe_sync_stage (2) fails to get out of 
their while loops.

I should add that I do not see hardware interrupt for the timer in 
function __ipipe_grab_IRQ.
I have no idea how the cycle is triggered,but when the kernel is locked,
the kernel is located in the software exclusively infinite loop 
described above.


In the hope that you could help me understand what is going on,
I would have liked groped a patch like this:
- Store, for each level of nesting of __ipipe_sync_stage, the irq number 
currently running and on behalf of which stage.
- Patch the function __ipipe_set_irqpending in such a way as not to set 
the flags for the pair (irq, stage) if the pair is already present at 
some level in the current stack trace, that is,
- if the function __ipipe_sync_stage is executing the handler for a 
stage, and then he had reset the flags in irqpend_himask and 
irqpend_lomask, it does not expect the handler goes to raise again the 
same flag for the same stage.

What do you think about this?

Thank you very much for any kind of advice you could give me

Sincerely
Marco Tessore






More information about the Xenomai mailing list