[Xenomai] Problem with ipipe locking PPIs and SGIs globally not per cpu

Philippe Gerum rpm at xenomai.org
Thu May 31 15:01:20 CEST 2018


On 05/31/2018 09:25 AM, Abel Vesa wrote:
> The problem is the fact that the arm twd irq (for example) is masked on that
> specific core inside gic, but the ipipe is locking it globally.
> 
> According to GIC architecture specification
> 
> https://www.cl.cam.ac.uk/research/srg/han/ACS-P35/zynq/arm_gic_architecture_specification.pdf
> 
> Section 2.2.1
> 
>         ...
>         Interrupt numbers ID0-ID31 are used for interrupts that are private to a CPU interface. These interrupts are
>         banked in the Distributor.
> 
>         A banked interrupt is one where the Distributor can have multiple interrupts with the same ID. A banked
>         interrupt is identified uniquely by its ID number and its associated CPU interface number. Of the banked
>         interrupt IDs:
>                 — ID0-ID15 are used for SGIs
>                 — ID16-ID31 are used for PPIs
>                 In a multiprocessor system:
>                 — A PPI is forwarded to a particular CPU interface, and is private to that interface. In prioritizing
>                 interrupts for a CPU interface the Distributor does not consider PPIs that relate to other interfaces.
>         ...
> 
> In our case, we're talking about ID29 (linux mapped 16). In kernel/ipipe/core.c,
> in function __ipipe_lock_irq the control bits should either be set per cpu and
> not globally if the hwirq is less than 31.
> 
> A suggestion would be to use the NR_CPUS bits from ipd->irqs[irq].control to
> mark the lock on a per cpu basis and then depending on the type of the irq
> set the corresponding bits for each (or all, in case SPIs) cpu.

Using the control bits would limit the number of supported CPUs to
__WORD_SIZE - 3, which would be too restrictive on high-end x86 or
armv8-based boards.

 And then, when
> checking if locked (in __ipipe_set_irq_pending and __ipipe_do_sync_stage)
> include the cpu in the condition. That should be all.
> 
> I'll try to come up with a patch later. I just wanted to point out the problem
> here.
> 

Thanks for looking at this. Originally, this per-IRQ pipeline-specific
bit was aimed at addressing the following scenario:

        local_irq_disable()
        -> [IRQ receipt, logged for root stage]
	   disable_irq(irq)
	   ...
	   local_irq_enable()
	      -> [spurious IRQ replay upon log sync since IRQ
	          is deemed disabled]

It turns out that this is definitely not the best way to address it. It
suffers multiple shortcomings, including SMP races and the requirement
to fix up each and every mask/unmask irqchip handlers being used. The
current approach even abuses the internal genirq interface, by handling
a corner case related to the (shutdown|disable)_irq ops from the
mask/unmask handlers.

There is an ongoing work to overhaul the integration of the interrupt
pipeline with the genirq core which addresses this issue, and several
others.

In the meantime, you could work around this issue by dropping the
__ipipe_lock/unlock dance entirely, addressing this case by clearing the
IRQ bit of the disabled/shutdown interrupt in each per-CPU log, from
within irq(_percpu)_disable, __irq_disable and irq_shutdown. That would
still not be perfect from a SMP synchronization point of view, but not
worse than the current __ipipe_lock/unlock_irq scheme regarding this,
and definitely better in all other aspects.

-- 
Philippe.



More information about the Xenomai mailing list