[RFC][PATCH 4.19] x86/ipipe: Protect TLB flushing against context switch by head domain
rpm at xenomai.org
Thu Mar 12 17:35:52 CET 2020
On 3/12/20 5:12 PM, Jan Kiszka wrote:
> On 12.03.20 16:59, Philippe Gerum wrote:
>> On 3/12/20 2:48 PM, Jan Kiszka wrote:
>>> From: Jan Kiszka <jan.kiszka at siemens.com>
>>> A Xenomai application is very rarely triggering
>>> WARNING: CPU: 0 PID: 1997 at arch/x86/mm/tlb.c:560 [...]
>>> (local_tlb_gen > mm_tlb_gen)
>>> This could be triggered by loaded_mm and loaded_mm_asid becoming out of
>>> sync when flush_tlb_func_common is interrupted by the head domain to
>>> switch a real-time task right between the retrieval of both values, or
>>> maybe even after that but before writing mm_tlb_gen back to
>>> Avoid that case by making the retrieval atomic while keeping the TLB
>>> flush interruptible. Now, there could still be interrupt during the
>>> flush. To avoid writing back to the wrong context, we first atomically
>>> check after the flush if nothing changed and only write if that is the
>>> case. That may mean another TLB flush is triggered needlessly, but
>>> that's rare and acceptable.
>>> Signed-off-by: Jan Kiszka <jan.kiszka at siemens.com>
>>> Due to the rare nature of this issue, we are not yet confident to have
>>> truly fixed it this way.
>>> Philippe, I'm seeing some similar attempt in dovetail but it appears to
>>> me it's missing some cases.
>> Not "some cases", but the last one in your patch specifically if I read
>> it correctly, which I assumed was not applicable, at least not the way I
>> read your change, when I worked on this a year ago. This explains why
>> that particular change is not present in the commit (3aa2fc2fb4c) you
>> seem to have cherry picked from dovetail for the 5.x kernel series. This
>> said, these are tricky issues, so as you hinted in your commit log,
>> there is likely room for improvement in any case, and I may have
>> overlooked things.
>>> Too bad that development was forking here
>>> and information isn't flowing smoothly yet.
>> You just demonstrated that the information is there, and that anyone can
>> access it freely by looking at the EVL development tree. I'm sorry to
> It's there but it now requires polling to extract it.
Yes, because I'm done with maintaining the I-pipe, and reviewing things
with backports to I-pipe in mind would amount to maintaining this code.
> I suspect I will
> find more interesting changes once reviewing the dovetail queue
> completely (I already found the reverse: KVM was broken in dovetail due
> to incomplete forward porting; will fix when I come along the code).
>> hear that forking my own code for the most part in order to find a
>> better approach for others to benefit from in the long run can be a
>> problem. I did not find any other way to go back to the drawing board as
>> required by the technical goals I'm pursuing with EVL, which differ from
> I've seen this with other spin-offs/rewrites/etc. of the ipipe-like
> kernel queue a couple of times: Even if colors and edges look
> differently, the core concept remains the same. Thus you also share the
> conceptual problems - and often also the solutions. Doing this multiple
> times is just wasted time. That's why we really need to get Xenomai
> based in dovetail for upcoming kernels so that test results and fixes
> flow in both directions automatically again.
I understand. I'm certainly the closest illustration of a rookie
engineer, I'm not as experienced as you are, hence my stubborn tendency
to do wasteful things at times. Last time was around 2001 when starting
Xenomai, then a year later implementing the first version of the I-pipe.
I should know better, but I can't help it, always looking for fun stuff
in coding while I should be doing boring things instead, like any
responsible grown-up should do.
More information about the Xenomai