rt_pipe_write memory allocation bug - xenomai 3.x
jan.kiszka at siemens.com
Thu Jul 30 00:08:18 CEST 2020
On 28.07.20 15:28, Stéphane Ancelot wrote:
> Le 27/07/2020 à 15:17, Jan Kiszka a écrit :
>> On 27.07.20 14:44, Stéphane Ancelot via Xenomai wrote:
>>> Using pipe created with poolsize = 0, meaning all message allocations
>>> for this pipe are performed on the Cobalt core heap.
>>> Unfortunately, using rt_pipe_write(), when no user task is consuming
>>> it, we discovered after almost many rt_pipe_write() cycles (700000 at
>>> least in our process) , that the cobalt heap and system heap seem
>>> being corrupted.
>>> Leading to system issues like unattended task crashes .....
>> "3.x" implies both 3.1 and 3.0 are affected?
>> Do you see a constantly growing use of system heap (leak)? If that is
>> not the case, we might have some wrap-around issue somewhere.
> The version we are using is based on release b3e18b6d of master branch.
> We don't sea system memory increasing (using top).
> Comparing it to the latest releases, we have not found any big
> differences in xddp code .
> Using other releases , applications and compiled kernel does not
> warranty to identify it has been solved , since the memory mapping to
> reproduce it , changes.
> For certifications reasons, we can't validate the latest source code,
> but only cherry pick a localised hotfix in the xenomai code.
>> Reproduction case would be nice.
> It is not easy, the initial problem was reported by one of our users ,
> we spent lot of time to achieve to reproduce it in our context.
> Some graphics user tasks were locking or crashing after some days usage
> and production .
> At first, we went in wrong directions in order to identify from where
> it could happen.
> In our system, we had to test each code commits back....in order to
> isolate the problem, and understand that it was visible after almost
> 700000 rt_pipe_write calls in our case.
> As a unittest, we can provide the enclosed snippet.That is the extracted
> code that would cause problem.
Under which condition does that test_pipe.cpp cause the issue? I've
given it a quick try, and as it's late, I disabled the delay in the
loop. That so far did not trigger an issue. Is the delay important?
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
More information about the Xenomai