rt_pipe_write memory allocation bug - xenomai 3.x

Stéphane Ancelot sancelot at numalliance.com
Thu Jul 30 10:43:28 CEST 2020


Le 30/07/2020 à 00:08, Jan Kiszka a écrit :
> On 28.07.20 15:28, Stéphane Ancelot wrote:
>>
>> Le 27/07/2020 à 15:17, Jan Kiszka a écrit :
>>> On 27.07.20 14:44, Stéphane Ancelot via Xenomai wrote:
>>>> Hi,
>>>>
>>>> Using pipe created with poolsize = 0, meaning all message 
>>>> allocations for this pipe are performed on the Cobalt core heap.
>>>>
>>>> Unfortunately,  using rt_pipe_write(), when no user task is 
>>>> consuming it, we discovered after almost many rt_pipe_write() 
>>>> cycles (700000 at least in our process)  , that the cobalt heap and 
>>>> system heap seem being corrupted.
>>>>
>>>> Leading to system issues like unattended task crashes .....
>>>>
>>>
>>> "3.x" implies both 3.1 and 3.0 are affected?
>>>
>>> Do you see a constantly growing use of system heap (leak)? If that 
>>> is not the case, we might have some wrap-around issue somewhere.
>>>
>> The version we are using is  based on release b3e18b6d  of master 
>> branch.
>>
>> We don't sea system memory increasing (using top).
>>
>> Comparing it to the latest releases, we have not found any big 
>> differences in xddp code .
>>
>> Using other releases , applications and compiled kernel does not 
>> warranty  to identify it has been solved , since the memory mapping 
>> to reproduce it , changes.
>>
>> For certifications reasons, we can't validate the latest source code, 
>> but only cherry pick a localised hotfix in the xenomai code.
>>
>>
>>> Reproduction case would be nice.
>>>
>> It is not easy, the initial problem was reported by one of our users 
>> , we spent lot of time to achieve to reproduce it in our context.
>>
>> Some graphics user tasks were locking or crashing after some days 
>> usage and production .
>>
>> At first,  we went in wrong directions in order to identify from 
>> where it could happen.
>>
>> In our system, we had to test each code commits back....in order to 
>> isolate the problem, and understand that it was visible after almost 
>> 700000 rt_pipe_write calls in our case.
>>
>>
>> As a unittest, we can provide the enclosed snippet.That is the 
>> extracted code that would cause problem.
>>
>
> Under which condition does that test_pipe.cpp cause the issue? I've 
> given it a quick try, and as it's late, I disabled the delay in the 
> loop. That so far did not trigger an issue. Is the delay important?
>
The delay is not important , this is the rt_pripe_write() number of 
calls, that are not consumed.

Not easy to identify the memory leak in the heap.

Either use a system with low memory.

I have not tried it, but I suppose filling system memory, at a moment it 
will crash it overwriting importing system data.



Jan

>


More information about the Xenomai mailing list