rt_pipe_write memory allocation bug - xenomai 3.x

Stéphane Ancelot sancelot at numalliance.com
Tue Jul 28 15:28:05 CEST 2020


Le 27/07/2020 à 15:17, Jan Kiszka a écrit :
> On 27.07.20 14:44, Stéphane Ancelot via Xenomai wrote:
>> Hi,
>>
>> Using pipe created with poolsize = 0, meaning all message allocations 
>> for this pipe are performed on the Cobalt core heap.
>>
>> Unfortunately,  using rt_pipe_write(), when no user task is consuming 
>> it, we discovered after almost many rt_pipe_write() cycles (700000 at 
>> least in our process)  , that the cobalt heap and system heap seem 
>> being corrupted.
>>
>> Leading to system issues like unattended task crashes .....
>>
>
> "3.x" implies both 3.1 and 3.0 are affected?
>
> Do you see a constantly growing use of system heap (leak)? If that is 
> not the case, we might have some wrap-around issue somewhere.
>
The version we are using is  based on release b3e18b6d  of master branch.

We don't sea system memory increasing (using top).

Comparing it to the latest releases, we have not found any big 
differences in xddp code .

Using other releases , applications and compiled kernel does not 
warranty  to identify it has been solved , since the memory mapping to 
reproduce it , changes.

For certifications reasons, we can't validate the latest source code, 
but only cherry pick a localised hotfix in the xenomai code.


> Reproduction case would be nice.
>
It is not easy, the initial problem was reported by one of our users , 
we spent lot of time to achieve to reproduce it in our context.

Some graphics user tasks were locking or crashing after some days usage 
and production .

At first,  we went in wrong directions in order to identify from where 
it could happen.

In our system, we had to test each code commits back....in order to 
isolate the problem, and understand that it was visible after almost 
700000 rt_pipe_write calls in our case.


As a unittest, we can provide the enclosed snippet.That is the extracted 
code that would cause problem.


>>
>> Are there any way to bypass this problem, like knowing if pipe has 
>> been opened before writing it ?
>>
> Regarding signalling of a non-RT client is connected: There is no 
> mechanism for that so far. Could be added. Needs a proposal for a 
> useful API.
>
> Jan
>
Stéphane ANCELOT
-------------- next part --------------
#include <sys/mman.h>
#include <stdlib.h>
#include <alchemy/task.h>
#include <alchemy/heap.h>
#include <alchemy/pipe.h>

#include <xenomai/init.h>
#include <xenomai/tunables.h>

#include <unistd.h>


#include <stdio.h>


RT_TASK TASK_AUTOMAT;   
RT_PIPE  pipe_fouettement; 



static int foo_tune(void)
{
        set_config_tunable(session_label, "numalliance/automat");
        set_config_tunable(mem_pool_size,8000000);
        printf("mem pool size=%ld\n", get_config_tunable(mem_pool_size));
        return 0; /* Success, otherwise -errno */
}

/*
 * CAUTION: we assume that all omitted handlers are zeroed
 * due to the static storage class. Make sure to initialize them
 * explicitly to NULL if the descriptor belongs to the .data
 * section instead.
 */
static struct setup_descriptor foo_setup = {
        .name = "foo",
        .tune = foo_tune,
        .parse_option = NULL,
        .help = NULL,
        .init = NULL,
        .options = NULL,
};

/* Register the setup descriptor. */
user_setup_call(foo_setup);


void automate(void *cookie)
{
static long  count = 0;
 printf("rt task running\n");
	while (1 == 1)
	{
	  rt_task_sleep(1000000);
  	  char car = 'p'; 
	  int ret = rt_pipe_write(&pipe_fouettement, &car, sizeof(car),P_NORMAL );
	  count++;

	     
	  if (count % 10000 == 0)
	    {
	      printf("count %ld\n",count);
	      fflush(NULL);
	    }
	}
}

int main(int argc,char **argv)
{

int ret;
    mlockall(MCL_CURRENT | MCL_FUTURE);
    if ((ret=rt_pipe_create(&pipe_fouettement,"FIFO_fouettemant",10,0)) < 0)
    {
        printf("unable to create pipe_fouettemant\n[%s]\n",strerror(-ret));
	exit(1);
    }
    else {
      printf("pipe created\n");
    }
   ret = rt_task_create(&TASK_AUTOMAT,"Nutomat",
			100000, /* TASK_STKSZ : 0 = default */
			97,/* TASK_PRIO  */
			0); /* TASK_MODE 0 =no flags */
	if (ret != 0)
		{
		  printf("%s:%d Task Automat creation failed ...\n",__FILE__,__LINE__);
		  exit(1);
		}

	else {
	  printf("task Automat created \n");
	}
	ret = rt_task_start(&TASK_AUTOMAT,&automate,NULL);
	if (ret != 0)
	{
		printf("Task start failed ...\n");
		exit(1);
	}
	else printf("task created\n");
	

	fflush(NULL);
	pause();
return 0;
}


More information about the Xenomai mailing list