[Xenomai] Isolation of CPU (isolcpus=1). Unexpected better performance when RT thread is on core0.
Yann le Chevoir
yann.le-chevoir at etu.esisar.grenoble-inp.fr
Mon Feb 5 16:50:00 CET 2018
Hello,
I am an engineering student and I try to proof that a 4000Hz hard real-time
application can run on an ARM board rather than on a more powerful machine.
I work with an IMX6 dual-core and Xenomai Cobalt 3.0.4. I use POSIX skin.
By the way, I first installed Xenomai Cobalt 3.0.5 but first
experimentations revealed that Alchemy API did not work properly (for
example, altency test did not work). I needed more investigations but when
I tried previous version, it worked. I did not test v3.0.6.
For now, my point is that I observe some unexpected behaviors when
isolating cpu1 and perhaps you can explain some to me.
My application looks like:
main(){
create a POSIX Xenomai thread1, prio = 99, cpu = 1
/*cpu1 is isolated, see below*/
create a POSIX Xenomai thread0, prio = 98, cpu = 0
/*I observed that thread1 latency is better if it remains a
RT thread on core 0*/
start both threads
while(1){
print_stat();
}
}
thread1(){
struct timespec start, stop, next, interval = 250us;
/* Initialization of the periodicity */
clock_gettime(CLOCK_REALTIME, &next);
next += interval;
while(1){
/*Releases at specified rate*/
clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &next, NULL);
/*Get time to check jitter and execution time*/
clock_gettime(CLOCK_REALTIME, &start);
do_job();
/*Get time to check execution time*/
clock_gettime(CLOCK_REALTIME, &stop);
do_stat(); //jitter = start-next; exec_time = stop-start
next += interval;
}
}
thread0(){
while(1){
usleep(10000);
/* Do nothing. I just observed that thread1 latency is better
* if it remains a RT thread on core 0. Do you have an
* explanation for that too?*/
}
}
In fact, I expected that my thread will perform better if it is the only
one on core1. The idea was that all the Linux stuff remains on core0.
So, I put the boot argument isolcpus=1.
Note that it remains these processes on core1. I think it is normal:
PID PSR RSS CLS RTPRIO CMD
11 1 0 FF 99 [migration/1]
12 1 0 TS - [ksoftirqd/1]
13 1 0 TS - [kworker/1:0]
14 1 0 TS - [kworker/1:0H]
Then, I use the script /usr/xenomai/bin/dohell to load Linux (core0)
for 90 seconds: ./dohell -m /mnt 90
And I launch my process for 60 seconds.
Here is the threads list:
PID TID PSR RSS CLS RTPRIO CMD
585 585 0 1952 TS - /bin/sh ./dohell -m /mnt 90
586 586 0 1588 TS - /bin/sh ./dohell -m /mnt 90
587 587 0 1640 TS - /bin/sh ./dohell -m /mnt 90
588 588 0 1604 TS - /bin/sh ./dohell -m /mnt 90
589 589 0 352 TS - dd if=/dev/zero of=/dev/null
590 590 0 1524 TS - /bin/sh ./dohell -m /mnt 90
591 591 0 344 TS - sleep 90
881 881 0 29332 TS - ./test_4000Hz
881 890 0 29332 TS - ./test_4000Hz
881 1013 1 29332 FF 99 ./test_4000Hz
881 1014 0 29332 FF 98 ./test_4000Hz
TID 881 is the main.
I am not sure why there is the TID 890 thread. Is it a Xenomai one (main)?
TID 1013 is the thread I am interested in.
TID 1014 is the thread I maintain on core0 to have better Latency on core1.
Here is the point:
With the configuration described above, I have the graph
"Core1.png" showing latency and execution time.
Note that I always plot thread1 statistics, it is the only one
which I am interested in.
Reminder of the configuration when plotting "Core1.png":
Core0: Linux stressed + main + thread0
Core1: thread1
Min execution time is 32us.
Max execution time is 82us.
I am a bit disappointed by so execution-time variations.
How can we explain that?
Then, trying permutations to understand these variations, I decided to put
thread1 on CPU0. Linux, main, thread0 and dohell continue doing their stuff.
Note that there is again the isolcpus=1 argument, so nothing is on CPU1.
I am surprised to have a better execution time statistics. Is it a known
situation and how can we explain that? See "Core0.png".
Reminder of the configuration when plotting "Core0.png":
Core0: Linux stressed + main + thread0 + thread1
Core1: -
Min execution time is 32us.
Max execution time is 65us.
Then, given these results, as I had the feeling that a mono-core processor
performs better that a dual-core one, I tried to delete the isolcpus=1
argument to proof the contrary.
Here is the configuration when plotting "NoIsolation.png":
Core0: Linux stressed + main + thread0
Core1: Linux stressed + thread1
As you can see, the graph looks like the first one, but execution time
is even worse at 94us.
Is there something I do wrong?
Thanks for your help, and thanks for your work on Xenomai :)
Yann
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NoIsolation.png
Type: image/png
Size: 15937 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Core0.png
Type: image/png
Size: 15550 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Core1.png
Type: image/png
Size: 16104 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment-0002.png>
More information about the Xenomai
mailing list