Fwd: Debugging system freeze, SIGXCPU

Ari Mozes arimozes at neocisinc.com
Mon Feb 25 18:59:14 CET 2019


On Mon, Feb 25, 2019 at 12:28 PM Jan Kiszka <jan.kiszka at siemens.com> wrote:
> > On Mon, Feb 25, 2019 at 11:08 AM Philippe Gerum <rpm at xenomai.org> wrote:
> >>
> >> On 2/25/19 2:32 PM, Ari Mozes via Xenomai wrote:
> >>> Resending this question with testcase.
> >>> Can someone give the testcase a try to see if it reproduces the problem I
> >>> am seeing?  Is more information needed?
> >>> It takes a couple of minutes before I see the issue occur.
> >>
> >> The random lockup is due to std::chrono::high_resolution_clock::now()
> >> invoking the vDSO form of clock_gettime().
> >>
> >> SIGXCPU aka Xenomai's SIGDEBUG may be sent by the core in various
> >> situations, but since the code does not set the T_WARNSW for any task,
> >> the only explanation is receiving a Xenomai watchdog notification. See
> >> the help information about CONFIG_XENO_OPT_WATCHDOG in your kernel
> >> configuration.
> >>
> >> After a few secs spinning in the vDSO code which may not be called from
> >> real-time context, the Xenomai core pulls the break and sends SIGXCPU to
> >> the offending process, unless the system locks up before the watchdog
> >> could even trigger.
> >>
> >> Solution: use clock_gettime(CLOCK_HOST_REALTIME) instead of
> >> std::chrono::high_resolution_clock::now() for getting timestamps.
> >>
> >> A related discussion is available at this URL:
> >> https://www.xenomai.org/pipermail/xenomai/2018-December/040133.html
> >>
> >> --
> >> Philippe.
> >
> >
> >
>
> --
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE
> Corporate Competence Center Embedded Linux
>
> On 25.02.19 17:57, Ari Mozes via Xenomai wrote:
> > Philippe,
> > Thank you for the information and the URL.
> > I read through the thread, and I agree with comments that it would be
> > helpful to be able to identify/blacklist/etc problematic calls when
> > porting over existing code to a true RT scenario.  In our case the
> > original code was written with "RT-like" behavior in mind, but as
> > there is a lot of code already in place, approaches to identify
> > existing problematic calls would be helpful.
>
> You could wrap such calls like we do for malloc/free in libcobalt. But wrapping
> only works if the direct caller is processed that way - and is not some
> pre-built external library.

Sure - make sense - IMO just knowing which calls are potentially problematic is
the difficult part here.  I expect I will just continue to stumble through them
and learn more as I go.

>
> Therefore: Do not use libraries that you didn't validate from within
> time-sensitive code paths. Also libstdc++ may contain more surprises.
>
> Jan
>
> > I will continue to familiarize myself with the nitty-gritty details,
> > but anything that makes the process easier is always welcome :-)
> >
> > Ari
> >
> >


-- 

Ari Mozes
Staff Software Engineer

Neocis, Inc.

Mobile: 781.266.6553



. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .



More information about the Xenomai mailing list