A potential Xenomai Mutex issue

DIAO, Hanson hanson.diao at siemens.com
Thu Aug 22 20:42:33 CEST 2019


Hi all,



I hope you are doing well. Currently I was working on a critical deadlock issue with Xenomail Library(version 2.6.4). I found that for the Xenomai lock count is not reliable after we called rt_mutex_release. I print the following message to you. I hope some developer can help me fix this issue. I know that this version is EOL, but we still use this old version. Thank you so much.



Issue 1:

Before Mutex Lock Mutext addr = 0xb7c059e8,count = 0, owner = 0     This message show the status before rt_mutex_acquire.

After Mutex Lock Mutext addr = 0xb7c059e8,count = 1, owner = 2bd   This message show the status after calling rt_mutex_acquire.     Everything is right for the rt_mutex_acquire in this scenario.



Before Mutex unLock Mutext addr = 0xb7c059e8,count = 1, owner = 2bd   This message show the status before rt_mutex_release.

After Mutex unLock Mutext addr = 0xb7c059e8,count = 1, owner = 0          This message show the status after rt_mutex_release. It seems that the lock count is not correct after call rt_mutex_release.



Issue 2:

When our task is call recursive lock. The mutex lock count should more than 1, but the lock count is still 1.



For the issue 1, I guess that there are something wrong in the release function. I highlighted the code. I am not sure if it is the root cause.



int rt_mutex_release(RT_MUTEX *mutex)

{

#ifdef CONFIG_XENO_FASTSYNCH

        unsigned long status;

        xnhandle_t cur;



        cur = xeno_get_current();

        if (cur == XN_NO_HANDLE)

                return -EPERM;



        status = xeno_get_current_mode();

        if (unlikely(status & XNOTHER))

                /* See rt_mutex_acquire_inner() */

                goto do_syscall;



        if (unlikely(xnsynch_fast_owner_check(mutex->fastlock, cur) != 0))

                return -EPERM;



        if (mutex->lockcnt > 1) {

                mutex->lockcnt--;

                return 0;

        }



        if (likely(xnsynch_fast_release(mutex->fastlock, cur)))

        {

                return 0;

        }

do_syscall:

#endif /* CONFIG_XENO_FASTSYNCH */



        return XENOMAI_SKINCALL1(__native_muxid, __native_mutex_release, mutex);

}







For the Mutex lock function, I am so confused with the following comments which I highlighted as below. I am not sure if it supports the recursive lock.

static int rt_mutex_acquire_inner(RT_MUTEX *mutex, RTIME timeout, xntmode_t mode)

{

        int err;

#ifdef CONFIG_XENO_FASTSYNCH

        unsigned long status;

        xnhandle_t cur;



        cur = xeno_get_current();

        if (cur == XN_NO_HANDLE)

                return -EPERM;



        /*

         * We track resource ownership for non real-time shadows in

         * order to handle the auto-relax feature, so we must always

         * obtain them via a syscall.

         */

        status = xeno_get_current_mode();

        if (unlikely(status & XNOTHER))

                goto do_syscall;



        if (likely(!(status & XNRELAX))) {

                err = xnsynch_fast_acquire(mutex->fastlock, cur);

                if (likely(!err)) {

                        mutex->lockcnt = 1;

                        return 0;

                }



                if (err == -EBUSY) {

                        if (mutex->lockcnt == UINT_MAX)

                                return -EAGAIN;



                        mutex->lockcnt++;

                        return 0;

                }



                if (timeout == TM_NONBLOCK && mode == XN_RELATIVE)

                        return -EWOULDBLOCK;

        } else if (xnsynch_fast_owner_check(mutex->fastlock, cur) == 0) {

                /*

                 * The application is buggy as it jumped to secondary mode

                 * while holding the mutex. Nevertheless, we have to keep the

                 * mutex state consistent.

                 *

                 * We make no efforts to migrate or warn here. There is

                 * XENO_DEBUG(SYNCH_RELAX) to catch such bugs.

                 */

                if (mutex->lockcnt == UINT_MAX)

                        return -EAGAIN;



                mutex->lockcnt++;

                return 0;

        }

do_syscall:

#endif /* CONFIG_XENO_FASTSYNCH */



        err = XENOMAI_SKINCALL3(__native_muxid,

                                __native_mutex_acquire, mutex, mode, &timeout);



#ifdef CONFIG_XENO_FASTSYNCH

        if (!err)

                mutex->lockcnt = 1;

#endif /* CONFIG_XENO_FASTSYNCH */



        return err;

}







More information about the Xenomai mailing list