This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Scheduler bug found and fixed


Hi Nick and others.

I found a bug in the mlqueue schedular. It goes something like this...

An alarm function calls cyg_thread_suspend() on the current thread, ie
the one that was running when the timer ISR went off.

cyg_thread_suspend correctly takes the thread off the run_queue[pri]
and clears the queue_map bit. The alarm func finishes, all the
remaining DSR are called. 

After this it decided to do a timeslice on the clock tick. The
timeslice function then rotates the current run queue. Unfortunatly,
the run queue is now empty, so it follows a null pointer into random
memory and makes that the head of the queue. It then forces a
reschedule. This works becasue the queue_map bit is not set on the now
corrupt run_queue. The correct thread gets to run and the system seems
happy.

Then something causes an add_thread onto the now corrupt run_queue. In
my cause a call the cyg_thread_kill(). add_thread checks the
consitancy of the run_queue. It finds the queue is not empty, but the
queue_map bit is not set and throws a CYG_ASSERT.

The fix is to change the timeslice function to check the queue it
wants to rotate actually has at least one thread on it. If not ask for
a reschedule.

Here is a patch for the fix and to put an assert into the rotate
function to check for rotating and empty queue.

        Andrew

Index: ChangeLog
===================================================================
RCS file: /cvs/ecos/ecos/packages/kernel/current/ChangeLog,v
retrieving revision 1.43
diff -c -r1.43 ChangeLog
*** ChangeLog	2000/09/19 05:53:59	1.43
--- ChangeLog	2000/09/20 12:01:30
***************
*** 1,3 ****
--- 1,8 ----
+ 2000-09-20  Andrew Lunn  <andrew.lunn@ascom.ch>
+ 
+ 	* src/sched/mlqueue.cxx (timeslice) don't rotate an empty queue. 
+ 	* src/sched/mlqueue.cxx (rotate) Assert queue is not empty
+ 
  2000-09-13  Jesper Skov  <jskov@redhat.com>
  
  	* tests/kexcept1.c (cause_exception): Use separate cause_fpe function.
Index: src/sched/mlqueue.cxx
===================================================================
RCS file: /cvs/ecos/ecos/packages/kernel/current/src/sched/mlqueue.cxx,v
retrieving revision 1.7
diff -c -r1.7 mlqueue.cxx
*** mlqueue.cxx	2000/09/11 02:43:00	1.7
--- mlqueue.cxx	2000/09/20 12:01:30
***************
*** 283,297 ****
      
          cyg_priority pri                               = thread->priority;
          Cyg_SchedulerThreadQueue_Implementation *queue = &sched->run_queue[pri];
! 
!         queue->rotate();
! 
!         if( queue->highpri() != thread )
              sched->need_reschedule = true;
! 
          timeslice_count = CYGNUM_KERNEL_SCHED_TIMESLICE_TICKS;
      }
- 
      
      CYG_ASSERT( queue_map & (1<<CYG_THREAD_MIN_PRIORITY), "Idle thread vanished!!!");
      CYG_ASSERT( !run_queue[CYG_THREAD_MIN_PRIORITY].empty(), "Idle thread vanished!!!");
--- 283,298 ----
      
          cyg_priority pri                               = thread->priority;
          Cyg_SchedulerThreadQueue_Implementation *queue = &sched->run_queue[pri];
!         if ( !queue->empty() ) {
!           queue->rotate();
!           
!           if( queue->highpri() != thread )
              sched->need_reschedule = true;
!         } else {
!             sched->need_reschedule = true;
!         }
          timeslice_count = CYGNUM_KERNEL_SCHED_TIMESLICE_TICKS;
      }
      
      CYG_ASSERT( queue_map & (1<<CYG_THREAD_MIN_PRIORITY), "Idle thread vanished!!!");
      CYG_ASSERT( !run_queue[CYG_THREAD_MIN_PRIORITY].empty(), "Idle thread vanished!!!");
***************
*** 656,662 ****
  Cyg_ThreadQueue_Implementation::rotate(void)
  {
      CYG_REPORT_FUNCTION();
!         
      queue = queue->next;
  
      CYG_REPORT_RETURN();
--- 657,664 ----
  Cyg_ThreadQueue_Implementation::rotate(void)
  {
      CYG_REPORT_FUNCTION();
!     CYG_ASSERT(queue != 0, "Rotating an empty queue");
!  
      queue = queue->next;
  
      CYG_REPORT_RETURN();



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]