This is the mail archive of the ecos-discuss@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: bugs in AT91 Ethernet driver


Major bug found in TX part - see below.

Andrew Lunn wrote:
>> During debugging, I see that 'priv->tx_busy' stays 'true', because the  
>> TX-Complete bit (COMP of TSR reg) is never set - I also see that the  
>> TX-Used Bit Read (UBR of TSR reg) is set, which is normal I think  
>> because the TX buffers have not yet been released (because COMP is not  
>> set..). But the TX-GO bit is 0, meaning that transmit is not busy.
>>
>> I added code to release the TX buffers and to clear 'priv->tx_busy' when  
>> TX-GO is read 0. Then at each ARP request, there is an ARP reply, but it  
>> does not get out.. (monitoring with Wireshark)
>> What's happening - it is like the EMAC TX is blocked, but why?
>
> It might be worth printing out the value of isr in at91_eth_isr. As
> the comment says, error conditions are not handled. Maybe you are
> getting a buffer under run?
Indeed Andrew, looking at the value of the ISR lead me to the cause of error,
and hence the solution. 
  The TX status of the ISR can be read in AT91_EMAC_TSR. The problem is bit 0
of this register: UBR = "Used Bit Read". 
  The datasheet (DS) is very unclear on this point (both the ATSAM9260 DS and
ATSAM7X DS have completely the same section about their EMAC). But after a full
day debugging, I'm quite sure I'm right.
  So, that Used Bit indicates if the concerning packet buffer (with sg_list[i]
data) has already been used by the EMAC. This Used Bit must be set to 0 by the
user SW, and is set to 1 by the EMAC. In case of the TX EMAC, the Used Bit is
set to 1 by the EMAC after transmit of the concerning buffer.
  The TX EMAC holds a private counter/pointer (read-only in TBQP) to the
transmit buffer descriptor list - the start location is the write-only
TBQP. (Only the RX part is explained in the DS). Normally this counter/pointer
cycles trough the circular transmit buffer descriptor list.
  Now, after transmit of the last buffer (that last buffer has to be marked by
the SW with a EOF flag) the TX EMAC increases its TX counter/pointer to already
point the next TX buffer. But that next TX buffer - that is free of course - is
marked by the SW as "Used" because the Used Bit is set. And therefore the UBR
error is flagged. And therefore the TX EMAC *resets its counter/pointer* !!
Because the SW does not reset its tbd_idx pointer, both pointers are out of
sync so all further transmits are blocked. 
  The major bug is that the if_at91.c SW rigorously sets the Used Bit (in the
buffer descriptor) of all free buffers to 1, meaning "Used". A the end of the
EMAC DS (36.4.1.3 for SAM9260 vG and 37.4.1.3 for SAM7 vG point 2): "Mark all
entries in this list as owned by EMAC, i.e. bit 31 of word 1 set to 0."
Therefore I changed at91_tb_init() and at91_reset_tbd() accordingly.
  Solving the "Used Bit" issue is not enough. All the TX errors UBR, RLE, BEX
and UND cause the counter/pointer to be reset. Therefore I changed the
deliver() function and added an extra argument to at91_reset_tbd() to reset the
SW tbd_idx pointer. I also added those TX errors to the IRQ IER and ISR.

In attachment a diff - it also contains my previous changes. I also added some
comments, and added '#ifdef CYGINT_IO_ETH_INT_SUPPORT_REQUIRED' to the start
and stop functions. And I added more debug printing - even in the ISR..
I tested this fix during 5 minutes - long enough to see that a ping finally
works... 
  There are of course still some problems - the ping reply is not synchronized
with the ping request: it runs 1 ping iteration behind - more of it in my next
mail. 

Kind regards,
Juergen

P.S.: The current code can only work if the driver is initialized after each
TX, or because each tx uses all buffers so that the pointer is wrapped.
I guess redboot must work this way??
Or am I still missing something?

--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]