This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Enabling -O2 option of GCC]


Hi,
	I have successfully removed the memory copy on the transmit
side, from an mbuf to the network buffer. This is very easy - all that's
necessary is in the low level driver "HRDWR_send" routine the driver
needs to iterate thru the "sg_list" array
	- setting up the buffer descriptors to directly use the
"sg_list" "buf" field as the value to be written to the "data buffer"
field of the buffer descriptor. 
	- only setting the "L" (Last in frame) and "TC" (Transmit CRC)
bits of the buffer descriptor status field in the last BD used.

	My (limited) testing to date seems to show that this works fine
- since the network stack does not free the mbuf until eth_drv_tx_done()
has been called back.

	I've not tried removing the copy on the receive side yet - it's
on my list of enhancements/performance things to do - once I've got the
application basically orking. I've a couple of possible schemes in mind
here (I don't know if any will work! - comments welcome). All involve
allocating mbufs (1522 bytes in length) in the low level driver and
setting the Buffer descriptors to use these directly - so any received
data is already in an mbuf. Then:

	a)	when the low level driver calls eth_drv_recv,
eth_drv_recv allocates an mbuf which it passes back to the low level
driver in via the "HRDWR_recv" call - well actually it's not the mbuf
that's passed but an sg_list that points to the data section of the
mbuf. If I add a pointer to the mbuf allocated in the parameters to
"HRDWR_recv" the low level driver can "swap" the mbufs over - so
"HRDWR_recv" uses the passed in mbuf in the buffer descriptor to replace
the one it passes to the network stack. "HRDWR_recv" would still have to
copy the 14 byte Ethernet header into the first sg_list buffer. I'd also
need to make sure that the mbuf allocated by eth_drv_recv was 1522 bytes
and not the length of the frame just received.

Or	b)	Basically do the same as a) but rather than modifying
eth_drv_recv and the calling parameters to "HRDWR_recv" (I don't want to
break my othe drivers or more importantly deviate my eCos source from
the main line code base!) don't call eth_drv_recv from the low level
driver at all! Move the functionally of eth_drv_recv into the low level
driver - so the low level driver:
		checks the interface is up
		updates the interface stats (if_ipackets++)
		moves the received frames' mbuf so that the data pointer
points after the ethernet header (ie advance it by 14 bytes)
		calls ether_input() - with a pointer to the ether header
and the mbuf
		allocates a replacement mbuf for the one just used and
gives it to the buffer descriptor. 

	c)	I have another network interface which uses a FIFO so I
have to perform a memcpy from it into an mbuf. This is a large overhead
when running in promiscuous mode and/or performing bridging. So I'm also
considering making a change like b) so the low level driver does not
call eth_drv_recv but it calls ether_input directly but WITHOUT an mbuf
just the ethernet header. I'll need to modify ether_input (and
bridge_input) so the remainder of the Ethernet frame is only read from
the FIFO if ether_input or bridge_input decide that the frame needs to
be processed. This change is a bit more complex and involves changes to
the common network stack not just the low level drivers so I'm reticent
about doing this. But it would make a hugh difference in promiscuous
mode.

	There are also a couple of bcmp (memcmp) calls in ether_input
comparing the ethernet destination address with the broadcast and
station mac address - many Ethernet controllers automatically provide
this information so it should be possible for the low level driver to
set the appropriate M_BCAST or M_MCAST in the mbuf itself - it might
also be worth defining a M_STATION to indicate the destination address
matched the station address.




	There is one FUNDAMENTAL issue with this and I'm not sure of the
implications within the network stack. With lots of processor (PowerPc,
68K derivaties etc) the data buffer used by the buffer descriptor must
be 4 byte or 16 byte aligned. This means that the start of the IP header
(14 bytes from the start of the Ethernet frame) will not be on a
longword (32 bit) boundary. Everyting I have read/seen seems to indicate
that the network stack expects the IP header to be 32 bit aligned - I'm
not sure of the reasons for this?
	- I know the code will be more efficient if accessing 32 bit
aligned data. But many processors (powerPc for example) will happily
access 32 bit values that are not on 32 bit aligned addresses - the IP
header processing code may be a little slower but the saving of not
having made a memcpy should be much greater.
	I'm not sure if the network stack itself will decide the IP
header is not aligned and complain - or even worse allocate a new mbuf
and align it - performing a memcpy! I think the only solution is to try
it and see - unless anybody knows more than me? So I don't think there
should really be a problem if the processor can support non-aligned data
accesses?

 
	Dave Webster




-----Original Message-----
From: ecos-discuss-owner@sources.redhat.com
[mailto:ecos-discuss-owner@sources.redhat.com] On Behalf Of N.Suresh
Sent: 10 December 2001 16:10
To: andrew.lunn@ascom.ch; ecos-discuss@sources.redhat.com;
rprakash@cdotb.ernet.in; bshiva@cdotb.ernet.in
Subject: Re: [ECOS] Enabling -O2 option of GCC]

Hi,
    i added volatile modifier to buffer descriptor pointers.
    it is working fine.

    i am working on a 8260 based board, where i am getting a avg. 
latency of 0.7 ms for ping,
    with cache, MMU and optimisations enabled.

    i want to reduce the latency to about tens of micro secs. (required 
by our application).

    Any idea where to optimize? Has anybody done away with the memory 
copy from the ethernet buffers to the mbufs ?

thanx in advance,
regards


Andrew Lunn wrote:

>On Mon, Dec 10, 2001 at 06:45:39PM +0530, N.Suresh wrote:
>
>>    Are there any pointers for getting the areas where exactly gcc 
>>optimises and
>>    special care in coding has to be taken care of ?
>>
>
>Do a search on the ecos discuss archive for "volatile".
>
>   Andrew
>
-- 
-- 
!=======================================================================
=====!
= Suresh N., Research Engineer, C-DoT, Bangalore.
=
= Call me at : OFF: 2383951(Dir) / 2263399 (268)  RES: 3334248
=
= Alternate email :  nsur_mys@email.com
=
= QOT: For fast-acting relief, try slowing down.
!=======================================================================
=====!





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]