This is the mail archive of the
ecos-devel@sourceware.org
mailing list for the eCos project.
Re: NAND technical review
Rutger Hofman wrote:
Jonathan Larmour wrote:
Hmm, I guess the key thing here is that in E's implementation most of
the complexity has been pushed into the lower layers; at least
compared to R's. R's has a more consistent interface through the
layers. Albeit at the expense of some rigidity and noticeable function
overhead.
It's not likely E's will be able to easily share controller code,
given of course you don't know what chips, and so what chip driver
APIs they'll be connected to. But OTOH, maybe this isn't a big deal
since a lot of the controller-specific munging is likely to be
platform-specific anyway due to characteristics of the attached NAND
(e.g. timings etc.) and the only bits that would be sensibly shared
would potentially happen in the processor HAL anyway at startup time.
What's left may not be that much and isn't a problem in the platform
HAL. However the likely exception to that is hardware-assisted ECC. A
semi-formal API for that would be desirable.
This is the largest difference in design philosophy between E and R. Is
it OK if I expand?
Sure.
NAND chips are all identical in their wire setup. They all have a data
'bus', and control lines to indicate whether what is on the bus is a
command, an address, or data.
NAND chips differ in how their command language works, but only so far.
What is on the market now is 'regular' large-page chips that all speak
the same command language, and small-page chips that have a somewhat
different command language. ONFI chips are large-page chips except in
interrogation at startup and in bad-block marking.
As I've already noted, it may be useful to think ahead to what may come
into the market later, including things that don't fit into the known
command languages (such as existing OneNAND) - a framework which can
support wider implementations can have that advantage.
[snip example]
These 2 languages are all the variation there is for NAND chips (plus,
at another level, 2 timing values for read cycle and write cycle)! The
wide-ranging differences for devices for NAND are in the controllers.
How controllers work, is that they accept input like 'write a command of
value 0x..', 'write an address of value 0x.....', etc, and do their job
on the NAND chip's wires. They cannot really operate at a higher level,
if only because they must support both small-page and large-page chips
(and ONFI), and this is the level of common protocol for the chips.
So controller code has to bridge between API calls like page_read and
the interface of the controller as described above. R's implementation
presumes that a lot of the code to make this translation is generic: a
large-page read translates to the controller steps as given above in the
running example, in any controller implementation.
That's true. At the same time, have a look at E's code in
https://bugzilla.ecoscentric.com/show_bug.cgi?id=1000770
Specifically the Samsung K9 driver in
devs/nand/samsung_k9/d20090826/include/k9fxx08x0x.inl - while you could
argue the steps required are generic and can be made common (write this
address, write that command, etc.), it seems E assumes that the steps may
not really be complex enough to justify abstracting them out.
I would certainly be interested in your perspective about what E's driver
implementation lacks compared to R's. Lack of hardware ECC is one thing
certainly.
Moreover, the generic
code handles spare layout: where in the spare is the application's spare
data folded, where is the ECC, where is the bad-block mark.
In E's implementation, the complexities of an abstracted spare layout seem
to start disappearing as you know more about what chip you've got as a lot
of the complexity has been pushed into the chip driver.
OTOH, the
generic code has hooks for handling any ECC that the controller has
computed in hardware -- how ECC is supported in hardware varies across
controllers. But the way the ECC check is handled (case in point is
where a correctible bit error is flagged) is generic again.
In E's case, in the EA LPC2468 port example, they have the following in
the platform HAL for a port (although it could be a package instead):
[various functions/macros defined which are used by k9fxx08x0x.inl]
#include <cyg/devs/nand/k9fxx08x0x.inl>
CYG_NAND_DEVICE(ea_nand, "onboard", &k9f8_funs, &_k9_ea_lpc2468_priv,
&linux_mtd_ecc, &nand_mtd_oob_64);
which succinctly brings together the chip driver, accessor functions, ECC
algorithm, and OOB layout. It becomes easy for a board port to choose some
different chips/layouts/ECC. There's flexibility for the future in that.
With R's implementation, there seems to be much more code involved. And I
sort of see why there's more code, and I sort of don't. Not just in the
generic layer, but in the drivers as well, at least looking at the bfin
chip, and I don't think the differences are completely explained by the
hardware properties of each NFC (but I'm very willing to be corrected!).
Comparing E's k9_read_page() along with everything it calls, with R's
bfin_nfc_data_read() along with everything it calls (and those call etc.
not just in bfin_nfc.c but also nand_ez_kit_bf548.inc[1]) there's a huge
difference. If nothing else from what I can tell this may then require a
much larger porting effort, compared to E's.
I see that some of the reasons for larger code in R are due to run-time
testing of hardware properties: 8 vs 16-bit bus width, SP vs LP vs ONFI. I
also note that E's implementation doesn't do as much error checking as I
think it ought to, especially in the Samsung K9 chip driver. But that's
not all of it the difference.
Anyway, I think I'm talking out loud here rather than asking anything
specific about it. It may just be something we have to put down to the
difference in design philosophy, rather than something which can be
improved. There are still advantages with R in other ways.
Jifl
[1] which should really be .inl for consistency in eCos but that's a detail
--
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine