This is the mail archive of the ecos-devel@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: AW: contributing a failsafe update meachanism for FIS from within ecos applications


Hi,

seems we are getting closer...

On Thursday 28 October 2004 19:23, Andrew Lunn wrote:
> On Thu, Oct 28, 2004 at 02:43:16PM +0200, Neundorf, Alexander wrote:
> > > Von: Andrew Lunn [mailto:andrew@lunn.ch]
> > >
...
> It however does not need to know where in the structure they are, what
> other member of the structure are, etc. By passing the fis_image_desc
> structure back and forth you are tying redboot and the application
> together. They both need to have the same definition of the structure.
> Maybe somebody adds a new member to the structure, recompiles redboot
> and installs it. You application is now broken since it does not know
> about this new member and it access the wrong things withing the
> structure. If however you just use the VV get functions everything
> works fine, because redboot knows how to get what you need from the
> structure. The implementation changes, the abstract interface stays
> the same. Classic object orianatated approach.
>
> > This way one would not have to call 8 functions to collect all
> > members for one entry.
>
> In practice you actually only need two. The base and the length. The
> other members are of no use to a filesystem.

Ok, maybe the other entries (entry point, crc) might also be interesting, 
maybe not.
But for creating a new image at least the following entries must be known:
-name
-flash_base
-data_length
-size
-entry_point
-mem_base
(maybe: crc)

So I would like a struct containing these entries.
I don't see a way around this.

...
> OK. Maybe a terminology problem here. What does your createImage() do?
> Does it create a new entry in the fis directory? Does it write a new
> image into flash at the location the fis entry says belongs to this
> image?
>
> For me these are two different things. If there was a filesystem
> creat() call it would do the first. A number of write()s would do the
> second.

createImage() does: create a new entry, writes the data, marks the new entry 
as valid.

It consists of the following steps:
startUpdate  (redboot) - modify the fis table contents in RAM and flash, mark
                         them in progress
writeData    (app) - either all at once, or in flash block sized chunks
finishUpdate (redboot) - mark the new fis table as valid in flash

> I also want to make sure that the design you propose is flexiable
> enough to support other peoples needs. So it seems you have enough
> memory to hold a complete image, but i want to ensure the same design
> can do multiple writes in a clean way using the same API. I would also
> like it to work without actually needed the redundant FIS block. Not
> everybody is so paranoid about power failure, but would like to be
> able to upgrade there application from within the application.

Well, paranoid...
If it fails the device doesn't work anymore...

Without redundant fis: 
startUpdate doesn't change the flash contents, the new fis table contents are 
written in finishUpdate, so it will work too (except that power failure.... 
well you know).

> > So how about this:
> > erase:
> > app to redboot: I want to erase foo
> > redboot: erases foo from the fis table and marks it as in progress
> > app: erases foo contents from the flash
> > app to redboot: I'm done with it
> > redboot: mark it valid
>
> I don't see why the application has to be involved in the erase. All
> you need is that the removing of the FIS entry is atomic with respect
> to a power cut. Redboot can do that.

I don't understand exactly.
If the app is responsible for writing the contents of the image, it should 
also be responsible for deleting the contents of the image. Ok, deleting 
could also mean simply removing the entry from the table, and the actual 
deletion is done before the programming, but I don't see a big difference 
here and really erasing the flash would seem cleaner to me. 

...
> > redboot: creates foo in the fis table and marks it as in progress
> > app: writes foo contents on the flash, all at once or block by block
> > app to redboot: I'm done with it
> > redboot: mark it valid
>
> You are again breaking the abstract. You are doing the CRC creation in
> the application where as it should be redboot doing it.

My main reason for this: I'd like to have the new fis table already completely 
correct on the flash except the valid_flag before the actual writing process 
starts, so that the final step really only has to set the valid_flag to 
valid.
Apart from that, is it possible for redboot to calculate the crc if it doesn't 
have enough ram to hold the complete image while updating and if the 
application is responsible for the actual writing ?
Which ram is actually available in a VV function ? (sorry for stupid 
questions)

[OT] why is crc32 used instead of the posix crc ?

...
> Assumption 1. All the needed FIS entries exist.
> Assumption 2. Your boot script is:
> fis load app
> go
> fis load app.bak
> go

This second step is cool :-)

> open(/foo) does two VV call to get the start and length of the image
> in flash and allocates the block cache.
>
> write() would copy the data into the block cache. If this fills the
> block cache it simply erases and then writes. As soon as the erase
> starts, the CRC is wrong. So in terms of redboot, this image is now
> corrupt.
>
> close() flushes the block cache. 

Is this is all done in the application ?

> It then does VV calls to ask redboot
> to recalculate the CRC and put it into the in memory copy of the FIS
> directory. It then calls a VV function to commit the FIS directory.
> Redboot does an atomic write, with respect to power failure, of the
> FIS directory using the valid fields in the redundant FIS blocks etc.
>
> So how do you do a safe upgrade of the application:
>
> open("app");
> write();         CRC is now wrong, so app.bak would be booted.
> write();         CRC is now wrong, so app.bak would be booted.
> write();         CRC is now wrong, so app.bak would be booted.
> close();         CRC is now valid, so the new image would be booted.
> open("app.bak");
> write();         CRC is now wrong, but it does not matter, app is valid
> write();         CRC is now wrong, but it does not matter, app is valid
> write();         CRC is now wrong, but it does not matter, app is valid
> close();         CRC is now valid and we have two identical apps.

I would prefer an obviously different API for the updating process since it is 
"dangerous" for the whole system. 
With my createImage() which writes a complete image at once there is also 
ensured that there can be at most one corrupt image at a time. When splitting 
open, write and close there can be more than one corrupt image. open() for 
writing should check that there is no other file open. Because of this 
special semantics I'd prefer a non-standard API for the updating functions.
But since this is done in the application it doesn't influence the interface 
to redboot that much.

Bye
Alex
-- 
Work: alexander.neundorf@jenoptik.com - http://www.jenoptik-los.de
Home: neundorf@kde.org                - http://www.kde.org
      alex@neundorf.net               - http://www.neundorf.net


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]