This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Reduce stack usage of _vfiprintf_r()


On 10/10/2012 12:05 PM, Freddie Chopin wrote:
W dniu 2012-10-10 18:25, Corinna Vinschen pisze:
In case of my code I
don't expect it to be ever called, because I don't use unbuffered
streams.
Now I'm puzzled.  If you're not using unbuffered IO, how come that
you notice a difference, given that the code in question is only
called for unbuffered IO?!?
   From looking at the code you may come to conclusion that it does not
get called if your stream is buffered, but - as I wrote in my first
message - the __sbprintf() gets inlined (it's static and not used
anywhere else, it will always be inlined with optimization enabled) in
_vfiprintf_r(), thus the 1024 byte buffer on stack is allocated on EVERY
entry to to _vfiprintf_r() - no matter what the stream is.

With my change, the dynamic allocation is performed only when the
execution actually reaches the code path.

Another solution could be to make __sbprintf() not-static, so that it
would not be inlined.

I would never complain if the allocation would happen only for
unbuffered stream, but it doesn't...

Keep in mind that we have to serve targets with size constraints as well
as targets which go for speed.  If you have a lot of output to an
unbuffered stream like stderr, calling malloc may slow down output
noticably.
You can't expect unbuffered I/O to be fast... If I'm concerned it can be
as slow as 1bps if it does not allocate 1kB on stack (; On Cortex-M3
dynamic allocation takes about few-hundred cycles, so I guess that it's
not a significant problem. By "overhead" I actually meant that
allocation of 1024 bytes with malloc() can take up more memory, I was
not refering to speed, as that's not the problem here.

Bottom line: newlib _IS_ too big for ARM microcontrollers, that's why
most people from the embedded world don't use anything more than math
library. That's why commercial IDE/toolchains using GCC for such devices
have their own - smaller - version of libc, not newlib (just to name
CrossWorks and CodeRed). That's why AVR microcontrollers have their own
avr-libc, which would probably better suit ARM microcontrollers if it
was not targeted especially at AVR architecture.
One thing to keep in mind is that many of these other libc
implementations are far from complete. They implement a
small subset of capabilities.

Newlib aims for high compatibility with standards while still
being suitable for use in embedded systems.

As you note, avr-libc focuses heavily on AVRs with little (no?)
concern for other CPU architectures. My recollection is that
it also is a libc subset. Different project goal.

Corinna's to lower the buffer size or move the routine so
it isn't inlined would on a first order pass both be acceptable.
It may make sense to do both.

Another design consideration which sometimes comes into
play is to limit or forbid malloc()'s after the program completes
initialization.  The malloc() solution would push against this
rule and require analysis to ensure that all paths free the
memory. [1]

Sorry for the ramble.

[1] Disclaimer. I didn't review the patch in detail. I am
commenting more on how picking different top level
design rules and goals can influence the appropriateness
of a potential solution.


Regards,
FCh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]