This is the mail archive of the crossgcc@sourceware.org mailing list for the crossgcc project.

See the CrossGCC FAQ for lots more information.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [crosstool-NG] Design discussion


On Saturday 04 April 2009 13:14:20 Yann E. MORIN wrote:
> Hello all!
>
> Recently, I've been challenged about the design of crosstool-NG.

I.E. I blogged a couple criticisms and he read my blog:

http://landley.net/notes-2008.html#07-03-2009

(You'll notice that today's entry is a similar criticism of gcc, and at the 
end of the January 7th entry I call my own code "stupid".  I break 
everything.)

> This post is to present the overall design of crosstool-NG, how and why
> I came up with it, and to eventually serve as a base for an open discussion
> on the matter.

For comparison, my Firmware Linux project also makes cross compilers, so I 
have some experience here too.  (I spent most of saturday getting armv4 big 
endian soft float to work, and still haven't managed to get uClibc++ to build 
under any arm EABI variant; the error is "multiple personality directive" 
from the assembler, which is just _weird_.)

My project is carefully designed in layers so you don't have to use the cross 
compilers I build.  It should be easy to use crosstool-ng output to build the 
root filesystems and system images that the later scripts produce.  (How easy 
it actually is, and whether there's any benefit in doing so, is something I 
haven't really looked into yet.)  The point is the two projects are not 
actually directly competing, or at least I don't think they are.

> Hopefully some will jump in to offer their views on the subject, and offer
> sugestions as what should be done to improve the situation, should the need
> arise.
>
> The mail is structured that way:
> 1) Genesis of crosstool-NG
> 2) The way to fullfill the requirements
>   2.a) Ease maintenance
>   2.b) Ease configuration of the toolchain
>   2.c) Support newer versions of components
>   2.d) Add new features
>   2.e) Add alternatives where available
> 3) crosstool-NG installation
>   3.a) Setting up crosstool-NG: why using a ./configure?
>   3.b) Installing crosstool-NG: why is it required?
>   3.c) Runing crosstool-NG: why can't I run make menuconfig?
> 4) crosstool-NG internals
>   4.a) Programming languages used in crosstool-NG
>   4.b) Internal API
> 5) Conclusion
>
> In advance, I do apology for the really, really long post, and for the
> limited subset of the English language I use.
>
>
> =======================================
>
> 1) Genesis of crosstool-NG
>
> First, a little introduction to put things straight.
>
> About four years ago, I needed to generate cross-compilers for ARM and
> MIPS. One of the requirements was to be able to use various versions of
> the components (gcc, glibc, binutils...), and a second was to be able to
> switch between glibc and uClibc. Of the different tools I tested, crosstool
> was the one most closely matching the requirements, so I ended up using
> that for the following 1.5 years.
>
> But crosstool was not easy to configure, and the available versions of the
> components were most of the time lagging behind. It was glibc-centric, and
> I had to add uClibc support, which was not accepted mainstream.

I came at it from a different background.  I was playing with Linux From 
Scratch almost from the beginning, and did several glibc+coreutils based 
systems (actually predating coreutils, it was three separate packages when I 
started out).

I threw my first build system away in 2003, both because I left an employer 
that might have had some claim to work I'd done on company time and ecause I 
got serious about uClibc and busybox.  That's what got me into cross 
compiling: I was trying to build an x86-uClibc system from an x86-glibc 
system, and it turns out that's cross compiling.  (I didn't know this at the 
time, I just knew it was fiddly and difficult.)

I started by taking the old uClibc build wrapper apart to see what it was 
actually _doing_, and reproducing that by hand:

  http://lists.busybox.net/pipermail/uclibc/2003-August/027652.html
  http://lists.busybox.net/pipermail/uclibc/2003-September/027714.html

Of course this was so long ago, I still cared about Erik's new "buildroot" 
thing:

  http://lists.busybox.net/pipermail/uclibc/2003-August/027531.html
  http://lists.busybox.net/pipermail/uclibc/2003-August/027542.html
  http://lists.busybox.net/pipermail/uclibc/2003-August/027559.html

Anyway, the 2003 relaunch resulted in the _previous_ FWL incarnation, 
memorialized here:

  http://landley.net/code/firmware/old/

Which was thrown away and rebooted from scratch in 2006 based on proper cross 
compiling when cross linux from scratch came out.

I looked at crosstool circa 2004-ish, but was turned off by the way it 
replicated huge amounts of infrastructure for every single dot release of 
every component.  (I remember it having separate patches, separate build 
scripts, and so on.  I don't even remember what it did per-target.)

I wanted something generic.  A single set of source code, a single set of 
build scripts, with all the variations between them kept as small, simple, 
and contained as possible.

I actually ended up not basing anything off of crosstool.  Instead when Cross 
Linux From Scratch came out, I learned from that, and by asking a lot of 
questions of coworkers at an embedded company I worked at for a year 
(Timesys).  But this was 2006, so it was after you'd already started with 
this.

Along the way I wrote this:
  http://landley.net/writing/docs/cross-compiling.html

Which Timesys's marketing department wound up influencing a bit.  If I was 
writing it today it would be titled "why cross compiling sucks" and would be 
a lot longer...

Anyway, the project I'm working on now is either the third or the fourth build 
system I've done, depending on how you want to count it.  The last two have 
been designed around _removing_ stuff rather than adding it.  Figuring out 
what I could do without, and how to get away with it.

> In the end, maintaining my own tree became problematic, and I decided to
> give a try at enhancing crosstool with the following main goals in mind,
> in this approximative order of importance:
>
>  a- ease overall maintenance
>  b- ease configuration of the toolchain
>  c- support newer versions of components
>  d- add new features
>  e- add alternatives where it was available

Can't really argue with those goals, although mostly because they're a bit 
vague.

My current build system has very careful boundaries.  I know what it _doesn't_ 
do.  This is not only because my first couple systems grew out of control 
(adding more and more packages and more and more features), but because I 
watched Erik's buildroot explode from a test harness for uClibc into an 
accidental Linux distribution.

Buildroot started when the uClibc guys decided that the build wrapper couldn't 
work (because libgcc_s.so would always leak a reference to libc.so.6 unless 
you rebuilt the compiler from source).  So they abandoned the wrapper and 
instead made a simple build script to create a uClibc-targeted compiler from 
gcc and binutils.  Then because it was easy to do and a good test of the 
compiler they'd just built, they compiled BusyBox and made a tiny root 
filesystem out of that, packaged up the resulting directory as a filesystem 
image, and built User Mode Linux to run the result.  Thus buildroot was a 
combination compiler generator and test harness for uClibc and BusyBox.

Except that every time a new package was made to work with uClibc (often 
requiring a patch or two, or special configuration), they added the ability 
to build it to the buildroot scripts, both to document how and to make 
regression testing easy.  It quickly blew up to dozens of packages, and 
buildroot discussion took over the uClibc mailing list for a few years 
(eventually I got fed up with it and created a buildroot list on the server, 
and kicked the buildroot discussion off to that list.  Then the uClibc list 
was almost dead for a while until the development community recovered.)  It 
also sucked all Erik's attention away from busybox (which is why I took the 
latter over for a while).

This is why my current system is very carefully delineated.  I know exactly 
what it does NOT do.  It builds the smallest possible system capable of 
rebuilding itself under itself.  I.E. it bootstraps a generic development 
environment for a target, within which you can build natively.  It has to do 
some cross compiling to do this, but once it's done you can _stop_ cross 
compiling, and instead fire up qemu and build natively within that.

Reality is of course slightly more complicated, but I edit down towards that 
vision fairly ruthlessly.  My project will _not_ become a Linux distro, 
although you can build one on top of it if you like (ala Mark's Gentoo From 
Scratch project).

> I mostly saw my changes as a experimental branch of crosstool, which
> would ultimately pick interesting features as they mature, while dumping
> the uninteresting ones. So crosstool would be the stable branch, while
> my work would serve as a kind of testbed. Hence the name: crosstool-NG,
> "NG" for "Next Generation".
>
> Never, at any one time, did I intend this stuff to replace crosstool.
> What happened is that, around the time I was working on this, Dan KEGEL
> became less and less responsive, and changes sent to the list (by any
> one, not just me) took ages to get applied, if they even get applied at
> all.
>
> So that was how crosstool-NG was born to the world...

Yup.  And from that set of assumptions you've done a fairly good job.  What 
I'm mostly disagreeing with is your assumptions.  (I've thrown out and 
restarted my own project several times, because I came to disagree with my 
previous interation's initial assumptions.  What I was trying to _do_ had 
changed, and starting over was the best way to get to my new goal.)

My current codebase is driven by a desire to challenge my own assumptions (not 
just is there a better way to do this, but am I trying to do the right 
thing?) and a fairly relentless drive to remove stuff.  Just because I got it 
working and spent six months doing so is no excuse for keeping it if I figure 
out how not to need it anymore.  Recent-ish case in point:

  http://landley.net/hg/firmware/rev/606

> =======================================
>
> 2) The way to fullfill the requirements
>
> The first move I made was to first start from scratch. That way, it
> sounded to me it would be easier to come up with a good layout of
> things.

I'm all for it. :)

> 2.a) Ease maintenance
>
> At the heart of crosstool was a single script. In there was all the build
> procedures for all the components, from the installing the kernel headers
> up to building gdb.

We all start that way. :)

In my case, I separated my design into layers, the four most interesting of 
which are:

download.sh - download all the source code and confirm sha1sums
cross-compiler.sh - create a cross compiler for a target.
mini-native.sh - build a root filesystem containing a native toolchain
system-image.sh - package the root filesystem into something qemu can boot

Each of those layers is as independent of the others as I can make it:  you 
can wget the source yourself without needing download.sh (and it won't 
re-download code that's already there with the right sha1sums), 
cross-compiler.sh produces a reusable cross compiler you can keep and build 
other stuff with, mini-native.sh should be able to use an arbitrary cross 
compiler you happen to have installed as long as it can build appropriate 
target binaries, and system-image.sh creates a system image out of an 
arbitrary directory.

There's a build.sh that runs all the stages in sequence, but it's a fairly 
trivial wrapper around the other scripts.  (There's another one, 
host-tools.sh, called between download.sh and cross-compiler.sh.  That one 
exists to isolate the build from variations in the host system, but it's 
entirely optional and skipping it shouldn't change the results if your host 
distro's build environment is reasonable.)

> The first step was to split up this script into smaller ones, each
> dedicated to building a single component. This way, I hoped that it would
> be easier to maintain each build procedure on its own.

I wound up breaking the http://landley.net/code/firmware/old version into a 
dozen or so different scripts.  My earlier versions the granularity was too 
coarse, in that one the granularity got too fine.  I think my current one has 
the granularity about right; each script does something interesting and 
explainable.

I factored out some common code into scripts/include.sh and 
scripts/functions.sh, but it's all things that should be immediately obvious.

For example, the first package build in cross-compiler.sh is:

# Build and install binutils

setupfor binutils build-binutils &&
AR=ar AS=as LD=ld NM=nm OBJDUMP=objdump OBJCOPY=objcopy \
        "${CURSRC}/configure" --prefix="${CROSS}" --host=${CROSS_HOST} \
        --target=${CROSS_TARGET} --with-lib-path=lib --disable-nls \
        --disable-shared --disable-multilib --program-prefix="${ARCH}-" \
        --disable-werror $BINUTILS_FLAGS &&
make -j $CPUS configure-host &&
make -j $CPUS CFLAGS="-O2 $STATIC_FLAGS" &&
make -j $CPUS install &&
cd .. &&
mkdir -p "${CROSS}/include" &&
cp binutils/include/libiberty.h "${CROSS}/include"

cleanup binutils build-binutils

The sources/include.sh file sets all those environment variables 
(autodetecting $CPUS and getting target-specific information from the target 
you selected in sources/targets).

The functions setupfor and cleanup are in sources/functions.sh but setupfor is 
mostly doing "tar xvjf packages/binutils-*.tar.bz2" and cd-ing to the 
appropriate directory, and cleanup is more or less "rm -rf".

Notice there is _nothing_ target-specific in there.  All the target 
information is factored out into sources/targets.  The build scripts 
_do_not_care_ what target you're building for.

> 2.b) Ease configuration of the toolchain
>
> In the state, configuring crosstool required editing a file containing
> shell variables assignements. There was no proper documentation at what
> variables were used, and no clear explanations about each variables
> meaning.
>
> The need for a proper way to configure a toolchain arose, and I quite
> instinctively turned to the configuration scheme used by the Linux
> kernel. This kconfig language is easy to write. The frontends that
> then present the resulting menuconfig have limitations in some corner
> cases, but they are maintained by the kernel folks.

Yeah, I modified menuconfig for busybox a few years back so the darn visiblity 
logic didn't prevent it from writing symbols out to the .config file, so I 
could create ENABLE symbols that you could reliably use with if (ENABLE) and 
thus rely on dead code elimination instead of littering the code with 
#ifdefs.  (I recreated that for my toybox project, and fed that code to Mark 
for the menuconfig he's using for Gentoo From Scratch.)

I've also been doing miniconfigs for years, even tried to push an improved UI 
for them upstream into the kernel (with documentation) at one point:

  http://lwn.net/Articles/160497/
  http://lwn.net/Articles/161086/

The sources/targets directories in FWL each require three files:

  1) miniconfig-uClibc
  2) miniconfig-linux
  3) details (defines some environment variables describing the target).

And basically what you do is:

  ./build.sh targetname

Which would try to read config files from "sources/targets/targetname" so it 
can build a cross compiler, root filesystem directory, and bootable system 
image for that target.

> Again, of with the build scripts, above, I decided to split each components
> configuration into separate files, with an almost 1-to-1 mapping.
>
> Of course, there are configuration sections that do not apply to a
> specific component, but to the overall toolchain: the place to install
> it, the target and its options (BE/LE, CPU variants...). And some options
> tell crosstool-NG how to behave: the place to find source tarballs, log
> verbosity, and so on...

Ok, a few questions/comments that come to mind here:

1) Why do we have to install your source code?  The tarball we download from 
your website _is_ source code, isn't it?  We already chose where to extract 
it.  The normal order of operations is "./configure; make; make install".  
With your stuff, you have to install it in a second location before you can 
configure it.  Why?  What is this step for?

2) Your configuration menu is way too granular.  You ask your users whether or 
not to use the gcc "-pipe" flag.  What difference does it make?  Why ask 
this?  Is there a real benefit to bothering them with this, rather than just 
picking one?

I want to do a more detailed critique here, but I had to reinstall my laptop a 
couple weeks ago and my quick attempt to bring up your menuconfig only made 
it this far:

./configure --prefix=/home/landley/cisco/crosstool-ng-1.3.2/walrus
Computing version string... 1.3.2
Checking for '/bin/bash'... /bin/bash
Checking for 'make'... /usr/bin/make
Checking for 'gcc'... /usr/bin/gcc
Checking for 'gawk'... not found
Bailing out...

I note that Ubuntu defaults to having "awk" installed, why you _need_ the gnu 
version of specifically is something I don't understand.  This is an issue 
I've bumped into in other contexts, and here's my standard response:

  http://lkml.indiana.edu/hypermail/linux/kernel/0701.1/2066.html

I remember from getting crosstool-ng working last time that it wanted a bunch 
of other random stuff (none of which my build system needs to make cross 
compilers or root filesystems).

For example, you require libtool.  Why are you checking for libtool?  I note 
that libtool exists to make non-elf systems work like ELF, I.E. it's a NOP on 
Linux, so it's actually _better_ not to have it installed at all because 
libtool often screws up cross compiling.  (In my experience, when a project 
is designed to do nothing and _fails_ to successfully do it, there's a fairly 
high chance it was written by the FSF.  One of the things my host-tools.sh 
does is make sure libtool is _not_ in the $PATH, even when it's installed on 
the host.  Pretty much the only things that use it are FSF packages, and they 
all have autoconf notice it's not there and skip it.  Except for binutils, 
which bundles its own version and doesn't use the host's anyway...)

It's 7am here and I haven't been to bed yet, so I'll pause here.  I need to 
download the new version of crosstool-ng in the morning, fight with getting 
it installed again, and pick up from here.

Rob
-- 
GPLv3 is to GPLv2 what Attack of the Clones is to The Empire Strikes Back.

--
For unsubscribe information see http://sourceware.org/lists.html#faq


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]