The cardinal sin of Scratchbox, a backhanded stab at the Maemo SDK+, and the promise of a pot of gold

Sunday, 7. Dec 2008

Ok, this is a long one.  This was floating around my brain for a while, and it is time to lay it to rest.  Get a coffee and make yourself comfortable.

Scratchbox is a success, and life has adapted to its kinks.  Our software has formed calluses around the places where Scratchbox used to cause pain.

If all goes according to plan, then the Maemo SDK will move from Scratchbox 1 to Scratchbox 2 in the near future.  This might require our software to develop calluses in new places, but that is OK.  The benefits of SDK+ will hopefully outweigh the cost of switching.

So I think this is a good opportunity as any to take a look at what Scratchbox and the Maemo SDK get wrong, in my opinion.

The big sin of Scratchbox 1 is wanton redirection to places that are outside the target.  The Maemo SDK inherited that sin without questioning it, and I am afraid the Maemo SDK+ will continue with it, even tho Scratchbox 2 would allow us to repent.

By way of example, consider how Scratchbox and the Maemo SDKs handle “autoconf”.

The nice trick of Scratchbox 1 and 2 is to put you in an environment that looks and feels like a native ARM machine (say) by emulating the CPU, but processor-hungry tools like the compiler are still natively fast.  This trick is achieved by provisioning alternative binaries of the processor-hungry tools. These alternative binaries are functionally equivalent to the real ones, but they use the instruction set of the native CPU, not the emulated one.  Scratchbox magically executes them instead of the real binaries and thus avoids the emulation overhead.

So, how would you expect Scratchbox to handle autoconf, which is a shell script?

I would expect Scratchbox to do nothing about the /usr/bin/autoconf script itself.  The desired speed up comes from having a alternative binary for /bin/sh, the interpreter.

Still, Scratchbox has autoconf in /scratchbox/tools/bin/ and accesses to /usr/bin/autoconf get redirected to /scratchbox/tools/bin/autoconf. Why?

The only reason can be bootstrapping or more generally, supporting incomplete systems.  This is another basic trick of Scratchbox: you have a running autoconf even when your /usr/bin is still empty.  This can be a great help when breaking the cyclic dependencies so typical of bootstrapping.

But, do we need the redirection magic for this?  After all, if /usr/bin is empty, the shell will not find autoconf there and will not even attempt to run /usr/bin/autoconf.  That’s why /scratchbox/tools/bin is included in your PATH by default, and autoconf and a host of other much more useful tools such as ls, cp, and mkdir are available to you from there even if your root filesystem is empty.

But, as soon as you have bootstrapped your system enough to have its own mkdir or autoconf, Scratchbox should step back and hand control over to the growing system.  The emulation magic of Scratchbox will make sure that the system’s tools can be used even when they are compiled for a different architecture.

Thus, redirection is not really useful for bootstrapping or incomplete systems, and should not be used to make ghosts of binaries available in places that are really empty.  Stepping back and handing over control can then be as simple as putting /scratchbox/tools/bin at the end of PATH.

This is not what Scratchbox does, and I frankly don’t know why.  It’s easy to fix, by changing the default path and redirection rules, but still it puzzles me why the defaults for the Maemo SDK are the way they are.  Defaults matter a lot when build bots enter the picture that run the Maemo SDK in its default configuration.

The way Scratchbox tries to stay in control of your growing system, you might think it is jealous and doesn’t want it to grow up enough to become independent.  With the current setup, you are bound to grow not only incomplete, but also inconsistent systems that are forever bound to run nowhere else but in Scratchbox.

As a simple example, some portable programs hardcode the location of perl into their scripts.  If you need these programs during early phases of bootstrapping, before you have added perl itself to the system, they will pick up /scratchbox/tools/bin/perl and will happily go to work bootstrapping the system further.  Later, after bringing perl into the system, you will retrace the dependency cycle so that the programs will now hardcode /usr/bin/perl into their scripts, the real location of perl in the system that you are building.  With the default Scratchbox, this is not what is happening: even on the second round, the programs will hardcode /scratchbox/tools/bin/perl.  This of course will not work when the program is running in a real Maemo device that only has /usr/bin/perl.

(Incidentally, some scripts in the Maemo SDK+ rootstraps still refer to programs in /scratchbox/tools/bin.)

Worse, Scratchbox’s Perl might be sufficiently different from the one you want to have in your system, courtesy of devkits being a notoriously hard to maintain and slow distribution mechanism.

The same thing continues to happen even if bootstrapping should be long over.  Essentially, what used to be a perfectly portable program has to changed to compile correctly in the Maemo SDK.  This is a useless activity, much more useless than making a program cross-compilable.

Thus, /scratchbox/tools/bin etc should be at the end of PATH.  But, that is not enough: Scratchbox will still redirect /usr/bin/perl to /scratchbox/bin/perl and you still have to fight against the core devkit.  Except now everything turns to magic, “which” is lying to you, and you can spend the better part of the day figuring out why things don’t behave the way you expect them to.

So, redirection is useless and harmful and the default PATH inside Scratchbox is wrong.

But what about the first trick, avoiding the expensive CPU emulation for processor-hungry programs like the compiler?  The speedup is crucial and it is worth to tolerate a bit of redirection ugliness to get it.  If we can get the speedup in a non-ugly way, that would be better.

The first important thing to realize is that the list of binaries to speed up can be quite short and very explicit: /bin/sh, the coreutils, /usr/bin/make, /usr/bin/perl, /usr/bin/m4, /usr/lib/gcc/*/cc1, /usr/lib/gcc/*/cc1plus, /usr/bin/as, and /usr/bin/ld should get us pretty far.  It might even be enough.  There is no point to try to speed up interpreted programs and libraries like shell scripts and Perl modules.  Bootstrapping is over, the point of redirecting /usr/bin/perl is to speed it up, not because we don’t have it yet.

The second important point is that to speed up /usr/bin/perl in this way, it has to be replaced with a functionally equivalent program. Same version, same compile time configuration, same everything except for the target instruction set.  Improving qemu to do some JITing with caching would be great.  Taking the configured source and compiling it again but this time for Intel is probably a more realistic approach for the time being.

In other words, I don’t want to use Scratchbox’s i386 Perl binary to speed up Maemo’s armel Perl binary.  I obviously want to use Maemo’s i386 Perl to speed up Maemo’s armel Perl.

I also don’t want to use Debian’s Perl to speed up Maemo’s Perl.  This is what the Maemo SDK+ is doing and it is equally as wrong as using Scratchbox’s Perl.

Scratchbox 2 is not so much redirecting executables, what it does is better described as assembling a new filesystem according to mapping rules.  It does this in a much cleaner and much more flexible way than Scratchbox 1.

What the Maemo SDK+ does with it, however, is to again shadow parts of Maemo freely and massively with things from outside of Maemo.  Almost everything comes from a Debian etch chroot.  Why?

This time, the reason seems to be that someone tried to learn a lesson from devkits.  Devkits are hard to keep up-to-date and people continuously complained about them being behind Debian.  So, the solution was to replace the devkits with Debian.

This is still missing the point, though: For people working on Maemo, Debian is just as hard to maintain as the devkits, even more so.  Just try to seriously answer the question which Debian distribution we should be using exactly for (half of) our build dependencies.  Stable? Come on, even devkits are more recent.  Testing?  Maybe, but why not unstable?  Maybe some packages from unstable and some from testing? Maybe people can decide on their own?

The complaint that devkits are out-of-date should be countered by removing devkits from the picture, but we shouldn’t stop there. General redirection to places outside of Maemo needs to stop.  Maemo itself should contain exactly what we need.  Build dependencies in devkits: bad, build dependencies in a distribution: hmmkay-ish.  Build dependencies in our own distribution: for the win!

We should not use Perl from Debian, we should use our own Perl.  It’s part of Maemo and not using it in the SDK is schizophrenic.  If we need something from Debian, we should import it and make it part of Maemo, explicitly and in a controlled way, like we have been doing it since the dawn of Maemo for packages that are primarily meant for the devices.

To put it another way: I want a simple setup that allows us to work on Maemo itself.  I don’t want a SDK that is a mix of packages from an old version of Debian and from Maemo, implemented via obscure and overly specific hacks to high level tools like dpkg-checkbuilddeps, and having to switch between under documented virtualization modes depending on whether I want to run “make all” or “make install”. Maemo can and should be big enough to contain its own build dependencies, and the virtualization tools should behave in a simple way.
So here is what I did as an experiment:

  • I installed the maemo SDK+ with both diablo_4.1.1 rootstraps.
  • I modified the configuration so that the armel rootstrap uses the i386 rootstrap as its “tool root”, and I disabled the gcc magic that Scratchbox 2 usually does.
  • I wrote a new mapping mode for Scratchbox 2 that essentially gives you the following layout:
    / -> diablo_4.1.1_armel rootstrap
    /home -> host
    /dev -> host
    /proc -> host
    /sys -> host
    /etc/passwd -> host
    /etc/resolv.conf -> host
    /usr/share/scratchbox2 -> host

At this point, you should have a working, fully emulated Diablo 4.1.1 environment. (Maybe this is the same as the “emulate” mode of Scratchbox 2, but I prefer to start from scratch in order to understand the magic better.)

You can compile stuff, but it is slow. Also, qemu is less than transparent and can’t seem to run /usr/bin/make, for example, which is a bit of a show stopper. Also also, find, xarg, and md5sum are missing and some tools expect /scratchbox/tools/bin to be there… oh well.

  • The following mappings make native binaries visible inside the Scratchbox target:
    /tools -> diablo_4.1.1_i386 rootstrap, read only
    /opt/maemo/ -> host
  • Then some symlinks for speed:

    /bin/bash -> /tools/bin/bash
    /usr/bin/make -> /tools/usr/bin/make
    /usr/bin/m4 -> /tools/usr/bin/m4
    /usr/bin/perl -> /tools/usr/bin/perl
  • Also for Perl .so modules:
    /usr/lib/perl/5.8/auto/File/Glob/Glob.so -> /tools/...
    ...
  • For the compiler, I made a small wrapper since as a cross compiler, it doesn’t dare to look into the standard places for header files and libraries. This needs to be done more cleanly and more thoroughly of course and maybe I should be using Scratchbox’s magic instead. But for now I prefer it explicit:
    #! /bin/sh
    cc=/opt/maemo/tools/arm-2005q3/bin/arm-none-linux-gnueabi-gcc
    $cc -I/usr/include -L/usr/lib -Wl,-rpath-link -Wl,/usr/lib "$@"

With this, “dpkg-buildpackage” runs successfully and with no real slowdown. I am happy.

Open issues:

  • Fakeroot doesn’t seem to work. (I used the fantastic “sb -R” to get around this. Love it. But fakeroot just needs to be there as well for completeness.)
  • The cross-compiler toolchain needs to move into Maemo and needs to be properly tuned.
  • Some tools needs to be written to manage the symlinks, maybe in cooperation with the actual packages whose binaries we redirect.
  • The Maemo warts need to be removed, such as the missing /usr/bin/find.

That’s it! Thanks for reading, and happy crossing!

About these ads

12 Responses to “The cardinal sin of Scratchbox, a backhanded stab at the Maemo SDK+, and the promise of a pot of gold”

  1. Lauri Leukkunen said

    Interesting :)

    There seems to be three ways of approaching cross-compilation with SB2:

    * Use the host for everything, run as little ARM code and use as few files from the target directories as possible while building, this is what the default “simple” sb2 mode does

    * Use the emulation for everything, run specific things from the host, this seems to be what you’ve built

    * Aim for 100% maemo buildability without modifying the maemo distro itself. This is what the SDK+ project has done. This is arguably the hardest nut to crack, but the SDK+ team has done a pretty good job, employing some pretty impressive amounts of explosives while doing it.

    Which one is right depends on the requirements. Personally I like the simple approach, and I would go so far as build my distro so that I wouldn’t rely solely on SB2 for cross-compiling. Since there are many open source components that support cross-compiling just fine, I’d use their built-in support whenever possible, then tackle a large number of others with SB2 and the really strange ones I might simply resort to stabbing at their build systems.

    This way I can keep sb2 mapping rules reasonably clean, won’t rely excessively on qemu’s functionality, nor would I create unneeded new complications for already perfectly cross-compiling components.

    But, as these things go, I understand pretty well why SDK+ has gone the path they’ve chosen.

    I would welcome your new “emulation with native accelerations” mode to sb2, if you feel like sending it.

  2. Marius said

    Lauri, there is the additional twist that my ‘host’ is the target system again, except compiled for i386.

    Without having too much experience, I’d say that the “host for everything” approach is good for small systems that will never grow big enough to include a compiler.

    One could argue that Maemo should be such a small system, but right now it’s not.

    Yep, we have to decide whether to trim Maemo down, or whether to let it grow up. Right now, it is not really clear what it wants to be.

    I am voting to let it grow up.

    As to the reason why the SDK+ is the way it is… well. Changing Maemo is not impossible, and I know firsthand what kind of a mess can result from not doing things right because it is easier to hack in your own corner than to go across the corridor. I have done that mistake with the SSU infrastructure, and boy did I ever create a monster.

    (But then again, it might not have happened at all otherwise.)

    Before getting serious about the “emulation with native acceleration mode”, I’d need to figure out how to handle all the symlinks. Just putting them in the filesystem doesn’t really work when upgrading things…

  3. Lauri Leukkunen said

    Making Maemo a true Debian derivative might make sense today, but it certainly wasn’t the case for 770. Nokia has the headcount to pull it off, but it would require a kind of long term thinking I’m just not seeing. Although it would probably be easier than trying to continue as things are.

    A big benefit of letting Maemo grow up is that then it would be possible for Maemo developers to eat their own dogfood. Today the only environment where anybody is exposed to using the thing is on the device, and to be honest, how many minutes per month does the average Maemo developer actually use an Internet Tablet for anything other than on-target debugging? I know I haven’t used a tablet at all since august.

  4. Veli Kaksonen said

    Hello Marius,

    The general idea of SB was to always use host tools when possible. This is because of speed issues. If in some cases you want to use target tools then you just need to set the SBOX_REDIRECT_IGNORE environment variable (SB1)…

    The idea was that the default behavior always brings speed and that you can switch to target tools if the this brings you trouble.

    It is probably also worth noticing that when we started with SB there was no Qemu or any other speedy emulator available. Also our hardware was quite slow. So all operations done on ARM were very slow.

    Still I see that this default is the right way to go.

    http://www.scratchbox.org/cgi-bin/darcsweb.cgi?r=1.0/scratchbox;a=headblob;f=/doc/variables.txt

    – Veli

  5. What’s really cool about your posting is that it shows that SB2 is actually manageable enough to be hacked. Nobody would have *ever* done these kind of experiments with SB1 =)

    The sofware in “tools” distro and “target” distro should indeed be of identical version. If the target has deviated from Debian, using “pure” Debian for tools distro doesn’t make any sense. Since, for various reasons we can’t use Debian as target distro (size is only one the reasons), maemo distro needs to carry all tools users could expect to use.

  6. Jussi Hakala said

    Nice experiment.

    However, I don’t agree with you on everything.

    For instance, I think the target’s tools should not be prioritized over the host ones. You may end up with your tools changing over the build process (yes, I know about shell caching the binary locations), slowing it down and quite possibly breaking the build process. In any case, the end result would be that not everything are be built using the same set of tools.

    Furthermore, using the emulation for everything is just a recipe for disaster, at least with the current cpu transparency method. User space qemu not perfect and substituting tools native to your host with target binaries running under qemu will just increase the probability of something going wrong, like moving further when ice under you is already thin. The user space qemu needs to be fixed or another method of cpu transparency used.

    I agree with you that maemo should be a complete distribution. Currently there’s been no real interest in providing the software sb1 is providing for initial bootstrapping process (and for faster build times) and that’s a bit shame as it is.

  7. Thomas said

    Really, I wonder whether you could just have an environment suitable for Debian’s cowbuilder, schroot, qemubuilder, and qemu. Then you can develop on the host and build packages from source with a single command, test them in a chroot on the host and then compile and test in a qemu environment with only five simple commands (you also need dpkg-buildpackage -S) to learn. Even better, it’s the same commands to do stuff for Debian or Ubuntu.

  8. FelipeC said

    Cool stuff!

    I agree that the Maemo rootstrap should contain everything needed, like the “devkits”, but I don’t agree that it should contain the toolchain.

    One of the reasons I like sb2 is that I can very easily try different toolchains. This is very important for example in multimedia where you want to optimize as much as possible, and a new compiler with better NEON vectorization stuff might make sense over whatever the Maemo SDK is supposed to work with.

  9. Eero Tamminen said

    > What the Maemo SDK+ does with it, however, is to again shadow parts of Maemo freely and massively with things from outside of Maemo. Almost everything comes from a Debian etch chroot. Why?

    Are you suggesting that we would have the tools replaced with Busybox like on the device? :-)

    AFAIK about only reason why several hundred SDK packages (most of them aren’t installed on the device, just needed for building) deviate from the Debian is that the other stuff they depend on (SB1 internals) is old too. If they can be taken directly from Debian, it’s much easier to update/maintain them and they will both be same in tools root and in target root. Also, the more we can minimize packaging differencies to Debian also on the device itself, the easier this comes.

    Once SDK+ works fine, I think the tools should be updated to Lenny, but I guess currently SDK+ team has other things to fix.

    > This is still missing the point, though: For people working on Maemo, Debian is just as hard to maintain as the devkits, even more so. Just try to seriously answer the question which Debian distribution we should be using exactly for (half of) our build dependencies. Stable? Come on, even devkits are more recent. Testing? Maybe, but why not unstable? Maybe some packages from unstable and some from testing? Maybe people can decide on their own?

    We should occasionally sync (start new branch) to Debian like all the other Debian derivatives do. If something needs even newer sources, well, I don’t see why they couldn’t/shouldn’t use them. The main point would be avoiding ancient crap (unless there’s a good reason not to).

  10. jeremiah said

    Hi,

    Great blog post – very informative.

    Why wouldn’t you want to use debian’s perl for the SDK+ perl? Debian is considered by the Perl Foundation as being the best distribution of perl out there.

  11. jeremiah said

    As far as which debian distro you should be using, the answer is simple; testing. Testing and unstable are the same, they only deviate when there is a freeze, and that happens every two years for about six months so everything is still pretty fresh anyway. Plus testing is stable, as stable as Ubuntu. That is what Shuttleworth, and others, understood – debian’s testing is a rock solid distro with the latest software. He was a debian developer after all so he could see it from the inside, he was just smart enough to start a business from it.

    If you use testing it will be up-to-date and stable, that is what developers want. :)

  12. vipc said

    very very thank you

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: