Building GCC

From liblfds.org
Revision as of 11:15, 1 May 2017 by Admin (talk | contribs) (→‎Problems)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

Most GCC builds on most platforms fail.

By this I mean to say that most builds of GCC, out of the box, without taking steps to fix the build system and/or the source base, on most platforms, will fail. Not because you have messed it up, but because they are actually as released broken and cannot work. x86_64 does a lot better than other platforms and most builds work.

I have spent now three months working towards building every released version of GCC, starting at 4.1.2, on four platforms, ARM32, ARM64, MIPS32 and x64. I originally intended to build also the matching glibc for each version of GCC.

What I have come away with from this is that it is impossible to build GCC with glibc, and that most GCC builds on plaforms other than x64 are broken. They cannot be built. There are a few cases where you can fix the build system or the source code and achieve a build.

Note that you only reach the point where you can know this after many, many weeks of struggling with the build system problems which occur *prior* to the point of unrecoverable failure.

If you begin to think about building GCC and/or glibc, and Google, you will find, invariably, the advice to boot up an old distro and use the GCC shipped with it. This advice exists for a reason, as you have read above : only the GCC and glibc developers themselves, and perhaps the people who actually create major distros, can build GCC/glibc, and they normally only manage this by fixing the build system and/or source code at that point such that a build can occur. You need to know enough to do development work on the GCC source base, to build it.

This is a fundamental and profound problem for professional software development. We must be able to choose which versions of the compiler and C library we use to compile our code, not least to ensure our code continues to work on older compilers which are still in widespread use.

Where is is impossible to build glibc, and problematic to build GCC, it is not possible to control the build system, except by keeping old distro releases around, and using them to build, test, and release, and this is unnecessarily awkward; not least because I'm looking to test with about 40 different versions of GCC. Even finding distros for those old versions is going to be difficult, and there's no reason why I can't have multiple GCC and glibc versions on my current machine, except for the problems with their build systems. It also precludes real benchmarking, as such distros are likely to be run in virtual machines.

It cannot be, for serious, professional development, that we use version a.b.c of a compiler, and then are forced off that version by updates to our operating system, and are unable then to go back and continue using that old version (at least in parallel with the new version).

My actual advice after all of this probably to use clang. I've not tried it yet, but from what I've seen, it has a sane and viable build system.

Dependencies

When building GCC, there are a number of tools and libraries you need to be aware of, and know how to use.

  • the GCC source code itself
  • binutils
  • glibc
  • libgmp
  • libmpfr
  • libmpc
  • libisl
  • CLooG-PPL
  • libppl

Of these, CLooG-PPL, libppl and libisl have been used at various times up to GCC 4.8.0 to implement loop optimization code. If they're not provided when building GCC, the loop optimization goodies will not be compiled in, but GCC will compile. I didn't need them, and there was already plenty of agony getting anything to compile at all, so I've never tried using them.

libgmp, libmpfr and libmpc are dependency libraries used by GCC for handling math of various kinds - I don't know much more than that. Only libgmp and libmpfr were used up to 4.5.0, and then libmpc began to be needed as well.

glibc is of course the C library.

binutils is a package which contains a set of vital tools used in conjunction with GCC, such as ar and ld.

Building

Basically, you get hold of the GCC source code, then you make a directory that you will build in, change into that directory and from there call configure in the GCC source directory. You then call make. Do not build in the GCC source directory itself - always configure and make in a separate directory.

GCC when it builds self-bootstraps. It first builds itself from the sources using the system compiler, then using that version of itself to build itself again, and then using that version of itself compiles itself again.

GCC has the notion of the build system, the host system and the target system - so you can for example build on your x86_64 machine a compiler which is to be run on ARM32 which emits code which runs on MIPS32. For all the work I've done, I've been doing what's known as a native build - the build, host and target systems are identical.

Regardng libgmp, libmpfr and libgmp, these can either be provided as source code and placed into the GCC directory, and then GCC will automatically configure and build them, or you can use the versions which come with your distro (in which case you need to install the libraries and their development packages).

I gave up trying to build glibc, so I just use the version which comes with the distro.

Binutils claims it can be built in the same way as libgmp, etc, by having its directory placed into the GCC source directory before calling GCC's configure. THIS IS NOT THE CASE. IT DOES NOT WORK. THE DOCUMENTATION IS A LIE. With libgmp, etc, they go into the root of the GCC source tree, so libgmp turns up as gmp. Binutils however has to have each subdirectory placed into the root of the GCC source tree - so ar, ld, etc, all turn up as new top level directories. However, binutils has a top level directory include - but so does GCC - and these two directory have different versions of the same files.

The advantage of building binutils in the GCC source tree is that the GCC you are building will build binutils. You can however build binutils normally (configure and make) on its own. This works, and then you need to use something like configure-alternatives to be able to switch to these new versions of binutils which you produce.

For the work I've been doing, I gave up bothering with this and just used the bintuils which came with the distro.

So - basics - get hold of the GCC source code, get hold of the libgmp, libmpfr and libmpc (if it's used by the version of GCC you're building) sources and put them into the root of the GCC source dir. Make a build directory, cd to it, configure and make.

The first problems now are what arguments to give configure and which versions of libgmp, libmpfr and libmpc to use.

Problems

Oh, so many problems.

I don't know where to begin, and there have been so many, for so long, that I've forgotten half of them.

The long and short of it is this : most GCC builds on most platforms fail.

x86_64 does much better than other platforms, though.

The errors vary wildly.

So for example on MIPS32, building GCC 5.4.0 with the recommended dependency library versions leads to an error where the test suite for libgmp fails because half the test binaries do not have the execute bit set. (The libgmp support mailing list had nothing to say on the matter.) If you change to the latest version of the libraries, the build then fails because it thinks it can't build the link-time optimization code.

(GCC takes a while to build, too - each of these builds took more than twenty hours. I could in theory build on my laptop, but I don't trust the build system as it is - the idea of trying to make it do something more complex than native builds gives me the screaming heebee-jeebees.)

I think then with this build (as with most others) I have to give up. I would have to understand the source code and/or build system enough to do the necessary development work to fix it.

This will be your normal experience, except on x86_64, where things tend much more to work.

One thing you might run across quite early on when Googling for some build problem, is a build bug in the GCC bugzilla, and it's marked as a duplicate, and when you go to look at the duplicate, you find the central bug is a duplicate for dozens and dozens of other bugs (each of which has no information how to fix) and its body text says something like "these are not real problems, it's that you don't know how to build GCC".

libgmp, libmpfr, libmpc

GCC depends on these libraries. There is a script, contrib/download_prerequisites, available from GCC 4.6.0 onwards, which downloads the versions you should use. Only... the versions specified have actually never changed, and are incredibly old and now unmaintained versions of these libraries, and in particular I know from experience for example that the specified libmpfr version is broken on MIPS32, and that none of them work on ARM64 (as the version of autotools used is too old - I think you can fix this by running it again, but I understand it's an open question whether or not that will actually work). Also, it's not clear which versions should be used prior to 4.6.0. I figured that out from the configure page of the GCC docs from each released versions and this is what you need;

GCC libmpfr libgmp libmpc
4.2.0 - 4.2.4 2.2.1 4.1.0 unused
4.3.0 - 4.3.6 2.3.0 4.1.0 unused
4.4.0 - 4.4.7 2.3.2 4.1.0 unused
4.5.0 onwards 2.4.2 4.3.2 0.8.1

The build systems for these libraries are not what you'd call fully reliable. I've mentioned the recommended version of libmpfr being broken on MIPS32, but for a second example, libgmp fails to build on arm from about version 5.1. I've yet to find out at which point it was fixed. Generally, they seem to build on x86_64. Go to other platforms and you're rolling the dice. I also begin to think that modern versions fail with old versions of GCC; older GCCs seem to be trying to interact with them in a way which fails.

Since it's not clear what breakages exist in these libraries, and it's also not clear what problems exist in GCC, one of the games you end up playing is trying to find versions of these libraries which actually build on the platform you've got with the GCC you're trying to build.

Be aware that the libraries also depend on each other; there is the not just the question of which version to use with GCC, but which versions of these libraries work with each other. You cannot independently change them. As with GCC, there is no information as such as to which versions work with which, but I have compiled (below) a table of when each was released, which can be used to get a feeling for which versions might work with which.

For example, libmpfr 2.4.2 cannot work with libgmp 5.0.0 or later.

libmpfr 2.4.2 on MIPS32

This version is broken on MIPS32 for GCC 4.4.0 and later.

(Yes. It really does mean every version of GCC released since 4.4.0 cannot build, out of the box, on MIPS32.)

The release page for this version has information and links to a patch;

libgmp configure

There is a common problem whereby the build fails with an error something like this;

checking lex output file root... flex: fatal internal error, exec of m4-not-needed failed
flex: error writing output file lex.yy.c
configure: error: cannot find output from flex; giving up

The problem is a bug in the libgmp configure. I remember it being something like the configure works when called directly from bash, but fails if you for example call it from a Python script.

To work around it, grep the libgmp configure and configure.in files and replace;

M4=m4-not-required

with

M4=m4

libgmp 4.3.2 test failure

There is one test, t-scan, which always spuriously fails. This means all GCCs starting with 4.5.0 when using the recommended dependency versions fail their test suite. This of course is a problem for anyone automating builds.

gnumake

Version 4.1.2 of GCC came out when gnumake 3.7.9 was the latest release. This version has slightly different built-in rules to later versions, which means 4.1.2 fails to build with later versions of gnumake.

texinfo

GCC comes with documentation. Texinfo is the package used to build the docs. Texinfo over time has of course seen improvements. Unfortuntately, later versions of texinfo (5 onwards, I think) are no longer able to build the docs of older versions of GCC; a build error occurs. However, older versions of GCC only try to build their docs if texinfo is installed, so a solution is to uninstall it. However however, newer versions of GCC require texinfo to be installed, and fail to build if it is not present. Also, there are some versions in the middle which require texinfo to be installed and then fail to build their docs because of the problem with later versions of texinfo.

A solution to this which I use is to touch the .info file for the dependency libraries, after copying them into the GCC source tree, i.e.

touch [gccroot]/gmp/doc/gmp.info
touch [gccroot]/mpfr/mpfr.info (pre 3.1.1)
touch [gccroot]/mpft/doc/gmp.info (post 3.1.1)

This seems to stop the doc build from happening.

Hard floating point support on Raspberry Pi

The Pi Debian distro is built with hard floating point support and as such does not ship with soft floating point header files. The compiler set define __ARM_PCS_VFP is used to indicate whether or not the system has hard floating point support, and this define is required by a number of header files. Earlier versions of GCC do not set this define, which causes the build of GCC to think the system is using soft floating point, and so when the attempt is made to #include them, the build fails (as they as not provided)

This define can be manually set in CFLAGS, i.e.

setenv CFLAGS -D__ARM_PCS_VFP

The second problem is that the Debian distro is build with VFP and hard floating point, a combination not supported by earlier versions of GCC, which I think therefore cannot be built.

library and include search paths

GCC seems to need to be told where the system libraries and header files are, and fails if not set;

setenv CC gcc
setenv LIBRARY_PATH /usr/lib/x86_64-linux-gnu
setenv C_INCLUDE_PATH /usr/include/x86_64-linux-gnu
setenv CPLUS_INCLUDE_PATH /usr/include/x86_64-linux-gnu

Replace x86_64-linux-gnu with whatever you have for your system.

GCC's configure

Dependency Versions

I have compiled a table showing when GCC, binutils, libgmp, libmpc, libmpfr and numctl released occurred, going back to release 4.1.2 of GCC.

These dates have been obtained by looking inside the archive files of the releases, as the file tiemstamps are not reliable (some archives having been republished much later than their original publication date).

For a given GCC, look backwards in time for binutils, libgmp, libmpc and libmpfr (GCC requires these tools/libraries, so the assumption is that the latest versions existant at the time of the GCC release are in use, to pick up bug fixes - but this may not be true; it might be any given major (GCC versioning is "major.minor.bugfix") release always sticks with the versions available at the time of the first release of that major release), and look forwards in time for glibc and numactrl (as these libraries could only be compiled with a compiler which already existed at the time those libraries was released).

Table of GCC, GCC Dependency, glibc and numactl Release Dates

GCC binutils glibc libmpfr numactl libgmp libmpc
2016 Sep 27 3.1.5
Aug 22 6.2.0
Aug 4 2.24
Aug 3 4.9.4 2.27
Jun 29 2.26.1
Jun 3 5.4.0
Apr 27 6.1.0
Mar 6 3.1.4
Feb 19 2.23
Jan 25 2.26
2015 Dec 10 2.0.11
Dec 4 5.3.0
Nov 1 2.22 6.1.0
Aug 5 2.22
Jul 19 3.1.3
Jul 16 5.2.0
Jul 21 2.25.1
Jun 26 4.9.3
Jun 23 4.8.5
Apr 22 5.1.0
Feb 16 1.0.3
Feb 6 2.21
2014 Dec 23 2.25
Dec 10 4.8.4
Oct 30 4.9.2
Oct 20 2.0.10
Sep 7 2.20
Jul 16 4.9.1
Jun 12 4.7.4
May 22 4.8.3
Apr 22 4.9.0
Mar 25 6.0.0a
Mar 24 6.0.0
Feb 7 2.19
Jan 15 1.0.2
2013 Dec 2 2.24
Oct 16 4.8.2
Oct 8 2.0.9
Sep 30 5.1.3
Aug 12 2.18
May 31 4.8.1
May 30 5.1.2
Apr 12 4.6.4
Apr 11 4.7.3
Mar 25 2.23.2
Mar 22 4.8.0
Mar 13 3.1.2
Feb 11 5.1.1
Feb 2 5.1.0a
2012 Dec 25 2.17
Dec 18 5.1.0
Nov 13 2.23.1
Oct 22 2.23
Oct 11 2.0.8
Sep 20 4.7.2
Sep 6 1.0.1
Jul 3 3.1.1
Jul 2 4.5.4
Jun 30 2.16.01
Jun 14 4.7.1
May 6 5.0.5
Mar 22 4.7.0
Mar 21 2.15
Mar 13 4.4.7
Mar 1 4.6.3
Feb 10 5.0.4
Jan 27 5.0.3
2011 Nov 21 2.22
Oct 28 4.6.2
Oct 7 2.14.1
Oct 3 3.1.0
Jul 19 1.0
Jun 27 4.6.1, 4.3.6 2.21.1
Jun 1 2.14
May 8 5.0.2
Apr 28 4.5.3
Apr 16 4.4.6
Apr 14 2.0.7
Apr 4 3.0.1
Mar 25 4.6.0
Feb 22 0.9
Feb 1 2.13
2010 Dec 29 2.0.6
Dec 16 4.5.2
Dec 13 2.12.2
Dec 8 2.21
Nov 29 2.11.3
Oct 4 2.0.5
Oct 1 4.4.5
Aug 3 2.12.1
Jul 31 4.5.1
Jul 28 2.0.4
Jul 10 3.0.0
May 22 4.3.5
May 19 2.11.2
May 14 0.8.2
Apr 29 4.4.4
Apr 14 4.5.0
Mar 3 2.20.1
Feb 6 5.0.1
Jan 21 4.4.3
Jan 8 5.0.0
Jan 7 4.3.2
2009 Dec 29 2.11.1
Dec 7 0.8.1
Nov 30 2.4.2
Nov 5 0.8
Nov 3 2.11
Oct 15 4.4.2
Oct 10 2.20
Sep 10 0.7
Aug 4 4.3.4
Jul 22 4.4.1
Jun 10 2.0.3
May 17 2.10.1
May 12 4.3.1
Apr 21 4.4.0
Apr 14 4.3.0
Apr 1 0.6
Mar 10 2.9
Feb 26 2.8
Feb 25 2.4.1
Feb 2 2.19.1
Jan 26 2.4.0
Jan 24 4.3.3
2008 Dec 12 0.5.2
Nov 18 0.5.1
Oct 16 2.19
Sep 20 4.2.4
Sep 17 0.5
Aug 27 4.3.2
Aug 5 2.0.2
Aug 2 4.2.3
Jun 10 2.0.1
Jun 6 4.3.1
May 19 4.2.4
May 7 2.0.0
Mar 5 4.3.0
Apr 14 1.0.3
Feb 1 4.2.3
2007 Oct 10 2.7
Oct 7 4.2.2
Sep 21 1.0.2
Sep 11 4.2.2
Aug 28 2.18
Aug 16 1.0.1, 1.0
Jul 31 2.5.1, 2.6.1
Jul 18 4.2.1
May 17 2.6
May 13 4.2.0
Feb 13 4.1.2
2006 Oct 31 0.9.11
Sep 20 2.5
Jun 23 2.17
May 24 4.2.1
Mar 26 4.22

1. Yes, 2.16.0, not 2.16 - I have no idea why.
2. Yes, 4.2, not 4.2.0