For a long time, adding packages to RedHat derived Linux systems has been called “RPM Hell”, for good reason. Particularly before the yum utility came about to help, getting RPM to do the right thing has often been a troublesome task. I was reminded of this again today, while trying to compile a PostgreSQL extension on two nearly identical CentOS systems.
PostgreSQL provides an API named PGXS that lets you build server extensions that both leverage the code library of the server and communicate with it. We use PGXS to install our repmgr utility, and having that well defined API let the program be developed externally from the main server core. Many popular pieces of PostgreSQL add-ons rely on PGXS to build themselves. In fact, the contrib modules that come with PostgreSQL itself are often built this way. Grabbing a similar contrib module and hacking on it from there is a well trod path toward building a new PostgreSQL extension.
PGXS relies upon the pg_config utility being in your PATH. pg_config comes with the postgresql-devel package, which nowadays is actually named postgresql90-devel. Unfortunately it’s not in the path for anyone by default. So the first step you need to build using PGXS is make it there. Something like this will work for most UNIX systems:
export PATH=”/usr/pgsql-9.0/bin:$PATH”
Here’s how building repmgr looked on the working system:
[gsmith@pyramid repmgr]$ make USE_PGXS=1
gcc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector –param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/include/et -DLINUX_OOM_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -I/usr/pgsql-9.0/include -I. -I. -I/usr/pgsql-9.0/include/server -I/usr/pgsql-9.0/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include -c -o dbutils.o dbutils.c
…
This includes –m64 -mtune=generic, which are the gcc options to say build for a 64 bit platform, but let the compiler figure out exactly which one you are on relative to the other restrictions. Nowadays the result normally comes out optimized for x86_64 if you have a 64-bit system. The auto-detection was more useful back when the choices were i386, i468, i586, and i686.
Onto the troublesome system. I thought I’d put PostgreSQL on here identically, yet the build didn’t work at all:
gcc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector –param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -I/usr/include/et -DLINUX_OOM_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -I/usr/pgsql-9.0/include -I. -I. -I/usr/pgsql-9.0/include/server -I/usr/pgsql-9.0/include/internal -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include -c -o dbutils.o dbutils.c
…
/usr/bin/ld: skipping incompatible /usr/pgsql-9.0/lib/libpq.so when searching for -lpq
/usr/bin/ld: skipping incompatible /usr/lib64/libtermcap.so when searching for -ltermcap
/usr/bin/ld: skipping incompatible /usr/lib64/libtermcap.a when searching for -ltermcap
/usr/bin/ld: cannot find -ltermcap
collect2: ld returned 1 exit status
What? This is trying to build 32 bit code: “-m32 -march=i386 -mtune=generic”. Because of that, when it tries to link with all the 64-bit libraries on the server like libpq and libtermcap, it can’t. How in the world is this happening?
You can see where the information that goes into a PGXS build command is coming from using pg_config. Here’s how to check the part related to the CFLAGS, the section where the bit size info is located at:
$ pg_config –cflags
-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector –param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/include/et -DLINUX_OOM_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv
Now I’m pissed. This is saying build for 64 bits as well, yet it’s still finding 32-bit information. Where is that coming from?
Some digging into the PGXS interface trying to trace this back eventually let me to /usr/pgsql-9.0/lib/pgxs/src/Makefile.global and here’s what the clue started to show up. That file listed 32-bit compiler options! Where did they come from?
At this point I started looking at exactly what RPMs were installed on each server,
because something had to be different between them. Here’s a handy command to know:
$ rpm -qa –queryformat ‘%{NAME}t%{ARCH}n’ | grep postgres | sort
compat-postgresql-libs i686
compat-postgresql-libs x86_64
postgresql90-contrib x86_64
postgresql90-devel x86_64
postgresql90-libs i386
postgresql90-libs x86_64
postgresql90-server x86_64
postgresql90 x86_64
RHEL5 is capable of running 32 and 64 bit applications side by side, you just have to be careful to compile them. So it’s normal that the database compatibility packages compat-postgresql-libs and postgresql90-libs include both architectures. You might have both 32 and 64 apps that want to talk to the same server. This is often annoying, for example when you want to delete a package and it tells your request matches more than one and does nothing–you need –allmatches to fix that.
What do we see on the server that won’t compile? Not quite the same thing:
compat-postgresql-libs i686
compat-postgresql-libs x86_64
postgresql90-contrib x86_64
postgresql90-devel i386
postgresql90-devel x86_64
postgresql90-libs i386
postgresql90-libs x86_64
postgresql90-server x86_64
postgresql90 x86_64
What are postgresql90-devel packages for both i386 and x86_64 doing there? That doesn’t make any sense at all!
Now, after testing to try and make sense of this, if you have either -devel package and try to install the other, it kicks back the right series of errors for files that conflict, like this:
file /usr/pgsql-9.0/lib/pgxs/src/Makefile.global from install of postgresql90-devel-9.0.2-2PGDG.rhel5.x86_64 conflicts with file from package postgresql90-devel-9.0.2-2PGDG.rhel5.i386
The packager knows perfectly well that they overwrite the same Makefile.global. How did I end with both? After wiping everything out I found exactly how:
# yum install postgresql90-devel
…
=========================================================================
Package Arch Version Repository Size
=========================================================================
Installing:
postgresql90-devel i386
9.0.2-2PGDG.rhel5 pgdg90 1.5 M
postgresql90-devel x86_64 9.0.2-2PGDG.rhel5 pgdg90 1.6 MTransaction Summary
===================
Install 2 Package(s)
Upgrade 0 Package(s)Total size: 3.1 M
Total download size: 1.5 M
Is this ok [y/N]:
It certainly is not OK! yum is perfectly happy to combine them, and I must have done that without noticing before. It turns out that if you do let them both install like this, the copy you’re left with may not report the right information back to PGXS–unsurprisingly, it is confused. That’s how I ended up with my problem. I was using the Makefile.global installed by the i386 version, but everything else on the system was x86_64.
So how to cleanup? Given the mix of files here, you can’t really trust that just deleting the unwanted one is enough. Then you may have no copies left of everything that conflcied. Only safe choice is to nuke them both, then just install the x86_64 one, now that we know exactly version is available from the test above:
rpm -e postgresql90-devel –allmatches
yum install postgresql90-devel-9.0.2-2PGDG.rhel5.x86_64
With this sorted out, now my PGXS extension builds just fine, and development
on repmgr proceeds again, after a day of lost time to figure this all out.
Lessons for today: be careful when installing the postgresql90-devel package via yum, and do not let it put both architectures of that file there. Only use the one that matches the platform of your main postgresql90 package. And if you are trying to build a PGXS extension on a RHEL/CentOS system, and you see the skipping incompatible library message, start by looking at the PostgreSQL development package(s) you have installed.
We’ll probably get this particular bad combination blocked by future updates to the PostgreSQL 9.0 packages. I thought it was interesting to share anyway, because there aren’t many good examples of doing troubleshooting like this on RPM. I once wrote one titled Installing the PostgreSQL 8.2 RPMs on RHEL 5/CentOS 5 that goes through some more of the background here. But those were simpler days, before 64-bit platforms were popular, and before you could install more than one PostgreSQL version via RPM at the same time. Knowing the right RPM incantation to list packages installed with their associated architecture is a vital trick nowadays to navigating your way out of RPM hell.