Developing PostgreSQL for Windows, Part 1

January 29, 2020
Developing PostgreSQL for Windows, Part 1

As a PostgreSQL developer, I occasionally need to make my code work on Windows. As I don’t normally use Windows and don’t have a permanent Windows installation around, this has always been a bit cumbersome. I have developed a few techniques to make this easier, and I figure they are worth sharing. And actually this text became quite long, so this will be a series of several blog posts.

The first thing that is useful to understand is the different variant Windows build targets and how they are similar and different.

The, arguably, primary way to build for Windows is using the Microsoft Visual Studio compiler suite. This is what you could think of as the most native environment. Most if not all binary distributions of PostgreSQL for Windows use this build. This build does not use the normal Unix makefiles but a separate build system under src/tools/msvc/. This parses the makefiles and has some custom logic and builds “project” files specific to this tool chain, which you can then run to build the code. Let’s call this the MSVC build here. This build is prone to break in two ways: One, if the code doesn’t actually build or run on Windows, and two, if you change something in the normal (makefiles-based) build system that causes those ad-hoc scripts to break. So this is always a lot of fun to deal with.

The second way is to use MinGW. This uses the GNU toolchain (GCC, GNU binutils, GNU ld, etc.) to build native code on Windows. You can think of it as “GCC on Windows”, but in reality it contains additional shim and glue to interface with the Windows system libraries. This uses the normal Unix-ish build system using configure and makefiles, but the code it produces is native Windows code, in principle equivalent to the output of the MSVC build. This also means that if code doesn’t build or run with the MSVC build, it will also not build or run here. But the build system is the same as for Linux etc., so it will be harder to break accidentally.

The third way is Cygwin. Cygwin is a subsystem that presents a POSIX-like environment on Windows. For example, Cygwin adds users, groups, fork(), SysV shared memory, and other facilities that don’t exist on native Windows but are standard on, say, Linux. The idea is that you could take source code meant for Linux or BSD and build it under Cygwin without change (or at least with only changes that would be within the typical range of porting changes between Unix-like systems). For this reason, a Cygwin port of PostgreSQL existed long before the native Windows port, because it was a much smaller effort. In practice, the abstraction breaks down in some areas, especially in the networking code and around file naming and access, but in general the Cygwin build breaks very rarely compared to the other targets.

There used to be another way to build on Windows. There were win32.mak files that you could use directly with nmake on Windows, and there was also support for Borland compilers at some point. These were basically stopgap measures for building just libpq natively on Windows before the full native port had arrived. These have now been removed.

There is another term that appears in this context: MSYS. The reason for this is that MinGW by itself is not often useful. It’s just a compiler tool chain. But in order to build typical Unix software, you need additional tools such as bash, make, sed, grep, and all those things that are typically used from a configure script or a makefile. These tools do not all exist as native Windows ports. But you can run them on top of the Cygwin subsystem. So one way to use MinGW is from inside Cygwin. Another is MSYS, which stands for “minimal system”, which is roughly speaking a bespoke subset of Cygwin and some packaging specifically for using MinGW for building software. The original MSYS is now abandoned as far as I know, but there is a popular new alternative MSYS2. More on this in a subsequent blog post. For now just understand the relationship between all these different software packages.

Now let’s consider how the source code sees these different build environments.

A native build using MSVC or MinGW defines _WIN32. (Strangely, this is the case for both 32-bit and 64-bit builds. A 64-bit build also defines _WIN64, but this is rarely used.) The PostgreSQL source code uses WIN32 instead, but that’s specific to PostgreSQL, not done by the compiler.

MSVC also defines _MSC_VER to some version number. This is sometimes useful to work around issues with a particular compiler version (often the kind of thing Unix builds would tend to use configure for). Note that MinGW does not define _MSC_VER, so code needs to be written carefully to handle that as well. There have been some minor bugs around this because code like #if _MSC_VER < NNNN to perhaps work around an issue with an older compiler would trigger on MinGW as well, which might not have been intended. (More correct would be #if defined(_MSC_VER) && _MSC_VER < NNNN and of course wrap into #ifdef WIN32.) MinGW defines __MINGW32__ and __MINGW64__, but these are very rarely used. Also, MinGW of course defines __GNUC__ since it’s GCC, so conditional code specific to GCC or a GCC version can also be used. In general, since MinGW uses Autoconf, those things should usually be checked in configure instead of in the code.

Cygwin defines __CYGWIN__. Notably, Cygwin does not define _WIN32, or WIN32, and so on — because it does not consider itself to be native Windows. That’s why in some code areas where Windows peeks through the Cygwin abstraction you see a lot of code with #if defined(WIN32) ||
defined(__CYGWIN__)
to handle both cases.

(There are some dusty corners in the code that don’t always handle all these preprocessor defines in a sensible and consistent way. In some cases, this is intentional because reality is weird, in other cases it’s rotten and incorrect code that needs to be cleaned up.)

Each of these targets exists in principle as a 32-bit and a 64-bit variant. A 64-bit Windows operating system installation, which is the normal modern installation, can run both 32-bit and 64-bit software, so you can install and run both variants on such a system. A production installation should probably use a 64-bit build, and so you might choose to not bother with the 32-bit environment. In fact, Cygwin’s 32-bit variant seems to be pretty dead, and you might not be able to get it to work at all. One problem, however, is that 64-bit MinGW has some obscure bugs, so when using MinGW especially it’s sometimes better to use the 32-bit environment, unless you want to fight operating system or toolchain bugs. However, 32-bit computing is obviously mostly on its way out in general, so this is not a future-proof option.

Now the question is perhaps which one of these environments is “the best”. As far as development goes, it doesn’t really matter because all code needs to work on all of them. As I mentioned above, the MSVC build is used for most production deployments of Windows. The MinGW (or rather MSYS2) environment is nicer to develop in if you are used to a Unix-like environment, but especially the 64-bit MinGW environment seems to be somewhat buggy, so it’s difficult to recommend this for production. Cygwin might be considered by some to be a historic curiosity at this point. Running a production server under Cygwin is not recommended because the performance is quite bad. But Cygwin is actually useful in some situations. For example, Readline does not work on either of the native Windows builds, but it does on Cygwin, so if you are a psql user, it is better to use a Cygwin build for that. Also, Cygwin is useful in the situation that is the inverse of this blog post: You are a Windows-mostly developer and want ensure your code is mostly compatible with Unix. So all three of these environments have their value and are worth being maintained at this time.

In the next part of this series, I’ll discuss some techniques to test code changes on and for Windows if it’s not your primary development environment.

Share this

More Blogs