Tonnerre Lombard
November 4th, 2005
It's no secret that a lot of people strongly dislike the current PLIST approach to defining which packages belong into a certain package. There are various reasons why this approach isn't widely accepted, although it almost always seems to work: For one, it puts humans into the process and thus causes an additional load and requires special knowledge of the way pkgsrc works if you want to build a certain package. Then, it requires a bunch of hacks in order to support all the different package options which may cause problems with files that only exist in some configurations.
For all these reasons (and many others), there have been requests to replace the old packlist based mechanism with something real that can't accidentally neglect packages. Here are some approaches that would take this into account. The basic approach is always to install the package into a sandbox directory first, tar it up and then untar it over the target VFS.
This is basically the most natural approach to pkgsrc, as it is the one we already use in order to compile packages (where we simply push another gcc via the PATH variable et al). In this case, the install(1), install-info(1) etc. programs will be replaced with simple versions that install to a jail instead, just like eventual chmod(1) calls etc. This should do for most applications.
The Gentoo approach would be to preload a library that overrides some libc calls, modifies the function parameters and then calls the original libc functions.
This is basically the OpenBSD method. The package installation routine runs under surveillance by systrace(1), which will then modify the target paths for file accesses so the package basically ends up somewhere else, while the installation routine still believes to have done a pretty good job.
This is probably the most clean solution so far. The root file system is mounted read-only into a subdirectory, and a temporary file system is mounted over it for writing using union mounts. All files appear in their usual place for the installation routine, but all write accesses end up in a temporary file system that you can easily tar up after the installation routine has finished.
This is basically the current clean-room approach pkgsrc is taking. A second root file system is installed into a chroot jail, and all packages installed into it get tracked by the print-PLIST mechanism, which basically walks through the file list and prints out the name of every file that doesn't belong to a package yet.
The idea behind this method would be to extend the package lists with MD5 sums of every file the package contains. The print-PLIST mechanism would then check the MD5 sum of every file in the target and if the file was modified, it will be taken into the list of files of the package. Just like its original, this method is only useful in dedicated pkg_comp jails.
There doesn't appear to be a perfect method that doesn't have flaws and is portable. The quality of the approaches is an inverse function of their portability, which is sad. However, it is still possible to implement a number of these methods and use more portable ones as a fallback if the underlying operating system doesn't support it.
My suggestion is to use the systrace approach if running on a BSD system and to keep the PLIST files around as a fallback solution. That way, most of the likely pkgsrc users will have the new features while the few ones that try to use pkgsrc on other platforms still can do that.
Also, different approaches can be applied to different packages. Since most packages only use the pkg_install framework, the modified toolchain will do perfectly well for them while still making for a portable package. Only special cases need tricks like systrace at all in the first place, such as Nullsoft Java installers and friends.
Tonnerre Lombard is a cross-system kernel hacker and political activist. He has founded the FFII Switzerland and is trying to get a political movement for human rights and free communication into place.
You can contact the author via:
You can visit the author's homepage on http://users.bsdprojects.net/ tonnerre/.
There are no example implementations for pkgsrc known yet. However, some of the designs have prior art: