Version 3.1.3
Copyright © 1997 Tom Lees
License Notice
APT and this document are free software; you can redistribute them and/or modify them under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. |
For more details, on Debian systems, see the file /usr/share/common-licenses/GPL for the full license. |
Abstract
This document describes the minimum necessary workings for the APT dselect replacement. It gives an overall specification of what its external interface must look like for compatibility, and also gives details of some internal quirks.
Table of Contents
Table of Contents
The basic dpkg package control file supports the following major features:- |
|
The "dpkg status area" is the term used to refer to the directory where dpkg keeps its various status files (GNU would have you call it the dpkg shared state directory). This is always, on Debian systems, /var/lib/dpkg. However, the default directory name should not be hard-coded, but #define'd, so that alteration is possible (it is available via configure in dpkg 1.4.0.9 and above). Of course, in a library, code should be allowed to override the default directory, but the default should be part of the library (so that the user may change the dpkg admin dir simply by replacing the library). |
Dpkg keeps a variety of files in its status area. These are discussed later on in this document, but a quick summary of the files is here:- |
|
These files are installed under /usr/lib/dpkg (usually), but /usr/local/lib/dpkg is also a possibility (as Debian policy dictates). Under this directory, there is a "methods" subdirectory. The methods subdirectory in turn contains any number of subdirectories for each general method processor (note that one set of method scripts can, and is, used for more than one of the methods listed under dselect). |
The following files may be found in each of these subdirectories:- |
|
As yet unwritten. You can refer to the other manuals for now. See dpkg(8). |
|
The dpkg utility itself is required for quite a number of packages, even if they have been installed with a tool totally separate from dpkg. The reason for this is that some packages, in their pre-installation scripts, check that your version of dpkg supports certain features. This was broken from the start, and it should have actually been a control file header "Dpkg-requires", or similar. What happens is that the configuration scripts will abort or continue according to the exit code of a call to dpkg, which will stop them from being wrongly configured. |
These special command-line options, which simply return as true or false are all prefixed with "--assert-". Here is a list of them (without the prefix):- |
|
Both these options check the status database to see what version of the "dpkg" package is installed, and check it against a known working version. |
This strange option is described as follows in the source code: |
/* Print a single package which: * (a) is the target of one or more relevant predependencies. * (b) has itself no unsatisfied pre-dependencies. * If such a package is present output is the Packages file entry, * which can be massaged as appropriate. * Exit status: * 0 = a package printed, OK * 1 = no suitable package available * 2 = error */ |
On further inspection of the source code, it appears that what is does is this:- |
|
Eventually, it writes out the record of all the packages to satisfy the pre-dependencies. This is used by the disk method to make sure that its dependency ordering is correct. What happens is that all pre-depending packages are first installed, then it runs dpkg -iGROEB on the directory, which installs in the order package files are found. Since pre-dependencies mean that a package may not even be unpacked unless they are satisfied, it is necessary to do this (usually, since all the package files are unpacked in one phase, the configured in another, this is not needed). |
Table of Contents
This chapter describes the internals to the "dpkg-deb" tool, which is used by "dpkg" as a back-end. dpkg-deb has its own tar extraction functions, which is the source of many problems, as it does not support long filenames, using extension blocks. |
The main principal of the new-format Debian archive (I won't describe the old format - for that have a look at deb-old.5), is that the archive really is an archive - as used by "ar" and friends. However, dpkg-deb uses this format internally, rather than calling "ar". Inside this archive, there are usually the following members:- |
|
The debian-binary member consists simply of the string "2.0", indicating the format version. control.tar.gz contains the control files (and scripts), and the data.tar.gz contains the actual files to populate the filesystem with. Both tarfiles extract straight into the current directory. Information on the tar formats can be found in the GNU tar info page. Since dpkg-deb calls "tar -cf" to build packages, the Debian packages use the GNU extensions. |
dpkg-deb documents itself thoroughly with its '--help' command-line option. However, I am including a reference to these for completeness. dpkg-deb supports the following options:- |
|
Here is a list of the internal checks used by dpkg-deb when building packages. It is in the order they are done. |
|
Table of Contents
This chapter describes the internals of dpkg itself. Although the low-level formats are quite simple, what dpkg does in certain cases often does not make sense. |
This describes the /var/lib/dpkg/updates directory. The function of this directory is somewhat strange, and seems only to be used internally. A function called cleanupdates is called whenever the database is scanned. This function in turn uses scandir(3), to sort the files in this directory. Files who names do not consist entirely of digits are discarded. dpkg also causes a fatal error if any of the filenames are different lengths. |
After having scanned the directory, dpkg in turn parses each file the same way it parses the status file (they are sorted by the scandir to be in numerical order). After having done this, it then writes the status information back to the "status" file, and removes all the "updates" files. |
These files are created internally by dpkg's "checkpoint" function, and are cleaned up when dpkg exits cleanly. |
Judging by the use of the updates directory I would call it a Journal. Inorder to efficiently ensure the complete integrity of the status file dpkg will "checkpoint" or journal all of it's activities in the updates directory. By merging the contents of the updates directory (in order!!) against the original status file it can get the precise current state of the system, even in the event of a system failure while dpkg is running. |
The other option would be to sync-rewrite the status file after each operation, which would kill performance. |
It is very important that any program that uses the status file abort if the updates directory is not empty! The user should be informed to run dpkg manually (what options though??) to correct the situation. |
First, the status file is read. This gives dpkg an initial idea of the packages that are there. Next, the updates files are read in, overriding the status file, and if necessary, the status file is re-written, and updates files are removed. Finally, the available file is read. The available file is read with flags which preclude dpkg from updating any status information from it, though - installed version, etc., and is also told to record that the packages it reads this time are available, not installed. |
More information on updates is given above. |
Version numbers consist of three parts: the epoch, the upstream version, and the Debian revision. Dpkg compares these parts in that order. If the epochs are different, it returns immediately, and so on. |
However, the important part is how it compares the versions which are essentially stored as just strings. These are compared in two distinct parts: those consisting of numerical characters (which are evaluated, and then compared), and those consisting of other characters. When comparing non-numerical parts, they are compared as the character values (ASCII), but non-alphabetical characters are considered "greater than" alphabetical ones. Also note that longer strings (after excluding differences where numerical values are equal) are considered "greater than" shorter ones. |
Here are a few examples of how these rules apply:- |
15 > 10 0010 == 10 d.r > dsr 32.d.r == 0032.d.r d.rnr < d.rnrn |