Kubernetes Helm: What Platform Package Managers Do

Now that development for Helm version 3 has kicked off, I'm starting to hear a wide variety of opinions on what Helm should do and how it relates to what other platform package managers do. What I'm learning is that not everyone realizes how much functionality is packed into the package managers they're using.

To illustrate the features in package managers, let's take a look at APT). APT is the package manager for Debian and Debian based platforms such as Ubuntu. It's been released since August of 1998 and has been in wide use for a long time.

Distributed Repositories

For many, the APT default repositories their operating system is configured with are the only ones they used. For example, the default set Ubuntu is configured to use. But, APT has been designed to have additional ones added or the default set changed.

This is leveraged for a variety of reasons including:

  1. Distributing general software. For example, launchpad.net projects can have APT repositories that people can install software from.
  2. Company specific applications can be bundled up as Debian packages and installed via the typical mechanisms. Many companies do this today for both their software services and internal applications.

Passing Metadata

A typical design pattern, these days, is to host a central service that holds metadata and enables queries against it. This is how many search services work and it's impractical or impossible to download their data set.

APT does things a little differently. An APT repository provides Package indexes that have metadata about the packages. These are downloaded and local applications can use these data sets to learn about the packages.

There are advantages and disadvantages to this such as:

  • When searching a local data set for packages you can control if a 3rd party service knows what you're searching for. When leveraging a central service for search you know but cannot control the analytics they do or how they use or sell that information.
  • Large data sets can take time to transfer to local systems and, in some cases, can be too large to practically hold locally. For APT this has shown to be a theoretical issue more than a practical one.
  • A local data set can be out of sync with the latest version of the set.
  • When the data set is local it can work offline.

Search metadata is a gold mine of information and provides opportunities for accidental private package information to be leaked between repositories and providers, which is a security issue.

Interestingly, using extensions on the package file can change its format to be sent as a compressed file.

Fetching Packages And Placing Files

APT does download and install files to the right place on a system. This may be what it's most well known for but it's just one of the many features.

Dependency Handling

In addition to the requested package, APT installs dependencies and can clean up dependencies it no longer sees are in use.

Install and Remove Events

When certain points in the install and removal of a packages occur an event happens which can trigger a script. These are part of the control section of the package archive.

Comparing to Kubernetes, these are conceptually similar to container lifecycle hooks.

Getting Information At Install Time

Have you ever installed MySQL using apt-get install? Has it prompted you for a password for the root MySQL user? Getting install time information and using it as part of the installation process is built into APT.

To make this process easier, there is a package with helper functions to set default values and collect information.

Bringing It Together

Bring this together and you end up with packages that can have orchestrated installations, collect user input at install time, manage the dependencies, and download and install files.

The setup also works in a distributed manner that appears to prioritize those installing applications over those running the distributed repositories.

If you want to have a little "fun", you might dig around inside a Debian packages for a popular package like MySQL. You can see everything going on inside it.