A Torrent of Pulls

May 9, 2016 · By Joseph Schorr

Today at CoreOS Fest 2016, Brandon Philips, CTO of CoreOS, highlighted news that we are bringing together BitTorrent and Quay for improved efficiency. This is a preview feature and tool that enables support for pulling appc and Docker container images using BitTorrent, with the new quayctl tool. Quayctl integrates with both the rkt and Docker container engines and we want you to try it out.

Improving container images in production

Containerization technologies such as rkt and the docker engine have provided major benefits for the deployment of software such as reproducibility, ease of development and ease of deployment. One major painpoint developers encounter when starting to use containers in production is the sheer size of the container images produced. Distributing large container images today to large numbers of machines can be a headache when bandwidth is limited, registries are located geographically far away, or the number and size of image layers are quite large.

At CoreOS, we’ve been giving a great deal of thought to how to mitigate some of the headaches associated with distributing of container images. Last year, we launched our squashed image support in Quay for downloading container images without their intermediate layers. Squashed images have been a great benefit to our customers and ourselves, with reported download times as much as 50% shorter versus a standard docker pull. However, squashed images must be fully pulled from registries every time, putting load on network gateways and not using local resources. Machines and services are rarely started alone; what if there were a way to share images within a cluster?

Using BitTorrent with containers and Quay

Enter BitTorrent. BitTorrent provides an easy way for multiple machines within a cluster to share pieces of large binary files peer-to-peer. Many organizations, such as Twitter and Facebook, have adopted BitTorrent for deployment with significant results. With BitTorrent, companies have seen drastic reductions in download and deployment time, as well as increased stability from having multiple machines serving their binary data.

Recognizing the benefits of using BitTorrent with containers, we are excited today to announce support for pulling appc and Docker images using BitTorrent via the new quayctl tool. Developers will save time on deployment of their containers, as well as benefit from increased efficiency thanks to the ability to use BitTorrent and Quay.

A torrent of images

Introducing BitTorrent-based pulling of Quay images into your deployment process is quite straightforward using the quayctl tool.

First, download the binary version of the tool from the GitHub releases page via curl or wget.

Once installed on a machine, quayctl can be used in place of rkt fetch or docker pull to pull the image via torrent:

$ quayctl rkt torrent pull quay.io/quay/busybox
Discovering image quay.io/quay/busybox
Downloading torrent for image quay.io/quay/busybox
quay.io/quay/busybox Completed
Loading image quay.io/quay/busybox
Successfully pulled image quay.io/quay/busybox

$ quayctl docker torrent pull quay.io/quay/busybox
Downloading manifest for image quay.io/quay/busybox
Downloaded manifest for image quay.io/quay/busybox
sha256:3eecc807f63d Completed
sha256:48341aa39e69 Completed
sha256:f0a98344d604 Completed
Connecting to docker
Pulling image
f0a98344d604 [===========================] 100.00 % 0  Pull complete
3eecc807f63d [===========================] 100.00 % 0  Pull complete
48341aa39e69 [===========================] 100.00 % 0  Pull complete
Successfully pulled image quay.io/quay/busybox

For each layer of an image, quayctl will pull the layer’s contents via torrent, using local seeds where possible and falling back to a Quay-provided webseed when image data is not yet in the cluster. During the download process, each client will seed to others in the swarm.

Once all layers have been downloaded, quayctl will then call the container engine to load the layers into the image store, allowing the image to be used as if pulled via a normal call:

$ rkt image list
ID                     NAME
sha512-82b6f101a4bfquay.io/quay/busybox:latest

Ensuring privacy and security

Private repositories and container images provided a unique challenge for implementing BitTorrent support.

For pull credentials, quayctl will automatically read from the local rkt or Docker credentials store, allowing existing flows that make use of rkt credential config or docker login to continue working without change.

When sharing pieces, Quay makes use of the Chihaya open source BitTorrent tracker to find peers. As part of this project, we implemented a custom JWT auth solution, allowing only those with the necessary signed token to access the peer list for private blocks. This allows private clusters to load peers for private images, without risk of leaking data or the IP addresses of the peers.

Seeding the cluster

To ensure maximum sharing within a cluster, it is recommended that all machines using BitTorrent seed their data once a container image has been downloaded. The quayctl tool provides a dedicated seed command which allows seeding of an image, either indefinitely or for a specified period of time:

$ quayctl rkt torrent seed quay.io/coreos/clair --duration=10m

Squashed images

In addition to using BitTorrent for pulling Docker images layer-by-layer, quayctl can also be used to BitTorrent pull squashed Docker images from Quay (rkt images are already squashed).

Squashed images can be pulled by adding the --squashed argument to the torrent pull command

NOTE: In order to torrent pull a squashed image, the image must have been downloaded at least once via a normal HTTP request

$ quayctl docker torrent pull quay.io/coreos/clair --squashed
Starting download of squashed image
quay.io/coreos/clair Completed
Importing squashed image
Successfully pulled squashed image quay.io/coreos/clair

Once pulled, the image will be available with the tag tag.squash

Benefits of torrenting

In our initial experiments, we have found that torrenting of images yielded reduced download times when performed on clusters behind single gateways to the external internet, or on clusters geographically remote from US East (where Quay stores its data). In both cases, the ability for machines locally within the cluster to share image data helped reduce download times compared to pulling data from Quay’s storage provider across the gateway or the world.

In our next experiments, changing to torrenting of ACI or squashed images resulted in even larger download time reductions (over 30% reduction from a normal pull), as the single images were significantly smaller than the sum of all the layers. We also saw far less variation in download times, as most of the pieces were pulled from within the local cluster.

We believe that torrenting will result in even further reductions of download times when used in environments with very large numbers of machines starting at once (50+), very large container images (1GB+) or on providers without direct networking access to Amazon S3.

Get started with BitTorrent and Quay by CoreOS

Using BitTorrent and Quay to store and manage application containers saves developers time with the ability to develop and distribute large container images quickly, safely and efficiently within a cluster. Get started with the Quay container registry to get the benefits of faster development and deployment.