CHASM

Crytpograph-Hash-Algorithm-Secured Mirroring

Rsync is Wasteful

When two nodes synchronize via rsync, there is a significant amount of overhead.

Re-computation of Deltas

Consider when comparing files based upon time stamps. On each synchronization, directory entry's timestamps are compared to determine which files to exchange. When doing a synchronization between two hosts without any knowledge other than the root paths on each end, this cost is unavoidable.

In a mirroring network (like those used in Fedora, Debian, etc.), a small number of files are changed or added on each update, but updates occur frequently. If the master server were to pre-calculate the changes, then every downstream client would be able to know which files to request before connecting to an upstream server.

If all clients are querying a tracker that can advertise new changes, then this the reduced latency in each link will lead to overall reduced latency as well. Consider a network with three tiers of clients. The tracker can advertise changes to tier 1. When all tier 1 clients have the updates, it can begin to advertise the updates to tier 2 clients. At each update rollout, downstream nodes can efficiently transfer data from upstream nodes without ever having to compare the contents of the archive on each end of the connection (a simple bitfield of files will suffice).

Lack of a Checksum Cache

Fedora's sample rsync configuration explicitly turns off the checksum feature of rsync. The rationale behind this (please note that I agree with this rationale) is that it prevents remote users from triggering a checksum operation over the entire tree. This operation requires every file on the tree to be read from disk and run through the checksum algorithm used.

Thus, in the interest of saving I/O subsystems mirrors disable checksum support. Without checksum support, rsync falls back to using timestamps. For security reasons (I'll explain in a later post), timestamps are less than adequate for detecting changes.

Reliance Upon Picking a Fast Upstream

With rsync, the choice of upstream mirror is critical to maintaining an keeping up-to-date with updates. A mirror will never see updates before its upstream does. In the worst case, the leaf nodes in the network will always be d (the depth of the tree) synchronizations behind (assuming new content is pulled at each sync). If there is a critical update that must be pushed immediately, a O(6d) latency (assuming four synchronizations per day) is not acceptable.

What CHASM Will Change

In this section I will provide a brief overview of how CHASM addresses each of the above issues.

CHASM's master node generates a manifest as part of the update process. Each manifest describes the absolute state of the archive in terms of directories, symbolic links, and files. Each directory or file object has an associated traditional unix file mode (ugo-rwx). Files are specified by a cryptographic hash (where the CH in CHASM comes from).

A client simply has to read the manifest and compare it to local data on the system. Combine this with a local pool of files identified by their hash value (like Git's .git/objects/ directory), and CHASM is able to quickly identify which files in a manifest must be transferred. Even a mirror that is months out of date could easily avoid re-transferring any data that is still valid.

The solution to the re-computation of deltas adequately provides a checksum cache, and solves the issue of hard-links. As each file is identified by its checksum, every duplicate of the file will be replaced with a hard-link as a matter of course.

Mirrors can query for updates quite frequently without significant penalty. A full manifest of the Fedora project takes up less than 100MB without compression. Compression reduces this amount (more on this in a later post). Creating a diff between manifests reduces the size further.

The Fedora infrastructure already includes support for detecting closest-mirrors using BGP. We'd like to leverage this to allow clients to dynamically pull updates from close neighbors so as to minimize bandwidth costs and hopefully maximize throughput.

Looking to the Future

CHASM is still a work in progress. Everything I wrote was in the above document was written in present tense for clarity. Many of the features exist as ideas, and are not reflected in the code.

If you're looking to help, please consider submitting to our CHASM Nightly Dashboard (nightly build instructions).

Update: We're also looking for students and support in the Fedora Summer of Code (CHASM)

Thanks,

--Robert

Copyright © 2010 Robert Escriva ¦ Powered by Firmant