The Cache Array Routing Protocol (CARP)

The CARP protocol uses a hash function to decide which cache a request is to be forwarded to. When a request is to be sent out, the code takes the URL requested and feeds it through a formula that essentially generates a large number from the text of the URL. A different URL (even if it differs by only one character) is likely to end up as a very different number (it won't, for example, differ by one). If you take 50 URLs and put them through the function, the numbers generated are going to be spread far apart from one another, and would be spread evenly across a graph. The numbers generated, however, all fit in a certain range. Because the number are spread across the range evenly, we can split the range into two, and the same number of URLs will have ended up in the first half as the second.

Let's say that we create a rule that effectively says: "I have two caches. Whenever I receive a request, I want to pass it to one of these caches. I know that any number generated by the hash function will be less than X, and that numbers are as likely to fall above one-half X as below. By sending all requests that hash to less than one-half X to cache one, and the remaining requests to cache two, the load should be even."

To terminology: the total range of numbers is split into equally large number ranges (called buckets).

Let's say that we have two caches, again. This time, though, cache one is able to handle twice the load of cache two. If we split the hash space into three ranges, and allocate buckets one and three to cache one (and bucket two to cache two), a URL will have twice the chance of going to cache one as it does to cache two.

Squid caches can talk to parent caches using CARP balancing if CARP was enabled when the source was configured (using the ./configure --enable-carp command.)

The load-factor values on all cache_peer lines must add up to 1.0. The below example splits 70% of the load onto the machine bigcache.mydomain.example, leaving the other 30% up to the other cache.

Example 8-11. Using CARP Load Factor variables

cache_peer smallcache.mydomain.example parent 3128 3130 carp-load-factor=0.3
cache_peer bigcache.mydomain.example parent 3128 3130 carp-load-factor=0.70

Now that your cache is integrated into a hierarchy (or is a hierarchy!), we can move to the next section. Accelerator mode allows your cache to function as a front-end for a real web server, speeding up web page access on those old servers.

Transparent caches effectively accelerate web servers from a distance (the code, at least, to perform both functions is effectively the same.) If you are going to do transparent proxying, I suggest that you read the next two Chapters. If you aren't interested in either of these Squid features, your Squid installation should be up and running. The remainder of the book (Section III) covers cache maintenance and debugging.