To place your cache in a hierarchy, use the
directive in squid.conf to specify the parent and sibling
For example, the following squid.conf file on
childcache.example.com configures its cache to retrieve
data from one parent cache and two sibling caches:
# squid.conf - On the host: childcache.example.com # # Format is: hostname type http_port udp_port # cache_host parentcache.example.com parent 3128 3130 cache_host childcache2.example.com sibling 3128 3130 cache_host childcache3.example.com sibling 3128 3130
cache_host_domain directive allows you to specify that
certain caches siblings or parents for certain domains:
# squid.conf - On the host: sv.cache.nlanr.net # # Format is: hostname type http_port udp_port # cache_host electraglide.geog.unsw.edu.au parent 3128 3130 cache_host cache1.nzgate.net.nz parent 3128 3130 cache_host pb.cache.nlanr.net parent 3128 3130 cache_host it.cache.nlanr.net parent 3128 3130 cache_host sd.cache.nlanr.net parent 3128 3130 cache_host uc.cache.nlanr.net sibling 3128 3130 cache_host bo.cache.nlanr.net sibling 3128 3130 cache_host_domain electraglide.geog.unsw.edu.au .au cache_host_domain cache1.nzgate.net.nz .au .aq .fj .nz cache_host_domain pb.cache.nlanr.net .uk .de .fr .no .se .it cache_host_domain it.cache.nlanr.net .uk .de .fr .no .se .it cache_host_domain sd.cache.nlanr.net .mx .za .mu .zm
The configuration above indicates that the cache will use
for domains uk, de, fr, no, se and it,
for domains mx, za, mu and zm, and
for domains au, aq, fj, and nz.
We have a simple set of guidelines for joining the NLANR cache hierarchy.
The NLANR hierarchy can provide you with an initial source for parent or sibling caches. Joining the NLANR global cache system will frequently improve the performance of your caching service.
Just enable these options in your squid.conf and you'll be registered:
cache_announce 24 announce_to sd.cache.nlanr.net:3131
NOTE: announcing your cache is not the same thing as joining the NLANR cache hierarchy. You can join the NLANR cache hierarchy without registering, and you can register without joining the NLANR cache hierarchy.
Visit the NLANR cache registration database to discover other caches near you. Keep in mind that just because a cache is registered in the database does not mean they are willing to be your parent/sibling/child. But it can't hurt to ask...
This entry has been moved to a different section.
Note: The information here is current for version 2.2.
If you are behind a firewall then you can't make direct connections to the outside world, so you must use a parent cache. Squid doesn't use ICP queries for a request if it's behind a firewall or if there is only one parent.
You can use the
never_direct access list in
squid.conf to specify which requests must be forwarded to
your parent cache outside the firewall. For example, if Squid
can connect directly to all servers that end with mydomain.com, but
must use the parent for all others, you would write:
acl INSIDE dstdomain mydomain.com never_direct deny INSIDENote that the outside domains will not match the INSIDE acl. When there are no matches, the default action is the opposite of the last action. Its as if there is an implicit never_direct allow all as the final rule.
You could also specify internal servers by IP address
acl INSIDE_IP dst 18.104.22.168/24 never_direct deny INSIDENote, however that when you use IP addresses, Squid must perform a DNS lookup to convert URL hostnames to an address. Your internal DNS servers may not be able to lookup external domains.
If you use never_direct and you have multiple parent caches, then you probably will want to mark one of them as a default choice in case Squid can't decide which one to use. That is done with the default keyword on a cache_peer line. For example:
cache_peer xyz.mydomain.com parent 3128 0 default
Note: The information here is current for version 2.2.
First, you need to give Squid a parent cache. Second, you need to tell Squid it can not connect directly to origin servers. This is done with three configuration file lines:
cache_peer parentcache.foo.com parent 3128 0 no-query default acl all src 0.0.0.0/0.0.0.0 never_direct allow allNote, with this configuration, if the parent cache fails or becomes unreachable, then every request will result in an error message.
In case you want to be able to use direct connections when all the parents go down you should use a different approach:
cache_peer parentcache.foo.com parent 3128 0 no-query prefer_direct offThe default behaviour of Squid in the absence of positive ICP, HTCP, etc replies is to connect to the origin server instead of using parents. The prefer_direct off directive tells Squid to try parents first.
The dnsserver processes are used by squid because the
gethostbyname(3) library routines used to
convert web sites names to their internet addresses
blocks until the function returns (i.e., the process that calls
it has to wait for a reply). Since there is only one squid
process, everyone who uses the cache would have to wait each
time the routine was called. This is why the dnsserver is
a separate process, so that these processes can block,
without causing blocking in squid.
It's very important that there are enough dnsserver processes to cope with every access you will need, otherwise squid will stop occasionally. A good rule of thumb is to make sure you have at least the maximum number of dnsservers squid has ever needed on your system, and probably add two to be on the safe side. In other words, if you have only ever seen at most three dnsserver processes in use, make at least five. Remember that a dnsserver is small and, if unused, will be swapped out.
First, find out if you have enough dnsserver processes running by looking at the Cachemanager dns output. Ideally, you should see that the first dnsserver handles a lot of requests, the second one less than the first, etc. The last dnsserver should have serviced relatively few requests. If there is not an obvious decreasing trend, then you need to increase the number of dns_children in the configuration file. If the last dnsserver has zero requests, then you definately have enough.
Another factor which affects the dnsserver service time is the proximity of your DNS resolver. Normally we do not recommend running Squid and named on the same host. Instead you should try use a DNS resolver (named) on a different host, but on the same LAN. If your DNS traffic must pass through one or more routers, this could be causing unnecessary delays.
Before you run the configure script, simply set the CACHE_HTTP_PORT environment variable.
setenv CACHE_HTTP_PORT 8080 ./configure make make install
With Squid-1.1 it is NOT possible. Each cache_dir is assumed to be the same size. The cache_swap setting defines the size of all cache_dir's taken together. If you have N cache_dir's then each one will hold cache_swap ÷ N Megabytes.
Most people have a disk partition dedicated to the Squid cache. You don't want to use the entire partition size. You have to leave some extra room. Currently, Squid is not very tolerant of running out of disk space.
Lets say you have a 9GB disk. Remember that disk manufacturers lie about the space available. A so-called 9GB disk usually results in about 8.5GB of raw, usable space. First, put a filesystem on it, and mount it. Then check the ``available space'' with your df program. Note that you lose some disk space to filesystem overheads, like superblocks, inodes, and directory entries. Also note that Unix normally keeps 10% free for itself. So with a 9GB disk, you're probably down to about 8GB after formatting.
Next, I suggest taking off another 10% or so for Squid overheads, and a "safe buffer." Squid normally puts its swap.state files in each cache directory. These grow in size until you rotate the logs, or restart squid. Also note that Squid performs better when there is more free space. So if performance is important to you, then take off even more space. Typically, for a 9GB disk, I recommend a cache_dir setting of 6000 to 7500 Megabytes:
cache_dir ... 7000 16 256
Its better to start out conservative. After the cache becomes full, look at the disk usage. If you think there is plenty of unused space, then increase the cache_dir setting a little.
If you're getting ``disk full'' write errors, then you definately need to decrease your cache size.
With Squid-1.1, yes, you will lose your cache. This is because version 1.1 uses a simplistic algorithm to distribute files between cache directories.
With Squid-2, you will not lose your existing cache. You can add and delete cache_dir's without affecting any of the others.
Several people on both the fwtk-users and the squid-users mailing asked about using Squid in combination with http-gw from the TIS toolkit. The most elegant way in my opinion is to run an internal Squid caching proxyserver which handles client requests and let this server forward it's requests to the http-gw running on the firewall. Cache hits won't need to be handled by the firewall.
In this example Squid runs on the same server as the http-gw, Squid uses 8000 and http-gw uses 8080 (web). The local domain is home.nl.
Either run http-gw as a daemon from the /etc/rc.d/rc.local (Linux Slackware):
exec /usr/local/fwtk/http-gw -daemon 8080or run it from inetd like this:
web stream tcp nowait.100 root /usr/local/fwtk/http-gw http-gwI increased the watermark to 100 because a lot of people run into problems with the default value.
Make sure you have at least the following line in /usr/local/etc/netperm-table:
http-gw: hosts 127.0.0.1You could add the IP-address of your own workstation to this rule and make sure the http-gw by itself works, like:
http-gw: hosts 127.0.0.1 10.0.0.1
The following settings are important:
http_port 8000 icp_port 0 cache_host localhost.home.nl parent 8080 0 default acl HOME dstdomain .home.nl never_direct deny HOMEThis tells Squid to use the parent for all domains other than home.nl. Below, access.log entries show what happens if you do a reload on the Squid-homepage:
872739961.631 1566 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/ - DEFAULT_PARENT/localhost.home.nl - 872739962.976 1266 10.0.0.21 TCP_CLIENT_REFRESH/304 88 GET http://www.nlanr.net/Images/cache_now.gif - DEFAULT_PARENT/localhost.home.nl - 872739963.007 1299 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/Icons/squidnow.gif - DEFAULT_PARENT/localhost.home.nl - 872739963.061 1354 10.0.0.21 TCP_CLIENT_REFRESH/304 83 GET http://www.squid-cache.org/Icons/Squidlogo2.gif - DEFAULT_PARENT/localhost.home.nl
http-gw entries in syslog:
Aug 28 02:46:00 memo http-gw: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) Aug 28 02:46:00 memo http-gw: log host=localhost/127.0.0.1 protocol=HTTP cmd=dir dest=www.squid-cache.org path=/ Aug 28 02:46:01 memo http-gw: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1 Aug 28 02:46:01 memo http-gw: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) Aug 28 02:46:01 memo http-gw: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/Squidlogo2.gif Aug 28 02:46:01 memo http-gw: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) Aug 28 02:46:01 memo http-gw: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/squidnow.gif Aug 28 02:46:01 memo http-gw: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) Aug 28 02:46:01 memo http-gw: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.nlanr.net path=/Images/cache_now.gif Aug 28 02:46:02 memo http-gw: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1 Aug 28 02:46:03 memo http-gw: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=2 Aug 28 02:46:04 memo http-gw: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=3
-- Rodney van den Oever
When a proxy-cache is used, a server does not see the connection coming from the originating client. Many people like to implement access controls based on the client address. To accommodate these people, Squid adds its own request header called "X-Forwarded-For" which looks like this:
X-Forwarded-For: 22.214.171.124, unknown, 126.96.36.199Entries are always IP addresses, or the word unknown if the address could not be determined or if it has been disabled with the forwarded_for configuration option.
We must note that access controls based on this header are extremely weak and simple to fake. Anyone may hand-enter a request with any IP address whatsoever. This is perhaps the reason why client IP addresses have been omitted from the HTTP/1.1 specification.
Yes it can, however the way of doing it has changed from earlier versions of squid. As of squid-2.2 a more customisable method has been introduced. Please follow the instructions for the version of squid that you are using. As a default, no anonymizing is done.
If you choose to use the anonymizer you might wish to investigate the forwarded_for option to prevent the client address being disclosed. Failure to turn off the forwarded_for option will reduce the effectiveness of the anonymizer. Finally if you filter the User-Agent header using the fake_user_agent option can prevent some user problems as some sites require the User-Agent header.
With the introduction of squid 2.2 the anonoymizer has become more customisable. It now allows specification of exactly which headers will be allowed to pass.
The new anonymizer uses the 'anonymize_headers' tag. It has two modes 'deny' all and allow the specified headers. The following example will simulate the old paranoid mode.
anonymize_headers allow Allow Authorization Cache-Control anonymize_headers allow Content-Encoding Content-Length anonymize_headers allow Content-Type Date Expires Host anonymize_headers allow If-Modified-Since Last-Modified anonymize_headers allow Location Pragma Accept Charset anonymize_headers allow Accept-Encoding Accept-Language anonymize_headers allow Content-Language Mime-Version anonymize_headers allow Retry-After Title Connection anonymize_headers allow Proxy-Connection
This will prevent any headers other than those listed from being passed by the proxy.
The second mode is 'allow' all and deny the specified headers. The example replicates the old standard mode.
anonymize_headers deny From Referer Server anonymize_headers deny User-Agent WWW-Authenticate Link
It allows all headers to pass unless they are listed.
You can not mix allow and deny in a squid configuration it is either one or the other!
There are three modes: none, standard, and paranoid. The mode is set with the http_anonymizer configuration option.
With no anonymizing (the default), Squid forwards all request headers as received from the client, to the origin server (subject to the regular rules of HTTP).
In the standard mode, Squid filters out the following specific request headers:
In the paranoid mode, Squid allows only the following specific request headers:
References: Anonymous WWW
Sure, just use the always_direct access list.
For example, if you want Squid to connect directly to hotmail.com servers, you can use these lines in your config file:
acl hotmail dstdomain .hotmail.com always_direct allow hotmail
Sure, there are few things you can do.
You can use the no_cache access list to make Squid never cache any response:
acl all src 0/0 no_cache deny all
With Squid-2.4 and later you can use the ``null'' storage module:
cache_dir null -1 1000