Testing Squid

If all has gone well, we can begin to test the cache. True browser access is only covered in the next chapter, and there is a whole chapter devoted to configuring your browser. Until then, testing is done with the client program, which is included with the Squid source, and is in the /usr/local/squid/bin directory.

The client program connects to a cache and request a page, and prints out useful timing information. Since client is available on all systems that Squid runs on, and has the same interface on all of them, we use it for the initial testing.

At this stage Squid should be in the foreground, logging everything to your terminal. Since client is a unix program, you need access to a command prompt to run it. At this stage it's probably easiest to simply start another session (this way you can see if errors are printed in the main window).

The client program is compiled to connect to localhost on port 3128 (you can override these defaults from the command line, see the output of client -h for more details.)

If you are running client on the cache server, and are using port 3128 for incoming requests, you should be able to type a command like this, and the client program will retrieve the page through the cache server:

client http://squid.nlanr.net/

If your cache is running on a different machine you will have to use the -h and -p options. The following command will connect to the machine cache.qualica.comf on port 8080 and retrieve the above web page.

Example 5-1. Using the -h and -p client Options

cache1:~ $ /usr/local/squid/bin/client -h cache.qualica.com -p 8080 http://www.ora.com/

The client program can also be used to access web sites directly. As you may remember from reading Chapter 2, the protocol that clients use to access pages through a cache is part of the HTTP specification. The client program can be used to send both "normal" and "cache" HTTP requests. To check that your cache machine can actually connect to the outside world, it's a good idea to test access to an outside web server.

The next example will retrieve the page at http://www.qualica.com/, and send the html contents of the page to your terminal.

If you have a firewall between you and the internet, the request may not work, since the firewall may require authentication (or, if it's a proxy-level firewall and is not doing transparent proxying of the data, you may explicitly have to tell client to connect to the machine.) To test requests through the firewall, look at the next section.

A note about the syntax of the next request: you are telling client to connect directly to the remote site, and request the page /. With a request through a cache server, you connect to the cache (as you would expect) and request a whole url instead of just the path to a file. In essence, both normal-HTTP and cache-HTTP requests are identical; one just happens to refer to a whole URL, the other to a file.

Example 5-2. Retrieving Pages directly from a remote site with client

cache1:~ $ /usr/local/squid/bin/client -h www.ora.com -p 80 /

Client can also print out timing information for the download of a page. In this mode, the contents of thi page isn't printed: only the timing information is. The zero in the below example indicates that Squid is to retrieve the page until interrupted (with Control-C or Break.) If you want to retrieve the page a limited number of times, simply replace the zero with a number.

Example 5-3. Printing timing information for a page download

cache1:~ $ /usr/local/squid/bin/client -g 0 -h www.ora.com -p 80 /

Testing a Cache or Proxy Server with Client

Now that you have client working, you

Example 5-4. Accessing a site through the cache

cache1:~ $ /usr/local/squid/bin/client -h cache1.domain.example -p 3128 http://www.ora.com/

If the request through the cache returned the same page as you retrieved with direct access (you didn't receive an error message from Squid), Squid should be up and running. Congratulations! If things aren't going so well for you, you will have received an error message here. Normally, this is because of the acls described in the previous chapter. First, you should have a look at the terminal where you are running Squid (Or, if you are skipping ahead and have put Squid in the background, in the /usr/local/squid/logs/cache.log file.) If Squid encountered some sort of problem, there should be an error or warning in this file. If there are no messages here, you should look at the /usr/local/squid/logs/access.log file next. We haven't coverd the details of this file yet, but they are coverded in the next section of this chapter. First, though, let's see if your cache can process requests to internal servers. There are many cases where a request will work to internal servers but not to external machines.

Testing Intranet Access

If you have a proxy-based firewall, Squid should be configured to pass outgoing requests to the proxy running on the firewall. This quite often presents a problem when an internal client is attempting to connect to an internal (Intranet) server, as discussed in section 2.2.5.2. To ensure that the acl-operator lists created in section 2.2.5.2 are working, you should use client to attempt to connect to a machine on the local network through the cache. cache1:~ $ client -h cache1.domain.example -p 3128 http://www.localdomain.example If you didn't get an error message from a command like the above, access to local servers should be working. It is possible, however, that the connection could be being passed from the local cache to the parent (across a serial line), and the parent could be connecting back into the local network, slowing the connection enormously. The only way to ensure that the connection is not passing through your parent is to check the access logs, and see which server the connection is being passed to. 3.3.3: Access.log basics The access.log file logs all incoming requests. chapter 11 covers the fields in the access.log in detail. The most important fields are the URL (field 7), and hierarchy access type (field 9) fields. Note that a "-" indicates that there is no data for that field. The following example access.log entries indicate the changes in log output when connecting to another server, without a cache, with a single parent, and with multiple parents. Though fields are seperated by spaces, fields can contain sub-fields, where a "/" indicates the split. When connecting directly to a destination server, field 9 contains two subfields - the key word "DIRECT", followed by the name of the server that it is connecting to. Access to local servers (on your network) should always be DIRECT, even if you have a firewall, as discussed in section 3.1.2. The acl operator always_direct controls this behaviour. 905144366.259 1010 127.0.0.1 TCP_MISS/200 20868 GET http://www.ora.com/ - DIRECT/www.ora.com text/html When you have configured only one parent cache, the hierarchy access type indicates this, and includes the name of that cache. 905144426.435 289 127.0.0.1 TCP_MISS/200 20868 GET http://www.ora.com/ - SINGLE_PARENT/cache1.ora.com text/html There are many more types that can appear in the hierarchy access information field, but these are covered in chapter 11. Another useful field is the 'Log Tag' field, field four. In the following example this is the field "TCP_MISS/200". 905225025.225 609 127.0.0.1 TCP_MISS/200 10089 GET http://www.is.co.za/ - DIRECT/www.is.co.za text/html A MISS indicates that the request was already stored in the cache (or that the page contained headers indicating that the page was not to be cached). A HIT would indicate that the page was already stored in the cache. In the latter case the request time for a remote page should be substantially less than the first occurence in the logs. The time that Squid took to service the request is the second field. This value is in milliseconds. This value should approach that returned by examining a client request, but given operating system buffering there is likely to be a discrepancy. The fifth field is the size of the page returned to the client. Note that an aborted request can end up downloading more than this from the origin server if the quick_abort feature set is turned on in the Squid config file. Here is an example request direct from the origin server: 905230201.136 6642 127.0.0.1 TCP_MISS/200 20847 GET http://www.ora.com/ - DIRECT/www.ora.com text/html If we use client to fetch the page a short time later, a HIT is returned, and the time is reduced hugely. 905230209.899 151 127.0.0.1 TCP_HIT/200 20869 GET http://www.ora.com/ - NONE/- text/html Some of you will have noticed that the size of the hit has increased slightly. If you have checked the size of a request from the origin server and compared it to that of the same page through the cache, you will also note that the size of the returned data has increased very slightly. Extra headers are added to pages passing through the cache, indicating which peer the page was returned from (if applicable), age information and other information. Clients never see this information, but it can be useful for debugging. Since Squid 1.2 has support for HTTP/1.1, extra features can be used by clients accessing a copy of a page that Squid already has. Certain extra headers are included into the HTTP headers returned in HITS, indicating support for features which are not available to clients when returning MISSes. In the above example Squid has included a header in the page indicating that range-request are supported. If Squid is performing correctly, you should shut Squid down and add it to your startup files. Since Squid maintains an in-memory index of all objects in the cache, a kill -9 could cause corruption, and should never be used. The correct way to shutdown Squid is to use the command: cache1:~ # ~squid/bin/squid -k shutdown Squid command-line options are covered in chapter 10. 3.4) Addition to Startup Files The location of startup files vary from system to system. The location and naming scheme of these files is beyond the scope of this book. If you already have a local startup file, it's a pretty good idea to simply add the RunCache program to that file. Note that you should place RunCache in the background on startup, which is normally done by placing an '&' after the command: /usr/local/bin/RunCache & The RunCache program attempts to restart Squid if it dies for some reason, and logs basic Squid debug output both to the file "/usr/local/squid/squid.out" and to syslog.