Chapter 6. Browser Configuration

Table of Contents
Browsers
Browser-cache Interaction
Testing the Cache
Cache Auto-config
cgi generated autoconfig files
Future directions
Ready to Go

Browsers

Squid is the server half of a client-server relationship. Though you have configured Squid, your client (the browser) is still configured to talk to the menagerie of servers that make up the Internet.

You have already used the client program included with Squid to test that the cache is working. Browsers are more complicated to configure than client, especially since there are so many different types of browser.

This chapter covers the three most common browsers. It also includes information on the proxy configuration of Unix tools, since you may wish to use these for automatic download of pages. Once your browser is configured, some of the proxy-oriented features of browsers are covered. Many browsers allow you to force your cache server to reload the page, and have other proxy-specific features.

So that you can skip sections in this chapter that you don't need to read, browsers are configured in the following order: Netscape Communicator, Microsoft Internet Explorer, Opera and finally Unix Clients.

You can configure most browsers in more than one way. The first method is the simplest for a sysadmin, the second is simplest for the user. Since this book is written for system administrators, we term the first basic configuration, the second advanced configuration.

Basic Configuration

In this mode, each browser is configured independently to the others. If you need to change something about the server (the port that it accepts requests on, for example), each browser will have to be reconfigured manually: you will have to physically walk to it and change the setup. To avoid caching intranet sites, you will have to add exclusions for each intranet site.

Advanced Configuration

In this mode, you will configure a so-called rule server. Clients connect to this server on startup, and download information on which proxy server to talk to, and which URLs to retrieve from which proxy server. Exclusion of intranet sites is handled centrally: so one change will update all clients. If your organization is is large, or is growing, you should use the auto-config option.

Though this method is called auto-config, it's not completely automatic, since the user still has to enter a URL indicating the location of the list of rules. Advanced configuration has some advantages:

  • Changes to the proxy server are easy, since you only change the rule server.

  • A proxy server can be chosen based on destination machine name, destination port and more. Since this list is maintained centrally, chances also only have to be made once.

  • Browser configuration is easy, instead of adding complicated lists of IP's, a user simply has to type in a URL.

  • Since it's easy to configure, users are more likely to use the cache.

When you write your list of rules (also called a proxy auto-config script), you will still need to supply the client with the same information as with the basic configuration, it's just that the list of this information is maintained centrally. Even if you decide to use only autoconfig on your network, you should probably work through the basic configuration first.

Basic Configuration

To configure any browser, you need at least two pieces of information:

  • The proxy server's host name

  • The port that the proxy server is accepting requests on

Host name

It's very important to use a proxy specific host name. If you decide to move the cache to another machine at a later stage you will find that it's much easier to change DNS settings than to change the configuration of every browser on your network.

If your operating system supports IP aliases you should organize a dedicated IP address for the cache server, and use the tcp_incoming_address and tcp_outgoing_address squid.conf options to make Squid only accept incoming HTTP requests on that IP address.

There isn't really a naming convention for caches, but people generally use one of the following: cache, proxy, www-proxy, www-cache, or even the name of the product they are using; squid, netapp, netscape. Some people also include the location of the cache, and configure people in a region to talk to their local cache. More and more people are simply using cache, and it's the suggested name. If you wish to use regional names, you can use something along the lines of region.cache.domain.example.

Your choice of port has already been discussed. Have a look at HTTP:port in the index for more information.

Netscape Communicator 4.5

(? Screen shots here ?)

Select the Edit menu
Select Preferences
Maximize Advanced
Select Proxies
Choose Manual proxy configuration
Click the View... button

For each of FTP Proxy, Gopher Proxy, HTTP Proxy, Security Proxy, enter the hostname of your cache on the left, and the chosen http_port on the right. Squid can function as a WAIS proxy when it has a WAIS relay (see the tags wais_relay_host and wais_relay_port in chapter 10 for more information).

If you have an intranet server, you can enter the host name in the box titled "No Proxy for". If you wish to add more than one server, simply use a comma to separate the entries.

Since you are going to be accessing a large cache server, the disk space allocated for the browsers cache is disk space that could be used for something else. It's worth having some disk space allocated to the browsers' cache, especially if the cache is across a serial line. Modem users, for example, should keep their cache settings as is.

Select the Edit menu
Select Preferences
Maximize Advanced
Select Cache
Change the text in the Disk Cache box to 1000

Internet Explorer 4.0

Select the View menu option Select Internet Options Click on the Connection tab Select Access the Internet using a proxy server Type in your hostname in the Address: field, and your chosen port in the Port: field. Internet Explorer can attempt to connect directly to the destination server if the URL you are going to is in the local domain (? I presume ?). You should turn this option on, so that local accesses are not cached, and do not pass through the cache server. If you have more than one domain, you will have to specifically change options so that all your domains are ignored, using the Advanced button.

In the advanced menu, you can configure per-protocol cache server/port pairs, or you can type in only the first proxy/port pair, and select Use the same proxy for all protocols. Although Squid doesn't normally work with SOCKS, it's rarely used, so you can probably use the same proxy for all protocols.

The main advantage of using the Advanced menu is the ability to specify which domains are to be connected to directly, rather than through the proxy server. If all your local sites' hostnames begin with intranet, you can simply put that into the box titled Do not use proxy for addresses beginning with. You can add more than one exception by using a semicolon (;) between entries.

You will probably wish to exclude all local sites too. Since the exception list allows you to use a * character for what is known as a wildcard match, you can add *.localdomain.example, and all hosts in your domain will be accessed directly. Many people access local sites by IP address, rather than by name. Since the exception list matches against the URL (??) these will still pass through the cache, and you will need to add an IP address range to the list of hosts to exclude: 192.168.0.* should do nicely.

To reduce the local browser cache space (as discussed in the Netscape section in the previous section):

View
Options
General
In the Temporary Internet files section, click the Settings
button.
Move the slider all the way to the left.

Since Squid-2.0 and above handle HTTP/1.1 correctly, you should also configure Internet Explorer to use HTTP/1.1 when communicating with the proxy server:

View
Internet Options
Advanced tab
Scroll down until you see HTTP 1.1 Settings
Tick Use HTTP 1.1 through proxy server

(? I believe that opera is the third most common browser ?) (? I don't have a machine with it on... since I run Linux?)

Unix clients

Most Unix client programs use a single environment variable to decide how they are to access the Internet. If you run lynx (a terminal-based browser) on any of your machines, or use the recursive web-spider wget, you can simply set a shell variable and these programs will use the correct proxy server.

Each protocol has a different environmental variable, so that you can configure your client to use a different proxy for each protocol. Each protocol simply has the text _proxy tagged onto the end, so some of the most common protocols end up as follows:

  • http_proxy

  • ftp_proxy

  • gopher_proxy

Since many people prefer a shell other than bash, we make an exception to our rule that "all examples are based on sh" here.

sh. The Bourne Shell (or Bash, the freeware alternative)

http_proxy=http://cache.domain.example:3128/
export http_proxy
OR
ftp_proxy=http://cache.domain.example:3128/
export ftp_proxy

tcsh. The C Shell

setenv http_proxy http://cache.domain.example:3128/
OR
setenv ftp_proxy http://cache.domain.example:3128/

(? ksh, others ?)