Acl-operator lines

Acl-operators are the other half of the acl system. For each connection the appropriate acl-operators are checked (in the order that they appear in the file). You have met the http_access and icp_access operators before, but they aren't the only Squid acl-operators. All acl-operator lines have the same format; although the below format mentions http_access specifically, the layout also applies to all the other acl-operators too.

http_access allow|deny [!]aclname [& [!]aclname2 ... ]

Let's work through the fields from left to right. The first word is http_access, the actual acl-operator.

The allow and deny words come next. If you want to deny access to a specific class of users, you can change the customary allow to deny in the acl line. We have seen where a deny line is useful before, with the final deny of all IP ranges in previous examples.

Let's say that you wanted to deny Internet access to a specific list of IP addresses during the day. Since acls can only have one type per acl, you could not create an acl line that matches an IP address during specific times. By combining more than one acl per acl-operator line, though, you get the same effect. Consider the following acls:

acl dialup src 10.0.0.0/255.255.255.0
acl work time 08:00-17:00

If you could create an acl-operator that was matched when both the dialup and work acls were true, clients in the range could only connect during the right times. This is where the aclname2 in the above acl-operator definition comes in. When you specify more than one acl per acl-operator line, both acls have to be matched for the acl-operator to be true. The acl-operator function AND's the results from each acl check together to see if it is to return true of false.

You could thus deny the dialup range cache access during working hours with the following acl rules:

Example 7-17. Using more than one acl operator on an http_access line

acl myNet src 168.209.2.0/255.255.255.0
acl dialup src 10.0.0.0/255.255.255.0
acl work_hours time 08:00-17:00
# If a connection arrives during work hours, dialup is 1, and
# work_hours is 1. When ANDed together the http_access line matches
# and denies the client access
# during work hours:
#	1 AND 1 = TRUE, so the http_access line matches them and
#			they are denied
# after work hours:
#	1 AND 0 = FALSE, so the line does not match: the next
#			http_acess line is checked. Note that
#
http_access deny dialup work_hours
# If it's not during work hours, the above line will fail, and the
# next http_access line will be checked. You want to allow dialup
# users explicit access here, otherwise they are not caught by the
# myNet acl, and are denied by the final deny line.
http_access allow dialup
http_access allow myNet
http_access deny all

You can also invert an acl's result value by using an exclamation mark (the traditional NOT value from many programming languages) before the appropriate acl. In the following example I have reduced Example 6-4 into one http_access line, taking advantage of the implicit inversion of the last rule to deny access to all clients.

Example 7-18. Specifying more than one acl per http_access line

acl myNet src 10.0.0.0/255.255.0.0
acl all src 0.0.0.0/0.0.0.0
# A request from an outside network:
#		1 AND (NOT 0) = True, so the request is denied
# A request from an internal network:
#		1 AND (NOT 1) = False. Because the last definition
#		is inverted (see earlier discussions in this chapter
#		for more detail), the local network is allowed: the
#		'deny' is inverted.
http_access deny all !myNet
# There is an invisible "http_access allow all" here because of the
# way Squid inverts the last http_access rule.

Since the above example is quite complicated: let's cover it in more detail:

In the above example an IP from the outside world will match the 'all' acl, but not the 'myNet' acl; the IP will thus match the http_access line. Consider the binary logic for a request coming in from the outside world, where the IP is not defined in the myNet acl.

Deny http access if ((true) & (!false))

If you consider the relevant matching of an IP in the 10.0.0.0 range, the myNet value is true, the binary representation is as follows:

Deny http access if ((true) & (!true))

A 10.0.0.0 range IP will thus not match the only http_access line in the squid config file. Remembering that Squid will default to using the inverse of the last match in the file, accesses will be allowed from the myNet IP range.

The other Acl-operators

You have encountered only the http_access and icp_access acl-operators so far. Other acl-operators are:

The no_cache acl-operator

The no_cache acl-operator is used to ensure freshness of objects in the cache. The default Squid config file includes an example no_cache line that ejects the results of cgi programs from the cache. If you want to ensure that cgi pages are not cached, you must un-comment the following lines from squid.conf:

acl QUERY urlpath_regex cgi-bin \\?
no_cache deny QUERY

The first line uses a regular expression match to find urls that have cgi-bin or ? in the path (since we are using the urlpath_regex acl type, a site with a name like cgi-bin.oreilly.com will not be matched.) The no_cache acl-operator is then used to eject matching objects from the cache.

The ident_lookup_access acl-operator

Earlier we discussed using the ident protocol to control cache access. To reduce network overhead, Squid does an ident lookup only when it needs to. If you are using ident to do access control, Squid will do an ident lookup for every request, and you don't have to worry about this acl-operator.

Many administrators would like to log the the ident value for connections without actually using it for access control. Squid used to have a simple on/off switch for ident lookups, but this incurred extra overhead for the cases where the ident lookup wasn't useful (where, for example, the connection is from a desktop PC).

Let's consider some examples. Assume that a you have one Unix server (at IP address 10.0.0.3), and all remaining IP's in the 10.0.0.0/255.255.255.0 range are desktop PC's. You don't want to log the ident value from PC's, but you do want to record it when the connection is from the Unix machine. Here is an example acl set that does this:

Example 7-19. Logging ident values from specific machines

acl myNet src 10.0.0.0/255.255.255.0
acl all src 0.0.0.0/0.0.0.0
# not used for access control, just to differentiate ident lookups:
acl Unixmachine src 10.0.0.3/255.255.255.255
http_access allow myNet
http_access deny all
# do an ident lookup when the request is from Unixmachine
ident_lookup_access allow Unixmachine
# but don't log ident values for anything else
ident_lookup_access deny all

If a system cracker is attempting to attack your cache, it can be useful to have their ident value logged. The following example gets Squid not to do ident lookups for machines that are allowed access, but if a request comes from a disallowed IP range, an ident lookup is done and inserted into the log.

Example 7-20. Doing ident lookups for unknown machines

acl myNet src 10.0.0.0/255.255.255.0
acl all src 0.0.0.0/0.0.0.0
http_access allow myNet
http_access deny all
# If the request is from a local machine, don't do an ident query
ident_lookup_access deny myNet
# If the request is from another network, do an ident query
ident_lookup_access allow all

The miss_access acl-operator

The ICP protocol is used by many caches to find out if objects are in another cache's on-disk store. If you are peering with other organisation's caches, you may wish them to treat you as a sibling, where they only get data that you already have stored on disk. If an unscrupulous cache-admin were to change their cache_peer line to read parent instead of sibling, they could get you to retrieve objects on their behalf.

To stop this from happening, you can create an acl that contains the peering caches, and use the miss_access acl-operator to ensure that only hits are served to these caches. In response to all other requests, an access-denied message is sent (so if a sibling complains that they almost always get error messages, it's likely that they think that you should be their parent, and you think that they should be treating you as a sibling.)

When looking at the following example it is important to realise that http_access lines are checked before any miss_access lines. If the request is denied by the http_access lines, an error page is returned and the connection closed, so miss_access lines are never checked. This means that the last miss_access line in the example doesn't allow random IP ranges to access your cache, it only allows ranges that have passed the http_access test through. This is simpler than having one miss_access line for each http_access line in the file, and it will reduce CPU usage too, since only two acls are checked instead of the six we would have instead.

Example 7-21. Allowing a subnet range to only get data we already have (hits)

acl myFirstNet src 10.0.0.0/255.255.255.0
acl mySecondNet src 10.1.0.0/255.255.255.0
acl myThirdNet src 10.2.0.0/255.255.255.0
acl othercompany src 10.11.12.13/255.255.255.255
acl all src 0.0.0.0/0.0.0.0
http_access allow myNet
http_access allow myFirstNet
http_access allow mySecondNet
http_access allow myThirdNet
http_access allow othercompany
http_access deny all
#If the request is for a miss, and it's from othercompany, deny it
miss_access deny othercompany
miss_access allow all

The always_direct and never_direct acl-operators

These operators help you make controlled decisions about which servers to connect to directly, and which to connect through a parent cache/proxy. I previously discussed this set of options briefly in Chapter Three, during the Basic Installation phase.

These tags are covered in detail in the following chapter, in the Peer Selection section.

The broken_posts acl-operator

Some servers incorrectly handle POST data, requiring an extra Carridge-Return (CR) and Line-Feed (LF) after a POST request. Since obeying the HTTP specification will make Squid incompatable with these server, there is an option to be non-compliant when talking to a specific set of servers. This option should be very rarely used. The url_regex acl type should be used for specifying the broken server.

Example 7-22. Using the broken_posts acl-operator

acl broken_server url_regex http://broken-server.domain.example/
broken_posts allow broken_server