Applies to openSUSE Leap 42.1

26 The Proxy Server Squid

Abstract

Squid is a widely-used proxy cache for Linux and Unix platforms. This means that it stores requested Internet objects, such as data on a Web or FTP server, on a machine that is closer to the requesting workstation than the server. It may be set up in multiple hierarchies to assure optimal response times and low bandwidth usage, even in modes that are transparent for the end user. Additional software like squidGuard may be used to filter Web contents.

Squid acts as a proxy cache. It redirects object requests from clients (in this case, from Web browsers) to the server. When the requested objects arrive from the server, it delivers the objects to the client and keeps a copy of them in the hard disk cache. One of the advantages of caching is that several clients requesting the same object can be served from the hard disk cache. This enables clients to receive the data much faster than from the Internet. This procedure also reduces the network traffic.

Along with the actual caching, Squid offers a wide range of features such as distributing the load over intercommunicating hierarchies of proxy servers, defining strict access control lists for all clients accessing the proxy, allowing or denying access to specific Web pages with the help of other applications, and generating statistics about frequently-visited Web pages for the assessment of the users' surfing habits. Squid is not a generic proxy. It normally proxies only HTTP connections. It supports the protocols FTP, Gopher, SSL, and WAIS, but it does not support other Internet protocols, such as Real Audio, news, or video conferencing. Because Squid only supports the UDP protocol to provide communication between different caches, many other multimedia programs are not supported.

26.1 Some Facts about Proxy Caches

As a proxy cache, Squid can be used in several ways. When combined with a firewall, it can help with security. Multiple proxies can be used together. It can also determine what types of objects should be cached and for how long.

26.1.1 Squid and Security

It is possible to use Squid together with a firewall to secure internal networks from the outside using a proxy cache. The firewall denies all clients access to external services except Squid. All Web connections must be established by the proxy. With this configuration, Squid completely controls Web access.

If the firewall configuration includes a DMZ, the proxy should operate within this zone. Section 26.5, “Configuring a Transparent Proxy” describes how to implement a transparent proxy. This simplifies the configuration of the clients, because in this case they do not need any information about the proxy.

26.1.2 Multiple Caches

Several instances of Squid can be configured to exchange objects between them. This reduces the total system load and increases the chances of finding an object already existing in the local network. It is also possible to configure cache hierarchies, so a cache can forward object requests to sibling caches or to a parent cache—causing it to get objects from another cache in the local network or directly from the source.

Choosing the appropriate topology for the cache hierarchy is very important, because it is not desirable to increase the overall traffic on the network. For a very large network, it would make sense to configure a proxy server for every subnet and connect them to a parent proxy, which in turn is connected to the proxy cache of the ISP.

All this communication is handled by ICP (Internet cache protocol) running on top of the UDP protocol. Data transfers between caches are handled using HTTP (hypertext transmission protocol) based on TCP.

To find the most appropriate server from which to get the objects, one cache sends an ICP request to all sibling proxies. These answer the requests via ICP responses with a HIT code if the object was detected or a MISS if it was not. If multiple HIT responses were found, the proxy server decides from which server to download, depending on factors such as which cache sent the fastest answer or which one is closer. If no satisfactory responses are received, the request is sent to the parent cache.

Tip
Tip: Avoiding Duplication of Objects

To avoid duplication of objects in different caches in the network, other ICP protocols are used, such as CARP (cache array routing protocol) or HTCP (hypertext cache protocol). The more objects maintained in the network, the greater the possibility of finding the desired one.

26.1.3 Caching Internet Objects

Not all objects available in the network are static. There are a lot of dynamically generated CGI pages, visitor counters, and encrypted SSL content documents. Objects like this are not cached because they change each time they are accessed.

The question remains as to how long all the other objects stored in the cache should stay there. To determine this, all objects in the cache are assigned one of various possible states. Web and proxy servers find out the status of an object by adding headers to these objects, such as Last modified or Expires and the corresponding date. Other headers specifying that objects must not be cached are used as well.

Objects in the cache are normally replaced, because of a lack of free hard disk space, using algorithms such as LRU (last recently used). This means that the proxy expunges the objects that have not been requested for the longest time.

26.2 System Requirements

The most important thing is to determine the maximum network load the system must bear. Therefore, it is important to pay more attention to the load peaks, because these might be more than four times the day's average. When in doubt, it would be better to overestimate the system's requirements, because having Squid working close to the limit of its capabilities could lead to a severe loss in the quality of the service. The following sections point to the system factors in order of significance.

26.2.1 Hard Disks

Speed plays an important role in the caching process, so this factor deserves special attention. For hard disks, this parameter is described as random seek time, measured in milliseconds. Because the data blocks that Squid reads from or writes to the hard disk tend to be rather small, the seek time of the hard disk is more important than its data throughput. For the purposes of a proxy, hard disks with high rotation speeds are probably the better choice, because they allow the read-write head to be positioned in the required spot more quickly. One possibility to speed up the system is to use a number of disks concurrently or to employ striping RAID arrays.

26.2.2 Size of the Disk Cache

In a small cache, the probability of a HIT (finding the requested object already located there) is small, because the cache is easily filled and the less requested objects are replaced by newer ones. If, for example, one GB is available for the cache and the users only surf ten MB per day, it would take more than one hundred days to fill the cache.

The easiest way to determine the needed cache size is to consider the maximum transfer rate of the connection. With a 1 Mbit/s connection, the maximum transfer rate is 125 KB/s. If all this traffic ends up in the cache, in one hour it would add up to 450 MB and, assuming that all this traffic is generated in only eight working hours, it would reach 3.6 GB in one day. Because the connection is normally not used to its upper volume limit, it can be assumed that the total data volume handled by the cache is approximately 2 GB. This is why 2 GB of disk space is required in the example for Squid to keep one day's worth of browsed data cached.

26.2.3 RAM

The amount of memory (RAM) required by Squid directly correlates to the number of objects in the cache. Squid also stores cache object references and frequently requested objects in the main memory to speed up retrieval of this data. Random access memory is much faster than a hard disk.

In addition to that, there is other data that Squid needs to keep in memory, such as a table with all the IP addresses handled, an exact domain name cache, the most frequently requested objects, access control lists, buffers, and more.

It is very important to have sufficient memory for the Squid process, because system performance is dramatically reduced if it must be swapped to disk. The cachemgr.cgi tool can be used for the cache memory management. This tool is introduced in Section 26.6, “cachemgr.cgi”.

26.2.4 CPU

Squid is not a program that requires intensive CPU usage. The load of the processor is only increased while the contents of the cache are loaded or checked. Using a multiprocessor machine does not increase the performance of the system. To increase efficiency, it is better to buy faster disks or add more memory.

26.3 Starting Squid

If not already installed, install the squid package. squid does not belong to the default openSUSE® Leap installation scope.

Squid is already preconfigured in SUSE® Linux Enterprise Server, you can start it right after the installation. To ensure a smooth start-up, the network should be configured in a way that at least one name server and the Internet can be reached. Problems can arise if a dial-up connection is used with a dynamic DNS configuration. In this case, at least the name server should be entered, because Squid does not start if it does not detect a DNS server in /etc/resolv.conf.

26.3.1 Commands for Starting and Stopping Squid

To start Squid, enter systemctl start squid at the command line as root. In the initial start-up, the directory structure of the cache must first be defined in /var/cache/squid. This is done automatically by the start script and can take a few seconds or even minutes. If done appears to the right in green, Squid has been successfully loaded. To test the functionality of Squid on the local system, enter localhost as the proxy and 3128 as the port in the browser.

To allow users from the local system and other systems to access Squid and the Internet, change the entry in the configuration files /etc/squid/squid.conf from http_access deny all to http_access allow all. However, in doing so, consider that Squid is made completely accessible to anyone by this action. Therefore, define ACLs that control access to the proxy. More information about this is available in Section 26.4.2, “Options for Access Controls”.

After modifying the configuration file /etc/squid/squid.conf, Squid must reload the configuration file. Do this with systemctl reload squid. Alternatively, completely restart Squid with systemctl restart squid.

The command systemctl status squid can be used to check if the proxy is running. The command systemctl stop squid causes Squid to shut down. This can take a while, because Squid waits up to half a minute (shutdown_lifetime option in /etc/squid/squid.conf) before dropping the connections to the clients and writing its data to the disk.

Warning
Warning: Terminating Squid

Terminating Squid with kill or killall can damage the cache. To be able to restart Squid, a damaged cache must be deleted.

If Squid dies after a short period of time even though it was started successfully, check whether there is a faulty name server entry or whether the /etc/resolv.conf file is missing. Squid logs the cause of a start-up failure in the file /var/log/squid/cache.log. If Squid should be loaded automatically when the system boots, enable the service with systemctl enable squid.

An uninstall of Squid does not remove the cache hierarchy or the log files. To remove these, delete the /var/cache/squid directory manually.

26.3.2 Local DNS Server

Setting up a local DNS server makes sense even if it does not manage its own domain. It then simply acts as a caching-only name server and is also able to resolve DNS requests via the root name servers without requiring any special configuration (see Section 19.4, “Starting the BIND Name Server”). How this can be done depends on whether you chose dynamic DNS during the configuration of the Internet connection.

Dynamic DNS

Normally, with dynamic DNS, the DNS server is set by the provider during the establishment of the Internet connection and the local /etc/resolv.conf file is adjusted automatically. This behavior is controlled in the /etc/sysconfig/network/config file with the NETCONFIG_DNS_POLICY sysconfig variable. Set NETCONFIG_DNS_POLICY to "" with the YaST sysconfig editor. Then enter the local DNS server in the /etc/resolv.conf file with the IP address 127.0.0.1 for localhost. This way Squid can always find the local name server when it starts.

To make the provider's name server accessible, enter it in the configuration file /etc/named.conf under forwarders along with its IP address. With dynamic DNS, this can be achieved automatically during connection establishment by setting the sysconfig variable NETCONFIG_DNS_POLICY to auto.

Static DNS

With static DNS, no automatic DNS adjustments take place while establishing a connection, so there is no need to change any sysconfig variables. You must, however, enter the local DNS server in the file /etc/resolv.conf as described above. Additionally, the providers static name server must be entered manually in the /etc/named.conf file under forwarders along with its IP address.

Tip
Tip: DNS and Firewall

If you have a firewall running, make sure DNS requests can pass it.

26.4 The /etc/squid/squid.conf Configuration File

All Squid proxy server settings are made in the /etc/squid/squid.conf file. To start Squid for the first time, no changes are necessary in this file, but external clients are initially denied access. The proxy is available for localhost. The default port is 3128. The preinstalled configuration file /etc/squid/squid.conf provides detailed information about the options and many examples. Nearly all entries begin with # (the lines are commented) and the relevant specifications can be found at the end of the line. The given values almost always correlate with the default values, so removing the comment signs without changing any of the parameters usually has little effect. If possible, leave the sample as it is and insert the options along with the modified parameters in the line below. This way, the default values may easily be recovered and compared with the changes.

Tip
Tip: Adapting the Configuration File after an Update

If you have updated from an earlier Squid version, it is recommended to edit the new /etc/squid/squid.conf and only apply the changes made in the previous file. If you try to use the old squid.conf, you risk that the configuration no longer works, because options are sometimes modified and new changes added.

26.4.1 General Configuration Options (Selection)

http_port 3128

This is the port on which Squid listens for client requests. The default port is 3128, but 8080 is also common. If desired, specify several port numbers separated by blank spaces.

cache_peer hostnametypeproxy-porticp-port

Here, enter a parent proxy, for example, if you want to use the proxy of your ISP. As hostname, enter the name or IP address of the proxy to use and, as type, enter parent. For proxy-port, enter the port number that is also given by the operator of the parent for use in the browser (usually 8080). Set the icp-port to 7 or 0 if the ICP port of the parent is not known and its use is irrelevant to the provider. In addition, default and no-query may be specified after the port numbers to prohibit the use of the ICP protocol. Squid then behaves like a normal browser as far as the provider's proxy is concerned.

cache_mem 8 MB

This entry defines the amount of memory Squid can use for very popular replies. The default is 8 MB. This does not specify the memory usage of Squid and may be exceeded.

cache_dir ufs /var/cache/squid/ 100 16 256

The entry cache_dir defines the directory where all the objects are stored on disk. The numbers at the end indicate the maximum disk space in MB to use and the number of directories in the first and second level. The ufs parameter should be left alone. The default is 100 MB occupied disk space in the /var/cache/squid directory and creation of 16 subdirectories inside it, each containing 256 more subdirectories. When specifying the disk space to use, leave sufficient reserve disk space. Values from a minimum of 50% to a maximum of 80% of the available disk space make the most sense here. The last two numbers for the directories should only be increased with caution, because too many directories can also lead to performance problems. If you have several disks that share the cache, enter several cache_dir lines.

cache_access_log /var/log/squid/access.log, cache_log /var/log/squid/cache.log, cache_store_log /var/log/squid/store.log

These three entries specify the paths where Squid log files all its actions. Normally, nothing is changed here. If Squid is experiencing a heavy usage burden, it might make sense to distribute the cache and the log files over several disks.

emulate_httpd_log off

If the entry is set to on, obtain readable log files. Some evaluation programs cannot interpret this, however.

client_netmask 255.255.255.255

With this entry, mask IP addresses of clients in the log files. The last digit of the IP address is set to zero if you enter 255.255.255.0 here. You may protect the privacy of your clients this way.

ftp_user Squid@

With this, set the password Squid should use for the anonymous FTP login. It can make sense to specify a valid e-mail address here, because some FTP servers check these for validity.

cache_mgr webmaster

An e-mail address to which Squid sends a message if it unexpectedly crashes. The default is webmaster.

logfile_rotate 0

If you run squid -k rotate, Squid can rotate secured log files. The files are numbered in this process and, after reaching the specified value, the oldest file is overwritten. The default value is 0 because archiving and deleting log files in openSUSE Leap is carried out by a cron job set in the configuration file /etc/logrotate/squid.

append_domain <domain>

With append_domain, specify which domain to append automatically when none is given. Usually, your own domain is entered here, so entering www in the browser accesses your own Web server.

forwarded_for on

If you set the entry to off, Squid removes the IP address and the system name of the client from HTTP requests. Otherwise it adds a line to the header like

X-Forwarded-For: 192.168.0.1
negative_ttl 5 minutes; negative_dns_ttl 5 minutes

Normally, you do not need to change these values. If you have a dial-up connection, however, the Internet may, at times, not be accessible. Squid makes a note of the failed requests then refuses to issue new ones, although the Internet connection has been reestablished. In a case such as this, change the minutes to seconds. Then, after clicking Reload in the browser, the dial-up process should be reengaged after a few seconds.

never_direct allow acl_name

To prevent Squid from taking requests directly from the Internet, use the above command to force connection to another proxy. This must have previously been entered in cache_peer. If all is specified as the acl_name, force all requests to be forwarded directly to the parent. This might be necessary, for example, if you are using a provider that strictly stipulates the use of its proxies or denies its firewall direct Internet access.

26.4.2 Options for Access Controls

Squid provides a detailed system for controlling the access to the proxy. By implementing ACLs, it can be configured easily and comprehensively. This involves lists with rules that are processed sequentially. ACLs must be defined before they can be used. Some default ACLs, such as all and localhost, already exist. However, the mere definition of an ACL does not mean that it is actually applied. This only happens with http_access rules.

acl <acl_name> <type> <data>

An ACL requires at least three specifications to define it. The name <acl_name> can be chosen arbitrarily. For <type>, select from a variety of different options, which can be found in the ACCESS CONTROLS section in the /etc/squid/squid.conf file. The specification for <data> depends on the individual ACL type and can also be read from a file, for example, via host names, IP addresses, or URLs. The following are some simple examples:

acl mysurfers srcdomain .my-domain.com
acl teachers src 192.168.1.0/255.255.255.0
acl students src 192.168.7.0-192.168.9.0/255.255.255.0
acl lunch time MTWHF 12:00-15:00
http_access allow <acl_name>

http_access defines who is allowed to use the proxy and who can access what on the Internet. For this, ACLs must be given. localhost and all have already been defined above, which can deny or allow access via deny or allow. A list containing any number of http_access entries can be created, processed from top to bottom, and, depending on which occurs first, access is allowed or denied to the respective URL. The last entry should always be http_access deny all. In the following example, the localhost has free access to everything while all other hosts are denied access completely.

http_access allow localhost
http_access deny all

In another example using these rules, the group teachers always has access to the Internet. The group students only gets access Monday to Friday during lunch time.

http_access deny localhost
http_access allow teachers
http_access allow students lunch time
http_access deny all

The list with the http_access entries should only be entered, for the sake of readability, at the designated position in the /etc/squid/squid.conf file. That is, between the text

# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR
# CLIENTS

and the last

http_access deny all
redirect_program /usr/bin/squidGuard

With this option, specify a redirector such as squidGuard, which allows the blocking of unwanted URLs. Internet access can be individually controlled for various user groups with the help of proxy authentication and the appropriate ACLs. squidGuard is a separate package that can be installed and configured.

auth_param basic program /usr/sbin/pam_auth

If users must be authenticated on the proxy, set a corresponding program, such as pam_auth. When accessing pam_auth for the first time, the user sees a login window in which to enter the user name and password. In addition, an ACL is still required, so only clients with a valid login can use the Internet:

acl password proxy_auth REQUIRED

http_access allow password
http_access deny all

The REQUIRED after proxy_auth can be replaced with a list of permitted user names or with the path to such a list.

ident_lookup_access allow <acl_name>

With this, have an ident request run for all ACL-defined clients to find each user's identity. If you apply all to the <acl_name>, this is valid for all clients. Also, an ident daemon must be running on all clients. For Linux, install the pidentd package for this purpose. For Microsoft Windows, free software is available for download from the Internet. To ensure that only clients with a successful ident lookup are permitted, define a corresponding ACL here:

acl identhosts ident REQUIRED

http_access allow identhosts
http_access deny all

Here, too, replace REQUIRED with a list of permitted user names. Using ident can slow down the access time quite a bit, because ident lookups are repeated for each request.

26.5 Configuring a Transparent Proxy

The usual way of working with proxy servers is the following: the Web browser sends requests to a certain port in the proxy server and the proxy provides these required objects, whether they are in its cache or not. When working in a network, several situations may arise:

  • For security reasons, it is recommended that all clients use a proxy to surf the Internet.

  • All clients must use a proxy, regardless of whether they are aware of it.

  • The proxy in a network is moved, but the existing clients need to retain their old configuration.

In all these cases, a transparent proxy may be used. The principle is very easy: the proxy intercepts and answers the requests of the Web browser, so the Web browser receives the requested pages without knowing from where they are coming. As the name indicates, the entire process is done transparently.

26.5.1 Configuration Options in /etc/squid/squid.conf

To inform squid that it should act as a transparent proxy, use the option transparent at the tag http_port in the main configuration file /etc/squid/squid.conf. After restarting squid, the only other thing that must be done is to reconfigure the firewall to redirect the HTTP port to the port given in http_port. In the following squid configuration line, this would be the port 3128.

http_port 3128 transparent

26.5.2 Firewall Configuration with SuSEFirewall2

Now redirect all incoming requests via the firewall with help of a port forwarding rule to the Squid port. To do this, use the enclosed tool SuSEFirewall2, described in Book “Security Guide”, Chapter 15 “Masquerading and Firewalls”, Section 15.4.1 “Configuring the Firewall with YaST”. Its configuration file can be found in /etc/sysconfig/SuSEfirewall2. The configuration file consists of well-documented entries. To set a transparent proxy, you must configure several firewall options:

  • Device pointing to the Internet: FW_DEV_EXT="eth1"

  • Device pointing to the network: FW_DEV_INT="eth0"

Define ports and services (see /etc/services) on the firewall that are accessed from untrusted (external) networks such as the Internet. In this example, only Web services are offered to the outside:

FW_SERVICES_EXT_TCP="www"

Define ports or services (see /etc/services) on the firewall that are accessed from the secure (internal) network, both via TCP and UDP:

FW_SERVICES_INT_TCP="domain www 3128"
FW_SERVICES_INT_UDP="domain"

This allows accessing Web services and Squid (whose default port is 3128). The service domain stands for DNS (domain name service). This service is commonly used. Otherwise, simply take it out of the above entries and set the following option to no:

FW_SERVICE_DNS="yes"

The most important option is option number 15:

Example 26.1: Firewall Configuration: Option 15
# 15.)
# Which accesses to services should be redirected to a local port on
# the firewall machine?
#
# This option can be used to force all internal users to surf via
# your squid proxy, or transparently redirect incoming webtraffic to
# a secure webserver.
#
# Format: 
# list of <source network>[,<destination network>,<protocol>[,dport[:lport]]
# Where protocol is either tcp or udp. dport is the original
# destination port and lport the port on the local machine to
# redirect the traffic to
#
# An exclamation mark in front of source or destination network
# means everything EXCEPT the specified network
#
# Example: "10.0.0.0/8,0/0,tcp,80,3128 0/0,172.20.1.1,tcp,80,8080"

The comments above show the syntax to follow. First, enter the IP address and the netmask of the internal networks accessing the proxy firewall. Second, enter the IP address and the netmask to which these clients send their requests. In the case of Web browsers, specify the networks 0/0, a wild card that means to everywhere. After that, enter the original port to which these requests are sent and, finally, the port to which all these requests are redirected. Because Squid supports protocols other than HTTP, redirect requests from other ports to the proxy, such as FTP (port 21), HTTPS, or SSL (port 443). In this example, Web services (port 80) are redirected to the proxy port (port 3128). If there are more networks or services to add, they must be separated by a blank space in the respective entry.

FW_REDIRECT="192.168.0.0/16,0/0,tcp,80,3128"

To start the firewall and the new configuration with it, change an entry in the /etc/sysconfig/SuSEfirewall2 file. The entry START_FW must be set to "yes".

Start Squid as shown in Section 26.3, “Starting Squid”. To verify that everything is working properly, check the Squid log files in /var/log/squid/access.log. To verify that all ports are correctly configured, perform a port scan on the machine from any computer outside your network. Only the Web services (port 80) should be open. To scan the ports with nmap, the command syntax is nmap -O IP_address.

26.6 cachemgr.cgi

The cache manager (cachemgr.cgi) is a CGI utility for displaying statistics about the memory usage of a running Squid process. It is also a more convenient way to manage the cache and view statistics without logging the server.

26.6.1 Setup

First, a running Web server on your system is required. Configure Apache as described in Chapter 24, The Apache HTTP Server. To check if Apache is already running, as root enter the command systemctl status apache2. Otherwise, enter systemctl start apache2 to start Apache with the openSUSE Leap default settings. The last step to set it up is to copy the file cachemgr.cgi to the Apache directory cgi-bin. For 32-bit, this works as follows:

cp /usr/lib/squid/cachemgr.cgi /srv/www/cgi-bin/

In a 64-bit environment, the file cachemgr.cgi is located below /usr/lib64/squid/ and the command to copy it to the Apache directory is the following:

cp /usr/lib64/squid/cachemgr.cgi /srv/www/cgi-bin/

26.6.2 Cache Manager ACLs in /etc/squid/squid.conf

There are some default settings in the original file required for the cache manager. First, two ACLs are defined, then http_access options use these ACLs to grant access from the CGI script to Squid. The first ACL is the most important, because the cache manager tries to communicate with Squid over the cache_object protocol.

acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255

The following rules give Apache the access rights to Squid:

http_access allow manager localhost
http_access deny manager

These rules assume that the Web server and Squid are running on the same machine. If the communication between the cache manager and Squid originates at the Web server on another computer, include an extra ACL as in Example 26.2, “Access Rules”.

Example 26.2: Access Rules
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl webserver src 192.168.1.7/255.255.255.255 # webserver IP

Then add the rules in Example 26.3, “Access Rules” to permit access from the Web server.

Example 26.3: Access Rules
http_access allow manager localhost
http_access allow manager webserver
http_access deny manager

Configure a password for the manager for access to more options, like closing the cache remotely or viewing more information about the cache. For this, configure the entry cachemgr_passwd with a password for the manager and the list of options to view. This list appears as a part of the entry comments in /etc/squid/squid.conf.

Restart Squid every time the configuration file is changed. Do this easily with systemctl reload squid.

26.6.3 Viewing the Statistics

Go to the corresponding Web site—http://webserver.example.org/cgi-bin/cachemgr.cgi. Press continue and browse through the different statistics.

26.7 squidGuard

This section is not intended to explain an extensive configuration of squidGuard, only to introduce it and give some advice for using it. For more in-depth configuration issues, refer to the squidGuard Web site at http://www.squidguard.org.

squidGuard is a free (GPL), flexible, and fast filter, redirector, and access controller plug-in for Squid. It lets you define multiple access rules with different restrictions for different user groups on a Squid cache. squidGuard uses Squid's standard redirector interface. squidGuard can do the following:

  • Limit Web access for some users to a list of accepted or well-known Web servers or URLs.

  • Block access to some listed or blacklisted Web servers or URLs for some users.

  • Block access to URLs matching a list of regular expressions or words for some users.

  • Redirect blocked URLs to an intelligent CGI-based information page.

  • Redirect unregistered users to a registration form.

  • Redirect banners to an empty GIF.

  • Use different access rules based on time of day, day of the week, date, etc.

  • Use different rules for different user groups.

squidGuard and Squid cannot be used to:

  • Edit, filter, or censor text inside documents.

  • Edit, filter, or censor HTML-embedded script languages, such as JavaScript or VBscript.

Before it can be used, install squidGuard. Provide a minimal configuration file as /etc/squidguard.conf. Find configuration examples in http://www.squidguard.org/Doc/examples.html. Experiment later with more complicated configuration settings.

Next, create a dummy access denied page or a more or less complex CGI page to redirect Squid if the client requests a blacklisted Web site. Using Apache is strongly recommended.

Now, configure Squid to use squidGuard. Use the following entry in the /etc/squid/squid.conf file:

redirect_program /usr/bin/squidGuard

Another option called redirect_children configures the number of redirect (in this case squidGuard) processes running on the machine. The more processes you set, the more RAM is required. Try low numbers first, for example 4.

redirect_children 4

Last, have Squid load the new configuration by running systemctl reload squid. Now, test your settings with a browser.

26.8 Cache Report Generation with Calamaris

Calamaris is a Perl script used to generate reports of cache activity in ASCII or HTML format. It works with native Squid access log files. The Calamaris home page is located at http://Calamaris.Cord.de/. This tool does not belong to the openSUSE Leap default installation scope—to use it, install the calamaris package.

Log in as root then enter

cat access1.log [access2.log access3.log] | calamaris options > reportfile

When using more than one log file, make sure they are chronologically ordered, with older files listed first. This can be achieved by either listing the files one after the other as in the example above, or by using access{1..3}.log.

calamaris takes the following options:

-a

output all available reports

-w

output as HTML report

-l

include a message or logo in report header

More information about the various options can be found in the program's manual page with man calamaris.

A typical example is:

cat access.log.{10..1} access.log | calamaris -a -w \ 
> /usr/local/httpd/htdocs/Squid/squidreport.html

This puts the report in the directory of the Web server. Apache is required to view the reports.

26.9 For More Information

Visit the home page of Squid at http://www.squid-cache.org/. Here, find the Squid User Guide and a very extensive collection of FAQs on Squid.

Following the installation, a small HOWTO about transparent proxies is available in howtoenh as /usr/share/doc/howto/en/txt/TransparentProxy.gz. In addition, mailing lists are available for Squid at http://www.squid-cache.org/Support/mailing-lists.html.

Print this page