In ``ordinary'' proxying, the client specifies the hostname and port number of a proxy in his web browsing software. The browser then makes requests to the proxy, and the proxy forwards them to the origin servers. This is all fine and good, but sometimes one of several situations arise. Either
This is where transparent proxying comes in. A web request can be intercepted by the proxy, transparently. That is, as far as the client software knows, it is talking to the origin server itself, when it is really the proxy server.
Cisco routers support transparent proxying. But, (surprisingly enough) Linux can act as a router, and can perform transparent proxying by redirecting TCP connections to local ports. However, we also need to make our web proxy aware of the affect of the redirection, so that it can make connections to the proper origin servers. There are two general ways this works:
The first is when your web proxy is not transparent proxy aware. You can use a nifty little daemon called transproxy that sits in front of your web proxy and takes care of all the messy details for you. transproxy was written by John Saunders, and is available from ftp://ftp.nlc.net .au/pub/linux/www/ or your local metalab mirror. transproxy will not be discussed further in this document.
A cleaner solution is to get a web proxy that is aware of transparent proxying itself. The one we are going to focus on here is squid. Squid is an Open Source caching proxy server for Unix systems. It is available from www.squid-cache.org
This document will focus on squid version 2.3 and linux kernel version 2.2, the most current stable releases as of this writing (March 2000). It should also work with squids as early as 2.0 and the later 2.1 linux kernels. Should you need information about earlier releases, you may find some earlier documents at www.unxsoft.com.
If you want to use linux 2.3, you will have to use a thing called netfilter instead of ipchains. However, it is assumed that if you are running a development kernel, you can figure out netfilter on your own from the provided documentation. If not, you really shouldn't be running a development kernel (trust me on this). Once linux 2.4 is released, this document will be updated to cover netfilter.
Note that this document focuses only on HTTP proxing. I get many emails asking about transparent FTP proxying. While it may not be theoretically impossible to proxy FTP transparently, it is MUCH harder than HTTP, and I do not know of any currently available tools that can do it. If you can figure it out, I suggest you write your own HOWTO...