So we have finished the squid's setup in the last article. Almost...
When you try the presented setup, you will be surprised and upset to discover a port 81 showing up in the URLs of the static objects (like htmls). Hey, we did not want the user to see the port 81 and use it instead of 80, since then it will bypass the squid server and the hard work we went through was just a waste of time?
The solution is to run both squid and httpd_docs at the same port. This can
be accomplished by binding each one to a specific interface. Modify the httpd.conf in the httpd_docs configuration directory:
Port 80 BindAddress 127.0.0.1 Listen 127.0.0.1:80
Modify the squid.conf:
http_port 80 tcp_incoming_address 123.123.123.3 tcp_outgoing_address 127.0.0.1 httpd_accel_host 127.0.0.1 httpd_accel_port 80
Where 123.123.123.3 should be replaced with IP of your main server. Now restart squid and
httpd_docs in either order you want, and voila the port number has gone.
You must also have in the /etc/hosts an entry (most chances that it's already there):
127.0.0.1 localhost.localdomain localhost
Now if your scripts were generating HTML including fully qualified self references, using the 8080 or other port -- you should fix them to generate links to point to port 80 (which means not using the port at all). If you do not, users will bypass squid, like if it was not there at all, by making direct requests to the mod_perl server's port.
The only question left is what to do with users who bookmarked your services and they still have the port 8080 inside the URL. Do not worry about it. The most important thing is for your scripts to return a full URLs, so if the user comes from the link with 8080 port inside, let it be. Just make sure that all the consecutive calls to your server will be rewritten correctly. During a period of time users will change their bookmarks. What can be done is to send them an email if you have one, or to leave a note on your pages asking users to update their bookmarks. You could avoid this problem if you did not publish this non-80 port in first place.
To save you some keystrokes, here is the whole modified squid.conf:
http_port 80 tcp_incoming_address 123.123.123.3 tcp_outgoing_address 127.0.0.1 httpd_accel_host 127.0.0.1 httpd_accel_port 80 icp_port 0 hierarchy_stoplist /cgi-bin /perl acl QUERY urlpath_regex /cgi-bin /perl no_cache deny QUERY # debug_options ALL,1 28,9 redirect_program /usr/lib/squid/redirect.pl redirect_children 10 redirect_rewrites_host_header off request_size 1000 KB acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl myserver src 127.0.0.1/255.255.255.255 acl SSL_ports port 443 563 acl Safe_ports port 80 81 8080 81 443 563 acl CONNECT method CONNECT http_access allow manager localhost http_access allow manager myserver http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports # http_access allow all cache_effective_user squid cache_effective_group squid cache_mem 20 MB memory_pools on cachemgr_passwd disable shutdown
Note that all directives should start at the beginning of the line.
When I was first told about squid, I thought: ``Hey, Now I can drop the
httpd_docs server and to have only squid and httpd_perl
servers``. Since all my static objects will be cached by squid, I do not
need the light httpd_docs server. But it was a wrong assumption. Why? Because you still have the
overhead of loading the objects into squid at first time, and if your site
has many of them -- not all of them will be cached (unless you have devoted
a huge chunk of memory to squid) and my heavy mod_perl servers will still
have an overhead of serving the static objects. How one would measure the
overhead? The difference between the two servers is memory consumption,
everything else (e.g. I/O) should be equal. So you have to estimate the
time needed for first time fetching of each static object at a peak period
and thus the number of additional servers you need for serving the static
objects. This will allow you to calculate additional memory requirements. I
can imagine, this amount could be significant in some installations.
So I have decided to have even more administration overhead and to stick with squid, httpd_docs and httpd_perl scenario, where I can optimize and fine tune everything. Of course this can be not your case. If you are feeling that the scenario from the previous section is too complicated for you, make it simpler. Have only one server with mod_perl built in and let the squid to do most of the job that plain light apache used to do. As I have explained in the previous paragraph, you should pick this lighter setup only if you can make squid cache most of your static objects. If it cannot, your mod_perl server will do the work we do not want it to.
If you are still with me, install apache with mod_perl and squid. Then use
a similar configuration from the previous section, but now httpd_docs is
not there anymore. Also we do not need the redirector anymore and we
specify httpd_accel_host as a name of the server and not virtual. There is no need to bind two servers on the same port, because we do not
redirect and there is neither Bind nor Listen
directives in the httpd.conf anymore.
The modified configuration (see the explanations in the previous section):
httpd_accel_host put.your.hostname.here httpd_accel_port 8080 http_port 80 icp_port 0 hierarchy_stoplist /cgi-bin /perl acl QUERY urlpath_regex /cgi-bin /perl no_cache deny QUERY # debug_options ALL, 1, 28, 9 # redirect_program /usr/lib/squid/redirect.pl # redirect_children 10 # redirect_rewrites_host_header off request_size 1000 KB acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl myserver src 127.0.0.1/255.255.255.255 acl SSL_ports port 443 563 acl Safe_ports port 80 81 8080 81 443 563 acl CONNECT method CONNECT http_access allow manager localhost http_access allow manager myserver http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports # http_access allow all cache_effective_user squid cache_effective_group squid cache_mem 20 MB memory_pools on cachemgr_passwd disable shutdown
That's all!
Next month I'll start ``mod_perl coding guidelines'' series of articles.