mod_perl Strategy and Implementation - Part 2

By: Stas Bekman


Summary

This month we will talk about more advanced mod_perl setup, which deployes one plain and one mod_perl-enabled Apache servers. This setup allows a much better tuning of the system and uses a significantly less memory than the one I have presented in the previous article. First we will go through the pros and cons of this approach and then I'll show in details the installation and the configuration processes.


One Plain and One mod_perl-enabled Apache Servers Approach

As I have mentioned in the previous articles, when running scripts under mod_perl, you will notice that the httpd processes consume a huge amount of memory, from 5M to 25M, or even more. That is the price you pay for the enormous speed improvements under mod_perl. (Shared memory keeps them smaller.)

Using these large processes to serve static objects like images and html documents is an overkill. A better approach is to run two servers: a very light, plain apache server to serve static objects and a heavier mod_perl-enabled apache server to serve requests for dynamic (generated) objects (aka CGI).

From here on, I will refer to these two servers as httpd_docs (vanilla apache) and httpd_perl (mod_perl enabled apache).

The advantages:

An important note: When user browses static pages and the base URL in the Location window points to the static server, for example http://www.nowhere.com/index.html -- all relative URLs (e.g. <A HREF="/main/download.html">) are being served by the light plain apache server. But this is not the case with dynamically generated pages. For example when the base URL in the Location window points to the dynamic server -- (e.g. http://www.nowhere.com:8080/perl/index.pl) all relative URLs in the dynamically generated HTML will be served by the heavy mod_perl processes. You must use a fully qualified URLs and not the relative ones! http://www.nowhere.com/icons/arrow.gif is a full URL, while /icons/arrow.gif is a relative one. Using <BASE HREF="http://www.nowhere.com/"> in the generated HTML is another way to handle this problem. Also the httpd_perl server could rewrite the requests back to httpd_docs (much slower) and you still need an attention of the heavy servers. This is not an issue if you hide the internal port implementations, so client sees only one server running on port 80. (I'll cover this issue in one of the upcoming articles)

The disadvantages:


Installation and Configuration processes

Since we are going to run two apache servers, we will need two different sets of configuration, log and other files. We need a special directory layout. While some of the directories can be shared between the two servers (assuming that both are built from the same source distribution), others should be separated. From now on I will refer to these two servers as httpd_docs (vanilla Apache) and httpd_perl (Apache/mod_perl).

For this illustration, we will use /usr/local as our root directory. The Apache installation directories will be stored under this root (/usr/local/bin, /usr/local/etc and etc...)

First let's prepare the sources. We will assume that all the sources go into /usr/src dir. It is better when you use two separate copies of apache sources. Since you probably will want to tune each apache version at separate and to do some modifications and recompilations as the time goes. Having two independent source trees will prove helpful, unless you use DSO, which is covered later in this section.

Make two subdirectories:

  % mkdir /usr/src/httpd_docs
  % mkdir /usr/src/httpd_perl

Put the Apache sources into a /usr/src/httpd_docs directory:

  % cd /usr/src/httpd_docs
  % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf -

If you have a gnu tar:

  % tar xvzf /tmp/apache_x.x.x.tar.gz

Replace /tmp directory with a path to a downloaded file and x.x.x with the version of the server you have.

  % cd /usr/src/httpd_docs
  
  % ls -l
  drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 apache_x.x.x/

Now we will prepare the httpd_perl server sources:

  % cd /usr/src/httpd_perl
  % gzip -dc /tmp/apache_x.x.x.tar.gz | tar xvf -
  % gzip -dc /tmp/modperl-x.xx.tar.gz | tar xvf -
  
  % ls -l
  drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 apache_x.x.x/
  drwxr-xr-x  8 stas  stas 2048 Apr 29 17:38 modperl-x.xx/

Time to decide on the desired directory structure layout (where the apache files go):

  ROOT = /usr/local

The two servers can share the following directories (so we will not duplicate data):

  /usr/local/bin/
  /usr/local/lib
  /usr/local/include/
  /usr/local/man/
  /usr/local/share/

Important: we assume that both servers are built from the same Apache source version.

Servers store their specific files either in httpd_docs or httpd_perl sub-directories:

  /usr/local/etc/httpd_docs/
                 httpd_perl/
  
  /usr/local/sbin/httpd_docs/
                  httpd_perl/
  
  /usr/local/var/httpd_docs/logs/
                            proxy/
                            run/
                 httpd_perl/logs/
                            proxy/
                            run/

After completion of the compilation and the installation of the both servers, you will need to configure them. To make things clear before we proceed into details, you should configure the /usr/local/etc/httpd_docs/httpd.conf as a plain apache and Port directive to be 80 for example. And /usr/local/etc/httpd_perl/httpd.conf to configure for mod_perl server and of course whose Port should be different from the one httpd_docs server listens to (e.g. 8080). The port numbers issue will be discussed later.

The next step is to configure and compile the sources: Below are the procedures to compile both servers taking into account the directory layout I have just suggested to use.


Configuration and Compilation of the Sources.

Let's proceed with installation. I will use x.x.x instead of real version numbers so this article will never become obsolete :).


Building the httpd_docs Server

Sources Configuration:

  % cd /usr/src/httpd_docs/apache_x.x.x
  % make clean
  % env CC=gcc \
  ./configure --prefix=/usr/local \
    --sbindir=/usr/local/sbin/httpd_docs \
    --sysconfdir=/usr/local/etc/httpd_docs \
    --localstatedir=/usr/local/var/httpd_docs \
    --runtimedir=/usr/local/var/httpd_docs/run \
    --logfiledir=/usr/local/var/httpd_docs/logs \
    --proxycachedir=/usr/local/var/httpd_docs/proxy

If you need some other modules, like mod_rewrite and mod_include (SSI), add them here as well:

    --enable-module=include --enable-module=rewrite

Note: gcc -- compiles httpd by 100K+ smaller then cc on AIX OS. Remove the line env CC=gcc if you want to use the default compiler. If you want to use it and you are a (ba)?sh user you will not need the env function, t?csh users will have to keep it in.

Note: add --layout to see the resulting directories' layout without actually running the configuration process.

Sources Compilation:

  % make
  % make install

Rename httpd to http_docs

  % mv /usr/local/sbin/httpd_docs/httpd \
  /usr/local/sbin/httpd_docs/httpd_docs

Now update an apachectl utility to point to the renamed httpd via your favorite text editor or by using perl:

  % perl -p -i -e 's|httpd_docs/httpd|httpd_docs/httpd_docs|' \
  /usr/local/sbin/httpd_docs/apachectl


Building the httpd_perl (mod_perl enabled) Server

Before you start to configure the mod_perl sources, you should be aware that there are a few Perl modules that have to be installed before building mod_perl. You will be alerted if any required modules are missing when you run the perl Makefile.PL command line below. If you discover that some are missing, pick them from your nearest CPAN repository (if you do not know what is it, make a visit to http://www.perl.com/CPAN ) or run the CPAN interactive shell via the command line perl -MCPAN -e shell.

Make sure the sources are clean:

  % cd /usr/src/httpd_perl/apache_x.x.x
  % make clean
  % cd /usr/src/httpd_perl/mod_perl-x.xx
  % make clean

It is important to make clean since some of the versions are not binary compatible (e.g apache 1.3.3 vs 1.3.4) so any ``third-party'' C modules need to be re-compiled against the latest header files.

Here I did not find a way to compile with gcc (my perl was compiled with cc so we have to compile with the same compiler!!!

  % cd /usr/src/httpd_perl/mod_perl-x.xx

  % /usr/local/bin/perl Makefile.PL \
  APACHE_PREFIX=/usr/local/ \
  APACHE_SRC=../apache_x.x.x/src \
  DO_HTTPD=1 \
  USE_APACI=1 \
  PERL_MARK_WHERE=1 \
  PERL_STACKED_HANDLERS=1 \
  ALL_HOOKS=1 \
  APACI_ARGS=--sbindir=/usr/local/sbin/httpd_perl, \
         --sysconfdir=/usr/local/etc/httpd_perl, \
         --localstatedir=/usr/local/var/httpd_perl, \
         --runtimedir=/usr/local/var/httpd_perl/run, \
         --logfiledir=/usr/local/var/httpd_perl/logs, \
         --proxycachedir=/usr/local/var/httpd_perl/proxy

Notice that all APACI_ARGS (above) must be passed as one long line if you work with t?csh!!! However it works correctly the way it shown above with (ba)?sh (by breaking the long lines with '\'). If you work with t?csh it does not work, since t?csh passes APACI_ARGS arguments to ./configure by keeping the new lines untouched, but stripping the original '\', thus breaking the configuration process.

As with httpd_docs you might need other modules like mod_rewrite, so add them here:

         --enable-module=rewrite

Note: PERL_STACKED_HANDLERS=1 is needed for Apache::DBI

Now, build, test and install the httpd_perl.

  % make && make test && make install

Note: apache puts a stripped version of httpd at /usr/local/sbin/httpd_perl/httpd. The original version which includes debugging symbols (if you need to run a debugger on this executable) is located at /usr/src/httpd_perl/apache_x.x.x/src/httpd.

Note: You may have noticed that we did not run make install in the apache's source directory. When USE_APACI is enabled, APACHE_PREFIX will specify the --prefix option for apache's configure utility, specifying the installation path for apache. When this option is used, mod_perl's make install will also make install on the apache side, installing the httpd binary, support tools, along with the configuration, log and document trees.

If make test fails, look into t/logs and see what is in there.

While doing perl Makefile.PL ... mod_perl might complain by warning you about missing libgdbm. Users reported that it is actually crucial, and you must have it in order to successfully complete the mod_perl building process.

Now rename the httpd to httpd_perl:

  % mv /usr/local/sbin/httpd_perl/httpd \
  /usr/local/sbin/httpd_perl/httpd_perl

Update the apachectl utility to point to renamed httpd name:

  % perl -p -i -e 's|httpd_perl/httpd|httpd_perl/httpd_perl|' \
  /usr/local/sbin/httpd_perl/apachectl


Configuration of the servers

Now when we have completed the building process, the last stage before running the servers, is to configure them.


Basic httpd_docs Server's Configuration

Configuring of httpd_docs server is a very easy task. Open /usr/local/etc/httpd_docs/httpd.conf into your favorite editor (starting from version 1.3.4 of Apache - there is only one file to edit). And configure it as you always do. Make sure you configure the log files and other paths according to the directory layout we decided to use.

Start the server with:

  /usr/local/sbin/httpd_docs/apachectl start


Basic httpd_perl Server's Configuration

Here we will make a basic configuration of the httpd_perl server. We edit the /usr/local/etc/httpd_perl/httpd.conf file. As with httpd_docs server configuration, make sure that ErrorLog and other file's location directives are set to point to the right places, according to the chosen directory layout.

The first thing to do is to set a Port directive - it should be different from 80 since we cannot bind 2 servers to use the same port number on the same machine. Here we will use <8080>. Some developers use port 81, but you can bind to it, only if you have root permissions. If you are running on multiuser machine, there is a chance someone already uses that port, or will start using it in the future - which as you understand might cause a collision. If you are the only user on your machine, basically you can pick any not used port number. Port number choosing is a controversial topic, since many organizations use firewalls, which may block some of the ports, or enable only a known ones. From my experience the most used port numbers are: 80, 81, 8000 and 8080. Personally, I prefer the port 8080. Of course with 2 server scenario you can hide the nonstandard port number from firewalls and users, by either using the mod_proxy's ProxyPass or proxy server like squid.

Now we proceed to mod_perl specific directives. A good idea will be to add them all at the end of the httpd.conf, since you are going to fiddle a lot with them at the beginning.

First, you need to specify the location where all mod_perl scripts will be located.

Add the following configuration directive:

    # mod_perl scripts will be called from
  Alias /perl/ /usr/local/myproject/perl/

From now on, all requests starting with /perl will be executed under mod_perl and will be mapped to the files in /usr/local/myproject/perl/.

Now we should configure the /perl location.

  PerlModule Apache::Registry

  <Location /perl>
    #AllowOverride None
    SetHandler perl-script
    PerlHandler Apache::Registry
    Options ExecCGI
    allow from all
    PerlSendHeader On
  </Location>

This configuration causes all scripts that are called with a /perl path prefix to be executed under the Apache::Registry module and as a CGI (so the ExecCGI, if you omit this option the script will be printed to the user's browser as a plain text or will possibly trigger a 'Save-As' window). Apache::Registry module lets you run almost unaltered CGI/perl scripts under mod_perl. PerlModule directive is an equivalent of perl's require(). We load the Apache::Registry module before we use it in the PerlHandler in the Location configuration.

PerlSendHeader On tells the server to send an HTTP header to the browser on every script invocation. You will want to turn this off for nph (non-parsed-headers) scripts.

This is only a very basic configuration, in the future articles I'll show a more advanced configuration techniques.

Now start the server with:

  /usr/local/sbin/httpd_perl/apachectl start


Next month

Next month I'll talk about proxy servers and will show how do they improve the mod_perl's performance.