The Apache/Perl integration project brings together the full power of the Perl programming language and the Apache HTTP server. With mod_perl it is possible to write Apache modules entirely in Perl, this lets you easily do things that are more difficult or impossible in regular CGI programs, such as running sub requests for example. In addition, the persistent interpreter embedded in the server saves the overhead of starting an external perl interpreter, the penalty of Perl start-up time. Another important feature is code caching, the modules and scripts are being loaded and compiled only once, then for the rest of the server's life the scripts are being served from the cache, thus server spends its time only to run the already loaded and compiled code, which is very fast.
The primary advantages of mod_perl are power and speed. You have full
access to the inner-workings of the web server and can intervene at any
stage of request-processing. This allows for customized processing of (to
name just a few of the phases) URI->filename translation,
authentication, response generation and logging. There is very little
run-time overhead. In particular, it is not necessary to start a separate
process, as is often done with web-server extensions. The most wide-spread
such extension mechanism, the Common Gateway Interface (CGI), can be
replaced entirely with perl-code that handles the response generation phase
of request processing. Mod_perl includes 2 general purpose modules for this
purpose: Apache::Registry, which can transparently run existing perl CGI scripts and
Apache::PerlRun, which does a similar job but allows you to run ``dirtier'' (to some
extent) scripts.
You can configure your httpd server and handlers in Perl (using
PerlSetVar, and <Perl> sections). You can even define your own configuration directives.
Many people wonder and ask ``How much of a performance improvement does mod_perl give?''. Well, it all depends on what you are doing with mod_perl and possibly who you ask. Developers report speed boosts from 200% to 2000%. The best way to measure is to try it and see for yourself! (see http://perl.apache.org/tidbits.html and http://perl.apache.org/stories/ for the facts)
I have prepared a list of some of the very busy mod_perl driven sites. A thousand words would not substitute a single touch. Enter the sites and feel the difference. They will persuade you that mod_perl rules!
Internet Movie Database (Ltd) - http://www.moviedatabase.com/ - serves around 1.25 million page views per day. All database lookups are handled inside Apache via mod_perl. Each request also goes through several mod_perl handlers and is then reformatted on the fly with mod_perl SSI to embed advertising banners and give different views of the site depending on the hostname used.
Medimatch, Medical Professionals - http://www.medimatch.com/ - Using mod_perl they developed the entire site, using persistent database connections for speed.
Singles Heaven - http://singlesheaven.com is a Match Maker site with almost 14 000 members and growing. The site is driven by
mod_perl, DBI, Apache::DBI (which provides a persistence to DB connections) and mysql. The speed is
enormous, chatting with mod_perl is a pleasure of experience. Every page is
being generated by about 10 SQL queries, for it does many dynamic checks
every time - like checking for new emails, watching the users who
registered in their watchdog and many more. You don't feel these queries
are actually happen, the speed is of the ``Hello World'' script.
Enter
as anonymous user to try it.
SlashDot Org. News for Nerds. Stuff that Matters.- http://slashdot.org . The title says everything. Slashdot is running on Linux Red Hat box. And it is pure Perl and MySQL stuff, webserver is mod_perl enabled of course! Serves from 400,000 to 650,000 pages a day!!! News flash: May, 14 1999 : Slashdot Serves One Hundred Millionth Page!
ValueClick - http://www.valueclick.com/ serves more than 18 million requests per day (on ~20 machines), where every response is dynamic, with all sorts of calculation, storing, logging, counting, you name it. All of their ``application'' is done with Perl (and mod_perl for the stuff they do in the apache processes).
According to Netcraft ( http://netcraft.com ), as of May, 1999 - 3 million hosts are running free Apache webserver, which is more than 57% of all checked in survey hosts! Here is the graph of ``Server Share in Internet Web Sites - Now'' - http://www.netcraft.com/survey/ .
What about mod_perl? http://perl.apache.org/netcraft/ reports that there are 156458 hostnames and 36976 unique IP addresses of the sites running mod_perl. This number actually is bigger, since when hosts were scanned for running webservers, only well known ports were checked (80, 81, 8080 and other). If a server were to run on unusual port it would not enter the count, unless the owner has manually added it to the Netcraft's database. Here is a GRAPH of mod_perl growth:
For the latest numbers see http://perl.apache.org/netcraft/ .
mod_perl's home is http://perl.apache.org . From the site you will be able to download the latest mod_perl software and various documentation, watch new announcements, find third party free and commercials products and more. There is a Perl/Apache mailing list. You will find subscribing details at the same place.
In future issues of this column we will talk about mod_perl and other perl/apache related things: apache administration stuff, writing and understanding apache handlers, writing code allowing database persistence, setting up software watchdogs, smart perl speedups, good vs. bad programming styles, which are directly related to scripts' performance and etc. Anything that can be handy to apache/mod_perl developers. Suggestions for future columns are welcome.
Next column will talk about mod_perl implementation strategies. We will try to understand what strategy is the best for almost every situation.