Apache HTTP Server

From Pulsed Media Wiki

The Apache HTTP Server, commonly referred to as Apache, is a free and open-source cross-platform web server software. Developed and maintained by the Apache Software Foundation, it has played a key role in the initial growth of the World Wide Web and remains one of the most popular web servers globally.

Apache serves web content (like HTML files, images, videos) to users who request it via their web browsers. It can host one or more virtual hosts (websites) on a single server. It is highly configurable through its plain text configuration files and a module system that extends its functionality.

Overview

Apache is known for its flexibility, power, and widespread support. Its architecture has historically been connection-driven, using multiprocessing modules (MPMs) to handle connections:

  • `prefork` MPM: Creates a new process for each incoming connection. This is stable but can consume significant resources with many connections.
  • `worker` MPM: Uses multiple processes, each with multiple threads, to handle connections, offering better scalability than `prefork` for high traffic.
  • `event` MPM: The default in recent versions, designed to handle many connections efficiently, especially those with keep-alive timers, by using dedicated threads for listening and managing connections.

Apache is configured using plain text files, primarily `apache2.conf` (or `httpd.conf`) and separate configuration files for virtual hosts typically stored in `sites-available` and enabled via `sites-enabled` directories. It also supports decentralized configuration through `.htaccess` files placed within content directories, allowing per-directory overrides of certain settings.

History

The Apache HTTP Server project was started in 1995 based on the NCSA HTTPd server. It quickly gained popularity due to its open-source nature, robustness, and performance. It became the dominant web server on the Internet in the late 1990s and early 2000s and has maintained significant market share ever since, despite the rise of competitors like Nginx.

Features

Key features of the Apache HTTP Server include:

  • Module System: Provides a wide range of modules (e.g., `mod_ssl` for TLS/SSL, `mod_rewrite` for URL rewriting, modules for various programming languages) that can be dynamically loaded to add functionality.
  • Configuration Files: Highly configurable through directives in `apache2.conf` and site-specific configuration files.
  • Virtual Hosts: Supports hosting multiple websites on a single server using name-based or IP-based virtual hosts.
  • .htaccess Files: Allows decentralized, per-directory configuration overrides (though using these can sometimes impact performance).
  • Handling Static Content: Efficiently serves static files like HTML, CSS, images, etc.
  • Handling Dynamic Content: Supports dynamic content generation through interfaces like CGI (Common Gateway Interface), WSGI (Web Server Gateway Interface), and modules like `mod_php` or proxying to application servers.

Setup on Linux (CLI Tutorial)

This tutorial covers the basic installation and management of the Apache HTTP Server on a Debian-based Linux distribution (like Ubuntu) using the command-line interface.

Prerequisites:

Steps:

  1. Update the package index:
 sudo apt update  

This command updates the list of available software packages.

  1. Install Apache2:

Install the Apache HTTP Server package.

 sudo apt install apache2 -y  

The service will typically start automatically after installation.

  1. Check the service status:

Verify that Apache is running.

 sudo systemctl status apache2  

You should see output indicating the service is `active (running)`.

  1. Access the default web page:

Open a web browser and navigate to your server's IP address or hostname. You should see the default "Apache2 Default Page" for Debian/Ubuntu, which is served from `/var/www/html/`.

  1. Managing the Apache2 Service:
  • Stop Apache:
 sudo systemctl stop apache2  
  • Start Apache:
 sudo systemctl start apache2  
  • Restart Apache:
 sudo systemctl restart apache2  
  • Reload Configuration (graceful restart):
 sudo systemctl reload apache2  

Use reload after making configuration changes to apply them without dropping active connections.

  1. Basic Configuration Files and Directories:
  • `/etc/apache2/`: The main configuration directory.
  • `/etc/apache2/apache2.conf`: The main global configuration file.
  • `/etc/apache2/ports.conf`: Specifies the ports Apache listens on.
  • `/etc/apache2/sites-available/`: Contains configuration files for virtual hosts (websites) that are available on the server.
  • `/etc/apache2/sites-enabled/`: Contains symbolic links to the virtual host configurations in `sites-available` that are currently active.
  • `/var/www/html/`: The default document root (location of web files) for the default website.
  1. Enabling/Disabling Virtual Host Sites:

You can activate a virtual host configuration file from `sites-available` by creating a symbolic link in `sites-enabled` using the `a2ensite` command.

  • Enable a site (e.g., a file named `mywebsite.conf` in `sites-available`):
 sudo a2ensite mywebsite.conf  
  • Disable a site (e.g., the default site):
 sudo a2dissite 000-default.conf  
  • After running `a2ensite` or `a2dissite`, you must reload or restart Apache for the changes to take effect.
  1. Checking Configuration Syntax:

Always test your Apache configuration files for syntax errors before reloading or restarting the service.

 sudo apache2ctl configtest  

Or:

 sudo apachectl configtest  

Look for `Syntax OK` in the output.

  1. Enabling/Disabling Modules:

Apache's functionality can be extended by enabling/disabling modules using `a2enmod` and `a2dismod`.

  • Enable a module (e.g., the rewrite module):
 sudo a2enmod rewrite  
  • Disable a module:
 sudo a2dismod status  
  • After running `a2enmod` or `a2dismod`, you must reload or restart Apache.

Apache vs. Nginx

Apache and Nginx are two of the most popular web servers globally, but they differ fundamentally in their architecture and configuration approach, which influences their performance characteristics and use cases.

Feature Apache HTTP Server Nginx
Architecture Primarily process/thread-based (MPMs like prefork, worker, event). Can be less efficient with large numbers of concurrent connections compared to event-driven models. Event-driven, asynchronous architecture. Highly efficient at handling many concurrent connections with low memory consumption.
Configuration Complex, uses a single main configuration file (`apache2.conf` / `httpd.conf`) and often many included files for virtual hosts and modules. Supports decentralized `.htaccess` files. Simpler, uses a single main configuration file (`nginx.conf`) with includes. Virtual hosts (server blocks) and other configurations are typically centralized. Does NOT support `.htaccess`.
Handling Static Content Handles static files well. Generally considered more performant for serving static files, especially under high load, due to its efficient architecture.
Handling Dynamic Content Can process dynamic content internally using modules (e.g., `mod_php`) or externally via CGI, FastCGI, etc. `.htaccess` allows per-directory dynamic config. Primarily designed to pass dynamic requests to external processors (like PHP-FPM, Gunicorn, etc.) and serve the result. More commonly used as a reverse proxy for dynamic applications.
Flexibility (.htaccess) High. `.htaccess` files allow users without root access to control certain web server behaviors in their directories. Low. No equivalent to `.htaccess`. All configuration must be in the main configuration files, requiring root access or administrator involvement.
Performance under High Load Can consume more resources (RAM, CPU) per connection, potentially impacting performance when handling thousands of simultaneous connections, depending on the MPM used. Designed to handle the C10k problem (10,000 concurrent connections) efficiently with minimal resource overhead. Excellent for high-traffic static serving and reverse proxying.
Use Cases General-purpose web hosting, shared hosting (due to `.htaccess`), running applications with specific Apache modules, environments where decentralized configuration is beneficial. High-traffic websites, static content serving, reverse proxying for application servers (like Node.js, Python apps), load balancing, API gateways.

In many modern web setups, Apache and Nginx are used together, with Nginx acting as a reverse proxy handling static files and forwarding dynamic requests to Apache or an application server behind it.

See also