When I deploy a large
server for web traffic, I need to ensure that HTTP requests are
serviced with a minimum response time and many companies impose
a Service Level Agreement (SLA) requiring that 90% of web page
requests be served up with sub-second response time.
It's considered rude
to make your customers wait, especially on the web.
Today's web users have a low
tolerance level for slow page loads and many will simple click
the "back" button after waiting more than a second.
Measuring web server
response time
The most important
factor to look at is your high water mark of HTTP requests per
second and your end-to-end response time during this high-usage
period. This response time is the total elapsed time
between receipt of the request and the end of sending the
requested HTML. Some use the "Time To First Byte"(TTBF)
metric to measure webserver response time. Also see
Webserver Traffic
Monitoring Tips.
For a real measure of end-to-end
response time, you can use
open gateway proxy web servers and see how fast you
pages load from anywhere in the world. To use a proxy
server, you can use the
proxy feature in Internet Explorer or FireFox, enter the IP
address of the proxy server and time your HTTP web page
requests.
The
best configuration for web servers
So, how do you know
when you have an appropriate web server configuration?
When a web server without a web cache is experiencing stress,
the cause is most likely disk enqueue requests. The
average disk latency is 15-20 milliseconds and it does not take
much traffic to cause disk enqueues.
-
Disk placement
counts - If you have all of your images and HTML on
adjacent cylinders you may experience latency from the
movement of the read-write heads. Disk I/O latency
goes up, and the disk will shake like an out-of-balance
washing machine as competing HTTP requests try to get their
images and HTML. You can use the Linux iostat utility
to monitor your disk latency. If you don't have a
web cache, consider using super-fast
solid-state disks. It's also important to use
direct I/O on your web server.
-
Web Servers are
not CPU intensive - Servicing an HTTP request does not
involve high CPU consumption. You can monitor your web
server CPU consumption with the Linux top and vmstat
utilities, and you have CPU enqueues when the runqueue
exceeds the number of processors on the webserver. However,
you can utilize idle CPU resources by installing
HTTP compression to improve webserver HTML delivery
speed, trading off unused CPU cycles for faster web response
time.
-
RAM matters
most - Most webservers have some pages that are far more
popular than others, and caching the most popular images
will dramatically improve total response time. Without
a web cache, RAM usage should be minimal, but you should
monitor for real RAM
page-in conditions using vmstat, where page-in's
correlate with scan rate. Remember, page-in's occur as
a normal part of program loading, and you must separate
normal page-in's from those caused by server swapping.
Caching and the 80-20
rule
Disk speed is measured
in milliseconds (thousandths of a second) and RAM is measured in
nanoseconds (billionths of a second). In theory, RAM
should be 10,000 times faster than disk, but even with RAM
management latency overhead, caching frequently-used components
can results in a 500x improvement in webserver response time.
Almost all web site
have an 80-20 rule where 80% of the traffic is for the 20% of
the most popular web pages. You can examine your referrer
statistics to see the top 20% most popular pages and cache them
in RAM. Once sized and configured, your web cache should
keep your most frequently requested images and HTML cached for
outbound traffic. This article on
web page caching notes hat a RAM cache can make a huge
impact on overall response time:
"Formulating a
caching policy for a web site - determining which resources
should not get cached, which resources should, and for how
long - is a vital step in improving site performance."
You can have a
front-end cached reverse proxy server to serve-up requests
without forcing your server to go to disk. Open source
products such as Squid
do a great job, but your web server must have dedicated RAM
resources.
Choosing your web server hardware configuration
Depending on you
tolerance for 404 timeouts and response time during peak usage
times, you can choose webserver hardware to minimize disk
latency (caching) while ensuring that you have enough CPU and
RAM for the OS. Most companies choose Intel-based web
server hardware on a server with high-channel disks (to support
high volume disk reads) and expandable RAM to cache the most
frequently requested pages and images.
Make sure that you choose a server
that has expandable RAM capacity, at least 4 gig with expansion
to 16 gig, depending on the size of your working set of
frequently-referenced pages and images.
If you have less than a million
page views a month, you can buy
powerful 64-bit PC's for webserver hardware, install Red Hat
Linux advanced server with Squid and have a 1.5 gig web cache
for super-fast web page delivery.
Dell also offers higher volume webservers at reasonable
prices.
For larger webserver traffic
volumes, many web
hosting companies offer larger servers. I see here
there's a Dual Xeon EM64T 3.2GHz with 2GB RAM and two 200GB SATA
drives in a RAID 1 configuration (their biggest machine), which
is better than the box you're on now because of the RAIDed fast
drives, more GHz, and 64 bit processors. All for only $275.00 a
month! The only setback is that the max bandwidth on that is 1TB
per month.
If you'd want even cheaper than that, I also see a Dual Xeon
2.8GHz EM64T with 2GB RAM and two 80GB SATA RAID-1 drives for
only $210 a month.
For cheaper servers, we go non-RAID. There's a dual core Athlon
X2 3800 with 1GB RAM and an 80GB hard drive for only $160 a
month. You would be able to upgrade the RAM for only a little
more, but at that point I go with the beefier server that has
RAID on it for faster I/O.
 |
BC has experts to assist you in choosing the best
webserver hardware configuration for your
mission-critical web site. |