litheon

Discussion on all things college and technology.

Increasing web server performance

So over the past week or so the site has been dealing with quite a few problems regarding performance, especially around that of average CPU utilization. Throughout the process I found quite a bit of documentation regarding increasing performance in IIS. So in no particular order here are a few of the changes that had major corresponding increases in performance and decrease in CPU utilization. With the steps listed below we've been able to hammer out some of the most frustrating performance problems, and optimize the site's performance enough that we no longer need to babysit the server during high load and especially when we reach the front page of digg.

1. Code optimization

  • Code optimization is probably the most important aspect of insuring top-notch performance, and also the hardest to give guidance about fixing as the code for every application varies and the way it executes on each individual server can vary just as much. The most important aspects of code optimization is reducing memory usage whenever possible, reducing the number of CPU intensive processes, and caching the results of frequently-run database queries.

2. Caching

  • Cache anything and everything that can be cached. When there are many parts of a page that are generated dynamically it can be very difficult to effectively cache data while still showing website visitors fresh content. This can be very difficult to accomplish, especially with certain application-specific caching frameworks such as WP-Cache. In ASP.NET the use of the Cache and Application objects makes it immensely simple to cache data, and not even need to worry about checking the freshness of the content because you can easily specify how the object should expire in the cache which will cause your code to re-create that entry using fresh data. PHP can also incorporate the same functionality through the use of Memcache.

3. HTTP headers

  • Another way to extend the reach of caching it ensuring your web server sends headers that tell the browser it doesn't need to update the same static files consistently. For IIS this does prove to be slightly difficult as you can not actually specify which specific types of files have the expires header set, which will cause the client browser to request the file if it was modified which doesn't necessarily add as much load as requesting the file out right. In order to circumvent this in IIS you can set the expiration on these files by setting the header status on the folder containing them (as the majority of static files you're probably going to be loading are in some sort of theme file). With Apache it is far more customizable using mod_expires, which allows you to specify a default and then specify the expiration based on the MIME type.

4. Multi-Threading

  • Quite a few modern web servers also have the capability of creating multiple execution threads and worker processes dynamically based on the load. Apache accomplishes this via the Multi-Processing Modules which gives you control over the number of threads created per worker process and the number of worker processes, while IIS allows you to specify the maximum number of worker processes that can be created. One thing to note about the IIS worker processes is that each specific worker process has its memory isolated from the others, so items added to the Cache on that worker process will only be accessible from pages that worker process handles.

5. Dynamic load management

  • Another good idea to ensure performance is disabling certain functionality during times of high load. Probably the easiest way to accomplish this would be to use some form of "Who is online" functionality if the software powering your website supports it. Some of the functionality that should be disabled includes anything that is heavy in database queries, and functionality that requires significantly more markup and as such requires more processing on the server-side. Disabling functionality like this also gives you the opportunity to disable caching of anything required to use whatever functionality you have disabled, effectively giving you additional memory to cache other, more important, elements of the page.

6. HTTP Keep-alives

  • Using HTTP Keep-alives is another way of maintaining adequate performance by using a single connection to the server for all of the elements on the page, and subsequent pages a client may navigate to. This reduces performance overhead by only necessitating that one connection as opposed to a single connection to the server for every element that needs to be requested from the server.

7. Gzip compression

  • Gzip compression can reduce the amount of time a client is actively communicating with the server, and as such reduces the performance overhead caused by the underlying TCP/IP protocol. The less time a client is communicating with the server the more time is available for other clients to communicate without having to be placed into a hold queue. It should however be noted that compressing anything does have a certain amount of CPU overhead in itself so it would be worth your time to try and balance CPU utilization versus the time required for the communication to complete. Unfortunately compression can only be configured at the server on Apache and IIS, and virtual host level on Apache. So if you're not using a VPS or a dedicated server it would be worth contacting your host to inquire about enabling that functionality if it is not already. Apache uses mod_deflate to facilitate gzip compression, while IIS allows you to edit the properties of "Web Sites" and enable HTTP compression on the "Service" tab.

8. Database optimization

  • Optimizing your databases is another huge step in ensuring good performance. Although most of the popular content management software out there already does a great deal of optimizing database schema, if you are developing your own application or adding additional functionality that requires your own database interface you should make sure you always create an index or primary key on a table that has a large quantity of data; especially if it is accessed relatively frequently. If you're using MySQL the storage engine chosen also plays a large part in the performance of the database server software, and as a result can positively or negatively affect the speed of your website.

Overall the issue of performance really doesn't have a single solution and can be the single most frustrating aspect of developing and running a website. Even with the most optimized code and databases, and with the most optimally tweaked settings it can be an insurmountable task to find the right combination of configuration options, hacks, plugins, and frameworks to achieve the best performance possible. However, with the steps listed here (and the steps on some of the websites listed below) obtaining at least adequate performance out of your website is a goal that can easily be within reach.

Links:

 

Comments

uRBAN jAMAican said:

Albeit a mouthful to read, I went through it all and got some great insight to things I would have never conjured up myself. Thanks for posting.

# October 29, 2007 6:16 PM

xRiOT said:

Great article Litheon. Very helpful.

Do you have any suggestions on tuning the machine.config for best performance? The website I am working on at the moment is set up in an environment with two load balanced web servers, one SQL 2005 server (with a warm failover) and one supporting application server which each of the webservers talk to for their session state.

# October 30, 2007 11:40 PM

litheon said:

I've never really toyed around with machine.config, although I'm assuming you can put <cache> entries in there. If that's the case make sure you bump the amount of memory the application can use for caching up there, especially if the SQL server is on a separate box. At any given moment on mn.com there are about 3,000 cache entries that total up to about 200 MB.

I've also got our SQL server configured to take up just under 2 GB of RAM. With that configuration and a fair amount of database optimization the CPU on that box never exceeds 20% utilization, ever.

Also, if anyone has anything to add regarding performance please feel free to chime in. I'm always interested in ways to tweak settings so they're just right.

# October 31, 2007 8:31 AM

Gavin said:

some good advice there. Just to note the Optimizing IIS performance link does not work. I'm currently working on an online portal that is growing quickly and am trying to anticipate future problems before I hit them. I just wondered what sort of hardware your running for MS SQL? I'm currently running SQL 2005 Enterprise on a 3Ghz Xeon with 2GB, and am not sure how much longer this will hold up. I've looked at clustering but If I've understood right SQL2005 only clusters for failover and not load balancing so I dont think there would be too much to gain.

# February 20, 2008 11:44 AM

litheon said:

While anticipating future growth is a good idea it sounds like you're way too far ahead of yourself. If you are seeing peformance problems you'll definatley want to think about getting more RAM for your box before setting up another one and using load balancing. Make sure you have the SQL server configured to use an adequete amount of RAM as that will greatley increase the performance of the server.

# February 20, 2008 11:44 AM