Increasing web server performance
So over the past week or so the site has been dealing with quite a few problems regarding performance, especially around that of average CPU utilization. Throughout the process I found quite a bit of documentation regarding increasing performance in IIS. So in no particular order here are a few of the changes that had major corresponding increases in performance and decrease in CPU utilization. With the steps listed below we've been able to hammer out some of the most frustrating performance problems, and optimize the site's performance enough that we no longer need to babysit the server during high load and especially when we reach the front page of digg.
1. Code optimization
-
Code optimization is probably the most important aspect of insuring top-notch performance, and also the hardest to give guidance about fixing as the code for every application varies and the way it executes on each individual server can vary just as much. The most important aspects of code optimization is reducing memory usage whenever possible, reducing the number of CPU intensive processes, and caching the results of frequently-run database queries.
2. Caching
-
Cache anything and everything that can be cached. When there are many parts of a page that are generated dynamically it can be very difficult to effectively cache data while still showing website visitors fresh content. This can be very difficult to accomplish, especially with certain application-specific caching frameworks such as WP-Cache. In ASP.NET the use of the Cache and Application objects makes it immensely simple to cache data, and not even need to worry about checking the freshness of the content because you can easily specify how the object should expire in the cache which will cause your code to re-create that entry using fresh data. PHP can also incorporate the same functionality through the use of
Memcache.
3. HTTP headers
-
Another way to extend the reach of caching it ensuring your web server sends headers that tell the browser it doesn't need to update the same static files consistently. For IIS this does prove to be slightly difficult as you can not actually specify which specific types of files have the expires header set, which will cause the client browser to request the file if it was modified which doesn't necessarily add as much load as requesting the file out right. In order to circumvent this in IIS you can set the expiration on these files by setting the header status on the folder containing them (as the majority of static files you're probably going to be loading are in some sort of theme file). With Apache it is far more customizable using
mod_expires, which allows you to specify a default and then specify the expiration based on the MIME type.
4. Multi-Threading
-
Quite a few modern web servers also have the capability of creating multiple execution threads and worker processes dynamically based on the load. Apache accomplishes this via the
Multi-Processing Modules which gives you control over the number of threads created per worker process and the number of worker processes, while IIS allows you to specify the maximum number of worker processes that can be created. One thing to note about the IIS worker processes is that each specific worker process has its memory isolated from the others, so items added to the Cache on that worker process will only be accessible from pages that worker process handles.
5. Dynamic load management
-
Another good idea to ensure performance is disabling certain functionality during times of high load. Probably the easiest way to accomplish this would be to use some form of "Who is online" functionality if the software powering your website supports it. Some of the functionality that should be disabled includes anything that is heavy in database queries, and functionality that requires significantly more markup and as such requires more processing on the server-side. Disabling functionality like this also gives you the opportunity to disable caching of anything required to use whatever functionality you have disabled, effectively giving you additional memory to cache other, more important, elements of the page.
6. HTTP Keep-alives
-
Using HTTP Keep-alives is another way of maintaining adequate performance by using a single connection to the server for all of the elements on the page, and subsequent pages a client may navigate to. This reduces performance overhead by only necessitating that one connection as opposed to a single connection to the server for every element that needs to be requested from the server.
7. Gzip compression
-
Gzip compression can reduce the amount of time a client is actively communicating with the server, and as such reduces the performance overhead caused by the underlying TCP/IP protocol. The less time a client is communicating with the server the more time is available for other clients to communicate without having to be placed into a hold queue. It should however be noted that compressing anything does have a certain amount of CPU overhead in itself so it would be worth your time to try and balance CPU utilization versus the time required for the communication to complete. Unfortunately compression can only be configured at the server on Apache and IIS, and virtual host level on Apache. So if you're not using a VPS or a dedicated server it would be worth contacting your host to inquire about enabling that functionality if it is not already. Apache uses
mod_deflate to facilitate gzip compression, while IIS allows you to edit the properties of "Web Sites" and enable HTTP compression on the "Service" tab.
8. Database optimization
-
Optimizing your databases is another huge step in ensuring good performance. Although most of the popular content management software out there already does a great deal of optimizing database schema, if you are developing your own application or adding additional functionality that requires your own database interface you should make sure you always create an index or primary key on a table that has a large quantity of data; especially if it is accessed relatively frequently. If you're using MySQL the
storage engine chosen also plays a large part in the performance of the database server software, and as a result can positively or negatively affect the speed of your website.
Overall the issue of performance really doesn't have a single solution and can be the single most frustrating aspect of developing and running a website. Even with the most optimized code and databases, and with the most optimally tweaked settings it can be an insurmountable task to find the right combination of configuration options, hacks, plugins, and frameworks to achieve the best performance possible. However, with the steps listed here (and the steps on some of the websites listed below) obtaining at least adequate performance out of your website is a goal that can easily be within reach.
Links: