One of my colleagues, Jim Tallant, encountered a problem with Win2k Server, running in an Application Center environment, where very high CPU cycles threatened to bring down the server farm. All servers in the farm showed high CPU utilization (60 to 70%) whereas they normally run in the 15 to 20% range. We determined that this high CPU consumption was caused by excess time spent in the .Net memory garbage collection routine. The statistic “% Time in GC” on these servers was in the range of 40 to 60%.
Strangely, mal-behaving Servers tend to “self heal” after a few days - possibly caused by increasing memory demands forcing the garbage collector into some alternate mode that executes effectively and recovers the memory. A review of servers that “healed” shows an increase in the statistic “private bytes - aspnet_wp.exe” and “.Net Large Object Block” in the minutes preceding when servers healed themselves, and the servers went down to 15 to 18% overall CPU numbers.
Ultimately, we determined that there is a bug in the Win2k server memory GC routine that is related to servers having more than 2GB of RAM. A routine in the GC could only handle a number up to 2GB of RAM, and was indicating that the server had a very small amount of ram available, compared to the actual memory installed (it may have returned zero or a negative number). We tested a fix that is due to be published in the next few weeks by Microsoft and determined that this fix makes the servers behave correctly. Also, since we knew that the fix was related to running on servers with more than 2GB of RAM we tested changing the server configuration to only use 2GB of RAM. This also eliminated the problem, at a very low cost to performance.
In our implementation, we made the configuration change to temporarily solve the problem while we wait for an official patch from MS. MS is working on providing a patch that is intended for Win2K servers. The patch we tested was hand installed after extracting it from a Win2K3 install. Running on 2GB of memory adds around 10% additional CPU at our present load (for example we increase from 15% to 17%). The patch requires the installation of .Net Framework 1.1 Service Pack 1 before it is installed.
In order to understand how to make good use of the garbage collector and what performance problems you might run into when running in a garbage-collected environment, it’s important to understand the basics of how garbage collectors work and how those inner workings affect running programs.