Web Trenches

Resolve Stability Problems and SPEED UP ColdFusion 10



I've been in a long struggle with the default install of ColdFusion 10.  The main issue has been the stability of the product when you are running multiple IIS websites off of one instance of ColdFusion.  I don't know why Adobe doesn't build this right into the installer or their connector, but they did post a blog article and here are the changes you need to make to your configuration in order to stop the Tomcat-IIS connector from crashing consistently.

  1. Change (or add) the following lines in your workers.properties file.  That file is located at <cf-installation-path>\config\wsconfig\ in the folder number associated with your website.  If you chose ALL websites when installing ColdFusion, then the number is most likely 1.  

    worker.cfusion.max_reuse_connections=250
    worker.cfusion.connection_pool_size=500
    worker.cfusion.connection_pool_timeout=60

    Two important points here.  First, if you created a separate instance of CF then "cfusion" in the lines above should be replaced by your instance name.  Second, you should follow this formula for the first two numbers:  connections = pool_size / number of iis sites.  So, in the example above I have it set for two sites (250 = 500 / 2 sites).  If you had 20 sites, you could do something like this… 

    worker.academics.max_reuse_connections=50
    worker.academics.connection_pool_size=1000
    worker.academics.connection_pool_timeout=60  

    Adobe's blog post linked above explains more about what these numbers mean.  If you did not set ColdFusion to connect to ALL IIS sites when installing, be sure to set these lines for EVERY site that is connected to your instance – each one should have its own number-based folder in the wsconfig folder.
     

  2. Edit the server.xml file in <cf-installation-path>\cfusion\runtime\conf  (or <cf-installation-path>\<instance-name>\runtime\conf

    There is a line that corresponds to the settings above.  
    <Connector port="8013" protocol="AJP/1.3" redirectPort="8446" tomcatAuthentication="false" maxThreads="500" connectionTimeout=
    "60000"
    >

    The last two attributes (in bold) are the items to add.  These should directly correspond to the pool_size and pool_timeout settings you entered in the workers.properties file.  maxThreads is your pool_size, and connectionTimeout is your pool_timeout times 1000 (it's milliseconds here).  

After you have completed these two steps, restart your ColdFusion instance and you should be all set.

Please note that any time you re-run the ColdFusion connector, your workers.properties will get reset and you will have to re-add these settings.

BONUS!

I noticed that the sites I changed these settings on ran MUCH faster than the sites that were running on their own instance.  So, I decided to try adding these settings to single-site instances as well.  Surprisingly, this drastically sped up ColdFusion's performance.  So, for a dedicated server using one IIS website and one instance of ColdFusion I added the following lines.

worker.academics.max_reuse_connections=250

worker.academics.connection_pool_size=250

worker.academics.connection_pool_timeout=60

… and corresponding server.xml file setting…
<Connector port="8013" protocol="AJP/1.3" redirectPort="8446" tomcatAuthentication="false" maxThreads="250" connectionTimeout="60000">

I saw instant and significant performance improvement.  I am not entirely sure why, but it probably has something to do with having Tomcat cycle the connection every 60 seconds instead of having them held open.  

19 Replies to “Resolve Stability Problems and SPEED UP ColdFusion 10”

  1. Whilst you mention IIS in your first paragraph and given that wsconfig doesn’t offer an “All Sites” option when Apache is selected, do you know if this advice also apples to multi-site/instances of ColdFusion that use the Apache web server?

  2. The connectors for Apache are different, so I don’t believe this applies. The article on the Adobe CF blog does not mention Apache, and I didn’t see anyone on the Adobe forums complaining about this problem with Apache.

  3. Thanks for your thoughts Michael. I agree but I wanted your take on it as you’ve obviously looked into the issue very deeply. I’ve cross-posted to the Adobe blog in the hope of getting a definitive answer from Kiran.

  4. I did this and everything seemed good for weeks. Then, today, I started getting, “Service Temporary Unavailable! The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. Jakarta/ISAPI/isapi_redirector/1.2.32 ()”

    After attempting to restart CF, restart the server, and delete and re-add the connector, none of which worked, I stumbled upon:

    https://bugbase.adobe.com/index.cfm?event=bug&id=3318104

    …restarted “World Wide Web Publishing Service” (i.e. IIS) and it was fixed. Scary and bad.

    Do the suggestions from this post try to stave off this phenomenon? Or is this unrelated?

  5. @John – this does not sound like the same issue to me. If you run out of threads, recycling ColdFusion would cycle tomcat and release the threads. I’ve never run into the issue you describe, where an IIS restart is required. I would examine the Windows event viewer to look for crashes in IIS.

  6. Mike,

    Do you know if this problem could cause random sessions to be lost? Since we’ve migrated to CF10, we’ve been seeing scenarios where our admins will be logged in and then browse to a few pages and they’re getting kicked out. There was never any issue with the login and we’ve went back and confirmed that it’s working fine. There’s no real pattern to it and it’s difficult to replicate.

    We’re running the same setup as you, only 25 websites. Does it sound like a Tomcat thread issue?

    Also, we’ve made these changes on our dev and test environments and have noticed that some of our slow running pages are running fine and consistent now.

  7. I don’t think that is related, but I have seen that behavior. It usually follows a CF error, particularly one that is Java-related. I think it somehow drops the session due to the error. I’d check your coldfusion-out and application logs for errors around the same time the session gets dropped.

  8. Sir, you’re a genius – I was so sick of having to manually terminate the worker threads – After implementing your fix we hardly ever have to restart anything – Thanks so much!

  9. THANK YOU!! I was having this same issue but on OSX 10.9 on Apache 2.4 and CF11. This resolved ALL of my reached max clients issues. The only adjustment I made from your code was I also put a / after “60000” or it gives an error message.

  10. I have two mirror servers. And I have 3 coldfusion based websites hosted on IIS server. I was getting the “Service Temporary Unavailable! ” error quite frequently. I fixed it by restarting IIS server. This fixed the issue temporarily and the error kept coming back again. I now stumbled upon your fix. I have done the changes on the primary server. My question to you is , do I need to implement them on the mirror server as well.
    Thanks

  11. Radhika – It’s difficult to answer this question without knowing how your mirror configuration replicates files back and forth. If the worker.properties and server.xml files are not replicated automatically, then yes you would need to do this in both locations.

  12. Hi Michael,

    For the issue you stated above “I don’t think that is related, but I have seen that behavior. It usually follows a CF error, particularly one that is Java-related. I think it somehow drops the session due to the error. I’d check your coldfusion-out and application logs for errors around the same time the session gets dropped.”, we are also facing the same issue here but can’t find any solution, May I know do you have any suggestion on this?

  13. @SHI CAI – In my experience, this happens when CF consumes too much RAM. For example, it happened to me when I had a bug which caused CF to attempt to instantiate tens of thousands of objects.

  14. @JOHN BLISS – Thanks for your explanation. May I know what bug you facing last time? This error only occur when i migrate the application FROM CF8 into CF10.

  15. @SHI CAI – Bug was specific to my application. Look for cases where your application is attempting to instantiate a LOT of objects, build an array with millions of rows, etc…

  16. We have setup 3 instances of CF for our 3 different websites, still we are seeing lots of issue – after coupel of hours users see services are not available.

    worker.list=cfusion
    worker.cfusion.type=ajp13
    worker.cfusion.host=localhost
    worker.cfusion.port=8014
    worker.cfusion.max_reuse_connections=5000
    worker.cfusion.connection_pool_size=15000
    worker.cfusion.connection_pool_timeout=60

    worker.list=FR

    worker.FR.type=ajp13
    worker.FR.host=localhost
    worker.FR.port=8012
    worker.FR.max_reuse_connections=5000
    worker.FR.connection_pool_size=15000
    worker.FR.connection_pool_timeout=60

    worker.list=NR

    worker.NR.type=ajp13
    worker.NR.host=localhost
    worker.NR.port=8013
    worker.NR.max_reuse_connections=5000
    worker.NR.connection_pool_size=15000
    worker.NR.connection_pool_timeout=60

  17. @SNAIK – There are three things jump out at me.

    First, dividing the total connections up applies only when you are running multiple sites on ONE instance. You are running three separate instances, so this would not apply to you. You would want your max reuse and your max pool size to be the same number.

    Second, shouldn’t your max_pool_size and your maxThreads be the same number?

    Third, these connection counts seem excessively high. You are allowing 15000 (or more) connections to use up the 60 seconds each. I don’t know anything about you server load or capabilities, but it’s unlikely that your server can handle that many pooled connections.

    Try this…
    worker.list=cfusion
    worker.cfusion.type=ajp13
    worker.cfusion.host=localhost
    worker.cfusion.port=8014
    worker.cfusion.max_reuse_connections=1000
    worker.cfusion.connection_pool_size=1000
    worker.cfusion.connection_pool_timeout=60

    Use the same numbers for all three instances.

Leave a Reply

Your email address will not be published. Required fields are marked *