ICEfaces
  1. ICEfaces
  2. ICE-2807

Memory Leak in Concurrent Dom View applications

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.2, 1.7Beta1
    • Fix Version/s: 1.7.1
    • Component/s: Framework
    • Labels:
      None
    • Environment:
      ICEfaces

      Description

      RIM has reported memory leaks in their application. Their testing consists of using JMeter scripts to load the application by essentially doing the following:

      - Create a new View
      - interact with the application
      - Dispose of the View, hopefully releasing resources
      - Repeat.

      They are testing against version 1.6.2, which has a great deal of relevance, but version 1.7 Beta 1 and 1.6.2 have some similarities too.


      I've tried testing timezone3 with concurrent DOM views, using JMeter and have a few observations.

      1) There does appear to be a memory leak as long as the session sticks around in 1.6.2. In their application, and they see people using multiple views for extended periods of time.

      2) The timezone demonstration applications have not kept up with best practices and have not been updated in the Concurrent DOM Views world. They flat out do the wrong thing.

      3) There's a potentially disastrous problem in our JSF usage or implementation in that several "application" scoped beans can be created the first time the application starts up, and this is particularly noticeable when using JMeter.

      4) Threads and their ThreadLocal variables. There appear to be some paths through the servlet hierarchy and perhaps the RenderManager structure that allow threads to keep their ThreadLocal references after the threads have passed. While these aren't necessarily leaks in the normal sense in that the references should be replaced later with other references, it looks like some threads can get sidelined for periods of time holding references that look like leaks. See 2 and 3.

      5) Race conditions in rendering with multiple Views open?

      1) The 1.6.2 code keeps a bunch of DOMContext objects keyed by view name in the Session record. If there is a rendering exception in the TimezoneBean, it disposes of itself somewhat permanently, and removes itself from the renderGroup only, but does not clean up the Session. It does not remove the DOMContext bundle from the Session, and these DOMContext objects contain references to all the heavies. All 1.6.2 example applications MUST call BridgeFacesContext.dispose() in addition to all other timer related activities.

      Fine, you say. How often do we get a renderingException callback in Timezone3? Well, actually, when there are several views open, it happens a lot in conjunction with the RejectionHandler being executed as shown in the following stack dump, and this happens several times in succession.

      27-Feb-2008 4:53:23 PM com.icesoft.faces.async.render.RejectionHandler rejectedExecution
      WARNING: render task was rejected: com.icesoft.faces.async.render.RunnableRender@1c836e8
      27-Feb-2008 4:53:28 PM com.icesoft.faces.webapp.parser.Parser parse
      SEVERE: Failed to execute JSP lifecycle.
      javax.servlet.jsp.JspException
      at com.sun.faces.taglib.jsf_core.ViewTag.doEndTag(ViewTag.java:235)
      at com.icesoft.faces.webapp.parser.Parser.executeJspLifecycle(Parser.java:203)
      at com.icesoft.faces.webapp.parser.Parser.parse(Parser.java:134)
      at com.icesoft.faces.application.D2DViewHandler.renderResponse(D2DViewHandler.java:543)
      at com.icesoft.faces.application.D2DViewHandler.renderView(D2DViewHandler.java:153)
      at com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:107)
      at com.sun.faces.lifecycle.LifecycleImpl.phase(LifecycleImpl.java:268)
      at com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java:137)
      at com.icesoft.faces.webapp.http.core.PageServer$1.respond(PageServer.java:26)
      at com.icesoft.faces.webapp.http.servlet.ServletRequestResponse.respondWith(ServletRequestResponse.java:150)



      This appears to be the major memory leak. This isn't as big a deal in the Head, because the entire DOMContext bundle is not used at all in the same way. There's no view specific information kept in the session record.


      #2) Clean up TimezoneBean. This gets really confusing. The TimezoneBean is in JSF request scope, except ICEfaces co-opts this, and the TimezoneBeans are essentially session scope beans since they don't normally go away until the session expires, unless the browser or JMeter sends dispose-views. At this stage, the TimezoneBean should never ever call requestStop and clock.dispose(). If it does, the clock stops for all other concurrent views, and the IntervalRenderer supporting this group of Views is also disposed. Note that if any of these renderThreads hold a ThreadLocal to a PersistentFacesState object, it will be held as long as the IntervalRenderer is stopped. This isn't a memory leak because it is possible to flush out all the references, it's just a bit of a sidetrack. Stopping the clock though is bad. If you could determine that you were the last bean in the RenderGroup, maybe you could stop it, but it's best to just leave it running at the moment.

       protected boolean performCleanup() {
              state = null;
              try {
                  if (clock != null) {
      // clock.requestStop();
                      clock.remove(this);
      // clock.dispose();
                      clock = null;
                  }

                  return true;
              } catch (Exception failedCleanup) {
                  System.out.println("Yup exception. " + failedCleanup);
                  failedCleanup.printStackTrace();

              }
              return false;
          }


      #3) Here's a log trace of Timezone 3 using the Head and JMeter with 10 clients starting over 10 seconds. It's not too bad, we only get 4 RenderManager instances. In 1.6.2 I'd get 7 or 8 because it takes forever to instantiate the RenderManager. So that's a bad race condition. Ted says that this is likely our meddling with things, but I don't know any details. Over time, 3 of these RenderManager instances and the associated RenderHubs are GC'd. BUT Not the RenderThreads. They persist for the lifetime of the application. So one RenderHub wins, and its' threads are the ones that are used, but in this case, the other (3 * 15 = 45 ) Threads persist until the server is shut down. If they ever executed a RunnableRender, then they have a ThreadLocal reference that is never released.

      class: javax/servlet/Servlet.class
      29-Feb-2008 3:20:42 PM com.icesoft.faces.application.D2DViewHandler <clinit>
      INFO:
      ICEsoft Technologies, Inc.
      ICEfaces x.x.x
      Build number: x
      Revision: x

      29-Feb-2008 3:20:43 PM org.apache.coyote.http11.Http11BaseProtocol start
      INFO: Starting Coyote HTTP/1.1 on http-8080
      29-Feb-2008 3:20:44 PM org.apache.jk.common.ChannelSocket init
      INFO: JK: ajp13 listening on /0.0.0.0:8009
      29-Feb-2008 3:20:44 PM org.apache.jk.server.JkMain start
      INFO: Jk running ID=0 time=0/31 config=null
      29-Feb-2008 3:20:44 PM org.apache.catalina.storeconfig.StoreLoader load
      INFO: Find registry server-registry.xml at classpath resource
      29-Feb-2008 3:20:44 PM org.apache.catalina.startup.Catalina start
      INFO: Server startup in 5047 ms
      29-Feb-2008 3:20:55 PM com.icesoft.faces.async.render.RenderHub createScheduledService
      INFO: scheduled render service created:
        core pool size : 5
      29-Feb-2008 3:20:55 PM com.icesoft.faces.async.render.RenderHub createScheduledService
      INFO: scheduled render service created:
        core pool size : 5
      29-Feb-2008 3:20:55 PM com.icesoft.faces.async.render.RenderHub createScheduledService
      INFO: scheduled render service created:
        core pool size : 5
      29-Feb-2008 3:20:55 PM com.icesoft.faces.async.render.RenderHub createScheduledService
      INFO: scheduled render service created:
        core pool size : 5
      29-Feb-2008 3:20:56 PM com.icesoft.faces.async.render.RenderHub createCoreService
      INFO: core render service created:
        core pool size : 10
        max pool size : 15
        keep alive time: 300000
      29-Feb-2008 3:20:56 PM com.icesoft.faces.async.render.RenderHub createCoreService
      INFO: core render service created:
        core pool size : 10
        max pool size : 15
        keep alive time: 300000
      29-Feb-2008 3:20:56 PM com.icesoft.faces.async.render.RenderHub createCoreService
      INFO: core render service created:
        core pool size : 10
        max pool size : 15
        keep alive time: 300000
      29-Feb-2008 3:20:56 PM com.icesoft.faces.async.render.RenderHub createCoreService
      INFO: core render service created:
        core pool size : 10
        max pool size : 15
        keep alive time: 300000



      #4) Every time I see an accumulation of objects, the culprit is instances of PersistentFacesState being kept around in ThreadLocal references. JPRofiler doesn't let me find which threads are holding them, and all the ThreadLocals should be reused, over time. I have seen a large number of really odd ThreadLocal references though. From Catalina.shutdownHook objects to logFlushers, each with a ThreadLocal reference to PersistentFacesState. These too, I think go away over time, and it's just the nature of the JProfiler snapshot that catches them, I think.

      We've put in disposing of the ThreadLocal references in RunnableRender after the lifecycle is executed and it has reduced the ThreadLocal reference to PFState from 15 to 1, but always, it seems, 1 remains. I'm not sure where that's being held. I think too, it is possible for HTTP threads to obtain ThreadLocal references and for the threads to skip their clearing code if rendering exceptions occur (which we've seen in 1). THere's no guarantee how often HTTP threads are going to be used, so it's possible for a Thread to sit idle for 'some time' during which the PFState object is held in memory.

      Again, all of the references in 4 'should' be flushed over time and aren't necessarily pure evil. Just a distraction.

      Ted mentioned asserting that the ThreadLocals on entry and exit of the MainServlet service methods and the RunnableRender methods are null, and catching them this way. This will have to be done with introspection though, as all the methods in the ThreadLocalMap are private.

        Activity

        Hide
        Ken Fyten added a comment -

        Last commit to 1.6 branch is incomplete (uses a newer version of a file from 1.7, should use the 1.6 branch version)

        Date of build: 2008-02-29 16:41:56
        /home/builder/cruise-builds/checkout/icefaces-continuous/icefaces-1.6/icefaces/core/build.xml:178: Compile failed; see the compiler error output for details.

        /home/builder/cruise-builds/checkout/icefaces-continuous/icefaces-1.6/icefaces/core/src/com/icesoft/faces/async/render/RunnableRender.java:142: cannot find symbol
        symbol : method executeAndRender()
        location: class com.icesoft.faces.webapp.xmlhttp.PersistentFacesState
        state.executeAndRender();

        Show
        Ken Fyten added a comment - Last commit to 1.6 branch is incomplete (uses a newer version of a file from 1.7, should use the 1.6 branch version) Date of build: 2008-02-29 16:41:56 /home/builder/cruise-builds/checkout/icefaces-continuous/icefaces-1.6/icefaces/core/build.xml:178: Compile failed; see the compiler error output for details. /home/builder/cruise-builds/checkout/icefaces-continuous/icefaces-1.6/icefaces/core/src/com/icesoft/faces/async/render/RunnableRender.java:142: cannot find symbol symbol : method executeAndRender() location: class com.icesoft.faces.webapp.xmlhttp.PersistentFacesState state.executeAndRender();
        Hide
        Deryk Sinotte added a comment -

        I've rolled back the change to the RunnableRender class and re-assigned to Greg. He can check in the complete change when he gets back.

        Show
        Deryk Sinotte added a comment - I've rolled back the change to the RunnableRender class and re-assigned to Greg. He can check in the complete change when he gets back.
        Hide
        Greg Dick added a comment -

        All the bits about multiple RenderManager instances should now be looked after in http://jira.icefaces.org//browse/ICE-2841

        Show
        Greg Dick added a comment - All the bits about multiple RenderManager instances should now be looked after in http://jira.icefaces.org//browse/ICE-2841
        Hide
        Greg Dick added a comment -

        This shot shows the number of objects allocated after 90 minutes. It shows a growth, but
        the growth is not anywhere near the number of views created.

        Show
        Greg Dick added a comment - This shot shows the number of objects allocated after 90 minutes. It shows a growth, but the growth is not anywhere near the number of views created.
        Hide
        Greg Dick added a comment -

        Here is the heap size over a 90 minute period showing very little in the way of leaking.

        Show
        Greg Dick added a comment - Here is the heap size over a 90 minute period showing very little in the way of leaking.
        Hide
        Greg Dick added a comment -

        Looking at the framework as it sits today, it looks pretty good. I'm running Timezone 3 application in Jmeter with 10 clients. I create a view do one request in the window, and dispose of the views. In the attached screen shots, I see some object count growth over 90 minutes, but it's no where near the leaking an object of any kind per cycle, since the above cycle takes about 10 seconds and the server has gone through 6800 rendering cycles in this timeframe.

        Show
        Greg Dick added a comment - Looking at the framework as it sits today, it looks pretty good. I'm running Timezone 3 application in Jmeter with 10 clients. I create a view do one request in the window, and dispose of the views. In the attached screen shots, I see some object count growth over 90 minutes, but it's no where near the leaking an object of any kind per cycle, since the above cycle takes about 10 seconds and the server has gone through 6800 rendering cycles in this timeframe.
        Hide
        Greg Dick added a comment -

        RIM has some more problems accumulating View objects. Debugging the problem with the YourKit profiler shows that it is ThreadLocal references to the BridgeFacesContext that are accumulating. So I've checked in a change to the release() method to zero out the ThreadLocal reference there, and it works.

        There are some exit paths in PersistentFacesState execute() and render() methods that also likely require thread locals to be cleared as well. Some clearing of ThreadLocals done inside try-catch blocks but not in a finally clause, and clearing only in render(), which may not be called if responseComplete is true.

        Show
        Greg Dick added a comment - RIM has some more problems accumulating View objects. Debugging the problem with the YourKit profiler shows that it is ThreadLocal references to the BridgeFacesContext that are accumulating. So I've checked in a change to the release() method to zero out the ThreadLocal reference there, and it works. There are some exit paths in PersistentFacesState execute() and render() methods that also likely require thread locals to be cleared as well. Some clearing of ThreadLocals done inside try-catch blocks but not in a finally clause, and clearing only in render(), which may not be called if responseComplete is true.
        Hide
        Greg Dick added a comment -

        Checked in new version of RunnableRender for the head and the 1.7.1 branch. It wasn't clearing up ThreadLocal references to PersistentFacesState in all code paths.

        Show
        Greg Dick added a comment - Checked in new version of RunnableRender for the head and the 1.7.1 branch. It wasn't clearing up ThreadLocal references to PersistentFacesState in all code paths.
        Hide
        Greg Dick added a comment -

        I've created a new utility class ThreadLocalUtility, with a static method to check that the ThreadLocal instances in the framework are null. This can be used to verify cleanup is still taking place.

        The places to call these methods are from someplace outbound in the Servlet path, and someplace outbound in the Server push path. Outbound in this case meaning after the lifecycle is complete and responses are on their way to the client. I've added a call to this utility method to the RunnableRender class in the run() method and to turn it on, you need to set TRACE level logging on for the com.icesoft.faces.async.render package.

        There is no call to this method currently from the server path, and that's because it's easy to do the same thing from a filter that we can add in a test application. This approach isn't available for the RunnableRender however.

        This utility method currently shows that the BridgeFacesContext and PersistentFacesState thread local variables are clear at these outbound locations. Any new threadLocal references should be added to the test, and any alternate thread outbound paths should also be vetted. FileUploadServlet?

        Show
        Greg Dick added a comment - I've created a new utility class ThreadLocalUtility, with a static method to check that the ThreadLocal instances in the framework are null. This can be used to verify cleanup is still taking place. The places to call these methods are from someplace outbound in the Servlet path, and someplace outbound in the Server push path. Outbound in this case meaning after the lifecycle is complete and responses are on their way to the client. I've added a call to this utility method to the RunnableRender class in the run() method and to turn it on, you need to set TRACE level logging on for the com.icesoft.faces.async.render package. There is no call to this method currently from the server path, and that's because it's easy to do the same thing from a filter that we can add in a test application. This approach isn't available for the RunnableRender however. This utility method currently shows that the BridgeFacesContext and PersistentFacesState thread local variables are clear at these outbound locations. Any new threadLocal references should be added to the test, and any alternate thread outbound paths should also be vetted. FileUploadServlet?

          People

          • Assignee:
            Unassigned
            Reporter:
            Greg Dick
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: