ICEpush
  1. ICEpush
  2. PUSH-364

Stuck non-container thread using Tomcat 7/8 ARP

    Details

    • Assignee Priority:
      P3
    • Support Case References:
    • Workaround Description:
      Hide
      suggested workarounds- but as of yet unconfirmed--use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
      57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
      57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
      OR
      disable using tomcat ARP with the following context-param:-
      <context-param>
      <param-name>org.icepush.useAsyncContext</param-name>
      <param-value>false</param-value>
      </context-param>
      Show
      suggested workarounds- but as of yet unconfirmed--use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:- 57779 < http://bz.apache.org/bugzilla/show_bug.cgi?id=57779 > : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt) 57621 < http://bz.apache.org/bugzilla/show_bug.cgi?id=57621 > : When an async request completes, ensure that any remaining request body data is swallowed. (markt) OR disable using tomcat ARP with the following context-param:- <context-param> <param-name>org.icepush.useAsyncContext</param-name> <param-value>false</param-value> </context-param>

      Description

       "Notification queue consumer." daemon prio=3 tid=0x00000001014f6800 nid=0x17 in Object.wait() [0xffffffff713fe000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      at java.lang.Object.wait(Object.java:503)
      at org.apache.coyote.AsyncStateMachine.pauseNonContainerThread(AsyncStateMachine.java:382)
      - eliminated <0xfffffffe5f76aae0> (a org.apache.coyote.AsyncStateMachine)
      at org.apache.coyote.AsyncStateMachine.asyncComplete(AsyncStateMachine.java:240)
      - locked <0xfffffffe5f76aae0> (a org.apache.coyote.AsyncStateMachine)
      at org.apache.coyote.http11.Http11Processor.actionInternal(Http11Processor.java:356)
      at org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:906)
      at org.apache.coyote.Request.action(Request.java:344)
      at org.apache.catalina.core.AsyncContextImpl.complete(AsyncContextImpl.java:93)
      at org.icepush.servlet.AsyncAdaptingServlet$AsyncRequestResponse.respondWith(AsyncAdaptingServlet.java:74)at org.icepush.SequenceTaggingServer$TaggingRequest.respondWith(SequenceTaggingServer.java:67)at org.icepush.BlockingConnectionServer.respondIfPendingRequest(BlockingConnectionServer.java:177)
      at org.icepush.BlockingConnectionServer.respondIfNotificationsAvailable(BlockingConnectionServer.java:149)
      - locked <0xfffffffe69880ba0> (a org.icepush.BlockingConnectionServer)
      at org.icepush.BlockingConnectionServer.sendNotifications(BlockingConnectionServer.java:126)
      at org.icepush.BlockingConnectionServer.receive(BlockingConnectionServer.java:293)
      at org.icepush.LocalNotificationBroadcaster.broadcast(LocalNotificationBroadcaster.java:39)
      at org.icepush.LocalPushGroupManager$Notification.run(LocalPushGroupManager.java:394)
      at org.icepush.LocalPushGroupManager$QueueConsumerTask.run(LocalPushGroupManager.java:432)
      at java.util.TimerThread.mainLoop(Timer.java:555)
      at java.util.TimerThread.run(Timer.java:505)

      The attempt is being made for an incoming request to establish a new blocking connection but it just piles up waiting for the lock.

      What’s unclear is why the “Notification queue consumer.” thread is having trouble completing and/or releasing the lock.
      On our end, we should be looking into the current locking strategy being used by BlockConnectionServer's respondIfBackOffRequested() and respondIfNotificationsAvailable() and possible coming up with a different locking strategy that avoid any future pile-ups of Threads in State BLOCKED.

        Activity

        Judy Guglielmin created issue -
        Judy Guglielmin made changes -
        Field Original Value New Value
        Workaround Description use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
        57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
        57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
        Judy Guglielmin made changes -
        Workaround Description use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
        57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
        57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
        use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
        57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
        57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
        OR
        disable using tomcat ARP with the following context-param:-
        <context-param>
        <param-name>org.icepush.useAsyncContext</param-name>
        <param-value>false</param-value>
        </context-param>
        Hide
        Judy Guglielmin added a comment - - edited

        It looks like org.apache.coyote.AsyncStateMachine.pauseNonContainerThread method is pausing the ICEpush (timer) thread responsible for emptying (and dispatching) the push notifications. (from Mircea's email July 20th) - further pointing to the tomcat implementation.

        Show
        Judy Guglielmin added a comment - - edited It looks like org.apache.coyote.AsyncStateMachine.pauseNonContainerThread method is pausing the ICEpush (timer) thread responsible for emptying (and dispatching) the push notifications. (from Mircea's email July 20th) - further pointing to the tomcat implementation.
        Judy Guglielmin made changes -
        Assignee Jack Van Ooststroom [ jack.van.ooststroom ]
        Judy Guglielmin made changes -
        Workaround Description use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
        57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
        57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
        OR
        disable using tomcat ARP with the following context-param:-
        <context-param>
        <param-name>org.icepush.useAsyncContext</param-name>
        <param-value>false</param-value>
        </context-param>
        suggested workarounds- but as of yet unconfirmed--use newer tomcat versions 7.0.62 or newer or 8.0.22 or newer that include the following 2 apache-tomcat fixes:-
        57779 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57779> : When an I/O error occurs on a non-container thread only dispatch to a container thread to handle the error if using Servlet 3+ asynchronous processing. This avoids potential deadlocks if an application is performing I/O on a non-container thread without using the Servlet 3+ asynchronous API. (markt)
        57621 <http://bz.apache.org/bugzilla/show_bug.cgi?id=57621> : When an async request completes, ensure that any remaining request body data is swallowed. (markt)
        OR
        disable using tomcat ARP with the following context-param:-
        <context-param>
        <param-name>org.icepush.useAsyncContext</param-name>
        <param-value>false</param-value>
        </context-param>
        Ken Fyten made changes -
        Fix Version/s 4.1 [ 11570 ]
        Fix Version/s EE-3.3.0.GA_P04 [ 12271 ]
        Assignee Priority P3 [ 10012 ]
        Hide
        Jack Van Ooststroom added a comment - - edited

        On the BlockConnectionServer's respondIfBackOffRequested() and respondIfNotificationsAvailable() locking front, I had a closer look at our code. Originally we theorized that having them use different locks might prevent the thread pile-up happening behind the stuck thread. However, it seems only to possibly delay it and not prevent it.

        As both methods use the synchronized keyword and thus use the BlockingConnectionServer instance as the lock it seemed reasonable to introduce separate locks for both looking at the implementations of these methods. However, when handling an incoming listen.icepush request the code dealing with this uses both mentioned methods:

                        if (!respondIfBackOffRequested()) {
                            if (!sendNotifications(pushGroupManager.getPendingNotifications())) {
                                if (resend) {
                                    resendLastNotifications();
                                } else {
                                    respondIfNotificationsAvailable();
                                }
                            }
                        }
        

        Generally a back-off is not requested and a resend is not necessary ensuring it eventually hits the respondIfNotificationsAvailable() method leading to the eventual thread pile-up.

        Show
        Jack Van Ooststroom added a comment - - edited On the BlockConnectionServer's respondIfBackOffRequested() and respondIfNotificationsAvailable() locking front, I had a closer look at our code. Originally we theorized that having them use different locks might prevent the thread pile-up happening behind the stuck thread. However, it seems only to possibly delay it and not prevent it. As both methods use the synchronized keyword and thus use the BlockingConnectionServer instance as the lock it seemed reasonable to introduce separate locks for both looking at the implementations of these methods. However, when handling an incoming listen.icepush request the code dealing with this uses both mentioned methods: if (!respondIfBackOffRequested()) { if (!sendNotifications(pushGroupManager.getPendingNotifications())) { if (resend) { resendLastNotifications(); } else { respondIfNotificationsAvailable(); } } } Generally a back-off is not requested and a resend is not necessary ensuring it eventually hits the respondIfNotificationsAvailable() method leading to the eventual thread pile-up.
        Ken Fyten made changes -
        Fix Version/s EE-4.1.0.GA [ 12172 ]
        Fix Version/s 4.1 [ 11570 ]
        Hide
        Leon Steffens added a comment -

        I can confirm we are hitting the same issue on both Tomcat 7.0.67 and Tomcat 8.0.30, using EE-3.0.0.GA_P03, so it's not fixed in the latest versions of Tomcat.

        Disabling the async context works but obviously falls back to the threadblocking server. I'll see if I can isolate the part of our test that consistently fails with this error and submit it.

        Show
        Leon Steffens added a comment - I can confirm we are hitting the same issue on both Tomcat 7.0.67 and Tomcat 8.0.30, using EE-3.0.0.GA_P03, so it's not fixed in the latest versions of Tomcat. Disabling the async context works but obviously falls back to the threadblocking server. I'll see if I can isolate the part of our test that consistently fails with this error and submit it.
        Hide
        Arran Mccullough added a comment -

        Attached test case that reproduces this behavior.

        Steps:

        • Click on the "Test AsyncFlood" link between 10 and 30 tries. When the issue happens it won't show any further push updates no matter how many clicks of the link.
        • I was able to reproduce it with a Tomcat 8.0.23 instance.
        • I uploaded the thread dump before and after the issue happened.
        Show
        Arran Mccullough added a comment - Attached test case that reproduces this behavior. Steps: Click on the "Test AsyncFlood" link between 10 and 30 tries. When the issue happens it won't show any further push updates no matter how many clicks of the link. I was able to reproduce it with a Tomcat 8.0.23 instance. I uploaded the thread dump before and after the issue happened.
        Arran Mccullough made changes -
        Attachment Case13690Example.war [ 22054 ]
        Attachment Case13690Example.zip [ 22055 ]
        Attachment after-thread-dump.txt [ 22056 ]
        Attachment before-thread-dump.txt [ 22057 ]
        Hide
        Jack Van Ooststroom added a comment -

        Detailed Problem Description

        When hitting this issue, the thread dump will show a number of threads of interest as shown below:

        A single non-container thread
        "Notification queue consumer." daemon prio=10 tid=0x0a895400 nid=0x43ef in Object.wait() [0xa119e000]
           java.lang.Thread.State: WAITING (on object monitor)
                at java.lang.Object.wait(Native Method)
                - waiting on <0xa8723010> (a org.apache.coyote.AsyncStateMachine)
                at java.lang.Object.wait(Object.java:503)
                at org.apache.coyote.AsyncStateMachine.pauseNonContainerThread(AsyncStateMachine.java:422)
                - locked <0xa8723010> (a org.apache.coyote.AsyncStateMachine)
                at org.apache.coyote.AsyncStateMachine.asyncComplete(AsyncStateMachine.java:256)
                - locked <0xa8723010> (a org.apache.coyote.AsyncStateMachine)
                at org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:849)
                at org.apache.coyote.Request.action(Request.java:378)
                at org.apache.catalina.core.AsyncContextImpl.complete(AsyncContextImpl.java:86)
                at org.icepush.servlet.AsyncAdaptingServlet$AsyncRequestResponse.respondWith(AsyncAdaptingServlet.java:109)
                at com.icesoft.icepush.AdjustHeartbeatServer$HeartbeatRequest.respondWith(AdjustHeartbeatServer.java:67)
                at org.icepush.SequenceTaggingServer$TaggingRequest.respondWith(SequenceTaggingServer.java:67)
                at org.icepush.BlockingConnectionServer.respondIfPendingRequest(BlockingConnectionServer.java:200)
                at org.icepush.BlockingConnectionServer.respondIfNotificationsAvailable(BlockingConnectionServer.java:163)
                - locked <0xa201e860> (a org.icepush.BlockingConnectionServer)
                at org.icepush.BlockingConnectionServer.sendNotifications(BlockingConnectionServer.java:138)
                at org.icepush.BlockingConnectionServer.receive(BlockingConnectionServer.java:338)
                at org.icepush.LocalNotificationBroadcaster.broadcast(LocalNotificationBroadcaster.java:39)
                at org.icepush.LocalPushGroupManager$Notification.run(LocalPushGroupManager.java:502)
                at org.icepush.LocalPushGroupManager$QueueConsumerTask.run(LocalPushGroupManager.java:611)
                at java.util.TimerThread.mainLoop(Timer.java:555)
                at java.util.TimerThread.run(Timer.java:505)
        

        and

        Multiple container threads
        "http-nio-8080-exec-3" daemon prio=10 tid=0x0a299400 nid=0x43f2 waiting for monitor entry [0xa10ab000]
           java.lang.Thread.State: BLOCKED (on object monitor)
                at org.icepush.BlockingConnectionServer.respondIfBackOffRequested(BlockingConnectionServer.java:148)
                - waiting to lock <0xa201e860> (a org.icepush.BlockingConnectionServer)
                at org.icepush.BlockingConnectionServer.access$1500(BlockingConnectionServer.java:39)
                at org.icepush.BlockingConnectionServer$RunningServer.service(BlockingConnectionServer.java:392)
                at org.icepush.BlockingConnectionServer.service(BlockingConnectionServer.java:99)
                at org.icepush.SequenceTaggingServer.service(SequenceTaggingServer.java:54)
                at com.icesoft.icepush.AdjustHeartbeatServer.service(AdjustHeartbeatServer.java:54)
                at org.icepush.ConfigurationServer.service(ConfigurationServer.java:109)
                at org.icepush.servlet.AsyncAdaptingServlet.service(AsyncAdaptingServlet.java:52)
                at org.icepush.servlet.EnvironmentAdaptingServlet.service(EnvironmentAdaptingServlet.java:56)
                at org.icepush.servlet.PathDispatcher.service(PathDispatcher.java:39)
                at org.icepush.servlet.BrowserDispatcher$BrowserEntry.service(BrowserDispatcher.java:101)
                at org.icepush.servlet.BrowserDispatcher.service(BrowserDispatcher.java:45)
                at org.icepush.servlet.PathDispatcher.service(PathDispatcher.java:39)
                at org.icepush.servlet.MainServlet.service(MainServlet.java:129)
                at org.icefaces.impl.push.servlet.ICEpushResourceHandler$ICEpushResourceHandlerImpl.handleResourceRequest(ICEpushResourceHandler.java:235)
                at org.icefaces.impl.push.servlet.ICEpushResourceHandler.handleResourceRequest(ICEpushResourceHandler.java:128)
                at org.icefaces.impl.application.SessionTimeoutMonitor.handleSessionAwareResourceRequest(SessionTimeoutMonitor.java:63)
                at org.icefaces.impl.application.SessionAwareResourceHandlerWrapper.handleResourceRequest(SessionAwareResourceHandlerWrapper.java:40)
                at javax.faces.application.ResourceHandlerWrapper.handleResourceRequest(ResourceHandlerWrapper.java:125)
                at org.icefaces.impl.application.AuxUploadResourceHandler.handleResourceRequest(AuxUploadResourceHandler.java:82)
                at javax.faces.webapp.FacesServlet.service(FacesServlet.java:591)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
                at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:617)
                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:518)
                at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1091)
                at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:668)
                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1521)
                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1478)
                - locked <0xa1e71ca0> (a org.apache.tomcat.util.net.NioChannel)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
                at java.lang.Thread.run(Thread.java:724)
        

        Originally, we suspected that looking at the thread dump the container threads were piling up behind the non-container thread due to Apache Tomcat's AsyncStateMachine invoking Object's wait() on the non-container thread. However, this is only partially true. Having a better understanding of the AsyncStateMachine's inner works and state changes tells us the following.

        There are to pieces of logic involved here:

        1. When an incoming listen.icepush request has been received by the BlockingConnectionServer, before obtaining a lock on the BlockingConnectionServer object this request is put on the pendingRequest queue. This always happens on a container thread.
        2. When a request for a push has been received by the LocalPushGroupManager, the non-container "Notification queue consumer" thread picks it up and if a request is on the pendingRequest queue it will poll it and then obtain a lock on the BlockingConnectionServer object.

        Once the lock on the BlockingConnectionServer has been obtained by either thread it will try to send a response to the listen.icepush request. In this edge case the non-container thread obtains the lock on the BlockingConnectionServer object before the container thread handling the same request has a chance, and thus the latter is in the BLOCKED state waiting on the lock on the BlockingConnectionServer object. Once the non-container thread has written the response it invokes the AsyncContext.complete() method eventually resulting in the AsyncStateMachine to invoke the Object.wait() method on this thread. As the container thread handling the same request does not finish due to the BLOCKED state the AsyncStateMachine's state will not change from STARTING to STARTED. Any other following incoming listen.icepush requests will end up trying to obtain the lock on the BlockingConnectionServer object and end up in the BLOCKED state as well. Seemingly resulting in the container threads piling up behind the non-container thread. The thread dumps of these BLOCKED container threads look similar though. However, the container thread handling the same request differs in its situation than the container threads handling different requests.

        The idea of a solution is simple: the request should only be put on the pendingRequest queue by the container thread after trying to handle the request first and only if the container thread was unable to send a response. However, the logic around the pendingRequest queue is quite fundamental to the current workings of the BlockingConnectionServer. We need to be careful not to break any related logic.

        Finally, to be clear this issue is specific to ICEpush in combination with Tomcat ARP. The original logic works fine with Tomcat non-ARP and other application servers as far as we know.

        Show
        Jack Van Ooststroom added a comment - Detailed Problem Description When hitting this issue, the thread dump will show a number of threads of interest as shown below: A single non-container thread "Notification queue consumer." daemon prio=10 tid=0x0a895400 nid=0x43ef in Object .wait() [0xa119e000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0xa8723010> (a org.apache.coyote.AsyncStateMachine) at java.lang. Object .wait( Object .java:503) at org.apache.coyote.AsyncStateMachine.pauseNonContainerThread(AsyncStateMachine.java:422) - locked <0xa8723010> (a org.apache.coyote.AsyncStateMachine) at org.apache.coyote.AsyncStateMachine.asyncComplete(AsyncStateMachine.java:256) - locked <0xa8723010> (a org.apache.coyote.AsyncStateMachine) at org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:849) at org.apache.coyote.Request.action(Request.java:378) at org.apache.catalina.core.AsyncContextImpl.complete(AsyncContextImpl.java:86) at org.icepush.servlet.AsyncAdaptingServlet$AsyncRequestResponse.respondWith(AsyncAdaptingServlet.java:109) at com.icesoft.icepush.AdjustHeartbeatServer$HeartbeatRequest.respondWith(AdjustHeartbeatServer.java:67) at org.icepush.SequenceTaggingServer$TaggingRequest.respondWith(SequenceTaggingServer.java:67) at org.icepush.BlockingConnectionServer.respondIfPendingRequest(BlockingConnectionServer.java:200) at org.icepush.BlockingConnectionServer.respondIfNotificationsAvailable(BlockingConnectionServer.java:163) - locked <0xa201e860> (a org.icepush.BlockingConnectionServer) at org.icepush.BlockingConnectionServer.sendNotifications(BlockingConnectionServer.java:138) at org.icepush.BlockingConnectionServer.receive(BlockingConnectionServer.java:338) at org.icepush.LocalNotificationBroadcaster.broadcast(LocalNotificationBroadcaster.java:39) at org.icepush.LocalPushGroupManager$Notification.run(LocalPushGroupManager.java:502) at org.icepush.LocalPushGroupManager$QueueConsumerTask.run(LocalPushGroupManager.java:611) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) and Multiple container threads "http-nio-8080-exec-3" daemon prio=10 tid=0x0a299400 nid=0x43f2 waiting for monitor entry [0xa10ab000] java.lang. Thread .State: BLOCKED (on object monitor) at org.icepush.BlockingConnectionServer.respondIfBackOffRequested(BlockingConnectionServer.java:148) - waiting to lock <0xa201e860> (a org.icepush.BlockingConnectionServer) at org.icepush.BlockingConnectionServer.access$1500(BlockingConnectionServer.java:39) at org.icepush.BlockingConnectionServer$RunningServer.service(BlockingConnectionServer.java:392) at org.icepush.BlockingConnectionServer.service(BlockingConnectionServer.java:99) at org.icepush.SequenceTaggingServer.service(SequenceTaggingServer.java:54) at com.icesoft.icepush.AdjustHeartbeatServer.service(AdjustHeartbeatServer.java:54) at org.icepush.ConfigurationServer.service(ConfigurationServer.java:109) at org.icepush.servlet.AsyncAdaptingServlet.service(AsyncAdaptingServlet.java:52) at org.icepush.servlet.EnvironmentAdaptingServlet.service(EnvironmentAdaptingServlet.java:56) at org.icepush.servlet.PathDispatcher.service(PathDispatcher.java:39) at org.icepush.servlet.BrowserDispatcher$BrowserEntry.service(BrowserDispatcher.java:101) at org.icepush.servlet.BrowserDispatcher.service(BrowserDispatcher.java:45) at org.icepush.servlet.PathDispatcher.service(PathDispatcher.java:39) at org.icepush.servlet.MainServlet.service(MainServlet.java:129) at org.icefaces.impl.push.servlet.ICEpushResourceHandler$ICEpushResourceHandlerImpl.handleResourceRequest(ICEpushResourceHandler.java:235) at org.icefaces.impl.push.servlet.ICEpushResourceHandler.handleResourceRequest(ICEpushResourceHandler.java:128) at org.icefaces.impl.application.SessionTimeoutMonitor.handleSessionAwareResourceRequest(SessionTimeoutMonitor.java:63) at org.icefaces.impl.application.SessionAwareResourceHandlerWrapper.handleResourceRequest(SessionAwareResourceHandlerWrapper.java:40) at javax.faces.application.ResourceHandlerWrapper.handleResourceRequest(ResourceHandlerWrapper.java:125) at org.icefaces.impl.application.AuxUploadResourceHandler.handleResourceRequest(AuxUploadResourceHandler.java:82) at javax.faces.webapp.FacesServlet.service(FacesServlet.java:591) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:617) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:518) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1091) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:668) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1521) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1478) - locked <0xa1e71ca0> (a org.apache.tomcat.util.net.NioChannel) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang. Thread .run( Thread .java:724) Originally, we suspected that looking at the thread dump the container threads were piling up behind the non-container thread due to Apache Tomcat's AsyncStateMachine invoking Object 's wait() on the non-container thread. However, this is only partially true. Having a better understanding of the AsyncStateMachine 's inner works and state changes tells us the following. There are to pieces of logic involved here: When an incoming listen.icepush request has been received by the BlockingConnectionServer , before obtaining a lock on the BlockingConnectionServer object this request is put on the pendingRequest queue. This always happens on a container thread. When a request for a push has been received by the LocalPushGroupManager , the non-container "Notification queue consumer" thread picks it up and if a request is on the pendingRequest queue it will poll it and then obtain a lock on the BlockingConnectionServer object. Once the lock on the BlockingConnectionServer has been obtained by either thread it will try to send a response to the listen.icepush request. In this edge case the non-container thread obtains the lock on the BlockingConnectionServer object before the container thread handling the same request has a chance, and thus the latter is in the BLOCKED state waiting on the lock on the BlockingConnectionServer object. Once the non-container thread has written the response it invokes the AsyncContext.complete() method eventually resulting in the AsyncStateMachine to invoke the Object.wait() method on this thread. As the container thread handling the same request does not finish due to the BLOCKED state the AsyncStateMachine 's state will not change from STARTING to STARTED . Any other following incoming listen.icepush requests will end up trying to obtain the lock on the BlockingConnectionServer object and end up in the BLOCKED state as well. Seemingly resulting in the container threads piling up behind the non-container thread. The thread dumps of these BLOCKED container threads look similar though. However, the container thread handling the same request differs in its situation than the container threads handling different requests. The idea of a solution is simple: the request should only be put on the pendingRequest queue by the container thread after trying to handle the request first and only if the container thread was unable to send a response. However, the logic around the pendingRequest queue is quite fundamental to the current workings of the BlockingConnectionServer . We need to be careful not to break any related logic. Finally, to be clear this issue is specific to ICEpush in combination with Tomcat ARP. The original logic works fine with Tomcat non-ARP and other application servers as far as we know.
        Jack Van Ooststroom made changes -
        Summary Manage issue with older tomcat 7 & 8 ARP with blocking connection Stuck non-container thread using Tomcat 7/8 ARP
        Hide
        Jack Van Ooststroom added a comment - - edited

        ICEfaces 3

        Sending core/src/main/java/org/icepush/BlockingConnectionServer.java
        Sending core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java
        Transmitting file data ..
        Committed revision 47624.

        Show
        Jack Van Ooststroom added a comment - - edited ICEfaces 3 Sending core/src/main/java/org/icepush/BlockingConnectionServer.java Sending core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java Transmitting file data .. Committed revision 47624.
        Hide
        Jack Van Ooststroom added a comment -

        ICEfaces 4

        Sending core/src/main/java/org/icepush/BlockingConnectionServer.java
        Sending core/src/main/java/org/icepush/Browser.java
        Sending core/src/main/java/org/icepush/ConfirmationTimeout.java
        Sending core/src/main/java/org/icepush/LocalNotificationBroadcaster.java
        Sending core/src/main/java/org/icepush/LocalPushGroupManager.java
        Sending core/src/main/java/org/icepush/NoopPushGroupManager.java
        Sending core/src/main/java/org/icepush/NotificationBroadcaster.java
        Sending core/src/main/java/org/icepush/PushGroupManager.java
        Sending core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java
        Transmitting file data .........
        Committed revision 47948.

        Sending eps/src/main/java/com/icesoft/push/DynamicPushGroupManager.java
        Sending eps/src/main/java/com/icesoft/push/RemotePushGroupManager.java
        Sending eps/src/main/java/com/icesoft/push/messaging/MessagePayload.java
        Transmitting file data ...
        Committed revision 44450.

        Show
        Jack Van Ooststroom added a comment - ICEfaces 4 Sending core/src/main/java/org/icepush/BlockingConnectionServer.java Sending core/src/main/java/org/icepush/Browser.java Sending core/src/main/java/org/icepush/ConfirmationTimeout.java Sending core/src/main/java/org/icepush/LocalNotificationBroadcaster.java Sending core/src/main/java/org/icepush/LocalPushGroupManager.java Sending core/src/main/java/org/icepush/NoopPushGroupManager.java Sending core/src/main/java/org/icepush/NotificationBroadcaster.java Sending core/src/main/java/org/icepush/PushGroupManager.java Sending core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java Transmitting file data ......... Committed revision 47948. Sending eps/src/main/java/com/icesoft/push/DynamicPushGroupManager.java Sending eps/src/main/java/com/icesoft/push/RemotePushGroupManager.java Sending eps/src/main/java/com/icesoft/push/messaging/MessagePayload.java Transmitting file data ... Committed revision 44450.
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #47948 Thu Feb 18 08:08:05 MST 2016 jack.van.ooststroom Fixed JIRA PUSH-364 : Stuck non-container thread using Tomcat 7/8 ARP
        Files Changed
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/Browser.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/PushGroupManager.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/LocalPushGroupManager.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/LocalNotificationBroadcaster.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/NoopPushGroupManager.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/NotificationBroadcaster.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/BlockingConnectionServer.java
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/ConfirmationTimeout.java
        Hide
        Jack Van Ooststroom added a comment -

        ICEfaces 4

        Sending icepush-ee/core/src/main/java/org/icepush/BlockingConnectionServer.java
        Transmitting file data .
        Committed revision 47949.

        Show
        Jack Van Ooststroom added a comment - ICEfaces 4 Sending icepush-ee/core/src/main/java/org/icepush/BlockingConnectionServer.java Transmitting file data . Committed revision 47949.
        Hide
        Carmen Cristurean added a comment - - edited

        Emporium app fails on ICEfaces4 trunk r.47948/ Jenkins ICEfaces4 trunk Build # 1819/ all browsers/ server: Tomcat 7.0.42/8.0.21; not an issue on previous Jenkins build: clocks do not work, they tick just once on page load, then stop. Also, the automated bids are not seen to occur. This is not an issue on Jenkins ICEfaces 4.1.1 Build #5.

        This failure can no longer be reproduced on ICEfaces4 trunk Build# 1820 (r. 47949); all tests pass (FF41).

        Show
        Carmen Cristurean added a comment - - edited Emporium app fails on ICEfaces4 trunk r.47948/ Jenkins ICEfaces4 trunk Build # 1819/ all browsers/ server: Tomcat 7.0.42/8.0.21; not an issue on previous Jenkins build: clocks do not work, they tick just once on page load, then stop. Also, the automated bids are not seen to occur. This is not an issue on Jenkins ICEfaces 4.1.1 Build #5. This failure can no longer be reproduced on ICEfaces4 trunk Build# 1820 (r. 47949); all tests pass (FF41).
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #47949 Thu Feb 18 09:43:47 MST 2016 jack.van.ooststroom Fixed JIRA PUSH-364 : Stuck non-container thread using Tomcat 7/8 ARP
        Files Changed
        Commit graph MODIFY /icepush/trunk/icepush/core/src/main/java/org/icepush/BlockingConnectionServer.java
        Hide
        Jack Van Ooststroom added a comment -

        The BlockingConnectionServer and associated classes have been re-factored to include the following:

        1. An incoming listen.icepush request handled by the BlockingConnectionServer is only added to the pendingPushRequestQueue if the handling container thread is unable to send a response in the end.
        2. The BlockingConnectionServer itself is not used as a lock anymore in order to operate on the backOff or the notifiedPushIDQueue. Instead 2 separate locks have been introduced: one for the backOff and one for the notifiedPushIDQueue access.

        Marking this one as FIXED.

        Show
        Jack Van Ooststroom added a comment - The BlockingConnectionServer and associated classes have been re-factored to include the following: An incoming listen.icepush request handled by the BlockingConnectionServer is only added to the pendingPushRequestQueue if the handling container thread is unable to send a response in the end. The BlockingConnectionServer itself is not used as a lock anymore in order to operate on the backOff or the notifiedPushIDQueue . Instead 2 separate locks have been introduced: one for the backOff and one for the notifiedPushIDQueue access. Marking this one as FIXED.
        Jack Van Ooststroom made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Ken Fyten made changes -
        Fix Version/s EE-4.1.0.BETA [ 13073 ]
        Ken Fyten made changes -
        Issue Type Improvement [ 4 ] Bug [ 1 ]
        Ken Fyten made changes -
        Fix Version/s 4.2.BETA [ 13092 ]
        Fix Version/s 4.2 [ 12871 ]
        Ken Fyten made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #51331 Wed Mar 15 10:04:23 MDT 2017 jack.van.ooststroom Backported PUSH-364
        Files Changed
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/BlockingConnectionServer.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/PushGroupManager.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/ConfirmationTimeout.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/NoopPushGroupManager.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/NotificationBroadcaster.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/LocalPushGroupManager.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/servlet/AsyncAdaptingServlet.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/LocalNotificationBroadcaster.java
        Commit graph MODIFY /icepush/branches/icepush-core-4.x-maintenance/icepush/core/src/main/java/org/icepush/Browser.java

          People

          • Assignee:
            Jack Van Ooststroom
            Reporter:
            Judy Guglielmin
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: