[ICE-10754] Long running process in one browser instance stops push updates in all others - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: EE-1.8.2.GA_P07
Fix Version/s: EE-1.8.2.GA_P09
Component/s: Bridge, Framework
Labels:
None
Environment:
All, ICEfaces 1.8.x

Assignee Priority:
P1
Support Case References:
Support Case #12054 - https://icesoft.my.salesforce.com/5007000000SPr0n

Description

An app is set to display clock ticks via an IntervalRenderer. If a long running process is executed in one browser instance, all other browser instances stop receiving clock tick updates. The other instances are functionally fine but just don't get the updates.

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Download All

Attachments

AuctionBean.java

09/Jul/15 9:04 AM

12 kB

Arran Mccullough
auctionMonitor.jspx

09/Jul/15 9:04 AM

20 kB

Arran Mccullough
Hide
auctionMonitor.war

09/Jul/15 9:04 AM

7.03 MB

Arran Mccullough
META-INF/MANIFEST.MF 0.1 kB

WEB-INF/classes/.../AddAuctionItem.class 3 kB

WEB-INF/classes/.../AddAuctionItem.java 3 kB

WEB-INF/classes/.../AuctionEvent.class 0.5 kB

WEB-INF/classes/.../AuctionEvent.java 2 kB

WEB-INF/classes/.../AuctionListener.class 0.2 kB

WEB-INF/classes/.../AuctionListener.java 2 kB

WEB-INF/.../AuctionMonitorItemDetailer.class 3 kB

WEB-INF/.../AuctionMonitorItemDetailer.java 3 kB

WEB-INF/classes/.../AuctionState$1.class 2 kB

WEB-INF/classes/.../AuctionState.class 2 kB

WEB-INF/classes/.../AuctionState.java 3 kB

WEB-INF/classes/.../ChatState.class 4 kB

WEB-INF/classes/.../ChatState.java 6 kB

WEB-INF/classes/.../Message.class 2 kB

WEB-INF/classes/.../Message.java 3 kB

WEB-INF/classes/.../MessageLog.class 1 kB

WEB-INF/classes/.../MessageLog.java 2 kB

WEB-INF/classes/.../beans/AuctionBean.class 9 kB

WEB-INF/classes/.../beans/AuctionBean.java 12 kB

WEB-INF/.../AuctionMonitorItemBean.class 9 kB

WEB-INF/.../AuctionMonitorItemBean.java 11 kB

WEB-INF/classes/.../beans/ClockBean.class 5 kB

WEB-INF/classes/com/.../beans/ClockBean.java 5 kB

WEB-INF/classes/.../beans/LogBean$1.class 0.7 kB

WEB-INF/classes/com/.../beans/LogBean.class 7 kB

WEB-INF/classes/com/.../beans/LogBean.java 11 kB

WEB-INF/classes/.../beans/ReadmeBean.class 4 kB

WEB-INF/classes/.../beans/ReadmeBean.java 5 kB

WEB-INF/classes/com/.../beans/UserBean.class 9 kB

Showing 30 of 121 items Download Zip
Show

auctionMonitor.war

09/Jul/15 9:04 AM

7.03 MB

Arran Mccullough

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Arran Mccullough added a comment - 09/Jul/15 9:04 AM

This issue is reproducible with a modified Auction Monitor demo.

Steps:
1) Load the demo in one browser, Firefox for example.
2) Load the demo in a different browser, IE for example.
3) Click the 'Test Long Process' button in Firefox. This calls a Thread.sleep() call for 10 min.
4) The clock ticks are no longer seen in the IE instance. Clicking on the bid buttons do update the clock but no push updates are seen.

Note: Steps 2 and 3 can be switched but the outcome remains the same.

Show

Arran Mccullough added a comment - 09/Jul/15 9:04 AM This issue is reproducible with a modified Auction Monitor demo. Steps: 1) Load the demo in one browser, Firefox for example. 2) Load the demo in a different browser, IE for example. 3) Click the 'Test Long Process' button in Firefox. This calls a Thread.sleep() call for 10 min. 4) The clock ticks are no longer seen in the IE instance. Clicking on the bid buttons do update the clock but no push updates are seen. Note: Steps 2 and 3 can be switched but the outcome remains the same.

Hide

Permalink

Deryk Sinotte added a comment - 09/Jul/15 11:48 AM

Able to reproduce. One of the first things I noticed, though, was that the clocks on the second browser do not stop right away. They go for several seconds before coming to a halt. Seemed to be around 9-10 seconds. So the lockout is not immediate.

It just so happens that the default pool of render threads is set to 10. I reconfigured the AuctionMonitor by adding the following context parameters:

    <context-param>
        <param-name>com.icesoft.faces.async.render.corePoolSize</param-name>
        <param-value>30</param-value>
    </context-param>
    
    <context-param>
        <param-name>com.icesoft.faces.async.render.maxPoolSize</param-name>
        <param-value>50</param-value>
    </context-param>

As you might guess, the clocks continued to tick for another 30 seconds before stopping. So the problem is related to the using of render threads in our thread pool. The problem causes them to run only once before being "frozen" and unable to be used again.

Show

Deryk Sinotte added a comment - 09/Jul/15 11:48 AM Able to reproduce. One of the first things I noticed, though, was that the clocks on the second browser do not stop right away. They go for several seconds before coming to a halt. Seemed to be around 9-10 seconds. So the lockout is not immediate. It just so happens that the default pool of render threads is set to 10. I reconfigured the AuctionMonitor by adding the following context parameters: <context-param> <param-name> com.icesoft.faces.async.render.corePoolSize </param-name> <param-value> 30 </param-value> </context-param> <context-param> <param-name> com.icesoft.faces.async.render.maxPoolSize </param-name> <param-value> 50 </param-value> </context-param> As you might guess, the clocks continued to tick for another 30 seconds before stopping. So the problem is related to the using of render threads in our thread pool. The problem causes them to run only once before being "frozen" and unable to be used again.

Hide

Permalink

Deryk Sinotte added a comment - 10/Jul/15 10:28 AM

So the problem ends up being fairly straightforward and generally is behaving mostly by design.

Any thread that attempts to work with a users current View must acquire a lifecycle lock for that View to prevent 2 threads from modifying that View at the same time.

Client requests are handled by HTTP request threads provided by Tomcat.
Push-related renders are handled by our own set of server-side threads (named "Render Thread - #"). We start with a pool of them that defaults to 10 but is configurable as noted in the previous comment.

In both cases, a user request or a push request, the behaviour is the same. Before the JSF lifecycle is run (execute then render) the lock for that particular view must be acquired. If something else has the lock, the thread waits until the lock is released.

What's happening in this case.

Assume we have two users, A and B, just sitting and watching the clocks tick.
Each time the interval renderer runs to push out a clock tick, a render thread is used for each view. So Render Thread - 1 is being used to push out to User A and Render Thread - 2 is being used to push out to User B. Render Threads 3-10 are currently idle as there are only the two users currently active. With nothing else is going on, each thread can easily acquire the lock for the associated user's View and run the lifecycle. Once done, the lock is released. This whole process repeats every second although different Render Threads may be used each time.
Suppose one or both users instigate a "normal" interaction with the app - something that doesn't take too long. The HTTP thread associated with each request grabs the lock for the relevant View and runs the lifecycle. If the clock happens to tick while this is happening, they just wait on the View lock until the request is done and the lock released. The assumption is that the client request won't take too long and the Render Thread will eventually be released back to the pool.
Now suppose User A initiates a request that takes a long time (5 minutes) and User B does nothing but sit there and enjoy the ticking clocks. User A's HTTP request has grabbed the lock for his view and is sitting on it. Each time the clock ticks, two Render Threads are used. The one for User A has to wait on the lock, while the one for User B works fine. However, we've now lost one Render Thread out of the pool. When the next clock tick occurs, the same thing happens. Now there are 2 threads waiting for User A's View lock. Clocks will continue to click for User B but continue to get piled up for User A until the thread pool runs out of threads.
Once the long-running task completes, all the waiting Render Threads do their work and go back to the pool.

This is why changing the size of the thread pool and/or the interval for the clock ticks, etc. all have a deterministic effect on the behaviour. More Render Threads means it takes longer to run out. A larger clock interval means the Render Threads don't run as often and therefore take longer to run out.

Possible solutions:

Design the app to either avoid the long-running process or have it so the long-running processes can occur without having to hold on to the View lock. Locking down the View (and the UI in general) for a protracted period of time probably isn't an optimal/user-friendly solution. In the past, depending on the task involved, I think we've offered alternatives to how the action can be implemented. Of course that may or may not apply here.
The simplest and easiest might be this. If the app knows it's going to be running a fairly long-running process while it has the lock, remove the current View from the render group before the process starts and add it back in after the process has finished. Having attempted clock updates going to the user with the long-running process isn't necessary anyway and just wastes resources. I modified our AuctionMonitor test to try this out. I moved the long-running action into the ClockBean to make it easier to add/remove and then created a second method:

    public void longRunInGroup(ActionEvent event) {
        System.out.println("sleeping for 1 min");
        try {
            Thread.sleep(60000);
        } catch (InterruptedException ex) {
            ex.printStackTrace();
        }
        System.out.println("added " + this);
    }

    public void longRunOutOfGroup(ActionEvent event) {
        System.out.println("sleeping for 1 min but removing from clock render while it happens");
        clock.remove(this);
        try {
            Thread.sleep(60000);
        } catch (InterruptedException ex) {
            ex.printStackTrace();
        }
        clock.add(this);
    }

Make adjustments to the RenderManager so that Render Threads can be configured to timeout gracefully. For example, we currently have this:

    public void acquireLifecycleLock() {
        if (!lifecycleLock.isHeldByCurrentThread()) {
            lifecycleLock.lock();
        }
    }

Perhaps providing a configurable timeout:

    public void acquireLifecycleLock() {
        if (!lifecycleLock.isHeldByCurrentThread()) {
            try {
                lifecycleLock.tryLock(10, TimeUnit.SECONDS);
            } catch (InterruptedException ie) {
                //Do something when it times out?
            }
        }
    }

This may be somewhat tricky as the View lock isn't currently that smart about the types of threads that acquire it (HTTP request vs Render Thread). W We could have separate logic for HTTP vs Render Threads via different methods or a boolean parameter but there may be other complexities involved as well - for example what happens when a Render Thread times out waiting for a lock?

Show

Deryk Sinotte added a comment - 10/Jul/15 10:28 AM So the problem ends up being fairly straightforward and generally is behaving mostly by design. Any thread that attempts to work with a users current View must acquire a lifecycle lock for that View to prevent 2 threads from modifying that View at the same time. Client requests are handled by HTTP request threads provided by Tomcat. Push-related renders are handled by our own set of server-side threads (named "Render Thread - #"). We start with a pool of them that defaults to 10 but is configurable as noted in the previous comment. In both cases, a user request or a push request, the behaviour is the same. Before the JSF lifecycle is run (execute then render) the lock for that particular view must be acquired. If something else has the lock, the thread waits until the lock is released. What's happening in this case. Assume we have two users, A and B, just sitting and watching the clocks tick. Each time the interval renderer runs to push out a clock tick, a render thread is used for each view. So Render Thread - 1 is being used to push out to User A and Render Thread - 2 is being used to push out to User B. Render Threads 3-10 are currently idle as there are only the two users currently active. With nothing else is going on, each thread can easily acquire the lock for the associated user's View and run the lifecycle. Once done, the lock is released. This whole process repeats every second although different Render Threads may be used each time. Suppose one or both users instigate a "normal" interaction with the app - something that doesn't take too long. The HTTP thread associated with each request grabs the lock for the relevant View and runs the lifecycle. If the clock happens to tick while this is happening, they just wait on the View lock until the request is done and the lock released. The assumption is that the client request won't take too long and the Render Thread will eventually be released back to the pool. Now suppose User A initiates a request that takes a long time (5 minutes) and User B does nothing but sit there and enjoy the ticking clocks. User A's HTTP request has grabbed the lock for his view and is sitting on it. Each time the clock ticks, two Render Threads are used. The one for User A has to wait on the lock, while the one for User B works fine. However, we've now lost one Render Thread out of the pool. When the next clock tick occurs, the same thing happens. Now there are 2 threads waiting for User A's View lock. Clocks will continue to click for User B but continue to get piled up for User A until the thread pool runs out of threads. Once the long-running task completes, all the waiting Render Threads do their work and go back to the pool. This is why changing the size of the thread pool and/or the interval for the clock ticks, etc. all have a deterministic effect on the behaviour. More Render Threads means it takes longer to run out. A larger clock interval means the Render Threads don't run as often and therefore take longer to run out. Possible solutions: Design the app to either avoid the long-running process or have it so the long-running processes can occur without having to hold on to the View lock. Locking down the View (and the UI in general) for a protracted period of time probably isn't an optimal/user-friendly solution. In the past, depending on the task involved, I think we've offered alternatives to how the action can be implemented. Of course that may or may not apply here. The simplest and easiest might be this. If the app knows it's going to be running a fairly long-running process while it has the lock, remove the current View from the render group before the process starts and add it back in after the process has finished. Having attempted clock updates going to the user with the long-running process isn't necessary anyway and just wastes resources. I modified our AuctionMonitor test to try this out. I moved the long-running action into the ClockBean to make it easier to add/remove and then created a second method: public void longRunInGroup(ActionEvent event) { System .out.println( "sleeping for 1 min" ); try { Thread .sleep(60000); } catch (InterruptedException ex) { ex.printStackTrace(); } System .out.println( "added " + this ); } public void longRunOutOfGroup(ActionEvent event) { System .out.println( "sleeping for 1 min but removing from clock render while it happens" ); clock.remove( this ); try { Thread .sleep(60000); } catch (InterruptedException ex) { ex.printStackTrace(); } clock.add( this ); } Make adjustments to the RenderManager so that Render Threads can be configured to timeout gracefully. For example, we currently have this: public void acquireLifecycleLock() { if (!lifecycleLock.isHeldByCurrentThread()) { lifecycleLock.lock(); } } Perhaps providing a configurable timeout: public void acquireLifecycleLock() { if (!lifecycleLock.isHeldByCurrentThread()) { try { lifecycleLock.tryLock(10, TimeUnit.SECONDS); } catch (InterruptedException ie) { //Do something when it times out? } } } This may be somewhat tricky as the View lock isn't currently that smart about the types of threads that acquire it (HTTP request vs Render Thread). W We could have separate logic for HTTP vs Render Threads via different methods or a boolean parameter but there may be other complexities involved as well - for example what happens when a Render Thread times out waiting for a lock?

Hide

Permalink

Ken Fyten added a comment - 14/Jul/15 4:06 PM

Design the app to either avoid the long-running process or have it so the long-running processes can occur without having to hold on to the View lock. Locking down the View (and the UI in general) for a protracted period of time probably isn't an optimal/user-friendly solution.

Agreed. Perhaps we should add a Warning log message if a view lock is detected to be held longer than some "reasonably long" threshold (15 secs?) that would indicate that an asynchronous update technique should be adopted, etc.

Show

Ken Fyten added a comment - 14/Jul/15 4:06 PM Design the app to either avoid the long-running process or have it so the long-running processes can occur without having to hold on to the View lock. Locking down the View (and the UI in general) for a protracted period of time probably isn't an optimal/user-friendly solution. Agreed. Perhaps we should add a Warning log message if a view lock is detected to be held longer than some "reasonably long" threshold (15 secs?) that would indicate that an asynchronous update technique should be adopted, etc.

Hide

Permalink

Ken Fyten added a comment - 11/Oct/16 3:16 PM

Upon review we've decided that it doesn't make much sense to spend the time adding specialize "Warning" logs for this scenario as we don't expect new projects to be using ICEfaces 1.8.x.

Show

Ken Fyten added a comment - 11/Oct/16 3:16 PM Upon review we've decided that it doesn't make much sense to spend the time adding specialize "Warning" logs for this scenario as we don't expect new projects to be using ICEfaces 1.8.x.

People

Assignee:

Deryk Sinotte

Reporter:

Arran Mccullough

Votes:

0 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

09/Jul/15 9:00 AM

Updated:

11/Oct/16 3:16 PM

Resolved:

11/Oct/16 3:16 PM