[PUSH-238] Server RuntimeException can create Push storm - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3
Fix Version/s: EE-3.3.0.GA, 4.0.BETA, 4.0
Component/s: Push Library
Labels:
None
Environment:
ICEpush

Assignee Priority:
P2

Description

If a RuntimeException occurs in the BlockingConnectionServer$RunningServer.service() method, the response sent to the bridge results in a Push storm - a tight, never-ending loop of requests to re-establish the blocking connection which results another exeption, etc.

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Deryk Sinotte added a comment - 17/May/13 10:50 AM

The problem causing the Push storm for me was code I had added (see ~~PUSH-221~~) to allow registering notification URIs from the server-side. Under certain circumstances, the code would throw an NPE when an old push id was included with the request. I improved the null check so managed to avoid causing the Push storm in that manner but this exception was being caught and handled in the BlockingConnectionServer$RunningServer.service() method. There is a large try/catch that does this:

try {
    ...
    //Doing a bunch of stuff with push ids and notification URIs
    //when all of a sudden
    
    NullPointerException occurs
    
    ...
} catch (RuntimeException e) {
    log.log(Level.FINE,"Request does not contain pushIDs.",e);
    respondIfPendingRequest(ErrorNoopResponse);
}

So the NPE was being caught and a <noop> response being returned with the header:

X-Connection-reason: request does not contain Push IDs

The bridge receives this and right away tries to re-establish the blocking connection. However, the NPE would happen again and -> Push storm.

In the past (as per ~~PUSH-135~~), to prevent these types Push storms we had instituted a mechanism to cap the number of retries when an empty response is returned. It doesn't appear that this logic is handling the situation. In connection.async.js we have:

var nonEmptyResponse = notEmpty(contentAsText(response));

if (reconnect) {
    if (nonEmptyResponse) {
....

Perhaps because it's not really empty. So I believe we need to:

Review the BlockingConnectionServer$RunningServer.service() to see if we are doing the right thing. Any RuntimeException in this block of code will result in the same behaviour. We should ensure that the try/catch around the whole block is what we want and that when a RuntimeException occurs, this response we are sending back is appropriate.

Look at improving the logging as "request does not contain pushIDs" was a not quite what was happening here. The request did contain a push id but it was not active/valid one.

Depending on what we do with the service() method, we may need to tweak the bridge so that when it sees this type of response, it doesn't just keep trying repeatedly.

The easiest way to reproduce this is synthetically. Just have it throw an NPE somewhere in the middle of that service() method based on some criteria. For example, I wedged this in towards the bottom of the try block:

System.out.println("BlockingConnectionServer$RunningServer.service: sequenceNumber = " + sequenceNumber);
if(sequenceNumber > 2){
    throw new NullPointerException("synthetic NPE to create Push storm");
}

After a few listen.icepush requests, a storm should be triggered.

Show

Deryk Sinotte added a comment - 17/May/13 10:50 AM The problem causing the Push storm for me was code I had added (see PUSH-221 ) to allow registering notification URIs from the server-side. Under certain circumstances, the code would throw an NPE when an old push id was included with the request. I improved the null check so managed to avoid causing the Push storm in that manner but this exception was being caught and handled in the BlockingConnectionServer$RunningServer.service() method. There is a large try/catch that does this: try { ... //Doing a bunch of stuff with push ids and notification URIs //when all of a sudden NullPointerException occurs ... } catch (RuntimeException e) { log.log(Level.FINE, "Request does not contain pushIDs." ,e); respondIfPendingRequest(ErrorNoopResponse); } So the NPE was being caught and a <noop> response being returned with the header: X-Connection-reason: request does not contain Push IDs The bridge receives this and right away tries to re-establish the blocking connection. However, the NPE would happen again and -> Push storm. In the past (as per PUSH-135 ), to prevent these types Push storms we had instituted a mechanism to cap the number of retries when an empty response is returned. It doesn't appear that this logic is handling the situation. In connection.async.js we have: var nonEmptyResponse = notEmpty(contentAsText(response)); if (reconnect) { if (nonEmptyResponse) { .... Perhaps because it's not really empty. So I believe we need to: Review the BlockingConnectionServer$RunningServer.service() to see if we are doing the right thing. Any RuntimeException in this block of code will result in the same behaviour. We should ensure that the try/catch around the whole block is what we want and that when a RuntimeException occurs, this response we are sending back is appropriate. Look at improving the logging as "request does not contain pushIDs" was a not quite what was happening here. The request did contain a push id but it was not active/valid one. Depending on what we do with the service() method, we may need to tweak the bridge so that when it sees this type of response, it doesn't just keep trying repeatedly. The easiest way to reproduce this is synthetically. Just have it throw an NPE somewhere in the middle of that service() method based on some criteria. For example, I wedged this in towards the bottom of the try block: System .out.println( "BlockingConnectionServer$RunningServer.service: sequenceNumber = " + sequenceNumber); if (sequenceNumber > 2){ throw new NullPointerException( "synthetic NPE to create Push storm" ); } After a few listen.icepush requests, a storm should be triggered.

Hide

Permalink

Mircea Toma added a comment - 21/May/13 11:32 AM

Modified BlockingConnectionServer to respond with a custom server-error message in case an exception is captured during blocking request processing. The server error response sent to the bridge will trigger a re-try executed by the bridge (a number of times, depending on the configuration).
The message sent contains the error message, exception class, and stack trace.
Also on the server the exception is logged as a warning.

Show

Mircea Toma added a comment - 21/May/13 11:32 AM Modified BlockingConnectionServer to respond with a custom server-error message in case an exception is captured during blocking request processing. The server error response sent to the bridge will trigger a re-try executed by the bridge (a number of times, depending on the configuration). The message sent contains the error message, exception class, and stack trace. Also on the server the exception is logged as a warning.

Hide

Permalink

Ted Goddard added a comment - 27/Jun/13 3:47 PM

Recently observed in mobileshowcase, but may be fixed by changes to ResourceRegistry.

Show

Ted Goddard added a comment - 27/Jun/13 3:47 PM Recently observed in mobileshowcase, but may be fixed by changes to ResourceRegistry.

Server RuntimeException can create Push storm

Details

Description

Activity

People

Dates