ICEpush
  1. ICEpush
  2. PUSH-238

Server RuntimeException can create Push storm

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.3
    • Fix Version/s: EE-3.3.0.GA, 4.0.BETA, 4.0
    • Component/s: Push Library
    • Labels:
      None
    • Environment:
      ICEpush
    • Assignee Priority:
      P2

      Description

      If a RuntimeException occurs in the BlockingConnectionServer$RunningServer.service() method, the response sent to the bridge results in a Push storm - a tight, never-ending loop of requests to re-establish the blocking connection which results another exeption, etc.

        Activity

        Hide
        Deryk Sinotte added a comment -

        The problem causing the Push storm for me was code I had added (see PUSH-221) to allow registering notification URIs from the server-side. Under certain circumstances, the code would throw an NPE when an old push id was included with the request. I improved the null check so managed to avoid causing the Push storm in that manner but this exception was being caught and handled in the BlockingConnectionServer$RunningServer.service() method. There is a large try/catch that does this:

        try {
            ...
            //Doing a bunch of stuff with push ids and notification URIs
            //when all of a sudden
            
            NullPointerException occurs
            
            ...
        } catch (RuntimeException e) {
            log.log(Level.FINE,"Request does not contain pushIDs.",e);
            respondIfPendingRequest(ErrorNoopResponse);
        }

        So the NPE was being caught and a <noop> response being returned with the header:

        X-Connection-reason: request does not contain Push IDs

        The bridge receives this and right away tries to re-establish the blocking connection. However, the NPE would happen again and -> Push storm.

        In the past (as per PUSH-135), to prevent these types Push storms we had instituted a mechanism to cap the number of retries when an empty response is returned. It doesn't appear that this logic is handling the situation. In connection.async.js we have:

        var nonEmptyResponse = notEmpty(contentAsText(response));
        
        if (reconnect) {
            if (nonEmptyResponse) {
        ....

        Perhaps because it's not really empty. So I believe we need to:

        • Review the BlockingConnectionServer$RunningServer.service() to see if we are doing the right thing. Any RuntimeException in this block of code will result in the same behaviour. We should ensure that the try/catch around the whole block is what we want and that when a RuntimeException occurs, this response we are sending back is appropriate.
        • Look at improving the logging as "request does not contain pushIDs" was a not quite what was happening here. The request did contain a push id but it was not active/valid one.
        • Depending on what we do with the service() method, we may need to tweak the bridge so that when it sees this type of response, it doesn't just keep trying repeatedly.

        The easiest way to reproduce this is synthetically. Just have it throw an NPE somewhere in the middle of that service() method based on some criteria. For example, I wedged this in towards the bottom of the try block:

        System.out.println("BlockingConnectionServer$RunningServer.service: sequenceNumber = " + sequenceNumber);
        if(sequenceNumber > 2){
            throw new NullPointerException("synthetic NPE to create Push storm");
        }

        After a few listen.icepush requests, a storm should be triggered.

        Show
        Deryk Sinotte added a comment - The problem causing the Push storm for me was code I had added (see PUSH-221 ) to allow registering notification URIs from the server-side. Under certain circumstances, the code would throw an NPE when an old push id was included with the request. I improved the null check so managed to avoid causing the Push storm in that manner but this exception was being caught and handled in the BlockingConnectionServer$RunningServer.service() method. There is a large try/catch that does this: try { ... //Doing a bunch of stuff with push ids and notification URIs //when all of a sudden NullPointerException occurs ... } catch (RuntimeException e) { log.log(Level.FINE, "Request does not contain pushIDs." ,e); respondIfPendingRequest(ErrorNoopResponse); } So the NPE was being caught and a <noop> response being returned with the header: X-Connection-reason: request does not contain Push IDs The bridge receives this and right away tries to re-establish the blocking connection. However, the NPE would happen again and -> Push storm. In the past (as per PUSH-135 ), to prevent these types Push storms we had instituted a mechanism to cap the number of retries when an empty response is returned. It doesn't appear that this logic is handling the situation. In connection.async.js we have: var nonEmptyResponse = notEmpty(contentAsText(response)); if (reconnect) { if (nonEmptyResponse) { .... Perhaps because it's not really empty. So I believe we need to: Review the BlockingConnectionServer$RunningServer.service() to see if we are doing the right thing. Any RuntimeException in this block of code will result in the same behaviour. We should ensure that the try/catch around the whole block is what we want and that when a RuntimeException occurs, this response we are sending back is appropriate. Look at improving the logging as "request does not contain pushIDs" was a not quite what was happening here. The request did contain a push id but it was not active/valid one. Depending on what we do with the service() method, we may need to tweak the bridge so that when it sees this type of response, it doesn't just keep trying repeatedly. The easiest way to reproduce this is synthetically. Just have it throw an NPE somewhere in the middle of that service() method based on some criteria. For example, I wedged this in towards the bottom of the try block: System .out.println( "BlockingConnectionServer$RunningServer.service: sequenceNumber = " + sequenceNumber); if (sequenceNumber > 2){ throw new NullPointerException( "synthetic NPE to create Push storm" ); } After a few listen.icepush requests, a storm should be triggered.
        Hide
        Mircea Toma added a comment -

        Modified BlockingConnectionServer to respond with a custom server-error message in case an exception is captured during blocking request processing. The server error response sent to the bridge will trigger a re-try executed by the bridge (a number of times, depending on the configuration).
        The message sent contains the error message, exception class, and stack trace.
        Also on the server the exception is logged as a warning.

        Show
        Mircea Toma added a comment - Modified BlockingConnectionServer to respond with a custom server-error message in case an exception is captured during blocking request processing. The server error response sent to the bridge will trigger a re-try executed by the bridge (a number of times, depending on the configuration). The message sent contains the error message, exception class, and stack trace. Also on the server the exception is logged as a warning.
        Hide
        Ted Goddard added a comment -

        Recently observed in mobileshowcase, but may be fixed by changes to ResourceRegistry.

        Show
        Ted Goddard added a comment - Recently observed in mobileshowcase, but may be fixed by changes to ResourceRegistry.

          People

          • Assignee:
            Mircea Toma
            Reporter:
            Deryk Sinotte
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: