ICEpush
  1. ICEpush
  2. PUSH-23

Auction updates stop after a variable period of time

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-Alpha2
    • Fix Version/s: 2.0-Alpha3
    • Component/s: Push Server, Samples
    • Labels:
      None
    • Environment:
      ICEpush ICEfaces 2.0 Auction

      Description

      Testing with the Auction application has shown that updates to the client will occasionally stop. The amount of time it takes is variable (from minutes to hours) and it can happen in almost any of the different browsers.

        Issue Links

          Activity

          Hide
          Deryk Sinotte added a comment -

          A number of debug logging statements were added to try and narrow down when and why the updates stopped. Here's a subset of that logging that shows what happens in the case of one browser where the clocks stopped ticking after a few minutes:

          //Push group is created
          Jan 20, 2010 11:30:45 AM org.icepush.PushContext$Group <init>
          FINEST: '3B938F0B693880AC6FDA99BBBCEB9160' push group created.

          //Push ID is added to the group
          Jan 20, 2010 11:30:45 AM org.icepush.PushContext addGroupMember
          FINEST: Added pushId '1264015845649' to group '3B938F0B693880AC6FDA99BBBCEB9160'.

          //The final time that the lastAccess time is updated for this id. The lastAccess time should be updated every 3 seconds or so provided //there are incoming requests from the client containing the required id.
          Jan 20, 2010 11:32:40 AM org.icepush.PushContext$Group touchIfMatching
          FINEST: lastAccess updated for 1264015845649

          //The last time a successful notification is sent for this push id.
          Jan 20, 2010 11:34:41 AM org.icepush.BlockingConnectionServer$4 writeTo
          FINEST: Sending notifications for [1264015845649].

          //Then the group is discard due to expiration, exactly 2 minutes after final lastAccess update.
          Jan 20, 2010 11:34:41 AM org.icepush.PushContext$Group discardIfExpired
          FINEST: '3B938F0B693880AC6FDA99BBBCEB9160' push group expired.

          //Continue to get these for that same group every 1 second. However, nothing is sent since the group has expired.
          Jan 20, 2010 11:35:59 AM org.icepush.PushContext push
          FINEST: Push notification triggered for '3B938F0B693880AC6FDA99BBBCEB9160' pushId.

          So it appears that the mechanism used to update the lastAccess time is not working. As more browser instances are added, it seems the more time elapses between successful updates. When the period gets longer than 2 minutes, the group is expired/discarded and the updates are no longer sent. The client, however, appears to be still be sending the appropriate requests so it would seem that there is something on the server side that isn't quite behaving.

          Show
          Deryk Sinotte added a comment - A number of debug logging statements were added to try and narrow down when and why the updates stopped. Here's a subset of that logging that shows what happens in the case of one browser where the clocks stopped ticking after a few minutes: //Push group is created Jan 20, 2010 11:30:45 AM org.icepush.PushContext$Group <init> FINEST: '3B938F0B693880AC6FDA99BBBCEB9160' push group created. //Push ID is added to the group Jan 20, 2010 11:30:45 AM org.icepush.PushContext addGroupMember FINEST: Added pushId '1264015845649' to group '3B938F0B693880AC6FDA99BBBCEB9160'. //The final time that the lastAccess time is updated for this id. The lastAccess time should be updated every 3 seconds or so provided //there are incoming requests from the client containing the required id. Jan 20, 2010 11:32:40 AM org.icepush.PushContext$Group touchIfMatching FINEST: lastAccess updated for 1264015845649 //The last time a successful notification is sent for this push id. Jan 20, 2010 11:34:41 AM org.icepush.BlockingConnectionServer$4 writeTo FINEST: Sending notifications for [1264015845649] . //Then the group is discard due to expiration, exactly 2 minutes after final lastAccess update. Jan 20, 2010 11:34:41 AM org.icepush.PushContext$Group discardIfExpired FINEST: '3B938F0B693880AC6FDA99BBBCEB9160' push group expired. //Continue to get these for that same group every 1 second. However, nothing is sent since the group has expired. Jan 20, 2010 11:35:59 AM org.icepush.PushContext push FINEST: Push notification triggered for '3B938F0B693880AC6FDA99BBBCEB9160' pushId. So it appears that the mechanism used to update the lastAccess time is not working. As more browser instances are added, it seems the more time elapses between successful updates. When the period gets longer than 2 minutes, the group is expired/discarded and the updates are no longer sent. The client, however, appears to be still be sending the appropriate requests so it would seem that there is something on the server side that isn't quite behaving.
          Hide
          Deryk Sinotte added a comment -

          In the BlockingConnectionServer, the participating push ids are captured and logged. These appear to occur every 1 second so the client looks like it's doing its thing correctly:

                  public void service(final Request request) throws Exception {
                         resetTimeout();
                         respondIfPendingRequest(CloseResponse);
                         participatingPushIDs = Arrays.asList(request.getParameterAsStrings("ice.pushid"));
                         if (log.isLoggable(Level.FINEST))

          {                    log.finest("Participating pushIds: " + participatingPushIDs + ".");                }

                         pendingRequest.put(request);
                         respondIfNotificationsAvailable();
                         inboundNotifier.notifyObservers(participatingPushIDs);
                     }

          This is the code that should be run when necessary to update the lastAccess time on the appropriate Group:

          public void update(Observable observable, Object o) {
          long now = System.currentTimeMillis();
          //avoid to scan/touch the groups on each notification
          if (lastScan + GroupScanningTimeResolution < now) {
          lastScan = now;
          List pushIDs = (List) o;
          Iterator i = new ArrayList(groups.values()).iterator();
          while (i.hasNext())

          { Group group = (Group) i.next(); group.touchIfMatching(pushIDs); group.discardIfExpired(); }

          }
          }

          Show
          Deryk Sinotte added a comment - In the BlockingConnectionServer, the participating push ids are captured and logged. These appear to occur every 1 second so the client looks like it's doing its thing correctly:         public void service(final Request request) throws Exception {                resetTimeout();                respondIfPendingRequest(CloseResponse);                participatingPushIDs = Arrays.asList(request.getParameterAsStrings("ice.pushid"));                if (log.isLoggable(Level.FINEST)) {                    log.finest("Participating pushIds: " + participatingPushIDs + ".");                }                pendingRequest.put(request);                respondIfNotificationsAvailable();                inboundNotifier.notifyObservers(participatingPushIDs);            } This is the code that should be run when necessary to update the lastAccess time on the appropriate Group: public void update(Observable observable, Object o) { long now = System.currentTimeMillis(); //avoid to scan/touch the groups on each notification if (lastScan + GroupScanningTimeResolution < now) { lastScan = now; List pushIDs = (List) o; Iterator i = new ArrayList(groups.values()).iterator(); while (i.hasNext()) { Group group = (Group) i.next(); group.touchIfMatching(pushIDs); group.discardIfExpired(); } } }
          Hide
          Deryk Sinotte added a comment -

          To recreate this issue on a local machine, we used variations of the following shell script technique to run 9 separate Firefox instances as more browsers appears to make the problem occur more quickly:

          for index in 1 2 3 4 5 6 7 8 9; do firefox -no-remote -P $index http://localhost:8080/auction & done

          What I noticed doing this is that the lastAccess time for the first browser seems to be reliably updated every 3 seconds. However, as more browsers are pointed at the server, the lastAccess time is less and less frequently. The last (9th) browser instance in my test was the first to fail and the lastAccess time was only ever updated twice. The failure occurred exactly 2 minutes after the lastAccess time was updated. This is more evidence that the problem is likely on the server side as the client continues to send requests and get notifications and updates right up until the group is expired.

          Show
          Deryk Sinotte added a comment - To recreate this issue on a local machine, we used variations of the following shell script technique to run 9 separate Firefox instances as more browsers appears to make the problem occur more quickly: for index in 1 2 3 4 5 6 7 8 9; do firefox -no-remote -P $index http://localhost:8080/auction & done What I noticed doing this is that the lastAccess time for the first browser seems to be reliably updated every 3 seconds. However, as more browsers are pointed at the server, the lastAccess time is less and less frequently. The last (9th) browser instance in my test was the first to fail and the lastAccess time was only ever updated twice. The failure occurred exactly 2 minutes after the lastAccess time was updated. This is more evidence that the problem is likely on the server side as the client continues to send requests and get notifications and updates right up until the group is expired.
          Hide
          Ted Goddard added a comment -

          To determine if the browser is actually sending the desired requests, I added an alert to snowplow/src/main/javascript/connection.async.js here:

          } else

          { info(logger, 'blocking connection stopped at server\'s request...'); +alert("Server X-Connection: close"); }

          and server-side logging to src/main/java/org/icepush/BlockingConnectionServer.java here:

          response.setHeader("X-Connection", "close");
          + response.setHeader("X-icepush", "closed due to previous blocking connection");
          response.setHeader("Content-Length", 0);

          +System.out.println("close current connection ");

          and

          public void service(final Request request) throws Exception {
          resetTimeout();
          +System.out.println(this + " " + pendingRequest + " pendingRequest for " + request.getHeader("Cookie"));

          that logs as follows when the problem occurs:

          org.icepush.BlockingConnectionServer$3@5bdcd49d [] pendingRequest for JSESSIONID=043B4D5EECBF6ACFFA917FC451A40990; ice.pushids=1264032693662; ice.connection.lease=1264032696040; ice.connection.running=9e20e%3Aacquired; ice.notified.pushids=

          org.icepush.BlockingConnectionServer$3@5bdcd49d [org.icepush.servlet.ThreadBlockingAdaptingServlet$ThreadBlockingRequestResponse@7f0abe88] pendingRequest for JSESSIONID=2DD1B04BE3E0E178FA0657CFA9294E92; ice.pushids=1264032693745; ice.connection.lease=1264032696117; ice.connection.running=9d826%3Aacquired; ice.notified.pushids=

          close current connection

          Two different JSESSIONID values are seen with the same BlockingConnectionServer. Either, the session is being restarted, or the BlockingConnectionServer objects are being incorrectly shared across multiple sessions. Note that ice.push.browser is not set, so perhaps all empty string identified browsers are sharing a single BlockingConnectionServer.

          Show
          Ted Goddard added a comment - To determine if the browser is actually sending the desired requests, I added an alert to snowplow/src/main/javascript/connection.async.js here: } else { info(logger, 'blocking connection stopped at server\'s request...'); +alert("Server X-Connection: close"); } and server-side logging to src/main/java/org/icepush/BlockingConnectionServer.java here: response.setHeader("X-Connection", "close"); + response.setHeader("X-icepush", "closed due to previous blocking connection"); response.setHeader("Content-Length", 0); +System.out.println("close current connection "); and public void service(final Request request) throws Exception { resetTimeout(); +System.out.println(this + " " + pendingRequest + " pendingRequest for " + request.getHeader("Cookie")); that logs as follows when the problem occurs: org.icepush.BlockingConnectionServer$3@5bdcd49d [] pendingRequest for JSESSIONID=043B4D5EECBF6ACFFA917FC451A40990; ice.pushids=1264032693662; ice.connection.lease=1264032696040; ice.connection.running=9e20e%3Aacquired; ice.notified.pushids= org.icepush.BlockingConnectionServer$3@5bdcd49d [org.icepush.servlet.ThreadBlockingAdaptingServlet$ThreadBlockingRequestResponse@7f0abe88] pendingRequest for JSESSIONID=2DD1B04BE3E0E178FA0657CFA9294E92; ice.pushids=1264032693745; ice.connection.lease=1264032696117; ice.connection.running=9d826%3Aacquired; ice.notified.pushids= close current connection Two different JSESSIONID values are seen with the same BlockingConnectionServer. Either, the session is being restarted, or the BlockingConnectionServer objects are being incorrectly shared across multiple sessions. Note that ice.push.browser is not set, so perhaps all empty string identified browsers are sharing a single BlockingConnectionServer.
          Hide
          Ted Goddard added a comment -

          If the error does not occur quickly, close all browsers and launch another set of 10.

          Logging in BrowserDispatcher.java shows multiple:

          lookupServer found org.icepush.servlet.BrowserDispatcher$BrowserEntry@27a4f43e for browserID null
          lookupServer found org.icepush.servlet.BrowserDispatcher$BrowserEntry@27a4f43e for browserID null

          So there is a very good possibility that multiple browsers will share the same "previousRequest" and abort their connection.

          Assigning to Mircea for comment on whether the browserID failure could case the reported problem.

          Show
          Ted Goddard added a comment - If the error does not occur quickly, close all browsers and launch another set of 10. Logging in BrowserDispatcher.java shows multiple: lookupServer found org.icepush.servlet.BrowserDispatcher$BrowserEntry@27a4f43e for browserID null lookupServer found org.icepush.servlet.BrowserDispatcher$BrowserEntry@27a4f43e for browserID null So there is a very good possibility that multiple browsers will share the same "previousRequest" and abort their connection. Assigning to Mircea for comment on whether the browserID failure could case the reported problem.
          Hide
          Ted Goddard added a comment -

          Adding

          response.addCookie(new Cookie(BrowserIDCookieName, browserID));

          to the else path in createPushId appears to fix the problem when multiple browsers are launched (the cookie must be applied to the response no matter what stale value is found in the ThreadLocal currentBrowserID.

          The ThreadLocal should be removed unless there is a very good reason for it's use.

          Show
          Ted Goddard added a comment - Adding response.addCookie(new Cookie(BrowserIDCookieName, browserID)); to the else path in createPushId appears to fix the problem when multiple browsers are launched (the cookie must be applied to the response no matter what stale value is found in the ThreadLocal currentBrowserID. The ThreadLocal should be removed unless there is a very good reason for it's use.
          Hide
          Ted Goddard added a comment -

          Fix is checked in using request attributes in place of the thread local.

          Show
          Ted Goddard added a comment - Fix is checked in using request attributes in place of the thread local.
          Hide
          Ted Goddard added a comment -

          Request attribute fix resolves push failure for multiple browsers. Still investigating push failures after extended periods.

          Show
          Ted Goddard added a comment - Request attribute fix resolves push failure for multiple browsers. Still investigating push failures after extended periods.
          Hide
          Mircea Toma added a comment -

          Make sure that all inbound pushIds are accumulated from all the participating browsers before a touch & discard group scanning is executed. The previous algorithm was using randomly only the pushIds of one of the browsers when running the touch & discard group scanning.

          Show
          Mircea Toma added a comment - Make sure that all inbound pushIds are accumulated from all the participating browsers before a touch & discard group scanning is executed. The previous algorithm was using randomly only the pushIds of one of the browsers when running the touch & discard group scanning.

            People

            • Assignee:
              Mircea Toma
              Reporter:
              Deryk Sinotte
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: