ICEfaces-EE
  1. ICEfaces-EE
  2. IPCK-21

EPS creates second view on failover

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.8.2.GA
    • Fix Version/s: None
    • Component/s: Enterprise Push Server
    • Labels:
      None
    • Environment:
      ICEFaces + EPS in WebSphere (but should be in all app server environments)

      Description

      During failover, the code in the ICEFaces framework has been designed to request a full page reload if a xmlhttp request arrives for a view that hasn't been created on the new primary node. This creates the object structure necessary to process requests. This works as intended.

      From our discussion:

      I believe this is EPS doing it. It's hard for EPS to know when fail-over has occurred. The Core can rely on a view structure and such, EPS cannot. To know when a fail-over occurred we introduced the EPSID. Whenever a blocking request is first handled by a particular EPS instance, that instance sets its EPSID as a Cookie. There are three scenarios:

         1. The blocking request to a particular EPS instance does not have an EPSID
            This is the first blocking request to any EPS instance as it doesn't echo back any EPSID. The EPS instance adds its EPSID to the response and handling of the response continues normally.
         2. The blocking request to a particular EPS instance does have the EPSID of that particular EPS instance
            This is not the first request to this EPS instance as it does echo back the EPSID of this EPS instance. The EPS instance doesn't have to do anything special and handling of the response continues normally.
         3. The blocking request to a particular EPS instance does echo back an EPSID of a different EPS instance
            This is the first request to this EPS instance as it does echo back the EPSID of a different EPS instance. Fail-over might have occurred! The EPS instance adds its EPSID to the response and responds with a ReloadResponse because of the possible fail-over scenario.

      I'll explain why I put "might" in italics. EPSIDs are generated upon start-up of each EPS instance and thus can differ from one run to the other. Let's say we have a two node cluster. Node 1 has EPSID id1 generated by chance, and Node 2 has EPSID id2. A browser has accessed the cluster and currently has EPSID id2 in its blocking requests. Upon restart of the cluster Node 1 now has EPSID id3 generated by chance, and Node 2 has EPSID id4. The Cookies aren't cleared in the browser instance and hits the cluster again. It hits Node 2 with a blocking request and that request has EPSID id2. Node 2 now has EPSID id4 associated with it and as the EPSIDs differ a ReloadResponse is send back as Node 2 thinks a fail-over occurred.

      There's also a chance of double reload. If fail-over is detected on a non-blocking request, that request gets a ReloadResponse from the Core and the reload occurs. Now the blocking request comes in but still has the other EPSID assigned to it and another ReloadResponse is send back.

      I agree this is not elegant. There are opportunities to improve this, but we didn't have much time back then:

          * The various EPS instance could announce there EPSIDs in order for each instance to know which EPSIDs are valid. This should help avoid the additional reload when a cluster has been restarted and the Cookies haven't been cleared in the browser.
          * Additionally when the Core has send a ReloadResponse it could indicate that to the EPS instance on that node. This should help avoid the double reload scenario.


      The problem with the double reload is that an additional full component tree for whatever page the user was sitting on when failover occured is created and maintained until session expiry. It's not possible to know in advance which page this will be naturally or how large this extra memory consumption will be.


        Activity

        Hide
        Greg Dick added a comment -

        It might be possible for the EPS product to send a <dispose-views/> request prior to it sending a reload in a way that duplicates the behaviour of the client. It could extract the viewNumber from the response with the mismatched/missing EPSID. At least in this way, the View resources would be released in the core and any server push resources created at an application level could be cleaned up.

        Show
        Greg Dick added a comment - It might be possible for the EPS product to send a <dispose-views/> request prior to it sending a reload in a way that duplicates the behaviour of the client. It could extract the viewNumber from the response with the mismatched/missing EPSID. At least in this way, the View resources would be released in the core and any server push resources created at an application level could be cleaned up.
        Hide
        Greg Dick added a comment -

        The code in the bridge is written to NOT send a dispose-views in this case. This is a generic reload request, and the view id's are not known at the bridge, hence the reload without dispose-views, which requires a view-id.

        Show
        Greg Dick added a comment - The code in the bridge is written to NOT send a dispose-views in this case. This is a generic reload request, and the view id's are not known at the bridge, hence the reload without dispose-views, which requires a view-id.
        Hide
        Greg Dick added a comment -

        This duplicate view creation can also be seen on initial page load just after a server restart. In this case, the client still includes the EPSID id in the request, and EPS has no persistent memory of the valid ids of the servers it was just serving, hence it asks for a full page reload. This can be avoided by clearing cookies in the client prior to loading the application.

        Show
        Greg Dick added a comment - This duplicate view creation can also be seen on initial page load just after a server restart. In this case, the client still includes the EPSID id in the request, and EPS has no persistent memory of the valid ids of the servers it was just serving, hence it asks for a full page reload. This can be avoided by clearing cookies in the client prior to loading the application.
        Hide
        Deryk Sinotte added a comment -

        Assigning to Jack for comments.

        This is an older 1.8 issue that I'm not sure we were able to replicate and/or fix.

        Show
        Deryk Sinotte added a comment - Assigning to Jack for comments. This is an older 1.8 issue that I'm not sure we were able to replicate and/or fix.

          People

          • Assignee:
            Jack Van Ooststroom
            Reporter:
            Greg Dick
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: