No ‘Sticky Sessions’ so shop carts, banking etc drop out

Home Page Forums Network Management ZeroShell No ‘Sticky Sessions’ so shop carts, banking etc drop out

This topic contains 13 replies, has 0 voices, and was last updated by  AussieWISP 6 years, 4 months ago.

Viewing 15 posts - 1 through 15 (of 15 total)
  • Author
    Posts
  • #43092

    AussieWISP
    Member

    Our system works very well (3 x Zeroshells) but we need stick sessions as shopping carts often suddenly go to empty if a round robin occurs, forum logins also have trouble and some other ip sensitive sites have issues. Besides ip specific to wan configs under NB, is there a way?

    #51904

    atheling
    Member

    Do you have my NB and QoS patch installed? The later versions of that dealt with the issue of “stickyness” of connections. Basically the last set of changes to that patch set up a different method for routing the pings that are used to determine if the WAN links are good. The original way resulting in the Linux routing cache being flushed every few seconds which resulted in the behaviour you are reporting.

    If you have installed the latest version of the patch, there are still some things that can lead to the routing cache being cleared. Unfortunately they are arcane and badly documented kernel routing tuning parameters… I’ll cross my fingers and hope that you either don’t have my patch installed or don’t have the latest version installed before we go into that area.

    http://dl.dropbox.com/u/19663978/ZS_nb_quo_b14_b.tar.gz

    (Patch is for b14 which is what I am still running, but I believe is should apply okay to b15.)

    #51905

    DrmCa
    Participant

    After the NB patch was installed the issues started happening – specifically with one site using AJAX chat which must be validating the IP of the user logging in.
    When round robin occurs, chat boots me out.

    #51906

    atheling
    Member
    DrmCa wrote:
    After the NB patch was installed the issues started happening – specifically with one site using AJAX chat which must be validating the IP of the user logging in.
    When round robin occurs, chat boots me out.

    It has been a while since I went through all this. And Linux routing seems to have wheel within wheels within wheels so my recollection could be off.

    That said, having connections “sticky” (subsequent TCP connections use the same gateway as previous ones with the same source and destination) is made possible because for new connections Linux looks in the routing cache before going through the IP rules which then specify a “routing policy database” which specifies the default route with a round robin setup in the case of multiple WAN interfaces. (For existing connections we have a bunch of logic in iptables to use tags to direct packets to the same interface the started on.)

    So it sounds like your routing cache is being cleared. The way I checked this when creating/debugging the patch was to use the “ip route show cache” command at the bash prompt along with wc and/or grep to see when the cache was being cleared.

    For example:

    Code:
    # ip route show cache | wc -l
    234
    #

    When the number goes down, or if it stays very low, then something is reseting the cache. In the unmodified version of ZS and in early versions of my patch it was because a “ip rule set” operation was being performed to setup the routing for the pings that detect WAN link failures and that operation has the side effect of flushing the cache.

    Anyway, log into the command line and use the above command to monitor the cache and see if it is being cleared and if the session issues you are experiencing are time wise correlated with when the cache is flushed. That will tell us if the problem is with the cache or elsewhere.

    #51907

    DrmCa
    Participant

    root@router root> ip route show cache | wc -l
    62
    root@router root> ip route show cache | wc -l
    80
    root@router root> ip route show cache | wc -l
    84
    root@router root> ip route show cache | wc -l
    114
    root@router root> ip route show cache | wc -l
    134
    root@router root> ip route show cache | wc -l
    152
    root@router root> ip route show cache | wc -l
    156
    root@router root> ip route show cache | wc -l
    172
    root@router root> ip route show cache | wc -l
    190
    root@router root> ip route show cache | wc -l
    10


    it booted me out here
    root@router root> ip route show cache | wc -l
    40
    root@router root> ip route show cache | wc -l
    60
    root@router root> ip route show cache | wc -l
    98
    root@router root> ip route show cache | wc -l
    134

    Disabling the ICMP failover checking fixed the issue with AJAX chat site but broke download speed. I’d rather have fast d/l than chat room.

    #51908

    atheling
    Member
    DrmCa wrote:
    root@router root>Disabling the ICMP failover checking fixed the issue with AJAX chat site but broke download speed. I’d rather have fast d/l than chat room.

    So it does appear your stickyness issue is related to the flushing of the routing cache and it is being caused by the logic that sets up the ICMP pings for WAN health checks.

    Do you have my latest patch installed from http://dl.dropbox.com/u/19663978/ZS_nb_qos_b14_b.tar.gz

    That patch was supposed to fix that problem.

    But, of course, it will “break” the download speed as all connections associated with the download would be over the same WAN interface.

    #51909

    DrmCa
    Participant

    Only one patch is installed: ZS_nb_quo_b14_b.tar.gz

    #51910

    atheling
    Member
    DrmCa wrote:
    Only one patch is installed: ZS_nb_quo_b14_b.tar.gz

    That’s the same one. I put it up with “QoS” miss typed to quo, the link in my previous response is to an identical file.

    Unfortunately, that now means I have to re-investigate why the routing cache is getting flushed… 😥

    #51911

    DrmCa
    Participant

    Could it be because I missed the check box to activate the Cron command?
    So it did not run.

    #51912

    DWJames
    Member

    hi,
    same issue here, but I’m running B16.
    Will this patch work for me or is there some way I can manually make the changes required?

    ip route show cache | wc -l
    shows that the routes are being regularly cleared and we are running the icmp failover monitoring so I guess that’s it.

    Thanks,
    James

    #51913

    atheling
    Member

    @dwjames wrote:

    hi,
    same issue here, but I’m running B16.
    Will this patch work for me or is there some way I can manually make the changes required?

    ip route show cache | wc -l
    shows that the routes are being regularly cleared and we are running the icmp failover monitoring so I guess that’s it.

    Thanks,
    James

    Don’t know if the patch will work for B16 or not…. I’ve been maintaining the patch for well over a year now hoping that it would be incorporated in Fulvio’s releases.

    I am still running b14 as the release notes for b15 and b16 indicate no changes that I particularly needed or wanted. I did check the changes for b15 and it seems the patch for b14 should work on it. But I have not yet downloaded b16 and checked to see if the patch would work on it.

    #51914

    DWJames
    Member

    ok, thanks.
    If the patch needs to go into the pre boot script, does that mean that it is applied each time the zeroshell boots and that it doesn’t rewrite any standard code?
    So this way I can try it and if it doesn’t work for me I can just remove the pre boot code and revert to standard?

    Do you have some more information on what this patch does aside from the sticky routes?
    Also, how does it deal with a sticky route if there is a line failure?

    thanks,
    James

    #51915

    atheling
    Member

    @dwjames wrote:

    ok, thanks.
    If the patch needs to go into the pre boot script, does that mean that it is applied each time the zeroshell boots and that it doesn’t rewrite any standard code?
    So this way I can try it and if it doesn’t work for me I can just remove the pre boot code and revert to standard?

    Correct. Removing the snippet from the pre-boot script and rebooting will remove the patch.

    @dwjames wrote:

    Do you have some more information on what this patch does aside from the sticky routes?

    Route stickiness was the last thing that that I fixed on the patch. 🙂

    1. The primary reason I started on the patch was to get net balancing and QoS to co-exist. They both use fwmarks and in the non-patched version they are used in such a way that they conflict with one another.

    2. Then I addressed the return paths for connections originating on the Internet. This is needed if you are running an externally accessible server on your LAN and wish it to be available over any of the WAN links. Think mail server with your DNS having multiple MX records, one for each of your external IP addresses.

    My two links have very different speeds so with normal balancing on them I seldom ran into the stickiness issue. But others did so the later versions of the patch addressed that.

    @dwjames wrote:

    Also, how does it deal with a sticky route if there is a line failure?

    thanks,
    James

    If/when there is a failure on a WAN link the routing tables are changed and that has the side effect of clearing the routing cache. So stickiness is reset when a WAN link fails or recovers.

    #51916

    alexemil
    Member

    There is need to install.. NB.. If I will not install this, then any alternate??

    #51917

    atheling
    Member

    @alexemil wrote:

    There is need to install.. NB.. If I will not install this, then any alternate??

    Latest version(s) of Zeroshell have the net balancing code included so the old patches should no longer be needed.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic.