August 28, 2007 at 5:19 am #40749
Hi all, I have a small problem with bonding (Fault Tolerance and Load Balancing), some background first,
I have two ZS boxes setup and connected together by two VPN links, I have bonded the two VPN’s at each end and it all seems to work ok UNTIL I unplug one of the links, the traffic continues for about 2 or 3 seconds then starts to fail. When pinging it drops about 3 out of 4 packets. When I plug it back in it restores in about 10 seconds.
Can anyone shed some light on what I should look for?
DarrenJune 18, 2008 at 8:12 am #45797
Ever resolve this problem, or does anyone have any input? I am getting the exact same results right now!June 19, 2008 at 2:10 am #45798
I have confirmed with darrenf that he is still having this issue. I have replicated this issue as well in my test environments.
From what I can see, when I configure the bond & check the “info” for the bond, it says round robin, never mentions anything about fail over. Not sure if that is just the naming or if thats the issue.
I have reconfigured my devices in my test lab a few times now..
Both devices have a similar config.
ETH00 = local lan
ETH01 = x.x.x.x
ETH02 = y.y.y.y
ETH01 is making a confirmed VPN tunnel to ETH01 on the other device.
ETH02 is making a confirmed VPN tunnel to ETH02 on the other device.
VPN’s are not configured with IP’s and routes are not configured at this stage.
Create a bond between the two VPN interfaces on each zeroshell box.
Give the Bond an IP address on a different subnet to anything prior.
Configure routes on each box to pass traffic via this bond interfaces.
All traffic is passing and running correctly.
Disconnect the cable or drop the link on of the VPN interfaces in the bond at either end, and thats when things go funny.
I have a constant ping running from a device behind the zeroshell box to the other zeroshell box, the pings drop when I disconnect on of the links, and then I may get 1 in 5 pings through after that. I have walked away for ten minutes and come back in case I was being impatient, but it remains.
I perform some other tests, such as telnetting to ports on the other router that are open to see if I get a connection, and browsing to the web interface of the other zeroshell box, and it does connect, but the packet loss is huge. Struggle to get the login boxes correctly, etc… Pings are still dropping.
There is no other configured performed in my test routers. No QOS, or firewall rules. It is a fresh image on two ALIX boxes from the image.
If you or anyone can shed any light on what might be happening here that would be great, as I would love to deploy this option on my wan links.
You must be logged in to reply to this topic.