Reply To: Date of a new release

Home Page Forums Network Management ZeroShell Date of a new release Reply To: Date of a new release

#50022

atheling
Member

Since Jose Menendez mentions this thread in his excellent “Using ZeroShell as a Net Balancer, QoS server and Captive Portal” located at http://www.zeroshell.net/eng/documentation/ I thought I should update with the latest patch and some thoughts on the current issues.

First the patch. This is my “patch 7” that I sent to Fulvio on 19Apr2010. It changes the way pings routed so that the kernel IP routing cache is not flushed every few seconds. This means that there is now a little “stickiness” for multiple connections to a particular IP address. As before, this patch is to be applied to the Beta12 version of ZeroShell’s scripts.


Index: failoverd
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/failoverd,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 failoverd
--- failoverd 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ failoverd 19 Apr 2010 23:13:28 -0000
@@ -7,40 +7,39 @@
[ -z "$TC" ] && TC=1
TO=$((Timeout*TC))
echo Timeout $TO
for R in `seq 1 $ProbesDOWN` ; do
for I in ${SUCCESS[$GW]} $IPS ; do
IPP=${IP[$I]}
- ip ru add tos 0x04 to ${IP[$I]} ta 1$GW
- if ping -Q 0x04 -c 1 -w $TO ${IP[$I]} ; then
- ip ru del tos 0x04 2>/dev/null
+ iptables -t mangle --flush NB_FO_PRE
+ iptables -t mangle -A NB_FO_PRE -p icmp --icmp-type echo-request -d $IPP -j MARK --set-mark 1$GW
+ if ping -c 1 -w $TO ${IP[$I]} ; then
SUCCESS[$GW]=$I
return 0
fi
- ip ru del tos 0x04 2>/dev/null
done
done
return 1
}
function pingUP {
GW=$1
echo PROBEUP
for I in ${SUCCESS[$GW]} $IPS ; do
IPP=${IP[$I]}
- ip ru add tos 0x04 to ${IP[$I]} ta 1$GW
R=0
ERROR=""
+ iptables -t mangle --flush NB_FO_PRE
+ iptables -t mangle -A NB_FO_PRE -p icmp --icmp-type echo-request -d $IPP -j MARK --set-mark 1$GW
while [ $R -lt $ProbesUP -a -z "$ERROR" ] ; do
if ! ping -Q 0x04 -c 1 -w $Timeout ${IP[$I]} ; then
ERROR=yes
else
R=$((R+1))
fi

done
- ip ru del tos 0x04 2>/dev/null
if [ -z "$ERROR" ] ; then
SUCCESS[$GW]=$I
return 0
fi
done
return 1
Index: fw_initrules
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/fw_initrules,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 fw_initrules
--- fw_initrules 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ fw_initrules 1 Dec 2009 03:51:40 -0000
@@ -2,13 +2,13 @@
. /etc/kerbynet.conf
CHAIN="$1"
[ -z "$CHAIN" ] && exit 1
CONFIG="$REGISTER/system/net/FW/"
if [ "$CHAIN" == QoS ] ; then
TABLE="-t mangle"
- CH=FORWARD
+ CH=QoS
else
if [ "$CHAIN" == NetBalancer ] ; then
TABLE="-t mangle"
CH=NetBalancer
else
TABLE=""
@@ -23,12 +23,16 @@
iptables -A INPUT -j SYS_INPUT
iptables -A INPUT -p tcp --dport 80 -j SYS_HTTPS
iptables -A INPUT -p tcp --dport 443 -j SYS_HTTPS
iptables -A INPUT -p tcp --dport 22 -j SYS_SSH
fi
[ "$CHAIN" == OUTPUT ] && iptables -A OUTPUT -j SYS_OUTPUT
+ # If we are doing the QoS chain, thenlear any marks left over from
+ # Netbalancing/failover routing. The QoS chain is applied after
+ # routing so there is no conflict.
+ [ "$CHAIN" == "QoS" ] && iptables $TABLE -A $CH -j MARK --set-mark 0x0
if [ -d $CONFIG/Chains/$CHAIN/Rules ] ; then
cd $CONFIG/Chains/$CHAIN/Rules
RULES=`ls`
for RULE in $RULES ; do
ENABLED="`cat $RULE/Enabled 2>/dev/null`"
if [ "$ENABLED" == yes ] ; then
Index: fw_makerule
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/fw_makerule,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 fw_makerule
--- fw_makerule 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ fw_makerule 1 Dec 2009 03:32:42 -0000
@@ -4,13 +4,13 @@
RULE="$2"
OPT="$3"
[ -z "$CHAIN" -a -z "$RULE" ] && exit 1
CONFIG="$REGISTER/system/net/FW"
if [ "$CHAIN" = QoS ] ; then
TABLE="-t mangle"
- CH=FORWARD
+ CH=QoS
else
if [ "$CHAIN" = NetBalancer ] ; then
TABLE="-t mangle"
CH=NetBalancer
else
TABLE=""
@@ -411,13 +411,13 @@
iptables $TABLE $IPT $TGT
if [ "$CHAIN" == QoS ] ; then
TGTDSCP=`cat $REGISTER/system/net/QoS/Class/$TARGET/DSCP 2>/dev/null`
if [ -n "$TGTDSCP" ] ; then
iptables $TABLE $IPT -j DSCP --set-dscp $TGTDSCP
fi
- iptables -t mangle -A FORWARD -m mark ! --mark 0 -j ACCEPT
+ iptables -t mangle -A QoS -m mark ! --mark 0 -j ACCEPT
fi
if [ "$CHAIN" == NetBalancer ] ; then
[ "$TARGET" != Auto ] && iptables -t mangle -A NetBalancer -m mark ! --mark 0 -j ACCEPT
fi
fi
fi
Index: fw_start
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/fw_start,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 fw_start
--- fw_start 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ fw_start 19 Apr 2010 22:51:13 -0000
@@ -10,12 +10,21 @@
iptables -t mangle -F NetBalancer 2>/dev/null
iptables -t mangle -X NetBalancer 2>/dev/null
iptables -t mangle -N NetBalancer 2>/dev/null
iptables -t mangle -F OpenVPN 2>/dev/null
iptables -t mangle -X OpenVPN 2>/dev/null
iptables -t mangle -N OpenVPN 2>/dev/null
+iptables -t mangle -F QoS 2>/dev/null
+iptables -t mangle -X QoS 2>/dev/null
+iptables -t mangle -N QoS 2>/dev/null
+iptables -t mangle -F NB_CT_PRE 2>/dev/null
+iptables -t mangle -X NB_CT_PRE 2>/dev/null
+iptables -t mangle -N NB_CT_PRE 2>/dev/null
+iptables -t mangle -F NB_FO_PRE 2>/dev/null
+iptables -t mangle -X NB_FO_PRE 2>/dev/null
+iptables -t mangle -N NB_FO_PRE 2>/dev/null
[ "$CPGW" == yes ] && iptables -N CapPort
$SCRIPTS/fw_https_chain
$SCRIPTS/fw_ssh_chain
$SCRIPTS/fw_sys_chain
CHAINS=`ls`
for C in $CHAINS ; do
Index: fw_viewchain
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/fw_viewchain,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 fw_viewchain
--- fw_viewchain 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ fw_viewchain 30 Nov 2009 19:30:43 -0000
@@ -1,7 +1,7 @@
#!/bin/sh
. /etc/kerbynet.conf
CHAIN="$1"
[ -z "$CHAIN" ] && exit 1
-[ "$CHAIN" == QoS ] && CHAIN="FORWARD -t mangle"
+[ "$CHAIN" == QoS ] && CHAIN="QoS -t mangle"
[ "$CHAIN" == NetBalancer ] && CHAIN="NetBalancer -t mangle"
iptables -n -v -L $CHAIN
Index: nb_fw
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/nb_fw,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 nb_fw
--- nb_fw 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ nb_fw 19 Apr 2010 22:54:18 -0000
@@ -1,24 +1,35 @@
#!/bin/sh
. /etc/kerbynet.conf
iptables -t mangle -D PREROUTING -j CONNMARK --restore-mark 2>/dev/null
-iptables -t mangle -D PREROUTING -j NetBalancer 2>/dev/null
-iptables -t mangle -D INPUT -j NetBalancer 2>/dev/null
-iptables -t mangle -D OUTPUT -j NetBalancer 2>/dev/null
+iptables -t mangle -D PREROUTING -m state --state NEW -j NB_CT_PRE 2>/dev/null
+iptables -t mangle -D PREROUTING -m state --state NEW -j NetBalancer 2>/dev/null
+iptables -t mangle -D INPUT -m state --state NEW -j NB_CT_POST 2>/dev/null
+iptables -t mangle -D OUTPUT -j CONNMARK --restore-mark 2>/dev/null
+iptables -t mangle -D OUTPUT -m state --state NEW -j NB_FO_PRE 2>/dev/null
+iptables -t mangle -D OUTPUT -m state --state NEW -j NetBalancer 2>/dev/null
iptables -t mangle -D OUTPUT -j OpenVPN 2>/dev/null
iptables -t mangle -D POSTROUTING -m state --state NEW -j NB_CT_POST 2>/dev/null
iptables -t mangle -D POSTROUTING -j NB_STAT 2>/dev/null
+# Need QoS to be done in mangle POSTROUTING. Note that if NetBalance
+# is enabled then we will insert those rules/chains first. So any
+# routing marks will be handled before we blow them away with QoS
+# marks.
+iptables -t mangle -D POSTROUTING -j QoS 2>/dev/null
+iptables -t mangle -I POSTROUTING 1 -j QoS 2>/dev/null
if [ "`cat $REGISTER/system/net/nb/Enabled 2>/dev/null`" = yes ] ; then
iptables -t mangle -I PREROUTING 1 -j CONNMARK --restore-mark
- iptables -t mangle -I PREROUTING 2 -j NetBalancer
+ iptables -t mangle -I PREROUTING 2 -m state --state NEW -j NB_CT_PRE 2>/dev/null
+ iptables -t mangle -I PREROUTING 3 -m state --state NEW -j NetBalancer
+ iptables -t mangle -I INPUT 1 -m state --state NEW -j NB_CT_POST 2>/dev/null
+ iptables -t mangle -I OUTPUT 1 -j CONNMARK --restore-mark
+ iptables -t mangle -I OUTPUT 2 -m state --state NEW -j NB_FO_PRE
+ iptables -t mangle -I OUTPUT 3 -m state --state NEW -j NetBalancer
+ iptables -t mangle -I OUTPUT 4 -j OpenVPN
iptables -t mangle -I POSTROUTING 1 -m state --state NEW -j NB_CT_POST 2>/dev/null
iptables -t mangle -I POSTROUTING 2 -j NB_STAT 2>/dev/null
- iptables -t mangle -I INPUT 1 -j NetBalancer
- iptables -t mangle -I OUTPUT 1 -j NetBalancer
- iptables -t mangle -I OUTPUT 2 -j OpenVPN
fi
$SCRIPTS/nb_vpn 2> /dev/null
$SCRIPTS/nb_setautomarking 2>/dev/null
-
-
-
+echo 300 > /proc/sys/net/ipv4/route/gc_min_interval
+echo 360 > /proc/sys/net/ipv4/route/gc_timeout

Index: nb_setautomarking
===================================================================
RCS file: /users/tfitch/cvsroot/Zeroshell/Zeroshell/kerbynet.cgi/scripts/nb_setautomarking,v
retrieving revision 1.1.1.1
diff -u -6 -w -r1.1.1.1 nb_setautomarking
--- nb_setautomarking 26 Nov 2009 22:13:35 -0000 1.1.1.1
+++ nb_setautomarking 16 Apr 2010 18:43:14 -0000
@@ -1,29 +1,69 @@
#!/bin/sh
. /etc/kerbynet.conf
CONFIG=$REGISTER/system/net/nb/Gateways
cd $CONFIG
function set_gwmark {
xGW="$1"
- INTERFACE=`cat $xGW/Interface 2>/dev/null`
+ INTF=`cat $xGW/Interface 2>/dev/null`
IP=`cat $xGW/IP 2>/dev/null`
+ # Set up the pre-routing chain for new connections from this Gateway. We want
+ # to mark all traffic originating from this gateway to be routed back out to the
+ #same gateway.
+
+ # If we have found the device, then mark all traffic coming in on it to use
+ # it for outbound responses
+ if [ "$INTF" != "" ] ; then
+ if ! iptables -t mangle -L NB_CT_PRE -n | grep -q -w `echo 1$xGW |awk '{printf ("0x%x",$0)}'` ; then
+ [ "`cat $xGW/Enabled 2>/dev/null`" = yes ] && iptables -t mangle -I NB_CT_PRE 1 -i $INTF -j MARK --set-mark 1$xGW
+ else
+ [ "`cat $xGW/Enabled 2>/dev/null`" != yes ] && iptables -t mangle -D NB_CT_PRE -i $INTF -j MARK --set-mark 1$xGW
+ fi
+ else
+ # If this Gateway has no interface device defined for it, see if we can get
+ # one based on the next hop IP address. There might be more than one gateway
+ # on the interface, so we need to qualify the match based on the packet's
+ # destination address which should be the interface's IP address.
+ if [ "$INTF" == "" ] ; then
+ if [ "$IP" != "" ] ; then
+ INTF=`ip route list table 1$xGW | grep ^default | grep -o "dev w*" | awk 'BEGIN {FS=" "}{print $2}'`
+ LOCAL_IP=`ip route list table 1$xGW | grep $INTF | grep -o "src .*" | awk 'BEGIN {FS=" "}{print $2}'`
+ if [ "$INTF" != "" ] ; then
+ if ! iptables -t mangle -L NB_CT_PRE -n | grep -q -w `echo 1$xGW |awk '{printf ("0x%x",$0)}'` ; then
+ [ "`cat $xGW/Enabled 2>/dev/null`" = yes ] && iptables -t mangle -I NB_CT_PRE 1 -i $INTF -d $LOCAL_IP -j MARK --set-mark 1$xGW
+ else
+ [ "`cat $xGW/Enabled 2>/dev/null`" != yes ] && iptables -t mangle -D NB_CT_PRE -i $INTF -d $LOCAL_IP -j MARK --set-mark 1$xGW
+ fi
+ fi
+ fi
+ fi
+ fi
+
+ # In the post routing phase, we want to get the the routing realm used for new
+ # connections and save it in the connection. First setp here is to get the mark
+ # and put it on the packet. Our caller will emit the code to save the marks to
+ # the connection.
if ! iptables -t mangle -L NB_CT_POST -n | grep -q -w `echo 1$xGW |awk '{printf ("0x%x",$0)}'` ; then
[ "`cat $xGW/Enabled 2>/dev/null`" = yes ] && iptables -t mangle -I NB_CT_POST 1 -m realm --realm 1$xGW -j MARK --set-mark 1$xGW
else
[ "`cat $xGW/Enabled 2>/dev/null`" != yes ] && iptables -t mangle -D NB_CT_POST -m realm --realm 1$xGW -j MARK --set-mark 1$xGW
fi
+
+ # Make the entry in the statistics chain so we can track how much traffic went
+ # over each gateway
if ! iptables -t mangle -L NB_STAT -n | grep -q -w `echo 1$xGW |awk '{printf ("0x%x",$0)}'` ; then
[ "`cat $xGW/Enabled 2>/dev/null`" = yes ] && iptables -t mangle -I NB_STAT 1 -m mark --mark 1$xGW
else
[ "`cat $xGW/Enabled 2>/dev/null`" != yes ] && iptables -t mangle -D NB_STAT -m mark --mark 1$xGW
fi
}
GW="$1"
if [ -z "$GW" ] ; then
GW=`ls -d ?? 2>/dev/null`
iptables -t mangle -F NB_CT_POST
+ iptables -t mangle -F NB_CT_PRE
iptables -t mangle -F NB_STAT
for G in $GW ; do
set_gwmark $G
done
iptables -t mangle -D NB_CT_POST -j CONNMARK --save-mark 2> /dev/null
iptables -t mangle -A NB_CT_POST -j CONNMARK --save-mark

Next the current know issues. Current TCP connections are tracked using the connection tracking logic in iptables. So all traffic for a connection will use the correct WAN gateway. However subsequent TCP connections may use an different WAN gateway. Some web sites, including all HTTPS sites, require that a series of TCP connections all come from the same IP address. Those sites will have problems with Zeroshell’s NetBalance.

To force a succession of TCP connections to use the same gateway we rely on the IP route cache. The first packet of a new connection triggers a decision on what gateway and next hop to use. That information is saved in the IP route cache. When an new connection to the same IP address is encountered the kernel routing logic will look in the route cache for the gateway and next hop to use. So the selection of gateway based on IP address becomes “sticky”. All new connections will use the same gateway.

This is good and bad. Good because it means HTTPS sites and the HTTP sites that check IP addresses against session will work. It is bad because it means that load balancing is not effective.

This stickiness is maintained until the entry is removed from the route cache. That happens when the cache becomes full and older entries are removed to make room for new ones. It also happens periodically when all routes are flushed. It can also happen when a failover occurs.

So, barring failures, the stickiness is controlled by the route cache parameters that can be viewed or written to in the /proc/sys/net/ipv4/route/ directory. The best explanation I have found for the parameters and what they do are located at:

http://lkml.indiana.edu/hypermail/linux/kernel/9901.3/1121.html
and
http://mailman.ds9a.nl/pipermail/lartc/2002q4/005296.html

Basically the default settings include a value of 600 in /proc/sys/net/ipv4/route/magic_interval that will flush all entries out of the route cache every 10 minutes.

I have not yet determined a set of values for these settings that work well for me. And based on your traffic patterns you will need different values that I do. I think the goal would be to have an entry age out of the route cache after a reasonable amount of time has passed since its last use. Maybe 10 or 20 minutes. Since all entries are removed at the magic_interval, it would seem that magic_interval should be large, probably around a day (86400 seconds). Unfortunately there does not appear to be a time of day that can be specified, so we may end up flushing the cache in the middle of a users session.

Next, we desire some way to age out older entries that have not been used in a while. The kernel will only do this if the cache is getting full. And the default size cache is quite large, many thousands of entries. Based on the number of different IP addresses accessed per seconds, which will vary for every installation, a cache size should be picked where the cache will become full in the 10 or 20 minutes desired to remove old entries. It is here that I think it comes down to trial and error by each ZeroShell administrator to determine what works for their installation.

If anyone has better insight into the mechanics of the Linux route cache and how to tune it for NetBalancing, please let me know.