November 24, 2010 at 6:18 am #42743
I’ve been running ZeroShell for a location that is servicing ~20 individuals at the moment. We have a 4MB optical fiber line running to our location. Our primary use of the internet is to access remote desktop in the United States (I should mention that we’re in China).
About our setup:
All users run into a switch, which then connects to the zeroshell box (2.8Ghz Pentium 4, 1GB ram) at ETH01, ETH02 is the WAN-facing adapter. ETH01 and ETH02 are bridged on BRIDGE00. Zeroshell runs from a standard 4gb samsung flash drive. No Captive Portal/HTTProxy/Bandwidthd. Zeroshell is running as the network’s DHCP server and connects directly to the modem. It is the only router in the link.
Zeroshell is an outstanding piece of software, but unfortunately over the last two months our network has been hit with crippling latency at random periods. As our primary focus is to connect to RDP servers in the US, I’ve set up QoS to give preference to any connections coming to or from our various RDP IPs in the US. Ive set the maximum bandwidth at 3.7 Mb for QoS, and only applied the rules to the LAN-facing ETH01 (different than what was mentioned in the walk-through, but it appears that the consensus here is that since we can’t QoS upstream then QoSing the WAN-facing adapter is moot).
Without the QoS, our 20+ individuals utilizing only 4Mb wreak havoc on the network. However this setup with zeroshell gives me exactly what I am looking for: rock-solid low latencies to our US-based servers while non-prioritized traffic (primarily HTTP) jumps all over the place.
So great, everything is working perfectly. However, for what appears to be no apparent reason, at certain times in the day Zeroshell will start acting “funny”. Pings from a host to the router will go from the expected <1ms to 100,200,300+ms latency. My previously-stable latency to US servers jumps up from ~280ms to 1000+ms. All of this happens for maybe 10-15 minutes, and then it ends. Back to smooth sailing and low latency. Maybe once a day, maybe once a week, there appears to be no rhyme or reason. Users are apparently not doing anything particularly intensive during these spikes.
When the spikes happen, I watch iptraf/top and see the following: We’re not utilizing our maximum bandwidth (maybe 2-2.5Mbps, not 3.7 up or down), CPU usage is minimal (~2-3%), Ram usage is about 200mb with 800mb free. Nothing out of the ordinary. There doesn’t appear to be a rapid influx of packets (~800/sec) according to iptraf. Actually, everything looks 100% “normal” to me, although I am far from an expert in these matters.
Even odder, pinging our US servers from a host machine results in 1000+ms latency during these slowdowns, however if I SSH into Zeroshell and ping the same server, I surprisingly get that expected rock-solid latency of 280ms. Could it have something to do with iptables? I’m thinking that pinging from the router is going through the OUTPUT chain and not getting mangled in the FORWARD chain, so could there be something going on there? If I disable QoS during this period, latency spikes all around (pinging the US from both the host and router), however I have a feeling that this is probably due to the fact that once I turn off QoS suddenly we have far too many users clogging our small pipe with no restrictions.
Does anybody have any ideas/thoughts as to why this might be happening? As I speak we just came off of a 20 minute latency spike.
I sincerely any assistance you guys might have,
Kyle BNovember 25, 2010 at 11:34 am #51357
look for errors in dmesg when your routeur is stalling.
Post it here if you need help.
You must be logged in to reply to this topic.