Using Tomato's QOS System

Tutorials » Using Tomato's QOS System

Background

The author has been involved in setting up WiFi in several large residential blocks, where it was important that the result not only worked but was simple to maintain by reception staff. Tomato’s QOS system was used to ensure that trolls lurking in their caves downloading files did not bring the whole thing to a grinding halt, as was the case before I was given the job. What was achieved has surprised many people here, including myself.

Ever sat in an internet shop, a hotel room or lobby, a local hotspot, and wondered why you can't access your email? Unknown to you, the guy in the next room or at the next table is hogging the internet bandwidth to download the Lord Of The Rings Special Extended Edition in 1080p HDTV format. You're screwed - because the hotspot router does not have an effective QOS system. In fact, I haven't come across a shop or an apartment block locally that has any QOS system in use at all. Most residents are not particularly happy with the service they [usually] pay for.

If you are a single user, then you probably don't need QOS at all. Just reducing conntrack timeouts may perform miracles for you.

Router “QOS”

A "QOS" (Quality Of Service) system running on a SOHO router is best viewed as a firmware strategy used to give priority to those applications which are important. Without it, anarchy rules, and the downloader will usually wreck the internet access for everybody else.

Many simple routers and unmanaged switches just forward traffic without looking at it and without doing anything special to it. Some switches and routers have several priority queues for network traffic (e.g. Tomato has 10 - which are Highest, High, Medium, Low, Lowest, A, B, C, D, E). These provide a basic kind of "QoS" by giving priority treatment to certain types of network traffic.

However, anyone searching the web for "QOS" will find that in engineering circles, QOS means something quite different to our simple little router's so-called "QOS". There are methods which tag each packet with a code that can be read by hardware along the traffic route, from your PC to the guy at the other end of the link, to tell that hardware how quickly to send the traffic - what PRIORITY it has (assuming the hardware is configured to obey the codes). The idea being that all routers across the internet would recognize these tags and give priority to the marked traffic as needed. You can, for example, purchase little adapters which mark packets they send, such as the popular Linksys PAP2. These plug between an analog phone and an ethernet jack, allowing use of the phone for VOIP.

Traffic marked by these adapters will therefore [supposedly] give priority to your VOIP traffic as it traverses the internet. VoIP calls via SIP in fact consist of SIP traffic that set initially up the call, and RTP traffic that actually carries the voice. Some devices can mark these two types of packets differently - so you could prioritise them differently if you had the hardware to do so.

Sounds good, doesn’t it? There’s just one little problem – it doesn’t work. For it to work all (or at least most) routers and switches across the internet have to take some notice of these tags – but sadly, they don’t. Even if they did, any ISP (or even user) could mark all of its traffic as high priority and then the whole thing is useless anyway. In fact, Windows 2000 is said to have done this in the past, and this is quite probably the best example of why it has not been implemented!

The simple “QOS System” as now used in the vast majority of SOHO routers does not mark traffic in this way and launch it on the internet in the hope that some benevolent genie will treat it nicely. We have to devise some other way to stop the pipe clogging up. So the aim of this article will be to show you how this can be done.

Since all that we can do, is therefore to process or condition traffic going OUT of our router, some myths have sprung up and arguments about “outgoing” and “incoming” QOS abound. I would remind the reader that this is not “true” QOS and that you must view it as an overall strategy. Don’t think of it as “outgoing” or “incoming” QOS, or you will become confused very quickly.

There are those who believe that we can only control what we send out from our router (the uplink) and cannot control our incoming traffic (downlink) at all. Sadly, there are a lot of such people especially in the various forums, disseminating misinformation and gloom, often with abuse thrown in for good measure when they can't get their own way. So I would ask you to please ignore those who insist that incoming data cannot be controlled at all and that QOS is therefore useless.

By looking at the overall picture of what is going on in an environment where many different connections are made simultaneously, we can manipulate the things we do have control of to have an effect on things which would at first sight appear to be outside of our control. The way we control incoming traffic is by manipulating what our router sends, in order to influence our incoming traffic. This can be more of an art than a science!

QOS in operation – is it effective?

I can best illustrate how effective Tomato’s QOS can be, by showing an example. A typical condominium block, with 250 rooms, about a hundred-odd users, all sharing an ADSL internet connection, can all happily use the internet without being aware that they are actually sharing a common line. Ping times drop from 250-450mS or worse without QOS to 35-55mS with some spikes when QOS is running. Since we have no control over what residents do with their machines, we have to ensure that the network runs well with anything that may be in use. This includes P2P, Mail, Webcams, IPTV, Messenger, Skype and VOIP, File transfers, YouTube, - you name it – we have it on our networks. Don't take my word for it - look on the Linksysinfo forum and you will find quite a few hotel operators and community ISP's using Tomato QOS.

Actually, for most residents, the most important thing is that WWW browsing is speedy and efficient. Anything else is seen as less important. Of course the fanatical games players see it another way, but I have to cater for the majority first. VOIP isn’t seen as a top priority in our blocks, for obvious reasons, but it can and does work very well. So I leave it to you. Does router 'QOS" work? I think you can see that it does. How well it actually works for you, will mostly depend on how much effort you put into understanding how to use it.

A word here. Often, when people read this thread, they complain that their brain hurts - that it's too difficult. Well, anything worthwhile is worth learning, isn't it? Or are you one of those people who always expects someone else to do everything for them? If you are just too lazy to read a couple of pages and try to understand them, then you shouldn't expect your router's QOS to work properly. Go watch TV.

There have been a small but steady stream of whiners wanting simple explanations and simple setup. My answer - you can find hundreds of simple explanations using Google. You can see how much thought has gone into them by looking at all of the figures neatly lined up - 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% in the setting boxes. Or 100% - 99% etc. Sometimes even everything set 1-100% in rate and limit, and no incoming limits. This clearly shows the author has not the slightest understanding of what he is doing. But yes, it's nice and simple. Go figure ….

To those who do genuinely want to learn and to do things for themselves, welcome, thanks for visiting this page, and good luck with your endeavors!

[1] Understanding what router QOS systems are and how they work

Let's begin by making some things a little clearer for newcomers to Tomato.

"Incoming" versus "Outgoing" QOS

Unfortunately many posts on the subject of QOS confuse people, especially newcomers, into misunderstanding what the router's QOS is, what it is NOT, what it is used for, and what it can really achieve if understood and used properly. Let's get this straight. There isn’t a “QOS for Uploads” and a “QOS for Downloads”.

This ongoing battle seems to arise from the fact that the QOS system operates on outgoing traffic. Therefore, many people do not understand how it can manipulate the situation to control INCOMING traffic. So they confuse everyone by swamping the forums with comments like "QOS doesn't work" and "the Incoming QOS is rubbish" - etc.

QOS would be of no interest whatsoever to most of us if unless it helped us with our incoming data flow. It really doesn't help to look at it as either "incoming" or "outgoing" QOS. Those people who keep insisting that because QOS only works on outgoing traffic (uploads) then it can’t work, are missing the whole point. I must stress this, because there are hundreds of people making stupid statements like this in the forums and unfortunately, too many people believe what they are saying.

So HOW does the router's QOS work, how does it make any difference to incoming traffic - if it only acts on the outgoing data? Well, it's actually very simple. [We will confine ourselves to the TCP protocol for the purpose of this discussion].

Take this analogy. Suppose there are a thousand people out there who will send you letters or parcels in the mail if you give them your address and request it (by ordering some goods, for example). Until you make your request, they don't know you and will not send you anything. But send them your address and a request for 10 letters and 10 parcels and they will send you 10 letters and 10 parcels. Ask for that number to be reduced or increased, or ask (pay!) for only letters and no parcels, and they will do so. If you get too much mail, you stop sending the requests or acknowledgements until it has slowed down to a manageable level. Unsolicited mail can be dealt with by ignoring it or by delaying receipt (payment) and the sender will send less and give up after a while. In other words, you stop more goods arriving at your house by simply not ordering more goods!

If you have letters arriving from several different sources, you stop or delay sending new orders to the ones you don't feel are important.

That's it! Do you understand the concept? You’ll see that it’s not an exact science. There are no “guarantees” that the remote sender will do exactly what you wish, but the chances are very good that you will be able to influence what he does.

The amount of mail you receive is usually directly proportional to the requests you send. If you send one request and get 10 deliveries - that is a 1:10 ratio. You've controlled the large amount of deliveries you receive with only the one order which you sent. Sending 1,000 requests at a 1:10 ratio would likely result in 10,000 letters received - more than your postman can deliver. So based on your experience, you can figure out the ratio of packets you are likely to receive from a particular request, and then LIMIT the number of your requests so that your postman can carry the incoming mail. But if you don't limit what you ask for, then the situation quickly gets out of control.

If despite your best efforts, too many packets arrive, then you can refuse to accept them. When those packets aren’t delivered, the guy sending them will slow down or stop.

It's not a perfect analogy, sure, but router QOS works in a similar way. You have to limit the requests and receipts that you send - and the incoming data reduces according to the ratio you determine by experience. If that still isn’t enough, we can refuse to accept them in an attempt to influence the remote sender to slow down.

The problem is you can have no absolute control what arrives at your PC - because your router does not know - and can never know - how many packets are in transit to you at any given time, in what order, and from what server. The only thing your router can do is remember what you SEND, see what comes back, and then respond to it. And the QOS system attempts to influence your incoming data stream indirectly by changing the data that you SEND in much the same way that you can control incoming mail simply by reducing your demand for it.

Now let us take the case where we are dealing with more than one “supplier” at once. If we decide that one supplier is more important than another, or you need a new fuel tank before you get a wheelnut for your motorbike, we can choose to process his orders first, and delay the others, by giving him a priority. There may be hundreds of “suppliers” sending you packets, and you can prioritize them as you wish by placing them into priority “classes” and processing them in order of their priority.

That is the whole purpose of the router-based QOS systems, and that is why it they have been developed, not merely to control uploads! However, you can't just check a magic box marked "limit all my P2P when I am busy with something more important" - you have to give clear instructions to the router in how to accomplish your aim. To do this it is necessary to understand how to control your incoming data by manipulating your outgoing requests, class priorities, and receipts for received packets. Added to this we also have the ability to limit or “shape” traffic by using bandwidth limits on both outgoing and incoming traffic.

Finally, we have to also consider UDP packets (rather less easy to control) and how to effectively control applications that use primarily UDP (VOIP, Multimedia etc).

Depending on your requirements that may take hours or months to get QOS working satisfactorily, my aim is to help you to do so.

[2] Setting Your Limits and defining rules for different applications

A look at QOS rate/limit settings with special reference to P2P Traffic - and why QOS often fails to work properly..

The router QOS system attempts to ensure that all important traffic is sent to the ISP first, and then tries to control or "shape" other traffic so that the higher priority incoming data is not delayed.

Packets from your PC will be “inspected” and compared with the router’s QOS classification rules to decide what priority they should have, and then assigned a place in the outgoing queue waiting to be sent to your ISP. Other mechanisms may also be used to manage the traffic so that the returning data from the remote server is delivered before that which is less important.

But someone has to define a set of QOS rules for a particular environment. That's YOU!

If you are a standalone user with one PC then you probably don't need QOS at all. If you are a P2P user and wish to download at absolute maximum speed, you will usually find QOS counter-productive. Where QOS is of the greatest benefit is when there are many connections and many users on a network, and one or more of them is preventing the others from working.

The worst problem faced by all of us in multi-user environments is P2P traffic, which can often take all available bandwidth. Hence, most discussions of QOS operation refer to P2P when giving examples of traffic control. We normally give P2P a low priority because most people want to browse online websites - and the P2P traffic slows their web browsing down.

The faster your ADSL line, the better your system will work, the more P2P you can allow on your network, and the better your VOIP and games will work. This is because of two things - firstly, obviously the overall speed improves. Secondly and more important, it is more difficult for P2P applications to actually generate enough traffic to fill the pipe. Overall, everything becomes less critical.

If you have a small network of 2 or 3 PC's then you may benefit from QOS, but it doesn't have to be too complicated. But if you have a larger network, similar to mine, which are large apartment blocks with about 250-400 rooms and maybe around 600-1200 residents, then QOS is absolutely essential. Without it, nobody will be able to do anything. Just a single P2P user will often ruin it for everyone else. However, the rules for correct QOS operation work just same for large or small networks - but you must decide for yourself how complex you want your rules to be, what applications running on your PC's you need to address. Inevitably, unnecessary rules will have an effect on throughput.

In a large block like mine, you have to try to cover everything, so your rules need a lot of thought. What we do is of the utmost importance if we want things to work properly, because if we screw up, everyone is dead in the water. Unfortunately, that means a very steep learning curve. It's also important to keep an open mind, and to understand that if a set of rules don't work, there is a reason. That reason is usually that you have overlooked and failed to address a particular set of circumstances.

The QOS in our router can only operate on outgoing data, but by “cause and effect” – this has a significant influence on the incoming data stream. After all, the incoming data to our router is what our QOS is *really* trying to control. QOS works by assigning a priority to certain classes of data at the expense of others, and also by controlling traffic by limits and other means - so as to enable prioritized traffic to actually get that priority.

Since UDP operates in a connectionless state, the main methods used by our router to control traffic involve manipulation of TCP packets. UDP, used for VOIP, IPTV applications, can't be controlled as such, but it can be helped by the reduction of TCP and other traffic congestion on the same link. In fact, some kinds of UDP traffic can be a huge drain on resources - and we will often need to prevent it from swamping our router. Sometimes that may mean just not allowing some kinds of UDP traffic.

We would usually like to allow WWW browsing to work quickly, and get our email, but aren’t too bothered about the speed of P2P – for example. In the event of huge amounts of traffic occurring which is too much for our bandwidth limitations, we also have to control the maximum amount of data which we attempt to send or receive over those links. This is called “capping”, “bandwidth limiting” or “traffic management”. This is also managed by the QOS system in our router and is a *part* of QOS.

So, once again a reminder - we must not refer to "incoming" or "outgoing" QOS. All of these mechanisms are PART of the "QOS" system on the router.

Time to really get down to business…

Let us have a look to see why many people fail to get QOS to work properly or at all, especially in the presence of large amounts of P2P. The original default rules in Tomato are almost useless - if better than nothing. So let's improve on them.

Firstly, let’s start by making the statement that “slow” web sessions are usually due to “bottlenecks” – your data is stuck in a queue somewhere. Let’s first assume that the route from your ISP to the remote web server is fast and stable. That leaves us with our router - which is something that we have some control over.

We are left with two points commonly responsible for bottlenecks.

1) Data sent by your PC’s, having been processed by QOS, is queued in the router waiting to be sent over the relatively slow “outgoing” uplink to your ISP. Let’s assume a 500kbps uplink.

2) Data coming from the remote web server, in response to your PC’s requests, is queued at the ISP waiting to be sent to your router. Let’s assume a 2Mbps downlink.

Bottleneck No. 1

Our PC's can usually send data to the router much faster than the router can pass it on to the ISP. This is the cause of the first "bottleneck". However, we can just leave the normal TCP/IP mechanisms in the PC to back off and sort out the problem of data being sent to the router too quickly, and it will take care of itself. But there is now another function associated with the sending of data by your router - to the ISP, which is the key to QOS operation.

Let me try to explain:

The incoming/outgoing data is queued in sections of the memory in the routers - these are known as “buffers”. A “buffer” is a place where data is stored temporarily while waiting to be processed. It is important not to let these “buffers” become full. If they are full, they are unable to receive more data, which is therefore lost. The lost data therefore has to be resent, resulting in a delay.

The transmit buffer in your own router contains data waiting to be sent to your ISP. This is an extremely important function. There must be room to “insert” packets at the front of the queue, so that it can be sent first - in order for QOS priorities to work properly. If there's no room to insert the data in the buffer, then QOS cannot work.

If your PC('s) can be slowed down so that they send data to the router at a slower rate then your router can send it to the ISP, we ensure that there will always be some free space in the buffer. This is the reason I recommend you to set the “Max Outbound" bandwidth in QOS-BASIC to approximately 85%, or even less, of the maximum “real” (measured) uplink speed.

I must stress that it is an absolute necessity that you set the outgoing limit at about 85% of the minimum bandwidth that you EVER observe on the line. THIS IS NOT NEGOTIABLE! You must measure the speed at different times throughout the day and night with an online speed test utility, with QOS turned off, and no other traffic - to determine the lowest speed obtained for that line. You then set 85% of this figure as your maximum permitted outgoing bandwidth useage. Just because this seems low to you, don't be tempted to set a higher figure. If you do, then the QOS system will not work correctly. To achieve best results for VOIP you can set a figure lower than this - 66% for example.

When this maximum outgoing bandwidth limit is reached - packets from the PC's are dropped by the router, causing the PC's on your network to slow down by backing off, and to resend the data after a wait period. Note that this is actually "traffic shaping" between your PC('s) and the router. This takes care of itself and is only mentioned in passing. You don't have to do anything.

Now, let’s consider QOS in operation. Imagine some unimportant data that you wish to send to your ISP, presently stored in the router's transmit buffer. As it is being sent, you might start up a new WWW session which you would prefer took priority. What we need to do is to insert this new data at the head of the queue so that it will be sent first. When you set a “priority” for a particular class, you are instructing the router that packets in certain class groups need to be sent before other classes, and the router will then arrange the packets in the correct order to be sent, with the highest priority data at the front of the queue, and the lowest at the back. This is quite independent of any limits, or traffic shaping, that the QOS system may ALSO do.

Now, we are going to assume that we have defined a WWW class of HIGH with no limits. Let’s imagine the router has just been switched on, and we then open a WWW or HTTP session. A packet (or packets) is sent to the remote server requesting a connection - this is quite a small amount of data. The server responds by sending us an acknowledgment, and the session begins by our requesting the server to send us pages and/or images/files. The server sends quite large amounts of data to the us, but we respond with quite a small stream of “ACK” packets acknowledging receipt. There is an approximate ratio between the received data and our sent traffic consisting mostly of receipts for that data [ACKS], and requests for resends.

Bottleneck No. 2 - The BIG ONE

This relationship between the data we send and the date we receive varies with the applications and protocols in use, but is usually of the order of at least 1:10 or 1:20, but it can rise to around 1:50 especially with P2P connections. So an unlimited outgoing data rate of 500kbps *could* result in an incoming data stream of anything from 5 to 25Mbps - which would of course be far too much for our downlink of 2Mbps. Our data would therefore be queued at the ISP waiting to be sent to our router. Most of it will never be received – it will be “dropped” by the ISP’s router. All other traffic will also be stuck in the same queue, and our response time is awful. This is bottleneck no. 2 in the above list.

How do you prevent this bottleneck? Well, firstly, you have to restrict the amount of data that you SEND to the remote server so that it will NOT send too much data back for your router to process. You have absolutely no control over anything else - you cannot do anything except play around with what you SEND to the remote server. And what you SEND determines what, and how much, traffic will RETURN. Understanding how to use the former to control the latter is the key to successful QOS operation. And how to do that, you can only learn from experience.

Let's go back for a moment to the analogy in the introduction:

Suppose there are a thousand people out there who will send you letters or parcels in the mail if you give them your address and request it. Until you request it, they don't know you and will not send you anything. Send them your address and a request for 10 letters and 10 parcels and they will send you 10 letters and 10 parcels. Ask for that number to be reduced or increased, or ask for only letters and no parcels, and they will do so. If you get too much mail, you stop sending the requests or acknowledgements until it has slowed down to a manageable level. Unsolicited mail can be dealt with by ignoring it or delaying receipt and the sender will send less and give up after a while.

The amount of mail you receive is usually directly proportional to the requests you send. If you send one request and get 10 letters, that is a 1:10 ratio. You've controlled the large amount of letters you receive with only the one letter which you sent. Sending 1,000 requests at a 1:10 ratio would result in 10,000 letters received - more than your postman can deliver. So based on your experience, you can figure out the ratio of letters you are likely to receive from a particular request, and then LIMIT the number of your requests so that your postman can carry the incoming mail. But if you don't limit what you ask for, then the situation quickly gets out of control.

So, we have to understand how the amount of incoming data is influenced by what we send. Experience tells us that for some applications aproximately a 1:10 ratio of sent to received data is normal, while for others it can be less than 1:50 or even more (esp. P2P).

To examine the effect of this "ratio" between sent and received TCP data in more detail we’ll use P2P – the real PITA for most routers and the application that we most often have trouble with. We will define a class of "D" for P2P with a rate of 10% (50kbps) and a limit of 50% (250k) and start off the P2P client with a load of popular movies, Linux distros, or whatever is needed. Now we look at the result. The link starts sending at 50kbps and quickly increases to 250kbps outgoing data (which is mostly acknowledgements for incoming traffic). Because of our 1:20 or more ratio between send and receive, we get perhaps 5Mbps or more INCOMING data from the P2P seeders in response. That is far too fast for our miserable little downlink of 2Mbps, and is queued at the ISP’s router waiting for our own router to accept it. The downlink has become saturated. Any other traffic is also stuck in this queue. When most of these packets fail to be delivered, after a preset period of time they are discarded by the ISP’s router and are lost.

As it does not receive any acknowledgement of receipt from our PC for the missing packets, the originating server “backs off” in time and resends the lost data after a short delay. It keeps doing this, increasing the delay exponentially each time, until the data rate is slowed down enough that the link congestion is relieved and packets are no longer dropped. It may take a long time to do this, but in theory, at least, eventually the link will stabilize.

By looking at the “realtime or 24 hour” graphs in Tomato, it is easy to see when your downlink is being saturated. The graph will “flat top” at maximum bandwidth, with very few and small peaks and troughs noticeable in the graph. You must never let it reach the maximum bandwidth figure, or your attempts at QOS will not work.

Right - let’s see what we can do about this !

There are some different mechanisms available for us to use which will have the effect of slowing down an incoming data stream. At first I will concentrate on the most important one, which would produce the best speed and response for other classes despite having several online P2P clients.

Reducing outgoing traffic for a class.

We drop the P2P class rate down to 1% (5k) and the limit to 10% (50k) - and watch what happens. The incoming data from the remote server(s) now also drops to maybe 500kbps - 1Mbps (cause and effect). This is OK and fits within our available 2Mbps bandwidth downlink, while a simultaneous WWW session is still quite fast and responsive. However, this is a simplistic view, because the “1:20 ratio” is not *always* applicable, and high-bandwidth seeders may actually send you more data than expected, nevertheless it will still probably be within the 2 Mbps link speed. However, if you try to do better than this and increase the outgoing limit to 20%, it MIGHT still be OK – or it more probably might NOT, depending on the material being sent to you, the number of seeders, the number of connections open at any given time, and many other factors which all have an effect on the link.

At more than 20% the simultaneous WWW session may start to slow down and is generally unresponsive as the incoming downlink starts to saturate. You must find this critical limit yourself and stick below it. You really do need to err on the low side to be absolutely certain that the downlink does NOT become saturated, or the QOS will break. I will discuss the pros and cons of increasing this setting to enable us to download more P2P later. We will show then how to use incoming traffic limits to allow this. But for the moment, stay with me.

TO RECAP - It is quite likely that setting your outgoing P2P traffic limit to more than 15-20% will begin to saturate your downlink with P2P, causing QOS to be ineffective. You have to decide on a compromise setting that allows higher P2P activity while still allowing a reasonably quick response to priority traffic like HTTP. [Shortly, we will see how to combine two methods to achieve this].

Still, let’s set it to 20% (100k UP) and be optimistic - phew – everything’s still OK. But we’ve hit a snag already – especially with P2P applications.

Consider what happens, for example, when your P2P application needs to UPLOAD a lot of files in order to gain “credits”. Your PC uploads lot of data, perhaps quickly filling your “upload” allocation of 100k. BUT this class is shared with the receipts (ACKS) you are sending out in response to incoming files. These packets no longer have exclusive access to the router's buffers, and since they have no special priority in the queue, may be delayed. Now your downloads will also slow down and can no longer reach the normal speed - they may even drop down to almost nothing. At this point you might think there is something wrong with QOS. But QOS is actually working correctly, and it is your understanding of how P2P operates and your application of the rules that is in question.

Your uploads have dominated the connection because you didn't anticipate what might happenitalic text. You allowed uploading seeds to dominate your connection, when what you really wanted to do was to allow downloads. So remember that when you deal with P2P, and decide what is your aim. Seeding isn’t usually very practical with most of our ADSL lines, downloads are what people usually want.

Limiting the incoming TCP data rate of a class

A better solution can be achieved by ALSO using the “incoming” traffic limit in Tomato P2P class to set a limit on incoming P2P data. So how does this work? The connection tracking section of the router firmware keeps a record of all outgoing P2P TCP packets and then attempts to keep a tally on any incoming TCP packets associated with it. It can therefore add them all up and then calculate the speed of the incoming P2P, which can then be limited. So we could, for example, set an incoming limit on our connection of something under 2 Mbps. If this is exceeded, the router will drop packetsitalic text, forcing the sender to back off and resend the data – once again allowing the link to stabilize. Tomato's QOS / Limiter is actually just using the normal method of TCP congestion control to shape traffic of the individual classes. [To better understand how the normal built-in backoff strategies of the TCP/IP protocols operate, you must use Google and read up primers on TCP/IP operation.]

This is, of course, the reason why a maximum incoming limit is sometimes recommended to be initially set in QOS/BASIC for rather less than the maximum “real” speed normally achievable from your ISP. It is an attempt to slow down the link before it becomes saturated. That is why it is often recommended to set to something LOWER than the maximum, usually 85% or so. If it is allowed to saturate, then it's too late - your QOS isn't working.

This is a good time to mention something about the maximum setting in Tomato's incoming limit settings.

Please note that the "Maximum" figure that we set in the incoming category is NOT in itself a limit. There is no overall limit in Tomato. This figure is just used to calculate the percentages of the individual classes. So we can at present only set a limit on each CLASS. However, you will quickly realize that the sum of these classes can now add up to more than the bandwidth that we have available! In short - Tomato's QOS incoming bandwidth limiter is fundamentally flawed.

Because of this, if you run a busy network, you've probably noticed that in practice it is actually unable to keep the incoming data pegged low. Heavy traffic on a couple of classes may well exceed the total bandwidth available. Actually, in order to always work consistently, the sum of the limits should add up to less than 100% of the bandwidth we have available. But if we do that - we end up with quite low throughput on some of our classes - they can't use all of the bandwidth. Tomato's QOS is unfinished !

Now, these figures we are bandying about are not cast in stone. While a link is busily "stabilizing itself", new connections are constantly being opened by WWW, Mail, Messenger, and especially other P2P seeders, while other connections may close unpredictably, and that upsets the whole thing. The goalposts are constantly moving! You will see from this that P2P in particular is very difficult to accurately control. Over a period, the average should approximate the limit figures. Best latency is achieved with a combination of 1) and 2). Juggling them to accomplish what you want is an art.

If you want to see your QOS working quickly and with good latency, set the incoming total of limits low at around 66% of your ISP's maximum speed.

These graphs of the latency of a 1.5Mbps ADSL line under differing loads, and the result of limiting inbound traffic, show clearly that this figure of 66% is something you ignore at your peril!

Now let's add some additional information onto the first graph. You can see that ping response begins to be affected from 1Mbps pwards, even at 1.2Mbps it has become quite bad! At 1.3mbps is it severely affected.

(Graphs thanks to Jared Valentine).

It is important not to rely 100% on the incoming limit especially while you set up QOS. Set it only when all else has been adjusted and you can see if your outgoing settings are causing congestion. If you try to set up your QOS with incoming limits set, it will actually make it rather difficult for you to see what is happening as a result of your settings, because the limit will kick in and mask what is going on. Initially, it is useful to set the incoming overall limit to 999999 so that it is in effect switched off, this will make things easier for you while examining your graphs and adjusting your QOS parameters. But once your QOS rules are in place it ALWAYS pays to impose an incoming limit for many applications as well as an overall limit.

Incidentally, there is a big difference in the class limits between 100% and NONE. 100% = 100% of your overall limit, NONE means ignore the overall limit.

To recap - For best throughput and reasonable response times and speeds, set incoming class limits quite high if you wish. You can set NONE=no limit at all for an important priority class such as WWW browsing. For best latency, set incoming limits lower. I found 50% maximum limits to be extremely responsive, 66% good, 80% still fairly reasonable but ping times beginning to suffer under load, and things dropped off noticeably after that. As a compromise, I use 80% for my maximum incoming limits, and most residents appear to be happy with the result.

You sacrifice bandwidth for response/latency.

In order for WWW to be snappy when using a restriction on other traffic, I usually set my WWW class limit to "NONE" so that it will attempt to use ALL available bandwidth for the fastest response.

Limiting numbers of TCP and UDP connections

If your router crashes or becomes unstable due to P2P applications opening large numbers of connections, try to limit the number of ports that a user can open.

Here is a collection of useful scripts: Put one or more of the following in the "Administration/Scripts/Firewall" box. Check that function before adding another rule. You may list the iptables rules by telnet to the router and issuing the command "iptables -L" ["-vnL" for verbose output] or "iptables -t nat -vnL". If you are running a recent tomato mod, you can also do this from the "system" command line entry box, which is much more convenient. [Another useful command: iptables -t mangle -vnL]

Now an explanation. Example Linux firewalls normally use the INPUT and FORWARD chains. The FORWARD chain defines the limit on what is sent to the WAN (the internet). This therefore places a limit on the connections to the outside from each client on your network. The INPUT chain limits what comes in from the internet to each client. Without this limit, the router can still be overloaded by incoming P2P etc.

Placing limits into either of these chains, which is usually recommended, does work, but in the event of a "real" DOS attack or SMTP mail trojan, the router often instantly reboots without so much as a single entry in the logs.

Following much investigation and discussion with phuque99 on Linksysinfo.org forum, following his suggestion the scripts were instead placed in the PREROUTING chains where they are processed first. BINGO! The router seems to stay up and running.

This is what I now recommend:

#Limit TCP connections per user
iptables -t nat -I PREROUTING -p tcp —syn -m iprange —src-range 192.168.1.50-192.168.1.250 -m connlimit —connlimit-above 150 -j DROP

#Limit all *other* connections per user including UDP
iptables -t nat -I PREROUTING -p ! tcp -m iprange —src-range 192.168.1.50-192.168.1.250 -m connlimit —connlimit-above 100 -j DROP

#Limit outgoing SMTP simultaneous connections
iptables -t nat -I PREROUTING -p tcp —dport 25 -m connlimit —connlimit-above 5 -j DROP

The next script is to prevent a machine with a virus from opening thousands of connections too quickly and taking up our bandwidth. I don't like this much, because it can prevent a lot of things working properly. Use with caution and adjust the figures to suit your setup.

iptables -t nat -I PREROUTING -p udp -m limit —limit 20/s —limit-burst 30 -j ACCEPT

NOTE If you test the above scripts with a limit of say 5 connections in the line, you will often see that it doesn't appear to be working, you will have many more connections than your limit, maybe 30-100, that you can't explain. Some of these may be old connections that have not yet timed out, and waiting for a while will fix it. Be aware that often these may be TEREDO or other connections associated with IPv6 (windows Vista, and 7) which is enabled by default. You should disable it on your PC by command line:

netsh
interface
teredo
set state disabled

Conntrack Timeout Settings

If your router becomes unstable, perhaps freezing or rebooting, apparently randomly, then it may have been asked to open too many connections, filling the connection tracking table and running the router low on memory. Often this can happen because poorly behaved applications (usually P2P clients) can attempt to open thousands of connections, mostly UDP, in a short space of time, just a few seconds. The router often does not record these "connection storms" in the logs, because it runs out of memory and crashes before it has time to do so.

Obviously, there is a flaw in the firmware, which most definitely should never allow this situation to happen. Until such time as we can correct this situation, we must resort to some means of damage prevention and control. Setting the timeout value of TCP and especially UDP connections is necessary.

Setting the number of allowed connections high (say 8192) makes the situation worse. In fact this number is almost never required. Most connections shown in the conntrack page will actually be old connections waiting to be timed out. Leaving the limit low, say 2000 to 3000 connections, gives the router more breathing space to act before it crashes.

The following settings have been found to help limit the connection storm problem somewhat, without too many side effects.

TCP
None 100
Established 1200
Syn Sent 20
Syn Received 20
FIN Wait 20
Time Wait 20
Close 20
Close Wait 20
Last Ack 20
Listen 120

UDP
Unreplied 10 (25 is often necessary for some VOIP applications to work, otherwise reduce it to 10)
Assured 10 (Some VOIP users may find it necessary to increase this towards 300 to avoid connection problems. Use the smallest number that is reliable).

Generic
10 for both

ADDIT .. NOVEMBER 2009

Teddy Bear is now compiling Tomato under a newer version (2.6) of the Linux Kernel. The "NONE" and "LISTEN" settings have been eliminated. There are two new settings, "GENERIC" and ICMP. The ICMP is self explanatory, the "generic" timeout which is used for all TCP/UDP connections that don't have their own timeout setting.
|
|