Filename: 182-creditbucket.txt
Title: Credit Bucket
Author: Florian Tschorsch and Björn Scheuermann
Created: 22 Jun 2011
Status: Obsolete
Note: Obsolete because we no longer have a once-per-second bucket refill.
Overview:
The following proposal targets the reduction of queuing times in onion
routers. In particular, we focus on the token bucket algorithm in Tor and
point out that current usage unnecessarily locks cells for long time spans.
We propose a non-intrusive change in Tor's design which overcomes the
deficiencies.
Motivation and Background:
Cell statistics from the Tor network [1] reveal that cells reside in
individual onion routers' cell queues for up to several seconds. These
queuing times increase the end-to-end delay very significantly and are
apparently the largest contributor to overall cell latency in Tor.
In Tor there exist multiple token buckets on different logical levels. They
all work independently. They are used to limit the up- and downstream of an
onion router. All token buckets are refilled every second with a constant
amount of tokens that depends on the configured bandwidth limits. For
example, the so-called RelayedTokenBucket limits relay traffic only. All
read data of incoming connections are bound to a dedicated read token
bucket. An analogous mechanism exists for written data leaving the onion
router. We were able to identify the specific usage and implementation of
the token bucket algorithm as one cause for very high (and unnecessary)
queuing times in an onion router.
We observe that the token buckets in Tor are (surprisingly at a first
glance) allowed to take on negative fill levels. This is justified by the
TLS connections between onion routers where whole TLS records need to be
processed. The token bucket on the incoming side (i.e., the one which
determines at which rate it is allowed to read from incoming TCP
connections) in particular often runs into non-negligible negative fill
levels. As a consequence of this behavior, sometimes slightly more data is
read than it would be admissible upon strict interpretation of the token
bucket concept.
However, the token bucket for limiting the outgoing rate does not take on
negative fill levels equally often. Consequently, it regularly happens
that somewhat more data are read on the incoming side than the outgoing
token bucket allows to be written during the same cycle, even if their
configured data rates are the same. The respective cells will thus not be
allowed to leave the onion router immediately. They will thus necessarily
be queued for at least as long as it takes until the token bucket on the
outgoing side is refilled again. The refill interval currently is, as
mentioned before, one second -- so, these cells are delayed for a very
substantial time. In summary, one could say that the two buckets, on the
incoming and outgoing side, work like a double door system and frequently
lock cells for a full token bucket refill interval length.
General Design:
In order to overcome the described problem, we propose the following
changes related to the token bucket algorithm.
We observe that the token bucket on the outgoing connections with its
current design is contra productive in the sense of queuing times. We
therefore propose modifications to the token bucket algorithm that will
eliminate the "double door effect" discussed above.
Let us start from Tor's current approach: Thus, we have a regular token
bucket on the reading side with a certain rate and a certain burst size.
Let x denote the current amount of tokens in the bucket. On the outgoing
side we need something appropriate that monitors and constrains the
outgoing rate, but at the same time avoids holding back cells (cf. double
door effects) whenever possible.
Here we propose something that adopts the role of a token bucket, but
realizes this functionality in a slightly different way. We call it a
"credit bucket". Like a token bucket, the credit bucket also has a current
fill level, denoted by y. However, the credit bucket is refilled in a
different way.
To understand how it works, let us look at the possible operations:
As said, x is the fill level of a regular token bucket on the incoming
side and thus gets incremented periodically according to the configured
rate. No changes here.
If x<=0, we are obviously not allowed to read. If x>0, we are allowed to
read up to x bytes of incoming data. If k bytes are read (k<=x), then we
update x and y as follows:
x = x - k (1)
y = y + k (2)
(1) is the standard token bucket operation on the incoming side. Whenever
data is admitted in, though, an additional operation is performed: (2)
allocates the same number of bytes on the outgoing side, which will later
on allow the same number of bytes to leave the onion router without any
delays.
If y + x > -M, we are allowed to write up to y + x + M bytes on the
outgoing side, where M is a positive constant. M specifies a burst size for
the outgoing side. M should be higher than the number of tokens that get
refilled during a refill interval, we would suggest to have M in the order
of a few seconds "worth" of data. Now if k bytes are written on the
outgoing side, we proceed as follows:
If k <= y then y = y - k
In this case we use "saved" credits, previously allocated on the incoming
side when incoming data has been processed.
If k > y then y = 0 and x = x - (k-y)
We generated additional traffic in the onion router, so that more data is
to be sent than has been read (the credit is not sufficient). We therefore
"steal" tokens from the token buffer on the incoming side to compensate for
the additionally generated data. This will result in correspondingly less
data being read on the incoming side subsequently. As a result of such an
operation, the token bucket fill level x on the incoming side may become
negative (but it can never fall below -M).
If y + x <= -M then outgoing data will be held back. This may lead to
double-door effects, but only in extreme cases where the outgoing traffic
largely exceeds the incoming traffic, so that the outgoing bursts size M is
exceeded.
Aside from short-term bursts of configurable size (as with every token
bucket), this procedure guarantees that the configured rate may never be
exceeded (on the application layer, that is; as with the current
implementation, an attacker may easily cause the onion router to
arbitrarily exceed the limits on the lower layers). Over time, we never
send more data than the configured rate: every sent byte needs a
corresponding token on the incoming side; this token must either have been
consumed by an incoming byte before (it then became a "credit"), or it is
"stolen" from the incoming bucket to compensate for data generated within
the onion router.
Specific Design Changes:
In the following we briefly point out the specific changes that need to be
done in Tor's source code. By doing so one can see how non intrusive our
modifications are.
First we need to address the bucket increment and decrement operations.
According to the described logic above, this should be done in the methods
connection_bucket_refill and connection_buckets_decrement respectively. In
particular allocating, saving and "stealing" of tokens need to be
considered here.
Second the rate limiting, i.e. the amount we are allowed to write
(connection_bucket_write_limit) needs to be adapted in lines of the credit
bucket logic. Meaning in order to avoid the here identified unnecessary
queuing of cells, we need to consider the new burst parameter M. Here we
also need to take non rate limited connections such as from the localhost
into account. The rate limiting on the reading side remains the same.
At last we need to find good values/ ratios for the parameter M such that
the trade off between avoiding "double door effects" and maintaining
strict rate limits work as expected. As future work and after insights
about the performance gain of the here described proposal we need to find a
way to implement this both using bufferevent rate limiting with libevent
2.3.x and Tor's rate limiting code.
Conclusion:
This proposal can be implemented with moderate effort and requires changes
only at the points where currently the token bucket operations are
performed.
We feel that this is not the be-all and end-all solution, because it again
introduces a feedback loop between the incoming and the outgoing side. We
therefore still hope that we will be able to come to a both simpler and
more effective design in the future. However, we believe that what we
proposed here is a good compromise between avoiding double-door effects to
the furthest possible extent, strictly enforcing an application-layer data
rate, and keeping the extent of changes to the code small.
Feedback is highly appreciated.
References:
[1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009.
[2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011