| .TH CAKE 8 "19 July 2018" "iproute2" "Linux" |
| .SH NAME |
| CAKE \- Common Applications Kept Enhanced (CAKE) |
| .SH SYNOPSIS |
| .B tc qdisc ... cake |
| .br |
| [ |
| .BR bandwidth |
| RATE | |
| .BR unlimited* |
| | |
| .BR autorate-ingress |
| ] |
| .br |
| [ |
| .BR rtt |
| TIME | |
| .BR datacentre |
| | |
| .BR lan |
| | |
| .BR metro |
| | |
| .BR regional |
| | |
| .BR internet* |
| | |
| .BR oceanic |
| | |
| .BR satellite |
| | |
| .BR interplanetary |
| ] |
| .br |
| [ |
| .BR besteffort |
| | |
| .BR diffserv8 |
| | |
| .BR diffserv4 |
| | |
| .BR diffserv3* |
| ] |
| .br |
| [ |
| .BR flowblind |
| | |
| .BR srchost |
| | |
| .BR dsthost |
| | |
| .BR hosts |
| | |
| .BR flows |
| | |
| .BR dual-srchost |
| | |
| .BR dual-dsthost |
| | |
| .BR triple-isolate* |
| ] |
| .br |
| [ |
| .BR nat |
| | |
| .BR nonat* |
| ] |
| .br |
| [ |
| .BR wash |
| | |
| .BR nowash* |
| ] |
| .br |
| [ |
| .BR split-gso* |
| | |
| .BR no-split-gso |
| ] |
| .br |
| [ |
| .BR ack-filter |
| | |
| .BR ack-filter-aggressive |
| | |
| .BR no-ack-filter* |
| ] |
| .br |
| [ |
| .BR memlimit |
| LIMIT ] |
| .br |
| [ |
| .BR fwmark |
| MASK ] |
| .br |
| [ |
| .BR ptm |
| | |
| .BR atm |
| | |
| .BR noatm* |
| ] |
| .br |
| [ |
| .BR overhead |
| N | |
| .BR conservative |
| | |
| .BR raw* |
| ] |
| .br |
| [ |
| .BR mpu |
| N ] |
| .br |
| [ |
| .BR ingress |
| | |
| .BR egress* |
| ] |
| .br |
| (* marks defaults) |
| |
| |
| .SH DESCRIPTION |
| CAKE (Common Applications Kept Enhanced) is a shaping-capable queue discipline |
| which uses both AQM and FQ. It combines COBALT, which is an AQM algorithm |
| combining Codel and BLUE, a shaper which operates in deficit mode, and a variant |
| of DRR++ for flow isolation. 8-way set-associative hashing is used to virtually |
| eliminate hash collisions. Priority queuing is available through a simplified |
| diffserv implementation. Overhead compensation for various encapsulation |
| schemes is tightly integrated. |
| |
| All settings are optional; the default settings are chosen to be sensible in |
| most common deployments. Most people will only need to set the |
| .B bandwidth |
| parameter to get useful results, but reading the |
| .B Overhead Compensation |
| and |
| .B Round Trip Time |
| sections is strongly encouraged. |
| |
| .SH SHAPER PARAMETERS |
| CAKE uses a deficit-mode shaper, which does not exhibit the initial burst |
| typical of token-bucket shapers. It will automatically burst precisely as much |
| as required to maintain the configured throughput. As such, it is very |
| straightforward to configure. |
| .PP |
| .B unlimited |
| (default) |
| .br |
| No limit on the bandwidth. |
| .PP |
| .B bandwidth |
| RATE |
| .br |
| Set the shaper bandwidth. See |
| .BR tc(8) |
| or examples below for details of the RATE value. |
| .PP |
| .B autorate-ingress |
| .br |
| Automatic capacity estimation based on traffic arriving at this qdisc. |
| This is most likely to be useful with cellular links, which tend to change |
| quality randomly. A |
| .B bandwidth |
| parameter can be used in conjunction to specify an initial estimate. The shaper |
| will periodically be set to a bandwidth slightly below the estimated rate. This |
| estimator cannot estimate the bandwidth of links downstream of itself. |
| |
| .SH OVERHEAD COMPENSATION PARAMETERS |
| The size of each packet on the wire may differ from that seen by Linux. The |
| following parameters allow CAKE to compensate for this difference by internally |
| considering each packet to be bigger than Linux informs it. To assist users who |
| are not expert network engineers, keywords have been provided to represent a |
| number of common link technologies. |
| |
| .SS Manual Overhead Specification |
| .B overhead |
| BYTES |
| .br |
| Adds BYTES to the size of each packet. BYTES may be negative; values |
| between -64 and 256 (inclusive) are accepted. |
| .PP |
| .B mpu |
| BYTES |
| .br |
| Rounds each packet (including overhead) up to a minimum length |
| BYTES. BYTES may not be negative; values between 0 and 256 (inclusive) |
| are accepted. |
| .PP |
| .B atm |
| .br |
| Compensates for ATM cell framing, which is normally found on ADSL links. |
| This is performed after the |
| .B overhead |
| parameter above. ATM uses fixed 53-byte cells, each of which can carry 48 bytes |
| payload. |
| .PP |
| .B ptm |
| .br |
| Compensates for PTM encoding, which is normally found on VDSL2 links and |
| uses a 64b/65b encoding scheme. It is even more efficient to simply |
| derate the specified shaper bandwidth by a factor of 64/65 or 0.984. See |
| ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details. |
| .PP |
| .B noatm |
| .br |
| Disables ATM and PTM compensation. |
| |
| .SS Failsafe Overhead Keywords |
| These two keywords are provided for quick-and-dirty setup. Use them if you |
| can't be bothered to read the rest of this section. |
| .PP |
| .B raw |
| (default) |
| .br |
| Turns off all overhead compensation in CAKE. The packet size reported |
| by Linux will be used directly. |
| .PP |
| Other overhead keywords may be added after "raw". The effect of this is |
| to make the overhead compensation operate relative to the reported packet size, |
| not the underlying IP packet size. |
| .PP |
| .B conservative |
| .br |
| Compensates for more overhead than is likely to occur on any |
| widely-deployed link technology. |
| .br |
| Equivalent to |
| .B overhead 48 atm. |
| |
| .SS ADSL Overhead Keywords |
| Most ADSL modems have a way to check which framing scheme is in use. Often this |
| is also specified in the settings document provided by the ISP. The keywords in |
| this section are intended to correspond with these sources of information. All |
| of them implicitly set the |
| .B atm |
| flag. |
| .PP |
| .B pppoa-vcmux |
| .br |
| Equivalent to |
| .B overhead 10 atm |
| .PP |
| .B pppoa-llc |
| .br |
| Equivalent to |
| .B overhead 14 atm |
| .PP |
| .B pppoe-vcmux |
| .br |
| Equivalent to |
| .B overhead 32 atm |
| .PP |
| .B pppoe-llcsnap |
| .br |
| Equivalent to |
| .B overhead 40 atm |
| .PP |
| .B bridged-vcmux |
| .br |
| Equivalent to |
| .B overhead 24 atm |
| .PP |
| .B bridged-llcsnap |
| .br |
| Equivalent to |
| .B overhead 32 atm |
| .PP |
| .B ipoa-vcmux |
| .br |
| Equivalent to |
| .B overhead 8 atm |
| .PP |
| .B ipoa-llcsnap |
| .br |
| Equivalent to |
| .B overhead 16 atm |
| .PP |
| See also the Ethernet Correction Factors section below. |
| |
| .SS VDSL2 Overhead Keywords |
| ATM was dropped from VDSL2 in favour of PTM, which is a much more |
| straightforward framing scheme. Some ISPs retained PPPoE for compatibility with |
| their existing back-end systems. |
| .PP |
| .B pppoe-ptm |
| .br |
| Equivalent to |
| .B overhead 30 ptm |
| |
| .br |
| PPPoE: 2B PPP + 6B PPPoE + |
| .br |
| ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + |
| .br |
| PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) |
| .br |
| .PP |
| .B bridged-ptm |
| .br |
| Equivalent to |
| .B overhead 22 ptm |
| .br |
| ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + |
| .br |
| PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) |
| .br |
| .PP |
| See also the Ethernet Correction Factors section below. |
| |
| .SS DOCSIS Cable Overhead Keyword |
| DOCSIS is the universal standard for providing Internet service over cable-TV |
| infrastructure. |
| |
| In this case, the actual on-wire overhead is less important than the packet size |
| the head-end equipment uses for shaping and metering. This is specified to be |
| an Ethernet frame including the CRC (aka FCS). |
| .PP |
| .B docsis |
| .br |
| Equivalent to |
| .B overhead 18 mpu 64 noatm |
| |
| .SS Ethernet Overhead Keywords |
| .PP |
| .B ethernet |
| .br |
| Accounts for Ethernet's preamble, inter-frame gap, and Frame Check |
| Sequence. Use this keyword when the bottleneck being shaped for is an |
| actual Ethernet cable. |
| .br |
| Equivalent to |
| .B overhead 38 mpu 84 noatm |
| .PP |
| .B ether-vlan |
| .br |
| Adds 4 bytes to the overhead compensation, accounting for an IEEE 802.1Q |
| VLAN header appended to the Ethernet frame header. NB: Some ISPs use one or |
| even two of these within PPPoE; this keyword may be repeated as necessary to |
| express this. |
| |
| .SH ROUND TRIP TIME PARAMETERS |
| Active Queue Management (AQM) consists of embedding congestion signals in the |
| packet flow, which receivers use to instruct senders to slow down when the queue |
| is persistently occupied. CAKE uses ECN signalling when available, and packet |
| drops otherwise, according to a combination of the Codel and BLUE AQM algorithms |
| called COBALT. |
| |
| Very short latencies require a very rapid AQM response to adequately control |
| latency. However, such a rapid response tends to impair throughput when the |
| actual RTT is relatively long. CAKE allows specifying the RTT it assumes for |
| tuning various parameters. Actual RTTs within an order of magnitude of this |
| will generally work well for both throughput and latency management. |
| |
| At the 'lan' setting and below, the time constants are similar in magnitude to |
| the jitter in the Linux kernel itself, so congestion might be signalled |
| prematurely. The flows will then become sparse and total throughput reduced, |
| leaving little or no back-pressure for the fairness logic to work against. Use |
| the "metro" setting for local lans unless you have a custom kernel. |
| .PP |
| .B rtt |
| TIME |
| .br |
| Manually specify an RTT. |
| .PP |
| .B datacentre |
| .br |
| For extremely high-performance 10GigE+ networks only. Equivalent to |
| .B rtt 100us. |
| .PP |
| .B lan |
| .br |
| For pure Ethernet (not Wi-Fi) networks, at home or in the office. Don't |
| use this when shaping for an Internet access link. Equivalent to |
| .B rtt 1ms. |
| .PP |
| .B metro |
| .br |
| For traffic mostly within a single city. Equivalent to |
| .B rtt 10ms. |
| .PP |
| .B regional |
| .br |
| For traffic mostly within a European-sized country. Equivalent to |
| .B rtt 30ms. |
| .PP |
| .B internet |
| (default) |
| .br |
| This is suitable for most Internet traffic. Equivalent to |
| .B rtt 100ms. |
| .PP |
| .B oceanic |
| .br |
| For Internet traffic with generally above-average latency, such as that |
| suffered by Australasian residents. Equivalent to |
| .B rtt 300ms. |
| .PP |
| .B satellite |
| .br |
| For traffic via geostationary satellites. Equivalent to |
| .B rtt 1000ms. |
| .PP |
| .B interplanetary |
| .br |
| So named because Jupiter is about 1 light-hour from Earth. Use this to |
| (almost) completely disable AQM actions. Equivalent to |
| .B rtt 3600s. |
| |
| .SH FLOW ISOLATION PARAMETERS |
| With flow isolation enabled, CAKE places packets from different flows into |
| different queues, each of which carries its own AQM state. Packets from each |
| queue are then delivered fairly, according to a DRR++ algorithm which minimises |
| latency for "sparse" flows. CAKE uses a set-associative hashing algorithm to |
| minimise flow collisions. |
| |
| These keywords specify whether fairness based on source address, destination |
| address, individual flows, or any combination of those is desired. |
| .PP |
| .B flowblind |
| .br |
| Disables flow isolation; all traffic passes through a single queue for |
| each tin. |
| .PP |
| .B srchost |
| .br |
| Flows are defined only by source address. Could be useful on the egress |
| path of an ISP backhaul. |
| .PP |
| .B dsthost |
| .br |
| Flows are defined only by destination address. Could be useful on the |
| ingress path of an ISP backhaul. |
| .PP |
| .B hosts |
| .br |
| Flows are defined by source-destination host pairs. This is host |
| isolation, rather than flow isolation. |
| .PP |
| .B flows |
| .br |
| Flows are defined by the entire 5-tuple of source address, destination |
| address, transport protocol, source port and destination port. This is the type |
| of flow isolation performed by SFQ and fq_codel. |
| .PP |
| .B dual-srchost |
| .br |
| Flows are defined by the 5-tuple, and fairness is applied first over |
| source addresses, then over individual flows. Good for use on egress traffic |
| from a LAN to the internet, where it'll prevent any one LAN host from |
| monopolising the uplink, regardless of the number of flows they use. |
| .PP |
| .B dual-dsthost |
| .br |
| Flows are defined by the 5-tuple, and fairness is applied first over |
| destination addresses, then over individual flows. Good for use on ingress |
| traffic to a LAN from the internet, where it'll prevent any one LAN host from |
| monopolising the downlink, regardless of the number of flows they use. |
| .PP |
| .B triple-isolate |
| (default) |
| .br |
| Flows are defined by the 5-tuple, and fairness is applied over source |
| *and* destination addresses intelligently (ie. not merely by host-pairs), and |
| also over individual flows. Use this if you're not certain whether to use |
| dual-srchost or dual-dsthost; it'll do both jobs at once, preventing any one |
| host on *either* side of the link from monopolising it with a large number of |
| flows. |
| .PP |
| .B nat |
| .br |
| Instructs Cake to perform a NAT lookup before applying flow-isolation |
| rules, to determine the true addresses and port numbers of the packet, to |
| improve fairness between hosts "inside" the NAT. This has no practical effect |
| in "flowblind" or "flows" modes, or if NAT is performed on a different host. |
| .PP |
| .B nonat |
| (default) |
| .br |
| Cake will not perform a NAT lookup. Flow isolation will be performed |
| using the addresses and port numbers directly visible to the interface Cake is |
| attached to. |
| |
| .SH PRIORITY QUEUE PARAMETERS |
| CAKE can divide traffic into "tins" based on the Diffserv field. Each tin has |
| its own independent set of flow-isolation queues, and is serviced based on a WRR |
| algorithm. To avoid perverse Diffserv marking incentives, tin weights have a |
| "priority sharing" value when bandwidth used by that tin is below a threshold, |
| and a lower "bandwidth sharing" value when above. Bandwidth is compared against |
| the threshold using the same algorithm as the deficit-mode shaper. |
| |
| Detailed customisation of tin parameters is not provided. The following presets |
| perform all necessary tuning, relative to the current shaper bandwidth and RTT |
| settings. |
| .PP |
| .B besteffort |
| .br |
| Disables priority queuing by placing all traffic in one tin. |
| .PP |
| .B precedence |
| .br |
| Enables legacy interpretation of TOS "Precedence" field. Use of this |
| preset on the modern Internet is firmly discouraged. |
| .PP |
| .B diffserv4 |
| .br |
| Provides a general-purpose Diffserv implementation with four tins: |
| .br |
| Bulk (CS1), 6.25% threshold, generally low priority. |
| .br |
| Best Effort (general), 100% threshold. |
| .br |
| Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% threshold. |
| .br |
| Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold. |
| .PP |
| .B diffserv3 |
| (default) |
| .br |
| Provides a simple, general-purpose Diffserv implementation with three tins: |
| .br |
| Bulk (CS1), 6.25% threshold, generally low priority. |
| .br |
| Best Effort (general), 100% threshold. |
| .br |
| Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel interval. |
| |
| .PP |
| .B fwmark |
| MASK |
| .br |
| This options turns on fwmark-based overriding of CAKE's tin selection. |
| If set, the option specifies a bitmask that will be applied to the fwmark |
| associated with each packet. If the result of this masking is non-zero, the |
| result will be right-shifted by the number of least-significant unset bits in |
| the mask value, and the result will be used as a the tin number for that packet. |
| This can be used to set policies in a firewall script that will override CAKE's |
| built-in tin selection. |
| |
| .SH OTHER PARAMETERS |
| .B memlimit |
| LIMIT |
| .br |
| Limit the memory consumed by Cake to LIMIT bytes. Note that this does |
| not translate directly to queue size (so do not size this based on bandwidth |
| delay product considerations, but rather on worst case acceptable memory |
| consumption), as there is some overhead in the data structures containing the |
| packets, especially for small packets. |
| |
| By default, the limit is calculated based on the bandwidth and RTT |
| settings. |
| |
| .PP |
| .B wash |
| |
| .br |
| Traffic entering your diffserv domain is frequently mis-marked in |
| transit from the perspective of your network, and traffic exiting yours may be |
| mis-marked from the perspective of the transiting provider. |
| |
| Apply the wash option to clear all extra diffserv (but not ECN bits), after |
| priority queuing has taken place. |
| |
| If you are shaping inbound, and cannot trust the diffserv markings (as is the |
| case for Comcast Cable, among others), it is best to use a single queue |
| "besteffort" mode with wash. |
| |
| .PP |
| .B split-gso |
| |
| .br |
| This option controls whether CAKE will split General Segmentation |
| Offload (GSO) super-packets into their on-the-wire components and |
| dequeue them individually. |
| |
| .br |
| Super-packets are created by the networking stack to improve efficiency. |
| However, because they are larger they take longer to dequeue, which |
| translates to higher latency for competing flows, especially at lower |
| bandwidths. CAKE defaults to splitting GSO packets to achieve the lowest |
| possible latency. At link speeds higher than 10 Gbps, setting the |
| no-split-gso parameter can increase the maximum achievable throughput by |
| retaining the full GSO packets. |
| |
| .SH OVERRIDING CLASSIFICATION WITH TC FILTERS |
| |
| CAKE supports overriding of its internal classification of packets through the |
| tc filter mechanism. Packets can be assigned to different priority tins by |
| setting the |
| .B priority |
| field on the skb, and the flow hashing can be overridden by setting the |
| .B classid |
| parameter. |
| |
| .PP |
| .B Tin override |
| |
| .br |
| To assign a priority tin, the major number of the priority field needs |
| to match the qdisc handle of the cake instance; if it does, the minor number |
| will be interpreted as the tin index. For example, to classify all ICMP packets |
| as 'bulk', the following filter can be used: |
| |
| .br |
| # tc qdisc replace dev eth0 handle 1: root cake diffserv3 |
| # tc filter add dev eth0 parent 1: protocol ip prio 1 \\ |
| u32 match icmp type 0 0 action skbedit priority 1:1 |
| |
| .PP |
| .B Flow hash override |
| |
| .br |
| To override flow hashing, the classid can be set. CAKE will interpret |
| the major number of the classid as the host hash used in host isolation mode, |
| and the minor number as the flow hash used for flow-based queueing. One or both |
| of those can be set, and will be used if the relevant flow isolation parameter |
| is set (i.e., the major number will be ignored if CAKE is not configured in |
| hosts mode, and the minor number will be ignored if CAKE is not configured in |
| flows mode). |
| |
| .br |
| This example will assign all ICMP packets to the first queue: |
| |
| .br |
| # tc qdisc replace dev eth0 handle 1: root cake |
| # tc filter add dev eth0 parent 1: protocol ip prio 1 \\ |
| u32 match icmp type 0 0 classid 0:1 |
| |
| .br |
| If only one of the host and flow overrides is set, CAKE will compute the other |
| hash from the packet as normal. Note, however, that the host isolation mode |
| works by assigning a host ID to the flow queue; so if overriding both host and |
| flow, the same flow cannot have more than one host assigned. In addition, it is |
| not possible to assign different source and destination host IDs through the |
| override mechanism; if a host ID is assigned, it will be used as both source and |
| destination host. |
| |
| |
| |
| .SH EXAMPLES |
| # tc qdisc delete root dev eth0 |
| .br |
| # tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet |
| .br |
| # tc -s qdisc show dev eth0 |
| .br |
| qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84 |
| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) |
| backlog 0b 0p requeues 0 |
| memory used: 0b of 5000000b |
| capacity estimate: 100Mbit |
| min/max network layer size: 65535 / 0 |
| min/max overhead-adjusted size: 65535 / 0 |
| average network hdr offset: 0 |
| |
| Bulk Best Effort Voice |
| thresh 6250Kbit 100Mbit 25Mbit |
| target 5.0ms 5.0ms 5.0ms |
| interval 100.0ms 100.0ms 100.0ms |
| pk_delay 0us 0us 0us |
| av_delay 0us 0us 0us |
| sp_delay 0us 0us 0us |
| pkts 0 0 0 |
| bytes 0 0 0 |
| way_inds 0 0 0 |
| way_miss 0 0 0 |
| way_cols 0 0 0 |
| drops 0 0 0 |
| marks 0 0 0 |
| ack_drop 0 0 0 |
| sp_flows 0 0 0 |
| bk_flows 0 0 0 |
| un_flows 0 0 0 |
| max_len 0 0 0 |
| quantum 300 1514 762 |
| |
| After some use: |
| .br |
| # tc -s qdisc show dev eth0 |
| |
| qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84 |
| Sent 44709231 bytes 31931 pkt (dropped 45, overlimits 93782 requeues 0) |
| backlog 33308b 22p requeues 0 |
| memory used: 292352b of 5000000b |
| capacity estimate: 100Mbit |
| min/max network layer size: 28 / 1500 |
| min/max overhead-adjusted size: 84 / 1538 |
| average network hdr offset: 14 |
| |
| Bulk Best Effort Voice |
| thresh 6250Kbit 100Mbit 25Mbit |
| target 5.0ms 5.0ms 5.0ms |
| interval 100.0ms 100.0ms 100.0ms |
| pk_delay 8.7ms 6.9ms 5.0ms |
| av_delay 4.9ms 5.3ms 3.8ms |
| sp_delay 727us 1.4ms 511us |
| pkts 2590 21271 8137 |
| bytes 3081804 30302659 11426206 |
| way_inds 0 46 0 |
| way_miss 3 17 4 |
| way_cols 0 0 0 |
| drops 20 15 10 |
| marks 0 0 0 |
| ack_drop 0 0 0 |
| sp_flows 2 4 1 |
| bk_flows 1 2 1 |
| un_flows 0 0 0 |
| max_len 1514 1514 1514 |
| quantum 300 1514 762 |
| |
| .SH SEE ALSO |
| .BR tc (8), |
| .BR tc-codel (8), |
| .BR tc-fq_codel (8), |
| .BR tc-htb (8) |
| |
| .SH AUTHORS |
| Cake's principal author is Jonathan Morton, with contributions from |
| Tony Ambardar, Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, |
| Sebastian Moeller, Ryan Mounce, Dean Scarff, Nils Andreas Svee, and Dave Täht. |
| |
| This manual page was written by Loganaden Velvindron. Please report corrections |
| to the Linux Networking mailing list <netdev@vger.kernel.org>. |