Wireshark & Packetdrill | TCP 三次握手之 Win 字段续

admin 2024年2月9日00:54:58评论6 views字数 8180阅读27分16秒阅读模式



再次基于 packetdrill TCP 三次握手脚本,测试 Win 字段的由来。此次构造模拟的是客户端场景,而之前《TCP 三次握手之 Win 字段》中构造模拟的是服务器端。


# cat tcp_3hs_007.pkt // TCP 基础之三次握手0  socket(..., SOCK_STREAM, IPPROTO_TCP) = 3+0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0+0 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)+0 > S 0:0(0) <...>+0.01 < S. 0:0(0) ack 1 win 10000 <mss 1000>+0 > . 1:1(0) ack 1


因为 >  表示预期协议栈会发送的数据包,所以内核协议栈自动构建发送 SYN 数据包。

+0 > S 0:0(0) <...>// +0 本行代码执行时间相对于上一行代码的偏移时间。// > ,表示预期协议栈会发送的数据包。// 0:0(0) ,表示开始序号:结束序号(数据包长度)。// <> 表示 TCP options,... 表示默认值。

模拟的是客户端场景,SYN 数据包自动构建的情况下,各字段因此无需自定义

1.执行脚本# packetdrill tcp_3hs_007.pkt # 执行完成后退出。2.捕获数据包# tcpdump -i any -nn port 8080tcpdump: data link type LINUX_SLL2tcpdump: verbose output suppressed, use -v[v]... for full protocol decodelistening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes21:52:17.517087 tun0  Out IP > Flags [S], seq 2970850519, win 64240, options [mss 1460,sackOK,TS val 4210775127 ecr 0,nop,wscale 8], length 021:52:17.617197 ?     In  IP > Flags [S.], seq 0, ack 2970850520, win 10000, options [mss 1000], length 021:52:17.617210 ?     Out IP > Flags [.], ack 1, win 64240, length 021:52:17.617278 ?     Out IP > Flags [F.], seq 1, ack 1, win 64240, length 021:52:17.617286 ?     In  IP > Flags [R.], seq 1, ack 1, win 10000, length 0^C5 packets captured7 packets received by filter0 packets dropped by kernel#

通过捕获数据包,可以看到 SYN 中的 Win 值为 64240,内核是如何定义该值的,以下回顾一下《TCP 三次握手之 Win 字段》中提到的 SYN 中 Win 定值过程。

以下简述客户端 SYN Win 构建过程中的几个相关函数,包括函数 tcp_connect_init 负责初始化 TCP 连接,其中涉及调用 tcp_select_initial_window 函数进行初始化窗口。

static void tcp_connect_init(struct sock *sk){...  tcp_select_initial_window(sk, tcp_full_space(sk),          tp->advmss - (tp->rx_opt.ts_recent_stamp ? tp->tcp_header_len - sizeof(struct tcphdr) : 0),          &tp->rcv_wnd,          &tp->window_clamp,          sock_net(sk)->ipv4.sysctl_tcp_window_scaling,          &rcv_wscale,          rcv_wnd);...

接下来进入 tcp_select_initial_window 函数,可见,__space 来自于 tcp_full_space(sk),一般取值为 tcp_rmem 默认值的 1/2,之后再设置 space 为 MSS 值的整数倍,最后与 U16_MAX 值比较取小 ,即一般情况下会是 64240 的窗口大小。

/* Determine a window scaling and initial window to offer. * Based on the assumption that the given amount of space * will be offered. Store the results in the tp structure. * NOTE: for smooth operation initial space offering should * be a multiple of mss if possible. We assume here that mss >= 1. * This MUST be enforced by all callers. */void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,             __u32 *rcv_wnd, __u32 *window_clamp,             int wscale_ok, __u8 *rcv_wscale,             __u32 init_rcv_wnd){  /* 确认空间大小,使其不会是负数。*/  unsigned int space = (__space < 0 ? 0 : __space);  /* If no clamp set the clamp to the max possible scaled window */  /* 如果 clamp 没有设置,则将 clamp 设置为 65535 * (2^14) = 1073741824,确保TCP窗口大小可以扩大到的理论最大值。*/  /* 之后 space 值使用min()函数取space和*window_clamp的最小值。*/  if (*window_clamp == 0)    (*window_clamp) = (U16_MAX << TCP_MAX_WSCALE);  space = min(*window_clamp, space);  /* Quantize space offering to a multiple of mss if possible. */  /* 确保 space 是 mss 的整数倍 */  if (space > mss)    space = rounddown(space, mss);  /* NOTE: offering an initial window larger than 32767   * will break some buggy TCP stacks. If the admin tells us   * it is likely we could be speaking with such a buggy stack   * we will truncate our initial window offering to 32K-1   * unless the remote has sent us a window scaling option,   * which we interpret as a sign the remote TCP is not   * misinterpreting the window field as a signed quantity.   */  /* 根据 ipv4.sysctl_tcp_workaround_signed_windows是否设置,相应设置接收窗口大小rcv_wnd。*/  if (sock_net(sk)->ipv4.sysctl_tcp_workaround_signed_windows)    (*rcv_wnd) = min(space, MAX_TCP_WINDOW);  else    (*rcv_wnd) = min_t(u32, space, U16_MAX);  /* 如果指定了init_rcv_wnd的值,则设置接收窗口大小rcv_wnd的min值。  if (init_rcv_wnd)    *rcv_wnd = min(*rcv_wnd, init_rcv_wnd * mss);  /* 计算接收窗口 rcv_wscale。*/  *rcv_wscale = 0;  if (wscale_ok) {    /* Set window scaling on max possible window */    space = max_t(u32, space, sock_net(sk)->ipv4.sysctl_tcp_rmem[2]);    space = max_t(u32, space, sysctl_rmem_max);    space = min_t(u32, space, *window_clamp);    *rcv_wscale = clamp_t(int, ilog2(space) - 15,              0, TCP_MAX_WSCALE);  }  /* Set the clamp no higher than max representable value */  /* 根据计算出的接收窗口扩大系数rcv_wscale来限制window_clamp的最大值。*/  (*window_clamp) = min_t(__u32, U16_MAX << (*rcv_wscale), *window_clamp);}EXPORT_SYMBOL(tcp_select_initial_window);


而 < S. 也就是 SYN/ACK 属于构造的数据包,各字段需自行定义,包括 ack、win、mss 等,之前也做过相关说明,不再复述,其中 win 必须定义,mss 可省略。

以下再次回顾一下《TCP 三次握手之 Win 字段》中提到的 SYN/ACK 中 Win 定值过程。简述服务器端 SYN/ACK Win 构建过程中的几个函数,涉及 tcp_v4_conn_request -> tcp_conn_request -> tcp_openreq_init_rwin 。tcp_openreq_init_rwin 函数如下,其主要功能在于选择函数 tcp_select_initial_window 所需的参数再调用其初始化接收窗口相关信息

void tcp_openreq_init_rwin(struct request_sock *req,         const struct sock *sk_listener,         const struct dst_entry *dst){  struct inet_request_sock *ireq = inet_rsk(req);  const struct tcp_sock *tp = tcp_sk(sk_listener);  /* 调用 tcp_full_space 函数获取监听套接字的接收缓冲区总大小,赋值给full_space */  int full_space = tcp_full_space(sk_listener);  u32 window_clamp;  __u8 rcv_wscale;  u32 rcv_wnd;  int mss;  /* 计算mss,基于目标路径的通告mss和监听套接字的限制。*/  mss = tcp_mss_clamp(tp, dst_metric_advmss(dst));  /* 读取监听套接字的window_clamp值。*/  window_clamp = READ_ONCE(tp->window_clamp);  /* Set this up on the first call only */  /* 如果有window_clamp值就用它,否则用目标路径的Window大小,作为请求套接字的窗口限制值。*/  req->rsk_window_clamp = window_clamp ? : dst_metric(dst, RTAX_WINDOW);  /* limit the window selection if the user enforce a smaller rx buffer */  /* 如果用户锁定设置了较小的接收缓冲区大小,那么需要限制窗口选择在该缓冲区大小之内。*/  if (sk_listener->sk_userlocks & SOCK_RCVBUF_LOCK &&      (req->rsk_window_clamp > full_space || req->rsk_window_clamp == 0))    req->rsk_window_clamp = full_space;  /* bpf 设置窗口相关。*/  rcv_wnd = tcp_rwnd_init_bpf((struct sock *)req);  if (rcv_wnd == 0)    rcv_wnd = dst_metric(dst, RTAX_INITRWND);  else if (full_space < rcv_wnd * mss)    full_space = rcv_wnd * mss;  /* tcp_full_space because it is guaranteed to be the first packet */  tcp_select_initial_window(sk_listener, full_space,    mss - (ireq->tstamp_ok ? TCPOLEN_TSTAMP_ALIGNED : 0),    &req->rsk_rcv_wnd,    &req->rsk_window_clamp,    ireq->wscale_ok,    &rcv_wscale,    rcv_wnd);  ireq->rcv_wscale = rcv_wscale;}EXPORT_SYMBOL(tcp_openreq_init_rwin);


对于 SYN 中 Win 值测试,首先尝试修改 tcp_rmem 的大小为 65536,也就是设置 full_space 的值为 tcp_rmem 的 1/2 ,即 32768 。 

tcp_rmem 默认值 131072# sysctl -a | grep tcp_rmemnet.ipv4.tcp_rmem = 4096        131072  6291456# tcp_rmem 修改值 65536# sysctl -q net.ipv4.tcp_rmem="4096 65536 6291456"# sysctl -a | grep tcp_rmemnet.ipv4.tcp_rmem = 4096        65536   6291456#

packetdrill 继续尝试执行脚本,tcpdump 捕获结果可以看到 SYN 中 win 32120设置 space 为 MSS 值的整数倍,即 32120,最后与 U16_MAX 值比较取小 ,仍为 32120。

# packetdrill tcp_3hs_007.pkt ## tcpdump -i any -nn port 808020:27:51.528905 tun0  Out IP > Flags [S], seq 1698096268, win 32120, options [mss 1460,sackOK,TS val 776046799 ecr 0,nop,wscale 7], length 020:27:51.629058 ?     In  IP > Flags [S.], seq 0, ack 1698096269, win 10000, options [mss 1000], length 020:27:51.629086 ?     Out IP > Flags [.], ack 1, win 32120, length 020:27:51.629194 ?     Out IP > Flags [F.], seq 1, ack 1, win 32120, length 020:27:51.629208 ?     In  IP > Flags [R.], seq 1, ack 1, win 10000, length 0


继续 SYN 中 Win 值测试,首先恢复 tcp_rmem 的大小为 131072,通过修改 init_rcv_wnd 值来影响 rcv_wnd 的取值,取 init_rcv_wnd * mss 小值。

通过 packetdrill pkt 文件中修改 initrwnd 值为 8,执行脚本后,tcpdump 捕获结果可以看到 SYN 中 win 11680,因为 init_rcv_wnd * mss 为 8 * 1460 = 11680 ,rcv_wnd 即为 11680 。

# cat tcp_3hs_win_005.pkt `ip route change dev tun0 initrwnd 8`0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3+0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0+0 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)+0 > S 0:0(0) <...>+.1 < S. 0:0(0) ack 1 win 10000 <mss 1000>+0 > . 1:1(0) ack 1## packetdrill tcp_3hs_win_005.pkt # # tcpdump -i any -nn port 8080tcpdump: data link type LINUX_SLL2tcpdump: verbose output suppressed, use -v[v]... for full protocol decodelistening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes20:58:00.716238 tun0  Out IP > Flags [S], seq 4007123868, win 11680, options [mss 1460,sackOK,TS val 4260117824 ecr 0,nop,wscale 7], length 020:58:00.816389 ?     In  IP > Flags [S.], seq 0, ack 4007123869, win 10000, options [mss 1000], length 020:58:00.816417 ?     Out IP > Flags [.], ack 1, win 11680, length 020:58:00.816516 ?     Out IP > Flags [F.], seq 1, ack 1, win 11680, length 020:58:00.816529 ?     In  IP > Flags [R.], seq 1, ack 1, win 10000, length 0
Wireshark & Packetdrill | TCP 三次握手之 Win 字段续


1. Wireshark 提示和技巧 | 捕获点之 TCP 三次握手
2. Wireshark 提示和技巧 | a == ${a} 显示过滤宏
3. Wireshark TS | 防火墙空闲会话超时问题
4. Wireshark TS | HTTP 传输文件慢问题
5. 网络设备 MTU MSS Jumboframe 全解

后台回复「TT」获取 Wireshark 提示和技巧系列 合集
后台回复「TS」获取 Wireshark Troubleshooting 系列 合集
Wireshark & Packetdrill | TCP 三次握手之 Win 字段续

原文始发于微信公众号(Echo Reply):Wireshark & Packetdrill | TCP 三次握手之 Win 字段续

  • 左青龙
  • 微信扫一扫
  • weinxin
  • 右白虎
  • 微信扫一扫
  • weinxin
  • 本文由 发表于 2024年2月9日00:54:58
  • 转载请保留本文链接(CN-SEC中文网:感谢原作者辛苦付出):
                   Wireshark & Packetdrill | TCP 三次握手之 Win 字段续https://cn-sec.com/archives/2416848.html


匿名网友 填写信息