Skip to content
Snippets Groups Projects
  1. May 23, 2024
    • Ryosuke Yasuoka's avatar
      nfc: nci: Fix handling of zero-length payload packets in nci_rx_work() · 6671e352
      Ryosuke Yasuoka authored
      
      When nci_rx_work() receives a zero-length payload packet, it should not
      discard the packet and exit the loop. Instead, it should continue
      processing subsequent packets.
      
      Fixes: d24b0353 ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet")
      Signed-off-by: default avatarRyosuke Yasuoka <ryasuoka@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20240521153444.535399-1-ryasuoka@redhat.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6671e352
    • Paolo Abeni's avatar
      net: relax socket state check at accept time. · 26afda78
      Paolo Abeni authored
      
      Christoph reported the following splat:
      
      WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0
      Modules linked in:
      CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759
      Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80
      RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293
      RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64
      R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000
      R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800
      FS:  000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786
       do_accept+0x435/0x620 net/socket.c:1929
       __sys_accept4_file net/socket.c:1969 [inline]
       __sys_accept4+0x9b/0x110 net/socket.c:1999
       __do_sys_accept net/socket.c:2016 [inline]
       __se_sys_accept net/socket.c:2013 [inline]
       __x64_sys_accept+0x7d/0x90 net/socket.c:2013
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x4315f9
      Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b
      RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
      RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300
      R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055
       </TASK>
      
      The reproducer invokes shutdown() before entering the listener status.
      After commit 94062790 ("tcp: defer shutdown(SEND_SHUTDOWN) for
      TCP_SYN_RECV sockets"), the above causes the child to reach the accept
      syscall in FIN_WAIT1 status.
      
      Eric noted we can relax the existing assertion in __inet_accept()
      
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490
      
      
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: 94062790 ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/23ab880a44d8cfd967e84de8b93dbf48848e3d8c.1716299669.git.pabeni@redhat.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      26afda78
    • Jason Xing's avatar
      tcp: remove 64 KByte limit for initial tp->rcv_wnd value · 378979e9
      Jason Xing authored
      
      Recently, we had some servers upgraded to the latest kernel and noticed
      the indicator from the user side showed worse results than before. It is
      caused by the limitation of tp->rcv_wnd.
      
      In 2018 commit a337531b ("tcp: up initial rmem to 128KB and SYN rwin
      to around 64KB") limited the initial value of tp->rcv_wnd to 65535, most
      CDN teams would not benefit from this change because they cannot have a
      large window to receive a big packet, which will be slowed down especially
      in long RTT. Small rcv_wnd means slow transfer speed, to some extent. It's
      the side effect for the latency/time-sensitive users.
      
      To avoid future confusion, current change doesn't affect the initial
      receive window on the wire in a SYN or SYN+ACK packet which are set within
      65535 bytes according to RFC 7323 also due to the limit in
      __tcp_transmit_skb():
      
          th->window      = htons(min(tp->rcv_wnd, 65535U));
      
      In one word, __tcp_transmit_skb() already ensures that constraint is
      respected, no matter how large tp->rcv_wnd is. The change doesn't violate
      RFC.
      
      Let me provide one example if with or without the patch:
      Before:
      client   --- SYN: rwindow=65535 ---> server
      client   <--- SYN+ACK: rwindow=65535 ----  server
      client   --- ACK: rwindow=65536 ---> server
      Note: for the last ACK, the calculation is 512 << 7.
      
      After:
      client   --- SYN: rwindow=65535 ---> server
      client   <--- SYN+ACK: rwindow=65535 ----  server
      client   --- ACK: rwindow=175232 ---> server
      Note: I use the following command to make it work:
      ip route change default via [ip] dev eth0 metric 100 initrwnd 120
      For the last ACK, the calculation is 1369 << 7.
      
      When we apply such a patch, having a large rcv_wnd if the user tweak this
      knob can help transfer data more rapidly and save some rtts.
      
      Fixes: a337531b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
      Signed-off-by: default avatarJason Xing <kernelxing@tencent.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Link: https://lore.kernel.org/r/20240521134220.12510-1-kerneljasonxing@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      378979e9
    • Dae R. Jeong's avatar
      tls: fix missing memory barrier in tls_init · 91e61dd7
      Dae R. Jeong authored
      
      In tls_init(), a write memory barrier is missing, and store-store
      reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.
      
      CPU0                               CPU1
      -----                              -----
      // In tls_init()
      // In tls_ctx_create()
      ctx = kzalloc()
      ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)
      
      // In update_sk_prot()
      WRITE_ONCE(sk->sk_prot, tls_prots)     -(2)
      
                                         // In sock_common_setsockopt()
                                         READ_ONCE(sk->sk_prot)->setsockopt()
      
                                         // In tls_{setsockopt,getsockopt}()
                                         ctx->sk_proto->setsockopt()    -(3)
      
      In the above scenario, when (1) and (2) are reordered, (3) can observe
      the NULL value of ctx->sk_proto, causing NULL dereference.
      
      To fix it, we rely on rcu_assign_pointer() which implies the release
      barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is
      initialized, we can ensure that ctx->sk_proto are visible when
      changing sk->sk_prot.
      
      Fixes: d5bee737 ("net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCE")
      Signed-off-by: default avatarYewon Choi <woni9911@gmail.com>
      Signed-off-by: default avatarDae R. Jeong <threeearcat@gmail.com>
      Link: https://lore.kernel.org/netdev/ZU4OJG56g2V9z_H7@dragonet/T/
      Link: https://lore.kernel.org/r/Zkx4vjSFp0mfpjQ2@libra05
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      91e61dd7
    • Steven Rostedt (Google)'s avatar
      tracing/treewide: Remove second parameter of __assign_str() · 2c92ca84
      Steven Rostedt (Google) authored
      With the rework of how the __string() handles dynamic strings where it
      saves off the source string in field in the helper structure[1], the
      assignment of that value to the trace event field is stored in the helper
      value and does not need to be passed in again.
      
      This means that with:
      
        __string(field, mystring)
      
      Which use to be assigned with __assign_str(field, mystring), no longer
      needs the second parameter and it is unused. With this, __assign_str()
      will now only get a single parameter.
      
      There's over 700 users of __assign_str() and because coccinelle does not
      handle the TRACE_EVENT() macro I ended up using the following sed script:
      
        git grep -l __assign_str | while read a ; do
            sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
            mv /tmp/test-file $a;
        done
      
      I then searched for __assign_str() that did not end with ';' as those
      were multi line assignments that the sed script above would fail to catch.
      
      Note, the same updates will need to be done for:
      
        __assign_str_len()
        __assign_rel_str()
        __assign_rel_str_len()
      
      I tested this with both an allmodconfig and an allyesconfig (build only for both).
      
      [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/
      
      Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home
      
      
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Julia Lawall <Julia.Lawall@inria.fr>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Acked-by: default avatarJani Nikula <jani.nikula@intel.com>
      Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
      Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
      Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
      Acked-by: default avatarTakashi Iwai <tiwai@suse.de>
      Acked-by: Darrick J. Wong <djwong@kernel.org>	# xfs
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      2c92ca84
  2. May 22, 2024
  3. May 21, 2024
    • Aaron Conole's avatar
      openvswitch: Set the skbuff pkt_type for proper pmtud support. · 30a92c9e
      Aaron Conole authored
      
      Open vSwitch is originally intended to switch at layer 2, only dealing with
      Ethernet frames.  With the introduction of l3 tunnels support, it crossed
      into the realm of needing to care a bit about some routing details when
      making forwarding decisions.  If an oversized packet would need to be
      fragmented during this forwarding decision, there is a chance for pmtu
      to get involved and generate a routing exception.  This is gated by the
      skbuff->pkt_type field.
      
      When a flow is already loaded into the openvswitch module this field is
      set up and transitioned properly as a packet moves from one port to
      another.  In the case that a packet execute is invoked after a flow is
      newly installed this field is not properly initialized.  This causes the
      pmtud mechanism to omit sending the required exception messages across
      the tunnel boundary and a second attempt needs to be made to make sure
      that the routing exception is properly setup.  To fix this, we set the
      outgoing packet's pkt_type to PACKET_OUTGOING, since it can only get
      to the openvswitch module via a port device or packet command.
      
      Even for bridge ports as users, the pkt_type needs to be reset when
      doing the transmit as the packet is truly outgoing and routing needs
      to get involved post packet transformations, in the case of
      VXLAN/GENEVE/udp-tunnel packets.  In general, the pkt_type on output
      gets ignored, since we go straight to the driver, but in the case of
      tunnel ports they go through IP routing layer.
      
      This issue is periodically encountered in complex setups, such as large
      openshift deployments, where multiple sets of tunnel traversal occurs.
      A way to recreate this is with the ovn-heater project that can setup
      a networking environment which mimics such large deployments.  We need
      larger environments for this because we need to ensure that flow
      misses occur.  In these environment, without this patch, we can see:
      
        ./ovn_cluster.sh start
        podman exec ovn-chassis-1 ip r a 170.168.0.5/32 dev eth1 mtu 1200
        podman exec ovn-chassis-1 ip netns exec sw01p1 ip r flush cache
        podman exec ovn-chassis-1 ip netns exec sw01p1 \
               ping 21.0.0.3 -M do -s 1300 -c2
        PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
        From 21.0.0.3 icmp_seq=2 Frag needed and DF set (mtu = 1142)
      
        --- 21.0.0.3 ping statistics ---
        ...
      
      Using tcpdump, we can also see the expected ICMP FRAG_NEEDED message is not
      sent into the server.
      
      With this patch, setting the pkt_type, we see the following:
      
        podman exec ovn-chassis-1 ip netns exec sw01p1 \
               ping 21.0.0.3 -M do -s 1300 -c2
        PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
        From 21.0.0.3 icmp_seq=1 Frag needed and DF set (mtu = 1222)
        ping: local error: message too long, mtu=1222
      
        --- 21.0.0.3 ping statistics ---
        ...
      
      In this case, the first ping request receives the FRAG_NEEDED message and
      a local routing exception is created.
      
      Tested-by: default avatarJaime Caamano <jcaamano@redhat.com>
      Reported-at: https://issues.redhat.com/browse/FDP-164
      
      
      Fixes: 58264848 ("openvswitch: Add vxlan tunneling support.")
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Acked-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Link: https://lore.kernel.org/r/20240516200941.16152-1-aconole@redhat.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      30a92c9e
    • Michal Luczaj's avatar
      af_unix: Fix garbage collection of embryos carrying OOB with SCM_RIGHTS · 041933a1
      Michal Luczaj authored
      
      GC attempts to explicitly drop oob_skb's reference before purging the hit
      list.
      
      The problem is with embryos: kfree_skb(u->oob_skb) is never called on an
      embryo socket.
      
      The python script below [0] sends a listener's fd to its embryo as OOB
      data.  While GC does collect the embryo's queue, it fails to drop the OOB
      skb's refcount.  The skb which was in embryo's receive queue stays as
      unix_sk(sk)->oob_skb and keeps the listener's refcount [1].
      
      Tell GC to dispose embryo's oob_skb.
      
      [0]:
      from array import array
      from socket import *
      
      addr = '\x00unix-oob'
      lis = socket(AF_UNIX, SOCK_STREAM)
      lis.bind(addr)
      lis.listen(1)
      
      s = socket(AF_UNIX, SOCK_STREAM)
      s.connect(addr)
      scm = (SOL_SOCKET, SCM_RIGHTS, array('i', [lis.fileno()]))
      s.sendmsg([b'x'], [scm], MSG_OOB)
      lis.close()
      
      [1]
      $ grep unix-oob /proc/net/unix
      $ ./unix-oob.py
      $ grep unix-oob /proc/net/unix
      0000000000000000: 00000002 00000000 00000000 0001 02     0 @unix-oob
      0000000000000000: 00000002 00000000 00010000 0001 01  6072 @unix-oob
      
      Fixes: 4090fa37 ("af_unix: Replace garbage collection algorithm.")
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      041933a1
    • Kuniyuki Iwashima's avatar
      tcp: Fix shift-out-of-bounds in dctcp_update_alpha(). · 3ebc46ca
      Kuniyuki Iwashima authored
      
      In dctcp_update_alpha(), we use a module parameter dctcp_shift_g
      as follows:
      
        alpha -= min_not_zero(alpha, alpha >> dctcp_shift_g);
        ...
        delivered_ce <<= (10 - dctcp_shift_g);
      
      It seems syzkaller started fuzzing module parameters and triggered
      shift-out-of-bounds [0] by setting 100 to dctcp_shift_g:
      
        memcpy((void*)0x20000080,
               "/sys/module/tcp_dctcp/parameters/dctcp_shift_g\000", 47);
        res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x20000080ul,
                      /*flags=*/2ul, /*mode=*/0ul);
        memcpy((void*)0x20000000, "100\000", 4);
        syscall(__NR_write, /*fd=*/r[0], /*val=*/0x20000000ul, /*len=*/4ul);
      
      Let's limit the max value of dctcp_shift_g by param_set_uint_minmax().
      
      With this patch:
      
        # echo 10 > /sys/module/tcp_dctcp/parameters/dctcp_shift_g
        # cat /sys/module/tcp_dctcp/parameters/dctcp_shift_g
        10
        # echo 11 > /sys/module/tcp_dctcp/parameters/dctcp_shift_g
        -bash: echo: write error: Invalid argument
      
      [0]:
      UBSAN: shift-out-of-bounds in net/ipv4/tcp_dctcp.c:143:12
      shift exponent 100 is too large for 32-bit type 'u32' (aka 'unsigned int')
      CPU: 0 PID: 8083 Comm: syz-executor345 Not tainted 6.9.0-05151-g1b294a1f3561 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.13.0-1ubuntu1.1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x201/0x300 lib/dump_stack.c:114
       ubsan_epilogue lib/ubsan.c:231 [inline]
       __ubsan_handle_shift_out_of_bounds+0x346/0x3a0 lib/ubsan.c:468
       dctcp_update_alpha+0x540/0x570 net/ipv4/tcp_dctcp.c:143
       tcp_in_ack_event net/ipv4/tcp_input.c:3802 [inline]
       tcp_ack+0x17b1/0x3bc0 net/ipv4/tcp_input.c:3948
       tcp_rcv_state_process+0x57a/0x2290 net/ipv4/tcp_input.c:6711
       tcp_v4_do_rcv+0x764/0xc40 net/ipv4/tcp_ipv4.c:1937
       sk_backlog_rcv include/net/sock.h:1106 [inline]
       __release_sock+0x20f/0x350 net/core/sock.c:2983
       release_sock+0x61/0x1f0 net/core/sock.c:3549
       mptcp_subflow_shutdown+0x3d0/0x620 net/mptcp/protocol.c:2907
       mptcp_check_send_data_fin+0x225/0x410 net/mptcp/protocol.c:2976
       __mptcp_close+0x238/0xad0 net/mptcp/protocol.c:3072
       mptcp_close+0x2a/0x1a0 net/mptcp/protocol.c:3127
       inet_release+0x190/0x1f0 net/ipv4/af_inet.c:437
       __sock_release net/socket.c:659 [inline]
       sock_close+0xc0/0x240 net/socket.c:1421
       __fput+0x41b/0x890 fs/file_table.c:422
       task_work_run+0x23b/0x300 kernel/task_work.c:180
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0x9c8/0x2540 kernel/exit.c:878
       do_group_exit+0x201/0x2b0 kernel/exit.c:1027
       __do_sys_exit_group kernel/exit.c:1038 [inline]
       __se_sys_exit_group kernel/exit.c:1036 [inline]
       __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1036
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xe4/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x67/0x6f
      RIP: 0033:0x7f6c2b5005b6
      Code: Unable to access opcode bytes at 0x7f6c2b50058c.
      RSP: 002b:00007ffe883eb948 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
      RAX: ffffffffffffffda RBX: 00007f6c2b5862f0 RCX: 00007f6c2b5005b6
      RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
      RBP: 0000000000000001 R08: 00000000000000e7 R09: ffffffffffffffc0
      R10: 0000000000000006 R11: 0000000000000246 R12: 00007f6c2b5862f0
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
       </TASK>
      
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Reported-by: default avatarYue Sun <samsun1006219@gmail.com>
      Reported-by: default avatarxingwei lee <xrivendell7@gmail.com>
      Closes: https://lore.kernel.org/netdev/CAEkJfYNJM=cw-8x7_Vmj1J6uYVCWMbbvD=EFmDPVBGpTsqOxEA@mail.gmail.com/
      
      
      Fixes: e3118e83 ("net: tcp: add DCTCP congestion control algorithm")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240517091626.32772-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3ebc46ca
    • Hangbin Liu's avatar
      ipv6: sr: fix memleak in seg6_hmac_init_algo · efb9f4f1
      Hangbin Liu authored
      
      seg6_hmac_init_algo returns without cleaning up the previous allocations
      if one fails, so it's going to leak all that memory and the crypto tfms.
      
      Update seg6_hmac_exit to only free the memory when allocated, so we can
      reuse the code directly.
      
      Fixes: bf355b8d ("ipv6: sr: add core files for SR HMAC support")
      Reported-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Closes: https://lore.kernel.org/netdev/Zj3bh-gE7eT6V6aH@hog/
      
      
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Link: https://lore.kernel.org/r/20240517005435.2600277-1-liuhangbin@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      efb9f4f1
    • Kuniyuki Iwashima's avatar
      af_unix: Update unix_sk(sk)->oob_skb under sk_receive_queue lock. · 9841991a
      Kuniyuki Iwashima authored
      Billy Jheng Bing-Jhong reported a race between __unix_gc() and
      queue_oob().
      
      __unix_gc() tries to garbage-collect close()d inflight sockets,
      and then if the socket has MSG_OOB in unix_sk(sk)->oob_skb, GC
      will drop the reference and set NULL to it locklessly.
      
      However, the peer socket still can send MSG_OOB message and
      queue_oob() can update unix_sk(sk)->oob_skb concurrently, leading
      NULL pointer dereference. [0]
      
      To fix the issue, let's update unix_sk(sk)->oob_skb under the
      sk_receive_queue's lock and take it everywhere we touch oob_skb.
      
      Note that we defer kfree_skb() in manage_oob() to silence lockdep
      false-positive (See [1]).
      
      [0]:
      BUG: kernel NULL pointer dereference, address: 0000000000000008
       PF: supervisor write access in kernel mode
       PF: error_code(0x0002) - not-present page
      PGD 8000000009f5e067 P4D 8000000009f5e067 PUD 9f5d067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP PTI
      CPU: 3 PID: 50 Comm: kworker/3:1 Not tainted 6.9.0-rc5-00191-gd091e579b864 #110
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: events delayed_fput
      RIP: 0010:skb_dequeue (./include/linux/skbuff.h:2386 ./include/linux/skbuff.h:2402 net/core/skbuff.c:3847)
      Code: 39 e3 74 3e 8b 43 10 48 89 ef 83 e8 01 89 43 10 49 8b 44 24 08 49 c7 44 24 08 00 00 00 00 49 8b 14 24 49 c7 04 24 00 00 00 00 <48> 89 42 08 48 89 10 e8 e7 c5 42 00 4c 89 e0 5b 5d 41 5c c3 cc cc
      RSP: 0018:ffffc900001bfd48 EFLAGS: 00000002
      RAX: 0000000000000000 RBX: ffff8880088f5ae8 RCX: 00000000361289f9
      RDX: 0000000000000000 RSI: 0000000000000206 RDI: ffff8880088f5b00
      RBP: ffff8880088f5b00 R08: 0000000000080000 R09: 0000000000000001
      R10: 0000000000000003 R11: 0000000000000001 R12: ffff8880056b6a00
      R13: ffff8880088f5280 R14: 0000000000000001 R15: ffff8880088f5a80
      FS:  0000000000000000(0000) GS:ffff88807dd80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 0000000006314000 CR4: 00000000007506f0
      PKRU: 55555554
      Call Trace:
       <TASK>
       unix_release_sock (net/unix/af_unix.c:654)
       unix_release (net/unix/af_unix.c:1050)
       __sock_release (net/socket.c:660)
       sock_close (net/socket.c:1423)
       __fput (fs/file_table.c:423)
       delayed_fput (fs/file_table.c:444 (discriminator 3))
       process_one_work (kernel/workqueue.c:3259)
       worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
       kthread (kernel/kthread.c:388)
       ret_from_fork (arch/x86/kernel/process.c:153)
       ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
       </TASK>
      Modules linked in:
      CR2: 0000000000000008
      
      Link: https://lore.kernel.org/netdev/a00d3993-c461-43f2-be6d-07259c98509a@rbox.co/
      
       [1]
      Fixes: 1279f9d9 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")
      Reported-by: default avatarBilly Jheng Bing-Jhong <billy@starlabs.sg>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240516134835.8332-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9841991a
  4. May 20, 2024
    • Dan Aloni's avatar
      rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL · 4836da21
      Dan Aloni authored
      
      Under the scenario of IB device bonding, when bringing down one of the
      ports, or all ports, we saw xprtrdma entering a non-recoverable state
      where it is not even possible to complete the disconnect and shut it
      down the mount, requiring a reboot. Following debug, we saw that
      transport connect never ended after receiving the
      RDMA_CM_EVENT_DEVICE_REMOVAL callback.
      
      The DEVICE_REMOVAL callback is irrespective of whether the CM_ID is
      connected, and ESTABLISHED may not have happened. So need to work with
      each of these states accordingly.
      
      Fixes: 2acc5cae ('xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed')
      Cc: Sagi Grimberg <sagi.grimberg@vastdata.com>
      Signed-off-by: default avatarDan Aloni <dan.aloni@vastdata.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      4836da21
    • Dan Aloni's avatar
      sunrpc: fix NFSACL RPC retry on soft mount · 0dc9f430
      Dan Aloni authored
      It used to be quite awhile ago since 1b63a751 ('SUNRPC: Refactor
      rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that
      all mount parameters propagate to NFSACL clients. However since that
      change, if mount options as follows are given:
      
          soft,timeo=50,retrans=16,vers=3
      
      The resultant NFSACL client receives:
      
          cl_softrtry: 1
          cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0
      
      These values lead to NFSACL operations not being retried under the
      condition of transient network outages with soft mount. Instead, getacl
      call fails after 60 seconds with EIO.
      
      The simple fix is to pass the existing client's `cl_timeout` as the new
      client timeout.
      
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Benjamin Coddington <bcodding@redhat.com>
      Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/
      
      
      Fixes: 1b63a751 ('SUNRPC: Refactor rpc_clone_client()')
      Signed-off-by: default avatarDan Aloni <dan.aloni@vastdata.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      0dc9f430
    • Olga Kornievskaia's avatar
      SUNRPC: fix handling expired GSS context · 9b62ef6d
      Olga Kornievskaia authored
      
      In the case where we have received a successful reply to an RPC request,
      but while processing the reply the client in rpc_decode_header() finds
      an expired context, the code ends up propagating the error to the caller
      instead of getting a new context and retrying the request.
      
      To give more details, in rpc_decode_header() we call rpcauth_checkverf()
      will call into the gss and internally will at some point call
      gss_validate() which has a check if the current’s context lifetime
      expired, and it would fail. The reason for the failure gets ‘scrubbed’
      and translated to EACCES so when we get back to rpc_decode_header() we
      just go to “out_verifier” which for that error would get converted to
      “out_garbage” (ie it’s treated as garballed reply) and the next
      action is call_encode. Which (1) doesn’t reencode or re-send (not to
      mention no upcall happens because context expires as that reason just
      not known) and it again fails in the same decoding process. After
      re-trying it 3 times the error is propagated back to the caller
      (ie nfs4_write_done_cb() in the case a failing write).
      
      To fix this, instead we need to look to the case where the server
      decides that context has expired and replies with an RPC auth error.
      In that case, the rpc_decode_header() goes to "out_msg_denied" in that
      we return EKEYREJECTED which in call_decode() is sent to “call_reserve”
      which triggers an upcalls and a re-try of the operation.
      
      The proposed fix is in case of a failed rpc_decode_header() to check
      if credentials were set to be invalid and use that as a proxy for
      deciding that context has expired and then treat is same way as
      receiving an auth error.
      
      Signed-off-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      9b62ef6d
    • Ryosuke Yasuoka's avatar
      nfc: nci: Fix uninit-value in nci_rx_work · e4a87abf
      Ryosuke Yasuoka authored
      
      syzbot reported the following uninit-value access issue [1]
      
      nci_rx_work() parses received packet from ndev->rx_q. It should be
      validated header size, payload size and total packet size before
      processing the packet. If an invalid packet is detected, it should be
      silently discarded.
      
      Fixes: d24b0353 ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet")
      Reported-and-tested-by: default avatar <syzbot+d7b4dc6cd50410152534@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=d7b4dc6cd50410152534
      
       [1]
      Signed-off-by: default avatarRyosuke Yasuoka <ryasuoka@redhat.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4a87abf
    • Andrea Mayer's avatar
      ipv6: sr: fix missing sk_buff release in seg6_input_core · 5447f970
      Andrea Mayer authored
      
      The seg6_input() function is responsible for adding the SRH into a
      packet, delegating the operation to the seg6_input_core(). This function
      uses the skb_cow_head() to ensure that there is sufficient headroom in
      the sk_buff for accommodating the link-layer header.
      In the event that the skb_cow_header() function fails, the
      seg6_input_core() catches the error but it does not release the sk_buff,
      which will result in a memory leak.
      
      This issue was introduced in commit af3b5158 ("ipv6: sr: fix BUG due
      to headroom too small after SRH push") and persists even after commit
      7a3f5b0d ("netfilter: add netfilter hooks to SRv6 data plane"),
      where the entire seg6_input() code was refactored to deal with netfilter
      hooks.
      
      The proposed patch addresses the identified memory leak by requiring the
      seg6_input_core() function to release the sk_buff in the event that
      skb_cow_head() fails.
      
      Fixes: af3b5158 ("ipv6: sr: fix BUG due to headroom too small after SRH push")
      Signed-off-by: default avatarAndrea Mayer <andrea.mayer@uniroma2.it>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5447f970
  5. May 17, 2024
    • Tom Parkin's avatar
      l2tp: fix ICMP error handling for UDP-encap sockets · 6e828dc6
      Tom Parkin authored
      
      Since commit a36e185e
      ("udp: Handle ICMP errors for tunnels with same destination port on both endpoints")
      UDP's handling of ICMP errors has allowed for UDP-encap tunnels to
      determine socket associations in scenarios where the UDP hash lookup
      could not.
      
      Subsequently, commit d26796ae
      ("udp: check udp sock encap_type in __udp_lib_err")
      subtly tweaked the approach such that UDP ICMP error handling would be
      skipped for any UDP socket which has encapsulation enabled.
      
      In the case of L2TP tunnel sockets using UDP-encap, this latter
      modification effectively broke ICMP error reporting for the L2TP
      control plane.
      
      To a degree this isn't catastrophic inasmuch as the L2TP control
      protocol defines a reliable transport on top of the underlying packet
      switching network which will eventually detect errors and time out.
      
      However, paying attention to the ICMP error reporting allows for more
      timely detection of errors in L2TP userspace, and aids in debugging
      connectivity issues.
      
      Reinstate ICMP error handling for UDP encap L2TP tunnels:
      
       * implement struct udp_tunnel_sock_cfg .encap_err_rcv in order to allow
         the L2TP code to handle ICMP errors;
      
       * only implement error-handling for tunnels which have a managed
         socket: unmanaged tunnels using a kernel socket have no userspace to
         report errors back to;
      
       * flag the error on the socket, which allows for userspace to get an
         error such as -ECONNREFUSED back from sendmsg/recvmsg;
      
       * pass the error into ip[v6]_icmp_error() which allows for userspace to
         get extended error information via. MSG_ERRQUEUE.
      
      Fixes: d26796ae ("udp: check udp sock encap_type in __udp_lib_err")
      Signed-off-by: default avatarTom Parkin <tparkin@katalix.com>
      Link: https://lore.kernel.org/r/20240513172248.623261-1-tparkin@katalix.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6e828dc6
    • Eric Dumazet's avatar
      af_packet: do not call packet_read_pending() from tpacket_destruct_skb() · 581073f6
      Eric Dumazet authored
      
      trafgen performance considerably sank on hosts with many cores
      after the blamed commit.
      
      packet_read_pending() is very expensive, and calling it
      in af_packet fast path defeats Daniel intent in commit
      b0138408 ("packet: use percpu mmap tx frame pending refcount")
      
      tpacket_destruct_skb() makes room for one packet, we can immediately
      wakeup a producer, no need to completely drain the tx ring.
      
      Fixes: 89ed5b51 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20240515163358.4105915-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      581073f6
    • Eric Dumazet's avatar
      netrom: fix possible dead-lock in nr_rt_ioctl() · e03e7f20
      Eric Dumazet authored
      
      syzbot loves netrom, and found a possible deadlock in nr_rt_ioctl [1]
      
      Make sure we always acquire nr_node_list_lock before nr_node_lock(nr_node)
      
      [1]
      WARNING: possible circular locking dependency detected
      6.9.0-rc7-syzkaller-02147-g654de42f3fc6 #0 Not tainted
      ------------------------------------------------------
      syz-executor350/5129 is trying to acquire lock:
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_node_lock include/net/netrom.h:152 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:464 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
      
      but task is already holding lock:
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (nr_node_list_lock){+...}-{2:2}:
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_remove_node net/netrom/nr_route.c:299 [inline]
              nr_del_node+0x4b4/0x820 net/netrom/nr_route.c:355
              nr_rt_ioctl+0xa95/0x1090 net/netrom/nr_route.c:683
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      -> #0 (&nr_node->node_lock){+...}-{2:2}:
              check_prev_add kernel/locking/lockdep.c:3134 [inline]
              check_prevs_add kernel/locking/lockdep.c:3253 [inline]
              validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
              __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_node_lock include/net/netrom.h:152 [inline]
              nr_dec_obs net/netrom/nr_route.c:464 [inline]
              nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(nr_node_list_lock);
                                     lock(&nr_node->node_lock);
                                     lock(nr_node_list_lock);
        lock(&nr_node->node_lock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor350/5129:
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      stack backtrace:
      CPU: 0 PID: 5129 Comm: syz-executor350 Not tainted 6.9.0-rc7-syzkaller-02147-g654de42f3fc6 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
        check_prev_add kernel/locking/lockdep.c:3134 [inline]
        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
        lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
        _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
        spin_lock_bh include/linux/spinlock.h:356 [inline]
        nr_node_lock include/net/netrom.h:152 [inline]
        nr_dec_obs net/netrom/nr_route.c:464 [inline]
        nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
        sock_do_ioctl+0x158/0x460 net/socket.c:1222
        sock_ioctl+0x629/0x8e0 net/socket.c:1341
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:904 [inline]
        __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240515142934.3708038-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e03e7f20
    • xu xin's avatar
      net/ipv6: Fix route deleting failure when metric equals 0 · bb487272
      xu xin authored
      
      Problem
      =========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      we noticed that the logic of assigning the default value of fc_metirc
      changed in the ioctl process. That is, when users use ioctl(fd, SIOCADDRT,
      rt) with a non-zero metric to add a route,  then they may fail to delete a
      route with passing in a metric value of 0 to the kernel by ioctl(fd,
      SIOCDELRT, rt). But iproute can succeed in deleting it.
      
      As a reference, when using iproute tools by netlink to delete routes with
      a metric parameter equals 0, like the command as follows:
      
      	ip -6 route del fe80::/64 via fe81::5054:ff:fe11:3451 dev eth0 metric 0
      
      the user can still succeed in deleting the route entry with the smallest
      metric.
      
      Root Reason
      ===========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      When ioctl() pass in SIOCDELRT with a zero metric, rtmsg_to_fib6_config()
      will set a defalut value (1024) to cfg->fc_metric in kernel, and in
      ip6_route_del() and the line 4074 at net/ipv3/route.c, it will check by
      
      	if (cfg->fc_metric && cfg->fc_metric != rt->fib6_metric)
      		continue;
      
      and the condition is true and skip the later procedure (deleting route)
      because cfg->fc_metric != rt->fib6_metric. But before that commit,
      cfg->fc_metric is still zero there, so the condition is false and it
      will do the following procedure (deleting).
      
      Solution
      ========
      In order to keep a consistent behaviour across netlink() and ioctl(), we
      should allow to delete a route with a metric value of 0. So we only do
      the default setting of fc_metric in route adding.
      
      CC: stable@vger.kernel.org # 5.4+
      Fixes: 67f69513 ("ipv6: Move setting default metric for routes")
      Co-developed-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarxu xin <xu.xin16@zte.com.cn>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240514201102055dD2Ba45qKbLlUMxu_DTHP@zte.com.cn
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bb487272
  6. May 16, 2024
  7. May 15, 2024
    • Nikolay Aleksandrov's avatar
      net: bridge: mst: fix vlan use-after-free · 3a7c1661
      Nikolay Aleksandrov authored
      
      syzbot reported a suspicious rcu usage[1] in bridge's mst code. While
      fixing it I noticed that nothing prevents a vlan to be freed while
      walking the list from the same path (br forward delay timer). Fix the rcu
      usage and also make sure we are not accessing freed memory by making
      br_mst_vlan_set_state use rcu read lock.
      
      [1]
       WARNING: suspicious RCU usage
       6.9.0-rc6-syzkaller #0 Not tainted
       -----------------------------
       net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage!
       ...
       stack backtrace:
       CPU: 1 PID: 8017 Comm: syz-executor.1 Not tainted 6.9.0-rc6-syzkaller #0
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
        nbp_vlan_group net/bridge/br_private.h:1599 [inline]
        br_mst_set_state+0x1ea/0x650 net/bridge/br_mst.c:105
        br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47
        br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88
        call_timer_fn+0x18e/0x650 kernel/time/timer.c:1793
        expire_timers kernel/time/timer.c:1844 [inline]
        __run_timers kernel/time/timer.c:2418 [inline]
        __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2429
        run_timer_base kernel/time/timer.c:2438 [inline]
        run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2448
        __do_softirq+0x2c6/0x980 kernel/softirq.c:554
        invoke_softirq kernel/softirq.c:428 [inline]
        __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
        irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
        instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
        sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
        </IRQ>
        <TASK>
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
       RIP: 0010:lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5758
       Code: 2b 00 74 08 4c 89 f7 e8 ba d1 84 00 f6 44 24 61 02 0f 85 85 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
       RSP: 0018:ffffc90013657100 EFLAGS: 00000206
       RAX: 0000000000000001 RBX: 1ffff920026cae2c RCX: 0000000000000001
       RDX: dffffc0000000000 RSI: ffffffff8bcaca00 RDI: ffffffff8c1eaa60
       RBP: ffffc90013657260 R08: ffffffff92efe507 R09: 1ffffffff25dfca0
       R10: dffffc0000000000 R11: fffffbfff25dfca1 R12: 1ffff920026cae28
       R13: dffffc0000000000 R14: ffffc90013657160 R15: 0000000000000246
      
      Fixes: ec7328b5 ("net: bridge: mst: Multiple Spanning Tree (MST) mode")
      Reported-by: default avatar <syzbot+fa04eb8a56fd923fc5d8@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=fa04eb8a56fd923fc5d8
      
      
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a7c1661
    • Nikolay Aleksandrov's avatar
      net: bridge: xmit: make sure we have at least eth header len bytes · 8bd67ebb
      Nikolay Aleksandrov authored
      
      syzbot triggered an uninit value[1] error in bridge device's xmit path
      by sending a short (less than ETH_HLEN bytes) skb. To fix it check if
      we can actually pull that amount instead of assuming.
      
      Tested with dropwatch:
       drop at: br_dev_xmit+0xb93/0x12d0 [bridge] (0xffffffffc06739b3)
       origin: software
       timestamp: Mon May 13 11:31:53 2024 778214037 nsec
       protocol: 0x88a8
       length: 2
       original length: 2
       drop reason: PKT_TOO_SMALL
      
      [1]
      BUG: KMSAN: uninit-value in br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
       br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
       __netdev_start_xmit include/linux/netdevice.h:4903 [inline]
       netdev_start_xmit include/linux/netdevice.h:4917 [inline]
       xmit_one net/core/dev.c:3531 [inline]
       dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
       __dev_queue_xmit+0x34db/0x5350 net/core/dev.c:4341
       dev_queue_xmit include/linux/netdevice.h:3091 [inline]
       __bpf_tx_skb net/core/filter.c:2136 [inline]
       __bpf_redirect_common net/core/filter.c:2180 [inline]
       __bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
       ____bpf_clone_redirect net/core/filter.c:2460 [inline]
       bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
       ___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
       __bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
       bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
       __bpf_prog_run include/linux/filter.h:657 [inline]
       bpf_prog_run include/linux/filter.h:664 [inline]
       bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
       bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
       bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4269
       __sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5678
       __do_sys_bpf kernel/bpf/syscall.c:5767 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5765 [inline]
       __x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5765
       x64_sys_call+0x96b/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:322
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatar <syzbot+a63a1f6a062033cf0f40@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=a63a1f6a062033cf0f40
      
      
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bd67ebb
  8. May 14, 2024
    • Heiko Carstens's avatar
      s390/iucv: Unexport iucv_root · effb8357
      Heiko Carstens authored
      
      There is no user of iucv_root outside of the core IUCV code left.
      Therefore remove the EXPORT_SYMBOL.
      
      Acked-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240506194454.1160315-7-hca@linux.ibm.com
      
      
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      effb8357
    • Heiko Carstens's avatar
      s390/iucv: Provide iucv_alloc_device() / iucv_release_device() · 4452e8ef
      Heiko Carstens authored
      
      Provide iucv_alloc_device() and iucv_release_device() helper functions,
      which can be used to deduplicate more or less identical IUCV device
      allocation and release code in four different drivers.
      
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240506194454.1160315-2-hca@linux.ibm.com
      
      
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      4452e8ef
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1 · e77f43d5
      Luiz Augusto von Dentz authored
      
      If hdev->le_num_of_adv_sets is set to 1 it means that only handle 0x00
      can be used, but since the MGMT interface instances start from 1
      (instance 0 means all instances in case of MGMT_OP_REMOVE_ADVERTISING)
      the code needs to map the instance to handle otherwise users will not be
      able to advertise as instance 1 would attempt to use handle 0x01.
      
      Fixes: 1d0fac2c ("Bluetooth: Use controller sets when available")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      e77f43d5
    • Luiz Augusto von Dentz's avatar
      Bluetooth: HCI: Remove HCI_AMP support · 84a4bb65
      Luiz Augusto von Dentz authored
      
      Since BT_HS has been remove HCI_AMP controllers no longer has any use so
      remove it along with the capability of creating AMP controllers.
      
      Since we no longer need to differentiate between AMP and Primary
      controllers, as only HCI_PRIMARY is left, this also remove
      hdev->dev_type altogether.
      
      Fixes: e7b02296 ("Bluetooth: Remove BT_HS")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      84a4bb65
    • Sungwoo Kim's avatar
      Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init() · a5b862c6
      Sungwoo Kim authored
      
      l2cap_le_flowctl_init() can cause both div-by-zero and an integer
      overflow since hdev->le_mtu may not fall in the valid range.
      
      Move MTU from hci_dev to hci_conn to validate MTU and stop the connection
      process earlier if MTU is invalid.
      Also, add a missing validation in read_buffer_size() and make it return
      an error value if the validation fails.
      Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a
      kzalloc failure and invalid MTU value.
      
      divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
      CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G        W          6.9.0-rc5+ #20
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Workqueue: hci0 hci_rx_work
      RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547
      Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c
      89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d
      b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42
      RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246
      RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f
      RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa
      R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084
      R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000
      FS:  0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline]
       l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline]
       l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline]
       l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809
       l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506
       hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline]
       hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176
       process_one_work kernel/workqueue.c:3254 [inline]
       process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335
       worker_thread+0x926/0xe70 kernel/workqueue.c:3416
       kthread+0x2e3/0x380 kernel/kthread.c:388
       ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      
      Fixes: 6ed58ec5 ("Bluetooth: Use LE buffers for LE traffic")
      Suggested-by: default avatarLuiz Augusto von Dentz <luiz.dentz@gmail.com>
      Signed-off-by: default avatarSungwoo Kim <iam@sung-woo.kim>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      a5b862c6
    • Gustavo A. R. Silva's avatar
      Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning · ea9e148c
      Gustavo A. R. Silva authored
      Prepare for the coming implementation by GCC and Clang of the
      __counted_by attribute. Flexible array members annotated with
      __counted_by can have their accesses bounds-checked at run-time
      via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE
      (for strcpy/memcpy-family functions).
      
      Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are
      getting ready to enable it globally.
      
      So, use the `DEFINE_FLEX()` helper for an on-stack definition of
      a flexible structure where the size of the flexible-array member
      is known at compile-time, and refactor the rest of the code,
      accordingly.
      
      With these changes, fix the following warning:
      net/bluetooth/hci_conn.c:669:41: warning: structure containing a
      flexible array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      
      Link: https://github.com/KSPP/linux/issues/202
      
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      ea9e148c
    • Mahesh Talewad's avatar
      LE Create Connection command timeout increased to 20 secs · 21d74b6b
      Mahesh Talewad authored
      
      On our DUT, we can see that the host issues create connection cancel
      command after 4-sec if there is no connection complete event for
      LE create connection cmd.
      As per core spec v5.3 section 7.8.5, advertisement interval range is-
      
      Advertising_Interval_Min
      Default : 0x0800(1.28s)
      Time Range: 20ms to 10.24s
      
      Advertising_Interval_Max
      Default : 0x0800(1.28s)
      Time Range: 20ms to 10.24s
      
      If the remote device is using adv interval of > 4 sec, it is
      difficult to make a connection with the current timeout value.
      Also, with the default interval of 1.28 sec, we will get only
      3 chances to capture the adv packets with the 4 sec window.
      Hence we want to increase this timeout to 20sec.
      
      Signed-off-by: default avatarMahesh Talewad <mahesh.talewad@nxp.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      21d74b6b
    • Sebastian Urban's avatar
      Bluetooth: compute LE flow credits based on recvbuf space · ce60b923
      Sebastian Urban authored
      
      Previously LE flow credits were returned to the
      sender even if the socket's receive buffer was
      full. This meant that no back-pressure
      was applied to the sender, thus it continued to
      send data, resulting in data loss without any
      error being reported. Furthermore, the amount
      of credits was essentially fixed to a small
      amount, leading to reduced performance.
      
      This is fixed by computing the number of returned
      LE flow credits based on the estimated available
      space in the receive buffer of an L2CAP socket.
      Consequently, if the receive buffer is full, no
      credits are returned until the buffer is read and
      thus cleared by user-space.
      
      Since the computation of available receive buffer
      space can only be performed approximately (due to
      sk_buff overhead) and the receive buffer size may
      be changed by user-space after flow credits have
      been sent, superfluous received data is temporary
      stored within l2cap_pinfo. This is necessary
      because Bluetooth LE provides no retransmission
      mechanism once the data has been acked by the
      physical layer.
      
      If receive buffer space estimation is not possible
      at the moment, we fall back to providing credits
      for one full packet as before. This is currently
      the case during connection setup, when MPS is not
      yet available.
      
      Fixes: b1c325c2 ("Bluetooth: Implement returning of LE L2CAP credits")
      Signed-off-by: default avatarSebastian Urban <surban@surban.net>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      ce60b923
    • Gustavo A. R. Silva's avatar
      Bluetooth: hci_sync: Use cmd->num_cis instead of magic number · 73b2652c
      Gustavo A. R. Silva authored
      At the moment of the check, `cmd->num_cis` holds the value of 0x1f,
      which is the max number of elements in the `cmd->cis[]` array at
      declaration, which is 0x1f.
      
      So, avoid using 0x1f directly, and instead use `cmd->num_cis`. Similarly
      to this other patch[1].
      
      Link: https://lore.kernel.org/linux-hardening/ZivaHUQyDDK9fXEk@neat/
      
       [1]
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      73b2652c
    • Gustavo A. R. Silva's avatar
      Bluetooth: hci_conn: Use struct_size() in hci_le_big_create_sync() · d6bb8782
      Gustavo A. R. Silva authored
      Use struct_size() instead of the open-coded version. Similarly to
      this other patch[1].
      
      Link: https://lore.kernel.org/linux-hardening/ZiwwPmCvU25YzWek@neat/
      
       [1]
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      d6bb8782
    • Gustavo A. R. Silva's avatar
      Bluetooth: hci_conn: Use __counted_by() to avoid -Wfamnae warning · c90748b8
      Gustavo A. R. Silva authored
      Prepare for the coming implementation by GCC and Clang of the
      __counted_by attribute. Flexible array members annotated with
      __counted_by can have their accesses bounds-checked at run-time
      via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE
      (for strcpy/memcpy-family functions).
      
      Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are
      getting ready to enable it globally.
      
      So, use the `DEFINE_FLEX()` helper for an on-stack definition of
      a flexible structure where the size of the flexible-array member
      is known at compile-time, and refactor the rest of the code,
      accordingly.
      
      With these changes, fix the following warning:
      net/bluetooth/hci_conn.c:2116:50: warning: structure containing a flexible
      array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      
      Link: https://github.com/KSPP/linux/issues/202
      
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      c90748b8
    • Gustavo A. R. Silva's avatar
      Bluetooth: hci_conn, hci_sync: Use __counted_by() to avoid -Wfamnae warnings · c4585edf
      Gustavo A. R. Silva authored
      Prepare for the coming implementation by GCC and Clang of the
      __counted_by attribute. Flexible array members annotated with
      __counted_by can have their accesses bounds-checked at run-time
      via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE
      (for strcpy/memcpy-family functions).
      
      Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are
      getting ready to enable it globally.
      
      So, use the `DEFINE_FLEX()` helper for multiple on-stack definitions
      of a flexible structure where the size of the flexible-array member
      is known at compile-time, and refactor the rest of the code,
      accordingly.
      
      Notice that, due to the use of `__counted_by()` in `struct
      hci_cp_le_create_cis`, the for loop in function `hci_cs_le_create_cis()`
      had to be modified. Once the index `i`, through which `cp->cis[i]` is
      accessed, falls in the interval [0, cp->num_cis), `cp->num_cis` cannot
      be decremented all the way down to zero while accessing `cp->cis[]`:
      
      net/bluetooth/hci_event.c:4310:
      4310    for (i = 0; cp->num_cis; cp->num_cis--, i++) {
                      ...
      4314            handle = __le16_to_cpu(cp->cis[i].cis_handle);
      
      otherwise, only half (one iteration before `cp->num_cis == i`) or half
      plus one (one iteration before `cp->num_cis < i`) of the items in the
      array will be accessed before running into an out-of-bounds issue. So,
      in order to avoid this, set `cp->num_cis` to zero just after the for
      loop.
      
      Also, make use of `aux_num_cis` variable to update `cmd->num_cis` after
      a `list_for_each_entry_rcu()` loop.
      
      With these changes, fix the following warnings:
      net/bluetooth/hci_sync.c:1239:56: warning: structure containing a flexible
      array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/hci_sync.c:1415:51: warning: structure containing a flexible
      array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/hci_sync.c:1731:51: warning: structure containing a flexible
      array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/hci_sync.c:6497:45: warning: structure containing a flexible
      array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      
      Link: https://github.com/KSPP/linux/issues/202
      
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      c4585edf
    • Zijun Hu's avatar
      Bluetooth: Remove 3 repeated macro definitions · 94c603c2
      Zijun Hu authored
      
      Macros HCI_REQ_DONE, HCI_REQ_PEND and HCI_REQ_CANCELED are repeatedly
      defined twice with hci_request.h, so remove a copy of definition.
      
      Signed-off-by: default avatarZijun Hu <quic_zijuhu@quicinc.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      94c603c2
    • Zijun Hu's avatar
      Bluetooth: hci_conn: Remove a redundant check for HFP offload · d68d8a7a
      Zijun Hu authored
      
      Remove a redundant check !hdev->get_codec_config_data.
      
      Signed-off-by: default avatarZijun Hu <quic_zijuhu@quicinc.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      d68d8a7a
    • Gustavo A. R. Silva's avatar
      Bluetooth: L2CAP: Avoid -Wflex-array-member-not-at-end warnings · 1c08108f
      Gustavo A. R. Silva authored
      -Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
      ready to enable it globally.
      
      There are currently a couple of objects (`req` and `rsp`), in a couple
      of structures, that contain flexible structures (`struct l2cap_ecred_conn_req`
      and `struct l2cap_ecred_conn_rsp`), for example:
      
      struct l2cap_ecred_rsp_data {
              struct {
                      struct l2cap_ecred_conn_rsp rsp;
                      __le16 scid[L2CAP_ECRED_MAX_CID];
              } __packed pdu;
              int count;
      };
      
      in the struct above, `struct l2cap_ecred_conn_rsp` is a flexible
      structure:
      
      struct l2cap_ecred_conn_rsp {
              __le16 mtu;
              __le16 mps;
              __le16 credits;
              __le16 result;
              __le16 dcid[];
      };
      
      So, in order to avoid ending up with a flexible-array member in the
      middle of another structure, we use the `struct_group_tagged()` (and
      `__struct_group()` when the flexible structure is `__packed`) helper
      to separate the flexible array from the rest of the members in the
      flexible structure:
      
      struct l2cap_ecred_conn_rsp {
              struct_group_tagged(l2cap_ecred_conn_rsp_hdr, hdr,
      
      	... the rest of members
      
              );
              __le16 dcid[];
      };
      
      With the change described above, we now declare objects of the type of
      the tagged struct, in this example `struct l2cap_ecred_conn_rsp_hdr`,
      without embedding flexible arrays in the middle of other structures:
      
      struct l2cap_ecred_rsp_data {
              struct {
                      struct l2cap_ecred_conn_rsp_hdr rsp;
                      __le16 scid[L2CAP_ECRED_MAX_CID];
              } __packed pdu;
              int count;
      };
      
      Also, when the flexible-array member needs to be accessed, we use
      `container_of()` to retrieve a pointer to the flexible structure.
      
      We also use the `DEFINE_RAW_FLEX()` helper for a couple of on-stack
      definitions of a flexible structure where the size of the flexible-array
      member is known at compile-time.
      
      So, with these changes, fix the following warnings:
      net/bluetooth/l2cap_core.c:1260:45: warning: structure containing a
      flexible array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/l2cap_core.c:3740:45: warning: structure containing a
      flexible array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/l2cap_core.c:4999:45: warning: structure containing a
      flexible array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      net/bluetooth/l2cap_core.c:7116:47: warning: structure containing a
      flexible array member is not at the end of another structure
      [-Wflex-array-member-not-at-end]
      
      Link: https://github.com/KSPP/linux/issues/202
      
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      1c08108f
    • Iulia Tanasescu's avatar
      Bluetooth: ISO: Handle PA sync when no BIGInfo reports are generated · d356c924
      Iulia Tanasescu authored
      
      In case of a Broadcast Source that has PA enabled but no active BIG,
      a Broadcast Sink needs to establish PA sync and parse BASE from PA
      reports.
      
      This commit moves the allocation of a PA sync hcon from the BIGInfo
      advertising report event to the PA sync established event. After the
      first complete PA report, the hcon is notified to the ISO layer. A
      child socket is allocated and enqueued in the parent's accept queue.
      
      BIGInfo reports also need to be processed, to extract the encryption
      field and inform userspace. After the first BIGInfo report is received,
      the PA sync hcon is notified again to the ISO layer. Since a socket will
      be found this time, the socket state will transition to BT_CONNECTED and
      the userspace will be woken up using sk_state_change.
      
      Signed-off-by: default avatarIulia Tanasescu <iulia.tanasescu@nxp.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      d356c924
Loading