Skip to content
Snippets Groups Projects
  1. Nov 14, 2023
    • Dan Carpenter's avatar
      netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval() · c301f098
      Dan Carpenter authored
      
      The problem is in nft_byteorder_eval() where we are iterating through a
      loop and writing to dst[0], dst[1], dst[2] and so on...  On each
      iteration we are writing 8 bytes.  But dst[] is an array of u32 so each
      element only has space for 4 bytes.  That means that every iteration
      overwrites part of the previous element.
      
      I spotted this bug while reviewing commit caf3ef74 ("netfilter:
      nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
      issue.  I think that the reason we have not detected this bug in testing
      is that most of time we only write one element.
      
      Fixes: ce1e7989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c301f098
    • Linkui Xiao's avatar
      netfilter: nf_conntrack_bridge: initialize err to 0 · a44af08e
      Linkui Xiao authored
      
      K2CI reported a problem:
      
      	consume_skb(skb);
      	return err;
      [nf_br_ip_fragment() error]  uninitialized symbol 'err'.
      
      err is not initialized, because returning 0 is expected, initialize err
      to 0.
      
      Fixes: 3c171f49 ("netfilter: bridge: add connection tracking system")
      Reported-by: default avatark2ci <kernel-bot@kylinos.cn>
      Signed-off-by: default avatarLinkui Xiao <xiaolinkui@kylinos.cn>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a44af08e
    • Yang Li's avatar
      netfilter: nft_set_rbtree: Remove unused variable nft_net · 67059b61
      Yang Li authored
      
      The code that uses nft_net has been removed, and the nft_pernet function
      is merely obtaining a reference to shared data through the net pointer.
      The content of the net pointer is not modified or changed, so both of
      them should be removed.
      
      silence the warning:
      net/netfilter/nft_set_rbtree.c:627:26: warning: variable ‘nft_net’ set but not used
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7103
      
      
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      67059b61
    • Eric Dumazet's avatar
      af_unix: fix use-after-free in unix_stream_read_actor() · 4b7b4926
      Eric Dumazet authored
      
      syzbot reported the following crash [1]
      
      After releasing unix socket lock, u->oob_skb can be changed
      by another thread. We must temporarily increase skb refcount
      to make sure this other thread will not free the skb under us.
      
      [1]
      
      BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
      
      CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a627b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      unix_stream_recv_urg net/unix/af_unix.c:2587 [inline]
      unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666
      unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903
      sock_recvmsg_nosec net/socket.c:1044 [inline]
      sock_recvmsg+0xe2/0x170 net/socket.c:1066
      ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803
      ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845
      __sys_recvmsg+0x114/0x1e0 net/socket.c:2875
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7fc67492c559
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559
      RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004
      RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340
      R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388
      </TASK>
      
      Allocated by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
      ____kasan_slab_free mm/kasan/common.c:236 [inline]
      ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
      kasan_slab_free include/linux/kasan.h:164 [inline]
      slab_free_hook mm/slub.c:1800 [inline]
      slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
      slab_free mm/slub.c:3809 [inline]
      kmem_cache_free+0xf8/0x340 mm/slub.c:3831
      kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015
      __kfree_skb net/core/skbuff.c:1073 [inline]
      consume_skb net/core/skbuff.c:1288 [inline]
      consume_skb+0xdf/0x170 net/core/skbuff.c:1282
      queue_oob net/unix/af_unix.c:2178 [inline]
      unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88801f3b9c80
      which belongs to the cache skbuff_head_cache of size 240
      The buggy address is located 68 bytes inside of
      freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
      
      The buggy address belongs to the physical page:
      page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9
      flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431
      set_page_owner include/linux/page_owner.h:31 [inline]
      post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537
      prep_new_page mm/page_alloc.c:1544 [inline]
      get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312
      __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568
      alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
      alloc_slab_page mm/slub.c:1870 [inline]
      allocate_slab+0x251/0x380 mm/slub.c:2017
      new_slab mm/slub.c:2070 [inline]
      ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223
      __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
      __slab_alloc_node mm/slub.c:3375 [inline]
      slab_alloc_node mm/slub.c:3468 [inline]
      kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      page last free stack trace:
      reset_page_owner include/linux/page_owner.h:24 [inline]
      free_pages_prepare mm/page_alloc.c:1137 [inline]
      free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347
      free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487
      __unfreeze_partials+0x21d/0x240 mm/slub.c:2655
      qlink_free mm/kasan/quarantine.c:168 [inline]
      qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
      kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
      __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      slab_alloc mm/slub.c:3486 [inline]
      __kmem_cache_alloc_lru mm/slub.c:3493 [inline]
      kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502
      vm_area_dup+0x21/0x2f0 kernel/fork.c:500
      __split_vma+0x17d/0x1070 mm/mmap.c:2365
      split_vma mm/mmap.c:2437 [inline]
      vma_modify+0x25d/0x450 mm/mmap.c:2472
      vma_modify_flags include/linux/mm.h:3271 [inline]
      mprotect_fixup+0x228/0xc80 mm/mprotect.c:635
      do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809
      __do_sys_mprotect mm/mprotect.c:830 [inline]
      __se_sys_mprotect mm/mprotect.c:827 [inline]
      __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Memory state around the buggy address:
      ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ^
      ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
      ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      
      Fixes: 876c14ad ("af_unix: fix holding spinlock in oob handling")
      Reported-and-tested-by: default avatar <syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rao Shoaib <rao.shoaib@oracle.com>
      Reviewed-by: default avatarRao shoaib <rao.shoaib@oracle.com>
      Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b7b4926
  2. Nov 13, 2023
    • Shigeru Yoshida's avatar
      tipc: Fix kernel-infoleak due to uninitialized TLV value · fb317eb2
      Shigeru Yoshida authored
      
      KMSAN reported the following kernel-infoleak issue:
      
      =====================================================
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
      BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       copy_to_user_iter lib/iov_iter.c:24 [inline]
       iterate_ubuf include/linux/iov_iter.h:29 [inline]
       iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       iterate_and_advance include/linux/iov_iter.h:271 [inline]
       _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       copy_to_iter include/linux/uio.h:197 [inline]
       simple_copy_to_iter net/core/datagram.c:532 [inline]
       __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420
       skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546
       skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
       netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967
       sock_recvmsg_nosec net/socket.c:1044 [inline]
       sock_recvmsg net/socket.c:1066 [inline]
       __sys_recvfrom+0x476/0x860 net/socket.c:2246
       __do_sys_recvfrom net/socket.c:2264 [inline]
       __se_sys_recvfrom net/socket.c:2260 [inline]
       __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651
       alloc_skb include/linux/skbuff.h:1286 [inline]
       tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline]
       tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170
       tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324
       genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
       genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067
       netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545
       genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076
       netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
       netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       ____sys_sendmsg+0x997/0xd60 net/socket.c:2588
       ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
       __sys_sendmsg net/socket.c:2671 [inline]
       __do_sys_sendmsg net/socket.c:2680 [inline]
       __se_sys_sendmsg net/socket.c:2678 [inline]
       __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Bytes 34-35 of 36 are uninitialized
      Memory access of size 36 starts at ffff88802d464a00
      Data copied to user address 00007ff55033c0a0
      
      CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      =====================================================
      
      tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is
      calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and
      the length of TLV value passed as an argument, and aligns the result to a
      multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
      
      If the size of struct tlv_desc plus the length of TLV value is not aligned,
      the current implementation leaves the remaining bytes uninitialized. This
      is the cause of the above kernel-infoleak issue.
      
      This patch resolves this issue by clearing data up to an aligned size.
      
      Fixes: d0796d1e ("tipc: convert legacy nl bearer dump to nl compat")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb317eb2
    • Willem de Bruijn's avatar
      net: gso_test: support CONFIG_MAX_SKB_FRAGS up to 45 · e6daf129
      Willem de Bruijn authored
      
      The test allocs a single page to hold all the frag_list skbs. This
      is insufficient on kernels with CONFIG_MAX_SKB_FRAGS=45, due to the
      increased skb_shared_info frags[] array length.
      
              gso_test_func: ASSERTION FAILED at net/core/gso_test.c:210
              Expected alloc_size <= ((1UL) << 12), but
                  alloc_size == 5075 (0x13d3)
                  ((1UL) << 12) == 4096 (0x1000)
      
      Simplify the logic. Just allocate a page for each frag_list skb.
      
      Fixes: 4688ecb1 ("net: expand skb_segment unit test with frag_list coverage")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6daf129
  3. Nov 10, 2023
    • Stanislav Fomichev's avatar
      net: set SOCK_RCU_FREE before inserting socket into hashtable · 871019b2
      Stanislav Fomichev authored
      
      We've started to see the following kernel traces:
      
       WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
      
       Call Trace:
        <IRQ>
        __bpf_skc_lookup+0x10d/0x120
        bpf_sk_lookup+0x48/0xd0
        bpf_sk_lookup_tcp+0x19/0x20
        bpf_prog_<redacted>+0x37c/0x16a3
        cls_bpf_classify+0x205/0x2e0
        tcf_classify+0x92/0x160
        __netif_receive_skb_core+0xe52/0xf10
        __netif_receive_skb_list_core+0x96/0x2b0
        napi_complete_done+0x7b5/0xb70
        <redacted>_poll+0x94/0xb0
        net_rx_action+0x163/0x1d70
        __do_softirq+0xdc/0x32e
        asm_call_irq_on_stack+0x12/0x20
        </IRQ>
        do_softirq_own_stack+0x36/0x50
        do_softirq+0x44/0x70
      
      __inet_hash can race with lockless (rcu) readers on the other cpus:
      
        __inet_hash
          __sk_nulls_add_node_rcu
          <- (bpf triggers here)
          sock_set_flag(SOCK_RCU_FREE)
      
      Let's move the SOCK_RCU_FREE part up a bit, before we are inserting
      the socket into hashtables. Note, that the race is really harmless;
      the bpf callers are handling this situation (where listener socket
      doesn't have SOCK_RCU_FREE set) correctly, so the only
      annoyance is a WARN_ONCE.
      
      More details from Eric regarding SOCK_RCU_FREE timeline:
      
      Commit 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under
      synflood") added SOCK_RCU_FREE. At that time, the precise location of
      sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling
      __inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested
      at dismantle time.
      
      Commit 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      started checking SOCK_RCU_FREE _after_ the lookup to infer whether
      the refcount has been taken care of.
      
      Fixes: 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      871019b2
  4. Nov 09, 2023
    • Eric Dumazet's avatar
      net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP · f1a3b283
      Eric Dumazet authored
      
      syzbot was able to trigger the following report while providing
      too small TCA_FQ_WEIGHTS attribute [1]
      
      Fix is to use NLA_POLICY_EXACT_LEN() to ensure user space
      provided correct sizes.
      
      Apply the same fix to TCA_FQ_PRIOMAP.
      
      [1]
      BUG: KMSAN: uninit-value in fq_load_weights net/sched/sch_fq.c:960 [inline]
      BUG: KMSAN: uninit-value in fq_change+0x1348/0x2fe0 net/sched/sch_fq.c:1071
      fq_load_weights net/sched/sch_fq.c:960 [inline]
      fq_change+0x1348/0x2fe0 net/sched/sch_fq.c:1071
      fq_init+0x68e/0x780 net/sched/sch_fq.c:1159
      qdisc_create+0x12f3/0x1be0 net/sched/sch_api.c:1326
      tc_modify_qdisc+0x11ef/0x2c20
      rtnetlink_rcv_msg+0x16a6/0x1840 net/core/rtnetlink.c:6558
      netlink_rcv_skb+0x371/0x650 net/netlink/af_netlink.c:2545
      rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6576
      netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
      netlink_unicast+0xf47/0x1250 net/netlink/af_netlink.c:1368
      netlink_sendmsg+0x1238/0x13d0 net/netlink/af_netlink.c:1910
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg net/socket.c:745 [inline]
      ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2588
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2642
      __sys_sendmsg net/socket.c:2671 [inline]
      __do_sys_sendmsg net/socket.c:2680 [inline]
      __se_sys_sendmsg net/socket.c:2678 [inline]
      __x64_sys_sendmsg+0x307/0x490 net/socket.c:2678
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
      slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x5e9/0xb10 mm/slub.c:3523
      kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560
      __alloc_skb+0x318/0x740 net/core/skbuff.c:651
      alloc_skb include/linux/skbuff.h:1286 [inline]
      netlink_alloc_large_skb net/netlink/af_netlink.c:1214 [inline]
      netlink_sendmsg+0xb34/0x13d0 net/netlink/af_netlink.c:1885
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg net/socket.c:745 [inline]
      ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2588
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2642
      __sys_sendmsg net/socket.c:2671 [inline]
      __do_sys_sendmsg net/socket.c:2680 [inline]
      __se_sys_sendmsg net/socket.c:2678 [inline]
      __x64_sys_sendmsg+0x307/0x490 net/socket.c:2678
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      CPU: 1 PID: 5001 Comm: syz-executor300 Not tainted 6.6.0-syzkaller-12401-g8f6f76a6a29f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      
      Fixes: 29f834aa ("net_sched: sch_fq: add 3 bands and WRR scheduling")
      Fixes: 49e7265f ("net_sched: sch_fq: add TCA_FQ_WEIGHTS attribute")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi <Salim&lt;jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20231107160440.1992526-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f1a3b283
    • Jakub Kicinski's avatar
      net: kcm: fill in MODULE_DESCRIPTION() · 31356547
      Jakub Kicinski authored
      W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
      
      Link: https://lore.kernel.org/r/20231108020305.537293-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31356547
    • Vlad Buslov's avatar
      net/sched: act_ct: Always fill offloading tuple iifidx · 9bc64bd0
      Vlad Buslov authored
      
      Referenced commit doesn't always set iifidx when offloading the flow to
      hardware. Fix the following cases:
      
      - nf_conn_act_ct_ext_fill() is called before extension is created with
      nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
      unspecified iifidx when connection is offloaded after only single
      original-direction packet has been processed by tc data path. Always fill
      the new nf_conn_act_ct_ext instance after creating it in
      nf_conn_act_ct_ext_add().
      
      - Offloading of unidirectional UDP NEW connections is now supported, but ct
      flow iifidx field is not updated when connection is promoted to
      bidirectional which can result reply-direction iifidx to be zero when
      refreshing the connection. Fill in the extension and update flow iifidx
      before calling flow_offload_refresh().
      
      Fixes: 9795ded7 ("net/sched: act_ct: Fill offloading tuple iifidx")
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Fixes: 6a9bad00 ("net/sched: act_ct: offload UDP NEW connections")
      Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9bc64bd0
  5. Nov 08, 2023
    • Florian Westphal's avatar
      netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses · 80abbe8a
      Florian Westphal authored
      
      The ipv6 redirect target was derived from the ipv4 one, i.e. its
      identical to a 'dnat' with the first (primary) address assigned to the
      network interface.  The code has been moved around to make it usable
      from nf_tables too, but its still the same as it was back when this
      was added in 2012.
      
      IPv6, however, has different types of addresses, if the 'wrong' address
      comes first the redirection does not work.
      
      In Daniels case, the addresses are:
        inet6 ::ffff:192 ...
        inet6 2a01: ...
      
      ... so the function attempts to redirect to the mapped address.
      
      Add more checks before the address is deemed correct:
      1. If the packets' daddr is scoped, search for a scoped address too
      2. skip tentative addresses
      3. skip mapped addresses
      
      Use the first address that appears to match our needs.
      
      Reported-by: default avatarDaniel Huhardeaux <tech@tootai.net>
      Closes: https://lore.kernel.org/netfilter/71be06b8-6aa0-4cf9-9e0b-e2839b01b22f@tootai.net/
      
      
      Fixes: 115e23ac ("netfilter: ip6tables: add REDIRECT target")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      80abbe8a
    • Maciej Żenczykowski's avatar
      netfilter: xt_recent: fix (increase) ipv6 literal buffer length · 7b308feb
      Maciej Żenczykowski authored
      
      in6_pton() supports 'low-32-bit dot-decimal representation'
      (this is useful with DNS64/NAT64 networks for example):
      
        # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:1.2.3.4 > /proc/self/net/xt_recent/DEFAULT
        # cat /proc/self/net/xt_recent/DEFAULT
        src=aaaa:bbbb:cccc:dddd:eeee:ffff:0102:0304 ttl: 0 last_seen: 9733848829 oldest_pkt: 1 9733848829
      
      but the provided buffer is too short:
      
        # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:255.255.255.255 > /proc/self/net/xt_recent/DEFAULT
        -bash: echo: write error: Invalid argument
      
      Fixes: 079aa88f ("netfilter: xt_recent: IPv6 support")
      Signed-off-by: default avatarMaciej Żenczykowski <zenczykowski@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7b308feb
    • Florian Westphal's avatar
      ipvs: add missing module descriptions · 17cd01e4
      Florian Westphal authored
      
      W=1 builds warn on missing MODULE_DESCRIPTION, add them.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      17cd01e4
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: remove catchall element in GC sync path · 93995bf4
      Pablo Neira Ayuso authored
      
      The expired catchall element is not deactivated and removed from GC sync
      path. This path holds mutex so just call nft_setelem_data_deactivate()
      and nft_setelem_catchall_remove() before queueing the GC work.
      
      Fixes: 4a9e12ea ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC")
      Reported-by: default avatarlonial con <kongln9170@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      93995bf4
    • Florian Westphal's avatar
      netfilter: add missing module descriptions · 94090b23
      Florian Westphal authored
      
      W=1 builds warn on missing MODULE_DESCRIPTION, add them.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      94090b23
    • Shigeru Yoshida's avatar
      virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt() · 34c4effa
      Shigeru Yoshida authored
      
      KMSAN reported the following uninit-value access issue:
      
      =====================================================
      BUG: KMSAN: uninit-value in virtio_transport_recv_pkt+0x1dfb/0x26a0 net/vmw_vsock/virtio_transport_common.c:1421
       virtio_transport_recv_pkt+0x1dfb/0x26a0 net/vmw_vsock/virtio_transport_common.c:1421
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      Uninit was stored to memory at:
       virtio_transport_space_update net/vmw_vsock/virtio_transport_common.c:1274 [inline]
       virtio_transport_recv_pkt+0x1ee8/0x26a0 net/vmw_vsock/virtio_transport_common.c:1415
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      Uninit was created at:
       slab_post_alloc_hook+0x105/0xad0 mm/slab.h:767
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5a2/0xaf0 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:559
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:650
       alloc_skb include/linux/skbuff.h:1286 [inline]
       virtio_vsock_alloc_skb include/linux/virtio_vsock.h:66 [inline]
       virtio_transport_alloc_skb+0x90/0x11e0 net/vmw_vsock/virtio_transport_common.c:58
       virtio_transport_reset_no_sock net/vmw_vsock/virtio_transport_common.c:957 [inline]
       virtio_transport_recv_pkt+0x1279/0x26a0 net/vmw_vsock/virtio_transport_common.c:1387
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      CPU: 1 PID: 10664 Comm: kworker/1:5 Not tainted 6.6.0-rc3-00146-g9f3ebbef746f #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      Workqueue: vsock-loopback vsock_loopback_work
      =====================================================
      
      The following simple reproducer can cause the issue described above:
      
      int main(void)
      {
        int sock;
        struct sockaddr_vm addr = {
          .svm_family = AF_VSOCK,
          .svm_cid = VMADDR_CID_ANY,
          .svm_port = 1234,
        };
      
        sock = socket(AF_VSOCK, SOCK_STREAM, 0);
        connect(sock, (struct sockaddr *)&addr, sizeof(addr));
        return 0;
      }
      
      This issue occurs because the `buf_alloc` and `fwd_cnt` fields of the
      `struct virtio_vsock_hdr` are not initialized when a new skb is allocated
      in `virtio_transport_init_hdr()`. This patch resolves the issue by
      initializing these fields during allocation.
      
      Fixes: 71dc9ec9 ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
      Reported-and-tested-by: default avatar <syzbot+0c8ce1da0ac31abbadcd@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=0c8ce1da0ac31abbadcd
      
      
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Link: https://lore.kernel.org/r/20231104150531.257952-1-syoshida@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      34c4effa
  6. Nov 07, 2023
  7. Nov 06, 2023
    • D. Wythe's avatar
      net/smc: put sk reference if close work was canceled · aa96fbd6
      D. Wythe authored
      
      Note that we always hold a reference to sock when attempting
      to submit close_work. Therefore, if we have successfully
      canceled close_work from pending, we MUST release that reference
      to avoid potential leaks.
      
      Fixes: 42bfba9e ("net/smc: immediate termination for SMCD link groups")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa96fbd6
    • D. Wythe's avatar
      net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc · c5bf605b
      D. Wythe authored
      
      This patch re-fix the issues mentioned by commit 22a825c5
      ("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()").
      
      Blocking sending message do solve the issues though, but it also
      prevents the peer to receive the final message. Besides, in logic,
      whether the sndbuf_desc is NULL or not have no impact on the processing
      of cdc message sending.
      
      Hence that, this patch allows the cdc message sending but to check the
      sndbuf_desc with care in smc_cdc_tx_handler().
      
      Fixes: 22a825c5 ("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5bf605b
    • D. Wythe's avatar
      net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT · 5211c972
      D. Wythe authored
      
      Considering scenario:
      
      				smc_cdc_rx_handler
      __smc_release
      				sock_set_flag
      smc_close_active()
      sock_set_flag
      
      __set_bit(DEAD)			__set_bit(DONE)
      
      Dues to __set_bit is not atomic, the DEAD or DONE might be lost.
      if the DEAD flag lost, the state SMC_CLOSED  will be never be reached
      in smc_close_passive_work:
      
      if (sock_flag(sk, SOCK_DEAD) &&
      	smc_close_sent_any_close(conn)) {
      	sk->sk_state = SMC_CLOSED;
      } else {
      	/* just shutdown, but not yet closed locally */
      	sk->sk_state = SMC_APPFINCLOSEWAIT;
      }
      
      Replace sock_set_flags or __set_bit to set_bit will fix this problem.
      Since set_bit is atomic.
      
      Fixes: b38d7324 ("smc: socket closing and linkgroup cleanup")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5211c972
    • Kuniyuki Iwashima's avatar
      tcp: Fix SYN option room calculation for TCP-AO. · 0a8e987d
      Kuniyuki Iwashima authored
      
      When building SYN packet in tcp_syn_options(), MSS, TS, WS, and
      SACKPERM are used without checking the remaining bytes in the
      options area.
      
      To keep that logic as is, we limit the TCP-AO MAC length in
      tcp_ao_parse_crypto().  Currently, the limit is calculated as below.
      
        MAX_TCP_OPTION_SPACE - TCPOLEN_TSTAMP_ALIGNED
                             - TCPOLEN_WSCALE_ALIGNED
                             - TCPOLEN_SACKPERM_ALIGNED
      
      This looks confusing as (1) we pack SACKPERM into the leading
      2-bytes of the aligned 12-bytes of TS and (2) TCPOLEN_MSS_ALIGNED
      is not used.  Fortunately, the calculated limit is not wrong as
      TCPOLEN_SACKPERM_ALIGNED and TCPOLEN_MSS_ALIGNED are the same value.
      
      However, we should use the proper constant in the formula.
      
        MAX_TCP_OPTION_SPACE - TCPOLEN_MSS_ALIGNED
                             - TCPOLEN_TSTAMP_ALIGNED
                             - TCPOLEN_WSCALE_ALIGNED
      
      Fixes: 4954f17d ("net/tcp: Introduce TCP_AO setsockopt()s")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a8e987d
    • Jamal Hadi Salim's avatar
      net, sched: Fix SKB_NOT_DROPPED_YET splat under debug config · 40cb2fdf
      Jamal Hadi Salim authored
      
      Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
      reproducer [2]. Problem seems to be that classifiers clear 'struct
      tcf_result::drop_reason', thereby triggering the warning in
      __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
      
      Fixed by disambiguating a legit error from a verdict with a bogus drop_reason
      
      [1]
      WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
      Modules linked in:
      CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      RIP: 0010:kfree_skb_reason+0x38/0x130
      [...]
      Call Trace:
       <IRQ>
       __netif_receive_skb_core.constprop.0+0x837/0xdb0
       __netif_receive_skb_one_core+0x3c/0x70
       process_backlog+0x95/0x130
       __napi_poll+0x25/0x1b0
       net_rx_action+0x29b/0x310
       __do_softirq+0xc0/0x29b
       do_softirq+0x43/0x60
       </IRQ>
      
      [2]
      
      ip link add name veth0 type veth peer name veth1
      ip link set dev veth0 up
      ip link set dev veth1 up
      tc qdisc add dev veth1 clsact
      tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
      mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
      
      Ido reported:
      
        [...] getting the following splat [1] with CONFIG_DEBUG_NET=y and this
        reproducer [2]. Problem seems to be that classifiers clear 'struct
        tcf_result::drop_reason', thereby triggering the warning in
        __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0). [...]
      
        [1]
        WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
        Modules linked in:
        CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
        RIP: 0010:kfree_skb_reason+0x38/0x130
        [...]
        Call Trace:
         <IRQ>
         __netif_receive_skb_core.constprop.0+0x837/0xdb0
         __netif_receive_skb_one_core+0x3c/0x70
         process_backlog+0x95/0x130
         __napi_poll+0x25/0x1b0
         net_rx_action+0x29b/0x310
         __do_softirq+0xc0/0x29b
         do_softirq+0x43/0x60
         </IRQ>
      
        [2]
        #!/bin/bash
      
        ip link add name veth0 type veth peer name veth1
        ip link set dev veth0 up
        ip link set dev veth1 up
        tc qdisc add dev veth1 clsact
        tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
        mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
      
      What happens is that inside most classifiers the tcf_result is copied over
      from a filter template e.g. *res = f->res which then implicitly overrides
      the prior SKB_DROP_REASON_TC_{INGRESS,EGRESS} default drop code which was
      set via sch_handle_{ingress,egress}() for kfree_skb_reason().
      
      Commit text above copied verbatim from Daniel. The general idea of the patch
      is not very different from what Ido originally posted but instead done at the
      cls_api codepath.
      
      Fixes: 54a59aed ("net, sched: Make tc-related drop reason more flexible")
      Reported-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/netdev/ZTjY959R+AFXf3Xy@shredder
      
      
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40cb2fdf
  8. Nov 03, 2023
  9. Nov 02, 2023
  10. Nov 01, 2023
    • felix's avatar
      SUNRPC: Fix RPC client cleaned up the freed pipefs dentries · bfca5fb4
      felix authored
      
      RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir()
      workqueue,which takes care about pipefs superblock locking.
      In some special scenarios, when kernel frees the pipefs sb of the
      current client and immediately alloctes a new pipefs sb,
      rpc_remove_pipedir function would misjudge the existence of pipefs
      sb which is not the one it used to hold. As a result,
      the rpc_remove_pipedir would clean the released freed pipefs dentries.
      
      To fix this issue, rpc_remove_pipedir should check whether the
      current pipefs sb is consistent with the original pipefs sb.
      
      This error can be catched by KASAN:
      =========================================================
      [  250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200
      [  250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503
      [  250.500549] Workqueue: events rpc_free_client_work
      [  250.501001] Call Trace:
      [  250.502880]  kasan_report+0xb6/0xf0
      [  250.503209]  ? dget_parent+0x195/0x200
      [  250.503561]  dget_parent+0x195/0x200
      [  250.503897]  ? __pfx_rpc_clntdir_depopulate+0x10/0x10
      [  250.504384]  rpc_rmdir_depopulate+0x1b/0x90
      [  250.504781]  rpc_remove_client_dir+0xf5/0x150
      [  250.505195]  rpc_free_client_work+0xe4/0x230
      [  250.505598]  process_one_work+0x8ee/0x13b0
      ...
      [   22.039056] Allocated by task 244:
      [   22.039390]  kasan_save_stack+0x22/0x50
      [   22.039758]  kasan_set_track+0x25/0x30
      [   22.040109]  __kasan_slab_alloc+0x59/0x70
      [   22.040487]  kmem_cache_alloc_lru+0xf0/0x240
      [   22.040889]  __d_alloc+0x31/0x8e0
      [   22.041207]  d_alloc+0x44/0x1f0
      [   22.041514]  __rpc_lookup_create_exclusive+0x11c/0x140
      [   22.041987]  rpc_mkdir_populate.constprop.0+0x5f/0x110
      [   22.042459]  rpc_create_client_dir+0x34/0x150
      [   22.042874]  rpc_setup_pipedir_sb+0x102/0x1c0
      [   22.043284]  rpc_client_register+0x136/0x4e0
      [   22.043689]  rpc_new_client+0x911/0x1020
      [   22.044057]  rpc_create_xprt+0xcb/0x370
      [   22.044417]  rpc_create+0x36b/0x6c0
      ...
      [   22.049524] Freed by task 0:
      [   22.049803]  kasan_save_stack+0x22/0x50
      [   22.050165]  kasan_set_track+0x25/0x30
      [   22.050520]  kasan_save_free_info+0x2b/0x50
      [   22.050921]  __kasan_slab_free+0x10e/0x1a0
      [   22.051306]  kmem_cache_free+0xa5/0x390
      [   22.051667]  rcu_core+0x62c/0x1930
      [   22.051995]  __do_softirq+0x165/0x52a
      [   22.052347]
      [   22.052503] Last potentially related work creation:
      [   22.052952]  kasan_save_stack+0x22/0x50
      [   22.053313]  __kasan_record_aux_stack+0x8e/0xa0
      [   22.053739]  __call_rcu_common.constprop.0+0x6b/0x8b0
      [   22.054209]  dentry_free+0xb2/0x140
      [   22.054540]  __dentry_kill+0x3be/0x540
      [   22.054900]  shrink_dentry_list+0x199/0x510
      [   22.055293]  shrink_dcache_parent+0x190/0x240
      [   22.055703]  do_one_tree+0x11/0x40
      [   22.056028]  shrink_dcache_for_umount+0x61/0x140
      [   22.056461]  generic_shutdown_super+0x70/0x590
      [   22.056879]  kill_anon_super+0x3a/0x60
      [   22.057234]  rpc_kill_sb+0x121/0x200
      
      Fixes: 0157d021 ("SUNRPC: handle RPC client pipefs dentries by network namespace aware routines")
      Signed-off-by: default avatarfelix <fuzhen5@huawei.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      bfca5fb4
Loading