Skip to content
Snippets Groups Projects
  1. Nov 23, 2023
    • Jann Horn's avatar
      tls: fix NULL deref on tls_sw_splice_eof() with empty record · 53f2cb49
      Jann Horn authored
      
      syzkaller discovered that if tls_sw_splice_eof() is executed as part of
      sendfile() when the plaintext/ciphertext sk_msg are empty, the send path
      gets confused because the empty ciphertext buffer does not have enough
      space for the encryption overhead. This causes tls_push_record() to go on
      the `split = true` path (which is only supposed to be used when interacting
      with an attached BPF program), and then get further confused and hit the
      tls_merge_open_record() path, which then assumes that there must be at
      least one populated buffer element, leading to a NULL deref.
      
      It is possible to have empty plaintext/ciphertext buffers if we previously
      bailed from tls_sw_sendmsg_locked() via the tls_trim_both_msgs() path.
      tls_sw_push_pending_record() already handles this case correctly; let's do
      the same check in tls_sw_splice_eof().
      
      Fixes: df720d28 ("tls/sw: Use splice_eof() to flush")
      Cc: stable@vger.kernel.org
      Reported-by: default avatar <syzbot+40d43509a099ea756317@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Link: https://lore.kernel.org/r/20231122214447.675768-1-jannh@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53f2cb49
  2. Nov 22, 2023
    • D. Wythe's avatar
      net/smc: avoid data corruption caused by decline · e6d71b43
      D. Wythe authored
      
      We found a data corruption issue during testing of SMC-R on Redis
      applications.
      
      The benchmark has a low probability of reporting a strange error as
      shown below.
      
      "Error: Protocol error, got "\xe2" as reply type byte"
      
      Finally, we found that the retrieved error data was as follows:
      
      0xE2 0xD4 0xC3 0xD9 0x04 0x00 0x2C 0x20 0xA6 0x56 0x00 0x16 0x3E 0x0C
      0xCB 0x04 0x02 0x01 0x00 0x00 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00
      0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xE2
      
      It is quite obvious that this is a SMC DECLINE message, which means that
      the applications received SMC protocol message.
      We found that this was caused by the following situations:
      
      client                  server
              ¦  clc proposal
              ------------->
              ¦  clc accept
              <-------------
              ¦  clc confirm
              ------------->
      wait llc confirm
      			send llc confirm
              ¦failed llc confirm
              ¦   x------
      (after 2s)timeout
                              wait llc confirm rsp
      
      wait decline
      
      (after 1s) timeout
                              (after 2s) timeout
              ¦   decline
              -------------->
              ¦   decline
              <--------------
      
      As a result, a decline message was sent in the implementation, and this
      message was read from TCP by the already-fallback connection.
      
      This patch double the client timeout as 2x of the server value,
      With this simple change, the Decline messages should never cross or
      collide (during Confirm link timeout).
      
      This issue requires an immediate solution, since the protocol updates
      involve a more long-term solution.
      
      Fixes: 0fb0b02b ("net/smc: adapt SMC client code to use the LLC flow")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6d71b43
  3. Nov 21, 2023
  4. Nov 20, 2023
  5. Nov 19, 2023
  6. Nov 17, 2023
    • David Howells's avatar
      rxrpc: Defer the response to a PING ACK until we've parsed it · 1a01319f
      David Howells authored
      
      Defer the generation of a PING RESPONSE ACK in response to a PING ACK until
      we've parsed the PING ACK so that we pick up any changes to the packet
      queue so that we can update ackinfo.
      
      This is also applied to an ACK generated in response to an ACK with the
      REQUEST_ACK flag set.
      
      Note that whilst the problem was added in commit 248f219c, it didn't
      really matter at that point because the ACK was proposed in softirq mode
      and generated asynchronously later in process context, taking the latest
      values at the time.  But this fix is only needed since the move to parse
      incoming packets in an I/O thread rather than in softirq and generate the
      ACK at point of proposal (b0346843).
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a01319f
    • David Howells's avatar
      rxrpc: Fix RTT determination to use any ACK as a source · 3798680f
      David Howells authored
      
      Fix RTT determination to be able to use any type of ACK as the response
      from which RTT can be calculated provided its ack.serial is non-zero and
      matches the serial number of an outgoing DATA or ACK packet.  This
      shouldn't be limited to REQUESTED-type ACKs as these can have other types
      substituted for them for things like duplicate or out-of-order packets.
      
      Fixes: 4700c4d8 ("rxrpc: Fix loss of RTT samples due to interposed ACK")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3798680f
    • David Howells's avatar
      rxrpc: Fix some minor issues with bundle tracing · 0c3bd086
      David Howells authored
      
      Fix some superficial issues with the tracing of rxrpc_bundle structs,
      including:
      
       (1) Set the debug_id when the bundle is allocated rather than when it is
           set up so that the "NEW" trace line displays the correct bundle ID.
      
       (2) Show the refcount when emitting the "FREE" traceline.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c3bd086
  7. Nov 16, 2023
  8. Nov 15, 2023
  9. Nov 14, 2023
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: split async and sync catchall in two functions · 8837ba3e
      Pablo Neira Ayuso authored
      
      list_for_each_entry_safe() does not work for the async case which runs
      under RCU, therefore, split GC logic for catchall in two functions
      instead, one for each of the sync and async GC variants.
      
      The catchall sync GC variant never sees a _DEAD bit set on ever, thus,
      this handling is removed in such case, moreover, allocate GC sync batch
      via GFP_KERNEL.
      
      Fixes: 93995bf4 ("netfilter: nf_tables: remove catchall element in GC sync path")
      Reported-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      8837ba3e
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test · 28628fa9
      Jozsef Kadlecsik authored
      Linkui Xiao reported that there's a race condition when ipset swap and destroy is
      called, which can lead to crash in add/del/test element operations. Swap then
      destroy are usual operations to replace a set with another one in a production
      system. The issue can in some cases be reproduced with the script:
      
      ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576
      ipset add hash_ip1 172.20.0.0/16
      ipset add hash_ip1 192.168.0.0/16
      iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
      while [ 1 ]
      do
      	# ... Ongoing traffic...
              ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
              ipset add hash_ip2 172.20.0.0/16
              ipset swap hash_ip1 hash_ip2
              ipset destroy hash_ip2
              sleep 0.05
      done
      
      In the race case the possible order of the operations are
      
      	CPU0			CPU1
      	ip_set_test
      				ipset swap hash_ip1 hash_ip2
      				ipset destroy hash_ip2
      	hash_net_kadt
      
      Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
      is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
      removed it, hash_net_kadt crashes.
      
      The fix is to force ip_set_swap() to wait for all readers to finish accessing the
      old set pointers by calling synchronize_rcu().
      
      The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.
      
      v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
          ip_set_destroy() unnecessarily when all sets are destroyed.
      v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held
          and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.
          So there's no need to extend the rcu read locked area in ipset itself.
      
      Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/
      
      
      Reported by: Linkui Xiao <xiaolinkui@kylinos.cn>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      28628fa9
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: bogus ENOENT when destroying element which does not exist · a7d5a955
      Pablo Neira Ayuso authored
      
      destroy element command bogusly reports ENOENT in case a set element
      does not exist. ENOENT errors are skipped, however, err is still set
      and propagated to userspace.
      
       # nft destroy element ip raw BLACKLIST { 1.2.3.4 }
       Error: Could not process rule: No such file or directory
       destroy element ip raw BLACKLIST { 1.2.3.4 }
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
      Fixes: f80a612d ("netfilter: nf_tables: add support to destroy operation")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a7d5a955
    • Dan Carpenter's avatar
      netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval() · c301f098
      Dan Carpenter authored
      
      The problem is in nft_byteorder_eval() where we are iterating through a
      loop and writing to dst[0], dst[1], dst[2] and so on...  On each
      iteration we are writing 8 bytes.  But dst[] is an array of u32 so each
      element only has space for 4 bytes.  That means that every iteration
      overwrites part of the previous element.
      
      I spotted this bug while reviewing commit caf3ef74 ("netfilter:
      nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
      issue.  I think that the reason we have not detected this bug in testing
      is that most of time we only write one element.
      
      Fixes: ce1e7989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c301f098
    • Linkui Xiao's avatar
      netfilter: nf_conntrack_bridge: initialize err to 0 · a44af08e
      Linkui Xiao authored
      
      K2CI reported a problem:
      
      	consume_skb(skb);
      	return err;
      [nf_br_ip_fragment() error]  uninitialized symbol 'err'.
      
      err is not initialized, because returning 0 is expected, initialize err
      to 0.
      
      Fixes: 3c171f49 ("netfilter: bridge: add connection tracking system")
      Reported-by: default avatark2ci <kernel-bot@kylinos.cn>
      Signed-off-by: default avatarLinkui Xiao <xiaolinkui@kylinos.cn>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a44af08e
    • Yang Li's avatar
      netfilter: nft_set_rbtree: Remove unused variable nft_net · 67059b61
      Yang Li authored
      
      The code that uses nft_net has been removed, and the nft_pernet function
      is merely obtaining a reference to shared data through the net pointer.
      The content of the net pointer is not modified or changed, so both of
      them should be removed.
      
      silence the warning:
      net/netfilter/nft_set_rbtree.c:627:26: warning: variable ‘nft_net’ set but not used
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7103
      
      
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      67059b61
    • Eric Dumazet's avatar
      af_unix: fix use-after-free in unix_stream_read_actor() · 4b7b4926
      Eric Dumazet authored
      
      syzbot reported the following crash [1]
      
      After releasing unix socket lock, u->oob_skb can be changed
      by another thread. We must temporarily increase skb refcount
      to make sure this other thread will not free the skb under us.
      
      [1]
      
      BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
      
      CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a627b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      unix_stream_recv_urg net/unix/af_unix.c:2587 [inline]
      unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666
      unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903
      sock_recvmsg_nosec net/socket.c:1044 [inline]
      sock_recvmsg+0xe2/0x170 net/socket.c:1066
      ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803
      ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845
      __sys_recvmsg+0x114/0x1e0 net/socket.c:2875
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7fc67492c559
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559
      RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004
      RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340
      R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388
      </TASK>
      
      Allocated by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
      ____kasan_slab_free mm/kasan/common.c:236 [inline]
      ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
      kasan_slab_free include/linux/kasan.h:164 [inline]
      slab_free_hook mm/slub.c:1800 [inline]
      slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
      slab_free mm/slub.c:3809 [inline]
      kmem_cache_free+0xf8/0x340 mm/slub.c:3831
      kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015
      __kfree_skb net/core/skbuff.c:1073 [inline]
      consume_skb net/core/skbuff.c:1288 [inline]
      consume_skb+0xdf/0x170 net/core/skbuff.c:1282
      queue_oob net/unix/af_unix.c:2178 [inline]
      unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88801f3b9c80
      which belongs to the cache skbuff_head_cache of size 240
      The buggy address is located 68 bytes inside of
      freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
      
      The buggy address belongs to the physical page:
      page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9
      flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431
      set_page_owner include/linux/page_owner.h:31 [inline]
      post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537
      prep_new_page mm/page_alloc.c:1544 [inline]
      get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312
      __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568
      alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
      alloc_slab_page mm/slub.c:1870 [inline]
      allocate_slab+0x251/0x380 mm/slub.c:2017
      new_slab mm/slub.c:2070 [inline]
      ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223
      __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
      __slab_alloc_node mm/slub.c:3375 [inline]
      slab_alloc_node mm/slub.c:3468 [inline]
      kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      page last free stack trace:
      reset_page_owner include/linux/page_owner.h:24 [inline]
      free_pages_prepare mm/page_alloc.c:1137 [inline]
      free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347
      free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487
      __unfreeze_partials+0x21d/0x240 mm/slub.c:2655
      qlink_free mm/kasan/quarantine.c:168 [inline]
      qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
      kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
      __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      slab_alloc mm/slub.c:3486 [inline]
      __kmem_cache_alloc_lru mm/slub.c:3493 [inline]
      kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502
      vm_area_dup+0x21/0x2f0 kernel/fork.c:500
      __split_vma+0x17d/0x1070 mm/mmap.c:2365
      split_vma mm/mmap.c:2437 [inline]
      vma_modify+0x25d/0x450 mm/mmap.c:2472
      vma_modify_flags include/linux/mm.h:3271 [inline]
      mprotect_fixup+0x228/0xc80 mm/mprotect.c:635
      do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809
      __do_sys_mprotect mm/mprotect.c:830 [inline]
      __se_sys_mprotect mm/mprotect.c:827 [inline]
      __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Memory state around the buggy address:
      ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ^
      ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
      ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      
      Fixes: 876c14ad ("af_unix: fix holding spinlock in oob handling")
      Reported-and-tested-by: default avatar <syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rao Shoaib <rao.shoaib@oracle.com>
      Reviewed-by: default avatarRao shoaib <rao.shoaib@oracle.com>
      Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b7b4926
  10. Nov 13, 2023
    • Shigeru Yoshida's avatar
      tipc: Fix kernel-infoleak due to uninitialized TLV value · fb317eb2
      Shigeru Yoshida authored
      
      KMSAN reported the following kernel-infoleak issue:
      
      =====================================================
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
      BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       copy_to_user_iter lib/iov_iter.c:24 [inline]
       iterate_ubuf include/linux/iov_iter.h:29 [inline]
       iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       iterate_and_advance include/linux/iov_iter.h:271 [inline]
       _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       copy_to_iter include/linux/uio.h:197 [inline]
       simple_copy_to_iter net/core/datagram.c:532 [inline]
       __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420
       skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546
       skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
       netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967
       sock_recvmsg_nosec net/socket.c:1044 [inline]
       sock_recvmsg net/socket.c:1066 [inline]
       __sys_recvfrom+0x476/0x860 net/socket.c:2246
       __do_sys_recvfrom net/socket.c:2264 [inline]
       __se_sys_recvfrom net/socket.c:2260 [inline]
       __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651
       alloc_skb include/linux/skbuff.h:1286 [inline]
       tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline]
       tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170
       tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324
       genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
       genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067
       netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545
       genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076
       netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
       netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       ____sys_sendmsg+0x997/0xd60 net/socket.c:2588
       ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
       __sys_sendmsg net/socket.c:2671 [inline]
       __do_sys_sendmsg net/socket.c:2680 [inline]
       __se_sys_sendmsg net/socket.c:2678 [inline]
       __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Bytes 34-35 of 36 are uninitialized
      Memory access of size 36 starts at ffff88802d464a00
      Data copied to user address 00007ff55033c0a0
      
      CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      =====================================================
      
      tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is
      calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and
      the length of TLV value passed as an argument, and aligns the result to a
      multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
      
      If the size of struct tlv_desc plus the length of TLV value is not aligned,
      the current implementation leaves the remaining bytes uninitialized. This
      is the cause of the above kernel-infoleak issue.
      
      This patch resolves this issue by clearing data up to an aligned size.
      
      Fixes: d0796d1e ("tipc: convert legacy nl bearer dump to nl compat")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb317eb2
    • Willem de Bruijn's avatar
      net: gso_test: support CONFIG_MAX_SKB_FRAGS up to 45 · e6daf129
      Willem de Bruijn authored
      
      The test allocs a single page to hold all the frag_list skbs. This
      is insufficient on kernels with CONFIG_MAX_SKB_FRAGS=45, due to the
      increased skb_shared_info frags[] array length.
      
              gso_test_func: ASSERTION FAILED at net/core/gso_test.c:210
              Expected alloc_size <= ((1UL) << 12), but
                  alloc_size == 5075 (0x13d3)
                  ((1UL) << 12) == 4096 (0x1000)
      
      Simplify the logic. Just allocate a page for each frag_list skb.
      
      Fixes: 4688ecb1 ("net: expand skb_segment unit test with frag_list coverage")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6daf129
  11. Nov 10, 2023
    • Stanislav Fomichev's avatar
      net: set SOCK_RCU_FREE before inserting socket into hashtable · 871019b2
      Stanislav Fomichev authored
      
      We've started to see the following kernel traces:
      
       WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
      
       Call Trace:
        <IRQ>
        __bpf_skc_lookup+0x10d/0x120
        bpf_sk_lookup+0x48/0xd0
        bpf_sk_lookup_tcp+0x19/0x20
        bpf_prog_<redacted>+0x37c/0x16a3
        cls_bpf_classify+0x205/0x2e0
        tcf_classify+0x92/0x160
        __netif_receive_skb_core+0xe52/0xf10
        __netif_receive_skb_list_core+0x96/0x2b0
        napi_complete_done+0x7b5/0xb70
        <redacted>_poll+0x94/0xb0
        net_rx_action+0x163/0x1d70
        __do_softirq+0xdc/0x32e
        asm_call_irq_on_stack+0x12/0x20
        </IRQ>
        do_softirq_own_stack+0x36/0x50
        do_softirq+0x44/0x70
      
      __inet_hash can race with lockless (rcu) readers on the other cpus:
      
        __inet_hash
          __sk_nulls_add_node_rcu
          <- (bpf triggers here)
          sock_set_flag(SOCK_RCU_FREE)
      
      Let's move the SOCK_RCU_FREE part up a bit, before we are inserting
      the socket into hashtables. Note, that the race is really harmless;
      the bpf callers are handling this situation (where listener socket
      doesn't have SOCK_RCU_FREE set) correctly, so the only
      annoyance is a WARN_ONCE.
      
      More details from Eric regarding SOCK_RCU_FREE timeline:
      
      Commit 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under
      synflood") added SOCK_RCU_FREE. At that time, the precise location of
      sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling
      __inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested
      at dismantle time.
      
      Commit 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      started checking SOCK_RCU_FREE _after_ the lookup to infer whether
      the refcount has been taken care of.
      
      Fixes: 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      871019b2
  12. Nov 09, 2023
    • Eric Dumazet's avatar
      net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP · f1a3b283
      Eric Dumazet authored
      
      syzbot was able to trigger the following report while providing
      too small TCA_FQ_WEIGHTS attribute [1]
      
      Fix is to use NLA_POLICY_EXACT_LEN() to ensure user space
      provided correct sizes.
      
      Apply the same fix to TCA_FQ_PRIOMAP.
      
      [1]
      BUG: KMSAN: uninit-value in fq_load_weights net/sched/sch_fq.c:960 [inline]
      BUG: KMSAN: uninit-value in fq_change+0x1348/0x2fe0 net/sched/sch_fq.c:1071
      fq_load_weights net/sched/sch_fq.c:960 [inline]
      fq_change+0x1348/0x2fe0 net/sched/sch_fq.c:1071
      fq_init+0x68e/0x780 net/sched/sch_fq.c:1159
      qdisc_create+0x12f3/0x1be0 net/sched/sch_api.c:1326
      tc_modify_qdisc+0x11ef/0x2c20
      rtnetlink_rcv_msg+0x16a6/0x1840 net/core/rtnetlink.c:6558
      netlink_rcv_skb+0x371/0x650 net/netlink/af_netlink.c:2545
      rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6576
      netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
      netlink_unicast+0xf47/0x1250 net/netlink/af_netlink.c:1368
      netlink_sendmsg+0x1238/0x13d0 net/netlink/af_netlink.c:1910
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg net/socket.c:745 [inline]
      ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2588
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2642
      __sys_sendmsg net/socket.c:2671 [inline]
      __do_sys_sendmsg net/socket.c:2680 [inline]
      __se_sys_sendmsg net/socket.c:2678 [inline]
      __x64_sys_sendmsg+0x307/0x490 net/socket.c:2678
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
      slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x5e9/0xb10 mm/slub.c:3523
      kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560
      __alloc_skb+0x318/0x740 net/core/skbuff.c:651
      alloc_skb include/linux/skbuff.h:1286 [inline]
      netlink_alloc_large_skb net/netlink/af_netlink.c:1214 [inline]
      netlink_sendmsg+0xb34/0x13d0 net/netlink/af_netlink.c:1885
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg net/socket.c:745 [inline]
      ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2588
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2642
      __sys_sendmsg net/socket.c:2671 [inline]
      __do_sys_sendmsg net/socket.c:2680 [inline]
      __se_sys_sendmsg net/socket.c:2678 [inline]
      __x64_sys_sendmsg+0x307/0x490 net/socket.c:2678
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      CPU: 1 PID: 5001 Comm: syz-executor300 Not tainted 6.6.0-syzkaller-12401-g8f6f76a6a29f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      
      Fixes: 29f834aa ("net_sched: sch_fq: add 3 bands and WRR scheduling")
      Fixes: 49e7265f ("net_sched: sch_fq: add TCA_FQ_WEIGHTS attribute")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi <Salim&lt;jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20231107160440.1992526-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f1a3b283
    • Jakub Kicinski's avatar
      net: kcm: fill in MODULE_DESCRIPTION() · 31356547
      Jakub Kicinski authored
      W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
      
      Link: https://lore.kernel.org/r/20231108020305.537293-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31356547
    • Vlad Buslov's avatar
      net/sched: act_ct: Always fill offloading tuple iifidx · 9bc64bd0
      Vlad Buslov authored
      
      Referenced commit doesn't always set iifidx when offloading the flow to
      hardware. Fix the following cases:
      
      - nf_conn_act_ct_ext_fill() is called before extension is created with
      nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
      unspecified iifidx when connection is offloaded after only single
      original-direction packet has been processed by tc data path. Always fill
      the new nf_conn_act_ct_ext instance after creating it in
      nf_conn_act_ct_ext_add().
      
      - Offloading of unidirectional UDP NEW connections is now supported, but ct
      flow iifidx field is not updated when connection is promoted to
      bidirectional which can result reply-direction iifidx to be zero when
      refreshing the connection. Fill in the extension and update flow iifidx
      before calling flow_offload_refresh().
      
      Fixes: 9795ded7 ("net/sched: act_ct: Fill offloading tuple iifidx")
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Fixes: 6a9bad00 ("net/sched: act_ct: offload UDP NEW connections")
      Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9bc64bd0
  13. Nov 08, 2023
    • Florian Westphal's avatar
      netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses · 80abbe8a
      Florian Westphal authored
      
      The ipv6 redirect target was derived from the ipv4 one, i.e. its
      identical to a 'dnat' with the first (primary) address assigned to the
      network interface.  The code has been moved around to make it usable
      from nf_tables too, but its still the same as it was back when this
      was added in 2012.
      
      IPv6, however, has different types of addresses, if the 'wrong' address
      comes first the redirection does not work.
      
      In Daniels case, the addresses are:
        inet6 ::ffff:192 ...
        inet6 2a01: ...
      
      ... so the function attempts to redirect to the mapped address.
      
      Add more checks before the address is deemed correct:
      1. If the packets' daddr is scoped, search for a scoped address too
      2. skip tentative addresses
      3. skip mapped addresses
      
      Use the first address that appears to match our needs.
      
      Reported-by: default avatarDaniel Huhardeaux <tech@tootai.net>
      Closes: https://lore.kernel.org/netfilter/71be06b8-6aa0-4cf9-9e0b-e2839b01b22f@tootai.net/
      
      
      Fixes: 115e23ac ("netfilter: ip6tables: add REDIRECT target")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      80abbe8a
    • Maciej Żenczykowski's avatar
      netfilter: xt_recent: fix (increase) ipv6 literal buffer length · 7b308feb
      Maciej Żenczykowski authored
      
      in6_pton() supports 'low-32-bit dot-decimal representation'
      (this is useful with DNS64/NAT64 networks for example):
      
        # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:1.2.3.4 > /proc/self/net/xt_recent/DEFAULT
        # cat /proc/self/net/xt_recent/DEFAULT
        src=aaaa:bbbb:cccc:dddd:eeee:ffff:0102:0304 ttl: 0 last_seen: 9733848829 oldest_pkt: 1 9733848829
      
      but the provided buffer is too short:
      
        # echo +aaaa:bbbb:cccc:dddd:eeee:ffff:255.255.255.255 > /proc/self/net/xt_recent/DEFAULT
        -bash: echo: write error: Invalid argument
      
      Fixes: 079aa88f ("netfilter: xt_recent: IPv6 support")
      Signed-off-by: default avatarMaciej Żenczykowski <zenczykowski@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7b308feb
    • Florian Westphal's avatar
      ipvs: add missing module descriptions · 17cd01e4
      Florian Westphal authored
      
      W=1 builds warn on missing MODULE_DESCRIPTION, add them.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      17cd01e4
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: remove catchall element in GC sync path · 93995bf4
      Pablo Neira Ayuso authored
      
      The expired catchall element is not deactivated and removed from GC sync
      path. This path holds mutex so just call nft_setelem_data_deactivate()
      and nft_setelem_catchall_remove() before queueing the GC work.
      
      Fixes: 4a9e12ea ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC")
      Reported-by: default avatarlonial con <kongln9170@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      93995bf4
    • Florian Westphal's avatar
      netfilter: add missing module descriptions · 94090b23
      Florian Westphal authored
      
      W=1 builds warn on missing MODULE_DESCRIPTION, add them.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      94090b23
    • Shigeru Yoshida's avatar
      virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt() · 34c4effa
      Shigeru Yoshida authored
      
      KMSAN reported the following uninit-value access issue:
      
      =====================================================
      BUG: KMSAN: uninit-value in virtio_transport_recv_pkt+0x1dfb/0x26a0 net/vmw_vsock/virtio_transport_common.c:1421
       virtio_transport_recv_pkt+0x1dfb/0x26a0 net/vmw_vsock/virtio_transport_common.c:1421
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      Uninit was stored to memory at:
       virtio_transport_space_update net/vmw_vsock/virtio_transport_common.c:1274 [inline]
       virtio_transport_recv_pkt+0x1ee8/0x26a0 net/vmw_vsock/virtio_transport_common.c:1415
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      Uninit was created at:
       slab_post_alloc_hook+0x105/0xad0 mm/slab.h:767
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5a2/0xaf0 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:559
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:650
       alloc_skb include/linux/skbuff.h:1286 [inline]
       virtio_vsock_alloc_skb include/linux/virtio_vsock.h:66 [inline]
       virtio_transport_alloc_skb+0x90/0x11e0 net/vmw_vsock/virtio_transport_common.c:58
       virtio_transport_reset_no_sock net/vmw_vsock/virtio_transport_common.c:957 [inline]
       virtio_transport_recv_pkt+0x1279/0x26a0 net/vmw_vsock/virtio_transport_common.c:1387
       vsock_loopback_work+0x3bb/0x5a0 net/vmw_vsock/vsock_loopback.c:120
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0xff6/0x1e60 kernel/workqueue.c:2703
       worker_thread+0xeca/0x14d0 kernel/workqueue.c:2784
       kthread+0x3cc/0x520 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      CPU: 1 PID: 10664 Comm: kworker/1:5 Not tainted 6.6.0-rc3-00146-g9f3ebbef746f #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      Workqueue: vsock-loopback vsock_loopback_work
      =====================================================
      
      The following simple reproducer can cause the issue described above:
      
      int main(void)
      {
        int sock;
        struct sockaddr_vm addr = {
          .svm_family = AF_VSOCK,
          .svm_cid = VMADDR_CID_ANY,
          .svm_port = 1234,
        };
      
        sock = socket(AF_VSOCK, SOCK_STREAM, 0);
        connect(sock, (struct sockaddr *)&addr, sizeof(addr));
        return 0;
      }
      
      This issue occurs because the `buf_alloc` and `fwd_cnt` fields of the
      `struct virtio_vsock_hdr` are not initialized when a new skb is allocated
      in `virtio_transport_init_hdr()`. This patch resolves the issue by
      initializing these fields during allocation.
      
      Fixes: 71dc9ec9 ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
      Reported-and-tested-by: default avatar <syzbot+0c8ce1da0ac31abbadcd@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=0c8ce1da0ac31abbadcd
      
      
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Link: https://lore.kernel.org/r/20231104150531.257952-1-syoshida@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      34c4effa
  14. Nov 07, 2023
  15. Nov 06, 2023
Loading