Commits · 130443d60b1b8c7a609a2af3384dd8e60df97181 · Ermine / R128

Jul 27, 2023

md: refactor idle/frozen_sync_thread() to fix deadlock · 130443d6

Yu Kuai authored 1 year ago

Our test found a following deadlock in raid10:

1) Issue a normal write, and such write failed:

  raid10_end_write_request
   set_bit(R10BIO_WriteError, &r10_bio->state)
   one_write_done
    reschedule_retry

  // later from md thread
  raid10d
   handle_write_completed
    list_add(&r10_bio->retry_list, &conf->bio_end_io_list)

  // later from md thread
  raid10d
   if (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
    list_move(conf->bio_end_io_list.prev, &tmp)
    r10_bio = list_first_entry(&tmp, struct r10bio, retry_list)
    raid_end_bio_io(r10_bio)

Dependency chain 1: normal io is waiting for updating superblock

2) Trigger a recovery:

  raid10_sync_request
   raise_barrier

Dependency chain 2: sync thread is waiting for normal io

3) echo idle/frozen to sync_action:

  action_store
   mddev_lock
    md_unregister_thread
     kthread_stop

Dependency chain 3: drop 'reconfig_mutex' is waiting for sync thread

4) md thread can't update superblock:

  raid10d
   md_check_recovery
    if (mddev_trylock(mddev))
     md_update_sb

Dependency chain 4: update superblock is waiting for 'reconfig_mutex'

Hence cyclic dependency exist, in order to fix the problem, we must
break one of them. Dependency 1 and 2 can't be broken because they are
foundation design. Dependency 4 may be possible if it can be guaranteed
that no io can be inflight, however, this requires a new mechanism which
seems complex. Dependency 3 is a good choice, because idle/frozen only
requires sync thread to finish, which can be done asynchronously that is
already implemented, and 'reconfig_mutex' is not needed anymore.

This patch switch 'idle' and 'frozen' to wait sync thread to be done
asynchronously, and this patch also add a sequence counter to record how
many times sync thread is done, so that 'idle' won't keep waiting on new
started sync thread.

Noted that raid456 has similiar deadlock([1]), and it's verified[2] this
deadlock can be fixed by this patch as well.

[1] https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
[2] https://lore.kernel.org/linux-raid/e9067438-d713-f5f3-0d3d-9e6b0e9efa0e@huaweicloud.com/



Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230529132037.2124527-5-yukuai1@huaweicloud.com

130443d6

md: add a mutex to synchronize idle and frozen in action_store() · 6f56f0c4

Yu Kuai authored 1 year ago


Currently, for idle and frozen, action_store will hold 'reconfig_mutex'
and call md_reap_sync_thread() to stop sync thread, however, this will
cause deadlock (explained in the next patch). In order to fix the
problem, following patch will release 'reconfig_mutex' and wait on
'resync_wait', like md_set_readonly() and do_md_stop() does.

Consider that action_store() will set/clear 'MD_RECOVERY_FROZEN'
unconditionally, which might cause unexpected problems, for example,
frozen just set 'MD_RECOVERY_FROZEN' and is still in progress, while
'idle' clear 'MD_RECOVERY_FROZEN' and new sync thread is started, which
might starve in progress frozen. A mutex is added to synchronize idle
and frozen from action_store().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230529132037.2124527-4-yukuai1@huaweicloud.com

6f56f0c4

md: refactor action_store() for 'idle' and 'frozen' · 64e5e09a

Yu Kuai authored 1 year ago


Prepare to handle 'idle' and 'frozen' differently to fix a deadlock, there
are no functional changes except that MD_RECOVERY_RUNNING is checked
again after 'reconfig_mutex' is held.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230529132037.2124527-3-yukuai1@huaweicloud.com

64e5e09a

Revert "md: unlock mddev before reap sync_thread in action_store" · a865b96c

Yu Kuai authored 1 year ago


This reverts commit 9dfbdafd.

Because it will introduce a defect that sync_thread can be running while
MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems,
for example:

list_add corruption. prev->next should be next (ffff0001ac1daba0), but was ffff0000ce1a02a0. (prev=ffff0000ce1a02a0).
Call trace:
 __list_add_valid+0xfc/0x140
 insert_work+0x78/0x1a0
 __queue_work+0x500/0xcf4
 queue_work_on+0xe8/0x12c
 md_check_recovery+0xa34/0xf30
 raid10d+0xb8/0x900 [raid10]
 md_thread+0x16c/0x2cc
 kthread+0x1a4/0x1ec
 ret_from_fork+0x10/0x18

This is because work is requeued while it's still inside workqueue:

t1:			t2:
action_store
 mddev_lock
  if (mddev->sync_thread)
   mddev_unlock
   md_unregister_thread
   // first sync_thread is done
			md_check_recovery
			 mddev_try_lock
			 /*
			  * once MD_RECOVERY_DONE is set, new sync_thread
			  * can start.
			  */
			 set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
			 INIT_WORK(&mddev->del_work, md_start_sync)
			 queue_work(md_misc_wq, &mddev->del_work)
			  test_and_set_bit(WORK_STRUCT_PENDING_BIT, ...)
			  // set pending bit
			  insert_work
			   list_add_tail
			 mddev_unlock
   mddev_lock_nointr
   md_reap_sync_thread
   // MD_RECOVERY_RUNNING is cleared
 mddev_unlock

t3:

// before queued work started from t2
md_check_recovery
 // MD_RECOVERY_RUNNING is not set, a new sync_thread can be started
 INIT_WORK(&mddev->del_work, md_start_sync)
  work->data = 0
  // work pending bit is cleared
 queue_work(md_misc_wq, &mddev->del_work)
  insert_work
   list_add_tail
   // list is corrupted

The above commit is reverted to fix the problem, the deadlock this
commit tries to fix will be fixed in following patches.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230529132037.2124527-2-yukuai1@huaweicloud.com

a865b96c

Jul 26, 2023

block: cleanup bio_integrity_prep · 51d74ec9

Jinyoung Choi authored 1 year ago


If a problem occurs in the process of creating an integrity payload, the
status of bio is always BLK_STS_RESOURCE.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230725051839epcms2p8e4d20ad6c51326ad032e8406f59d0aaa@epcms2p8


Signed-off-by: Jens Axboe <axboe@kernel.dk>

51d74ec9

Jul 25, 2023

block: Improve performance for BLK_MQ_F_BLOCKING drivers · 65a558f6

Bart Van Assche authored 1 year ago


blk_mq_run_queue() runs the queue asynchronously if BLK_MQ_F_BLOCKING
has been set. This is suboptimal since running the queue asynchronously
is slower than running the queue synchronously. This patch modifies
blk_mq_run_queue() as follows if BLK_MQ_F_BLOCKING has been set:
- Run the queue synchronously if it is allowed to sleep.
- Run the queue asynchronously if it is not allowed to sleep.
Additionally, blk_mq_run_hw_queue(hctx, false) calls are modified into
blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING) if the caller
may be invoked from atomic context.

The following caller chains have been reviewed:

blk_mq_run_hw_queue(hctx, false)
  blk_mq_get_tag()      /* may sleep, hence the functions it calls may also sleep */
  blk_execute_rq()             /* may sleep */
  blk_mq_run_hw_queues(q, async=false)
    blk_freeze_queue_start()   /* may sleep */
    blk_mq_requeue_work()      /* may sleep */
    scsi_kick_queue()
      scsi_requeue_run_queue() /* may sleep */
      scsi_run_host_queues()
        scsi_ioctl_reset()     /* may sleep */
  blk_mq_insert_requests(hctx, ctx, list, run_queue_async=false)
    blk_mq_dispatch_plug_list(plug, from_sched=false)
      blk_mq_flush_plug_list(plug, from_schedule=false)
        __blk_flush_plug(plug, from_schedule=false)
	blk_add_rq_to_plug()
	  blk_mq_submit_bio()  /* may sleep if REQ_NOWAIT has not been set */
  blk_mq_plug_issue_direct()
    blk_mq_flush_plug_list()   /* see above */
  blk_mq_dispatch_plug_list(plug, from_sched=false)
    blk_mq_flush_plug_list()   /* see above */
  blk_mq_try_issue_directly()
    blk_mq_submit_bio()        /* may sleep if REQ_NOWAIT has not been set */
  blk_mq_try_issue_list_directly(hctx, list)
    blk_mq_insert_requests() /* see above */

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230721172731.955724-4-bvanassche@acm.org


Signed-off-by: Jens Axboe <axboe@kernel.dk>

65a558f6

scsi: Remove a blk_mq_run_hw_queues() call · d42e2e34

Bart Van Assche authored 1 year ago

blk_mq_kick_requeue_list() calls blk_mq_run_hw_queues() asynchronously.
Leave out the direct blk_mq_run_hw_queues() call. This patch causes
scsi_run_queue() to call blk_mq_run_hw_queues() asynchronously instead
of synchronously. Since scsi_run_queue() is not called from the hot I/O
submission path, this patch does not affect the hot path.

This patch prepares for allowing blk_mq_run_hw_queue() to sleep if
BLK_MQ_F_BLOCKING has been set. scsi_run_queue() may be called from
atomic context and must not sleep. Hence the removal of the
blk_mq_run_hw_queues(q, false) call. See also scsi_unblock_requests().

Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230721172731.955724-3-bvanassche@acm.org

Signed-off-by: Jens Axboe <axboe@kernel.dk>

d42e2e34

scsi: Inline scsi_kick_queue() · b5ca9acf

Bart Van Assche authored 1 year ago


Inline scsi_kick_queue() to prepare for modifying the second argument
passed to blk_mq_run_hw_queues().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230721172731.955724-2-bvanassche@acm.org


Signed-off-by: Jens Axboe <axboe@kernel.dk>

b5ca9acf

block: don't pass a bio to bio_try_merge_hw_seg · ae42f0b3

Christoph Hellwig authored 1 year ago


There is no good reason to pass the bio to bio_try_merge_hw_seg.  Just
pass the current bvec and rename the function to bvec_try_merge_hw_page.
This will allow reusing this function for supporting multi-page integrity
payload bvecs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Link: https://lore.kernel.org/r/20230724165433.117645-9-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

ae42f0b3

block: move the bi_size update out of __bio_try_merge_page · 858c708d

Christoph Hellwig authored 1 year ago


The update of bi_size is the only thing in __bio_try_merge_page that
needs a bio.  Move it to the callers, and merge __bio_try_merge_page
and page_is_mergeable into a single bvec_try_merge_page that only takes
the current bvec instead of a full bio.  This will allow reusing this
function for supporting multi-page integrity payload bvecs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Link: https://lore.kernel.org/r/20230724165433.117645-8-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

858c708d

block: downgrade a bio_full call in bio_add_page · 80232b52

Christoph Hellwig authored 1 year ago


bio_add_page already checks that there is space in bi_size a little
earlier.  So after we failed to add to an existing segment, just check
that there is another one available instead of duplicating the bi_size
check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Link: https://lore.kernel.org/r/20230724165433.117645-7-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

80232b52

block: move the bi_size overflow check in __bio_try_merge_page · 61369905

Christoph Hellwig authored 1 year ago


Checking for availability in bi_size in a function that attempts to
merge into an existing segment is a bit odd, as the limit also applies
when adding a new segment.  This code works fine as we always call
__bio_try_merge_page, but contributes to sub-optimal calling conventions
and doesn't lead to clear code.

Move it to two of the callers instead, the third one already has a more
strict check that includes max_hw_segments anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230724165433.117645-6-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

61369905

block: move the bi_vcnt check out of __bio_try_merge_page · 0eca8b6f

Christoph Hellwig authored 1 year ago


Move the bi_vcnt out of __bio_try_merge_page and into the two callers
that don't already have it in preparation for additional changes to
__bio_try_merge_page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230724165433.117645-5-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

0eca8b6f

block: move the BIO_CLONED checks out of __bio_try_merge_page · 939e1a37

Christoph Hellwig authored 1 year ago


__bio_try_merge_page is a way too low-level helper to assert that the
bio is not cloned.  Move the check into bio_add_page and
bio_iov_iter_get_pages instead, which are the high level entry points
that should enforce this variant.  bio_add_hw_page already this
check, coverig the third (indirect) caller of __bio_try_merge_page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230724165433.117645-4-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

939e1a37

block: use SECTOR_SHIFT bio_add_hw_page · 6850b2dd

Christoph Hellwig authored 1 year ago


Use the SECTOR_SHIFT magic constant instead of the magic number.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230724165433.117645-3-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

6850b2dd

block: tidy up the bio full checks in bio_add_hw_page · cd1d83e2

Christoph Hellwig authored 1 year ago


bio_add_hw_page already checks if the number of bytes trying to be added
even fit into max_hw_sectors limit of the queue.   Remove the call to
bio_full and just do a check for the smaller of the number of segments
in the bio and the queue max segments limit, and do this cheap check
before the more expensive gap to previous check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinyoung Choi <j-young.choi@samsung.com>
Link: https://lore.kernel.org/r/20230724165433.117645-2-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

cd1d83e2

Jul 20, 2023

block: refactor to use helper · 8f63fef5

Nitesh Shetty authored 1 year ago


Reduce some code by making use of bio_integrity_bytes().

Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230719121608.32105-1-nj.shetty@samsung.com


Signed-off-by: Jens Axboe <axboe@kernel.dk>

8f63fef5

Jul 17, 2023

blk-flush: reuse rq queuelist in flush state machine · 81ada09c

Chengming Zhou authored 1 year ago


Since we don't need to maintain inflight flush_data requests list
anymore, we can reuse rq->queuelist for flush pending list.

Note in mq_flush_data_end_io(), we need to re-initialize rq->queuelist
before reusing it in the state machine when end, since the rq->rq_next
also reuse it, may have corrupted rq->queuelist by the driver.

This patch decrease the size of struct request by 16 bytes.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230717040058.3993930-5-chengming.zhou@linux.dev


Signed-off-by: Jens Axboe <axboe@kernel.dk>

81ada09c

blk-flush: count inflight flush_data requests · b175c867

Chengming Zhou authored 1 year ago


The flush state machine use a double list to link all inflight
flush_data requests, to avoid issuing separate post-flushes for
these flush_data requests which shared PREFLUSH.

So we can't reuse rq->queuelist, this is why we need rq->flush.list

In preparation of the next patch that reuse rq->queuelist for flush
state machine, we change the double linked list to unsigned long
counter, which count all inflight flush_data requests.

This is ok since we only need to know if there is any inflight
flush_data request, so unsigned long counter is good.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230717040058.3993930-4-chengming.zhou@linux.dev


Signed-off-by: Jens Axboe <axboe@kernel.dk>

b175c867

blk-flush: fix rq->flush.seq for post-flush requests · 28b24123

Chengming Zhou authored 1 year ago

If the policy == (REQ_FSEQ_DATA | REQ_FSEQ_POSTFLUSH), it means that the
data sequence and post-flush sequence need to be done for this request.

The rq->flush.seq should record what sequences have been done (or don't
need to be done). So in this case, pre-flush doesn't need to be done,
we should init rq->flush.seq to REQ_FSEQ_PREFLUSH not REQ_FSEQ_POSTFLUSH.

Fixes: 615939a2 ("blk-mq: defer to the normal submission path for post-flush requests")
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230717040058.3993930-3-chengming.zhou@linux.dev

Signed-off-by: Jens Axboe <axboe@kernel.dk>

28b24123

blk-mq: use percpu csd to remote complete instead of per-rq csd · 660e802c

Chengming Zhou authored 1 year ago

If request need to be completed remotely, we insert it into percpu llist,
and smp_call_function_single_async() if llist is empty previously.

We don't need to use per-rq csd, percpu csd is enough. And the size of
struct request is decreased by 24 bytes.

This way is cleaner, and looks correct, given block softirq is guaranteed
to be scheduled to consume the list if one new request is added to this
percpu list, either smp_call_function_single_async() returns -EBUSY or 0.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230717040058.3993930-2-chengming.zhou@linux.dev

Signed-off-by: Jens Axboe <axboe@kernel.dk>

660e802c

block: don't allow enabling a cache on devices that don't support it · 43c9835b

Christoph Hellwig authored 1 year ago


Currently the write_cache attribute allows enabling the QUEUE_FLAG_WC
flag on devices that never claimed the capability.

Fix that by adding a QUEUE_FLAG_HW_WC flag that is set by
blk_queue_write_cache and guards re-enabling the cache through sysfs.

Note that any rescan that calls blk_queue_write_cache will still
re-enable the write cache as in the current code.

Fixes: 93e9d8e8 ("block: add ability to flag write back caching on a device")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230707094239.107968-3-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

43c9835b

block: cleanup queue_wc_store · c4e21bcd

Christoph Hellwig authored 1 year ago


Get rid of the local queue_wc_store variable and handling setting and
clearing the QUEUE_FLAG_WC flag diretly instead the if / else if.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230707094239.107968-2-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>

c4e21bcd

nbd: automatically load module on genl access · ffe357c8

Thomas Weißschuh authored 1 year ago


Add a module alias to nbd.ko that allows the generic netlink core to
automatically load the module when netlink messages for nbd are
received.

This frees the user from manually having to load the module before using
nbd functionality via netlink.
If the system policy allows it this can even be used to load the nbd
module from containers which would otherwise not have access to the
necessary module files to do a normal "modprobe nbd".

For example this avoids the following error when using nbd-client:

$ nbd-client localhost 10809 /dev/nbd0
...
Error: Couldn't resolve the nbd netlink family, make sure the nbd module is loaded and your nbd driver supports the netlink interface.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Josef Bacik <josef@toxicpadna.com>
Link: https://lore.kernel.org/r/20230713-b4-nbd-genl-v3-1-226cbddba04b@weissschuh.net


Signed-off-by: Jens Axboe <axboe@kernel.dk>

ffe357c8

blk-wbt: Replace strlcpy with strscpy · 16291561

Azeem Shaikh authored 1 year ago

strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89



Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230703172159.3668349-3-azeemshaikh38@gmail.com


Signed-off-by: Jens Axboe <axboe@kernel.dk>

16291561

kyber: Replace strlcpy with strscpy · 222f58ac

Azeem Shaikh authored 1 year ago

strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89



Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230703172159.3668349-2-azeemshaikh38@gmail.com


Signed-off-by: Jens Axboe <axboe@kernel.dk>

222f58ac

Jul 16, 2023

Linux 6.5-rc2 · fdf0eaf1
Linus Torvalds authored 1 year ago

fdf0eaf1

Merge tag 'xtensa-20230716' of https://github.com/jcmvbkbc/linux-xtensa · 5b8d6e85

Linus Torvalds authored 1 year ago

Pull xtensa fixes from Max Filippov:

 - fix interaction between unaligned exception handler and load/store
   exception handler

 - fix parsing ISS network interface specification string

 - add comment about etherdev freeing to ISS network driver

* tag 'xtensa-20230716' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: fix unaligned and load/store configuration interaction
  xtensa: ISS: fix call to split_if_spec
  xtensa: ISS: add comment about etherdev freeing

5b8d6e85

Merge tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1667e630

Linus Torvalds authored 1 year ago

Pull perf fix from Borislav Petkov:

 - Fix a lockdep warning when the event given is the first one, no event
   group exists yet but the code still goes and iterates over event
   siblings

* tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR

1667e630

Merge tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8a3e4a64

Linus Torvalds authored 1 year ago

Pull objtool fixes from Borislav Petkov:

 - Mark copy_iovec_from_user() __noclone in order to prevent gcc from
   doing an inter-procedural optimization and confuse objtool

 - Initialize struct elf fully to avoid build failures

* tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  iov_iter: Mark copy_iovec_from_user() noclone
  objtool: initialize all of struct elf

8a3e4a64

Merge tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f61a89ca

Linus Torvalds authored 1 year ago

Pull scheduler fixes from Borislav Petkov:

 - Remove a cgroup from under a polling process properly

 - Fix the idle sibling selection

* tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/psi: use kernfs polling functions for PSI trigger polling
  sched/fair: Use recent_used_cpu to test p->cpus_ptr

f61a89ca

Merge tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · ede950b0

Linus Torvalds authored 1 year ago

Pull pin control fixes from Linus Walleij:
 "I'm mostly on vacation but what would vacation be without a few
  critical fixes so people can use their gaming laptops when hiding away
  from the sun (or rain)?

   - Fix a really annoying interrupt storm in the AMD driver affecting
     Asus TUF gaming notebooks

   - Fix device tree parsing in the Renesas driver"

* tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: Unify debounce handling into amd_pinconf_set()
  pinctrl: amd: Drop pull up select configuration
  pinctrl: amd: Use amd_pinconf_set() for all config options
  pinctrl: amd: Only use special debounce behavior for GPIO 0
  pinctrl: renesas: rzg2l: Handle non-unique subnode names
  pinctrl: renesas: rzv2m: Handle non-unique subnode names

ede950b0

Merge tag '6.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · fe756ad0

Linus Torvalds authored 1 year ago

Pull smb client fixes from Steve French:

 - Two reconnect fixes: important fix to address inFlight count to leak
   (which can leak credits), and fix for better handling a deleted share

 - DFS fix

 - SMB1 cleanup fix

 - deferred close fix

* tag '6.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix mid leak during reconnection after timeout threshold
  cifs: is_network_name_deleted should return a bool
  smb: client: fix missed ses refcounting
  smb: client: Fix -Wstringop-overflow issues
  cifs: if deferred close is disabled then close files immediately

fe756ad0

Merge tag 'powerpc-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 20edcec2

Linus Torvalds authored 1 year ago

Pull powerpc fixes from Michael Ellerman:

 - Fix Speculation_Store_Bypass reporting in /proc/self/status on
   Power10

 - Fix HPT with 4K pages since recent changes by implementing pmd_same()

 - Fix 64-bit native_hpte_remove() to be irq-safe

Thanks to Aneesh Kumar K.V, Nageswara R Sastry, and Russell Currey.

* tag 'powerpc-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm/book3s64/hash/4k: Add pmd_same callback for 4K page size
  powerpc/64e: Fix obtool warnings in exceptions-64e.S
  powerpc/security: Fix Speculation_Store_Bypass reporting on Power10
  powerpc/64s: Fix native_hpte_remove() to be irq-safe

20edcec2

Merge tag 'hardening-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 6eede068

Linus Torvalds authored 1 year ago

Pull hardening fixes from Kees Cook:

 - Remove LTO-only suffixes from promoted global function symbols
   (Yonghong Song)

 - Remove unused .text..refcount section from vmlinux.lds.h (Petr Pavlu)

 - Add missing __always_inline to sparc __arch_xchg() (Arnd Bergmann)

 - Claim maintainership of string routines

* tag 'hardening-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  sparc: mark __arch_xchg() as __always_inline
  MAINTAINERS: Foolishly claim maintainership of string routines
  kallsyms: strip LTO-only suffixes from promoted global functions
  vmlinux.lds.h: Remove a reference to no longer used sections .text..refcount

6eede068

Merge tag 'probes-fixes-v6.5-rc1-2' of... · 4b4eef57

Linus Torvalds authored 1 year ago

Merge tag 'probes-fixes-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probe fixes from Masami Hiramatsu:

- fprobe: Add a comment why fprobe will be skipped if another kprobe is
running in fprobe_kprobe_handler().

- probe-events: Fix some issues related to fetch-arguments:

- Fix double counting of the string length for user-string and
symstr. This will require longer buffer in the array case.

- Fix not to count error code (minus value) for the total used
length in array argument. This makes the total used length
shorter.

- Fix to update dynamic used data size counter only if fetcharg uses
the dynamic size data. This may mis-count the used dynamic data
size and corrupt data.

- Revert "tracing: Add "(fault)" name injection to kernel probes"
because that did not work correctly with a bug, and we agreed the
current '(fault)' output (instead of '"(fault)"' like a string)
explains what happened more clearly.

- Fix to record 0-length (means fault access) data_loc data in fetch
function itself, instead of store_trace_args(). If we record an
array of string, this will fix to save fault access data on each
entry of the array correctly.

* tag 'probes-fixes-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Revert "tracing: Add "(fault)" name injection to kernel probes"
tracing/probes: Fix to update dynamic data counter if fetcharg uses it
tracing/probes: Fix not to count error code to total length
tracing/probes: Fix to avoid double count of the string length on the array
fprobes: Add a comment why fprobe_kprobe_handler exits if kprobe is running

4b4eef57

Jul 15, 2023

Merge tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 831fe284

Linus Torvalds authored 1 year ago

Pull spi fixes from Mark Brown:
 "A couple of fairly minor driver specific fixes here, plus a bunch of
  maintainership and admin updates. Nothing too remarkable"

* tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  mailmap: add entry for Jonas Gorski
  MAINTAINERS: add myself for spi-bcm63xx
  spi: s3c64xx: clear loopback bit after loopback test
  spi: bcm63xx: fix max prepend length
  MAINTAINERS: Add myself as a maintainer for Microchip SPI

831fe284

Merge tag 'regmap-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 393ea781

Linus Torvalds authored 1 year ago

Pull regmap fix from Mark Brown:
 "One fix for an out of bounds access in the interupt code here"

* tag 'regmap-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap-irq: Fix out-of-bounds access when allocating config buffers

393ea781

Merge tag 'iommu-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 82678ab2

Linus Torvalds authored 1 year ago

Pull iommu fixes from Joerg Roedel:

 - Fix a regression causing a crash on sysfs access of iommu-group
   specific files

 - Fix signedness bug in SVA code

* tag 'iommu-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid()
  iommu: Fix crash during syfs iommu_groups/N/type

82678ab2

Merge tag 'x86_urgent_for_6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b6e6cc1f

Linus Torvalds authored 1 year ago

Pull x86 CFI fixes from Peter Zijlstra:
 "Fix kCFI/FineIBT weaknesses

  The primary bug Alyssa noticed was that with FineIBT enabled function
  prologues have a spurious ENDBR instruction:

    __cfi_foo:
	endbr64
	subl	$hash, %r10d
	jz	1f
	ud2
	nop
    1:
    foo:
	endbr64 <--- *sadface*

  This means that any indirect call that fails to target the __cfi
  symbol and instead targets (the regular old) foo+0, will succeed due
  to that second ENDBR.

  Fixing this led to the discovery of a single indirect call that was
  still doing this: ret_from_fork(). Since that's an assembly stub the
  compiler would not generate the proper kCFI indirect call magic and it
  would not get patched.

  Brian came up with the most comprehensive fix -- convert the thing to
  C with only a very thin asm wrapper. This ensures the kernel thread
  boostrap is a proper kCFI call.

  While discussing all this, Kees noted that kCFI hashes could/should be
  poisoned to seal all functions whose address is never taken, further
  limiting the valid kCFI targets -- much like we already do for IBT.

  So what was a 'simple' observation and fix cascaded into a bunch of
  inter-related CFI infrastructure fixes"

* tag 'x86_urgent_for_6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cfi: Only define poison_cfi() if CONFIG_X86_KERNEL_IBT=y
  x86/fineibt: Poison ENDBR at +0
  x86: Rewrite ret_from_fork() in C
  x86/32: Remove schedule_tail_wrapper()
  x86/cfi: Extend ENDBR sealing to kCFI
  x86/alternative: Rename apply_ibt_endbr()
  x86/cfi: Extend {JMP,CAKK}_NOSPEC comment

b6e6cc1f