Skip to content
Snippets Groups Projects
  1. Nov 18, 2023
  2. Oct 04, 2023
  3. Aug 24, 2023
  4. Jun 26, 2023
  5. Jun 23, 2023
  6. Apr 06, 2023
    • Shiyang Ruan's avatar
      fsdax: force clear dirty mark if CoW · f76b3a32
      Shiyang Ruan authored
      XFS allows CoW on non-shared extents to combat fragmentation[1].  The old
      non-shared extent could be mwrited before, its dax entry is marked dirty. 
      
      This results in a WARNing:
      
      [   28.512349] ------------[ cut here ]------------
      [   28.512622] WARNING: CPU: 2 PID: 5255 at fs/dax.c:390 dax_insert_entry+0x342/0x390
      [   28.513050] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables
      [   28.515462] CPU: 2 PID: 5255 Comm: fsstress Kdump: loaded Not tainted 6.3.0-rc1-00001-g85e1481e19c1-dirty #117
      [   28.515902] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014
      [   28.516307] RIP: 0010:dax_insert_entry+0x342/0x390
      [   28.516536] Code: 30 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 45 20 48 83 c0 01 e9 e2 fe ff ff 48 8b 45 20 48 83 c0 01 e9 cd fe ff ff <0f> 0b e9 53 ff ff ff 48 8b 7c 24 08 31 f6 e8 1b 61 a1 00 eb 8c 48
      [   28.517417] RSP: 0000:ffffc9000845fb18 EFLAGS: 00010086
      [   28.517721] RAX: 0000000000000053 RBX: 0000000000000155 RCX: 000000000018824b
      [   28.518113] RDX: 0000000000000000 RSI: ffffffff827525a6 RDI: 00000000ffffffff
      [   28.518515] RBP: ffffea00062092c0 R08: 0000000000000000 R09: ffffc9000845f9c8
      [   28.518905] R10: 0000000000000003 R11: ffffffff82ddb7e8 R12: 0000000000000155
      [   28.519301] R13: 0000000000000000 R14: 000000000018824b R15: ffff88810cfa76b8
      [   28.519703] FS:  00007f14a0c94740(0000) GS:ffff88817bd00000(0000) knlGS:0000000000000000
      [   28.520148] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   28.520472] CR2: 00007f14a0c8d000 CR3: 000000010321c004 CR4: 0000000000770ee0
      [   28.520863] PKRU: 55555554
      [   28.521043] Call Trace:
      [   28.521219]  <TASK>
      [   28.521368]  dax_fault_iter+0x196/0x390
      [   28.521595]  dax_iomap_pte_fault+0x19b/0x3d0
      [   28.521852]  __xfs_filemap_fault+0x234/0x2b0
      [   28.522116]  __do_fault+0x30/0x130
      [   28.522334]  do_fault+0x193/0x340
      [   28.522586]  __handle_mm_fault+0x2d3/0x690
      [   28.522975]  handle_mm_fault+0xe6/0x2c0
      [   28.523259]  do_user_addr_fault+0x1bc/0x6f0
      [   28.523521]  exc_page_fault+0x60/0x140
      [   28.523763]  asm_exc_page_fault+0x22/0x30
      [   28.524001] RIP: 0033:0x7f14a0b589ca
      [   28.524225] Code: c5 fe 7f 07 c5 fe 7f 47 20 c5 fe 7f 47 40 c5 fe 7f 47 60 c5 f8 77 c3 66 0f 1f 84 00 00 00 00 00 40 0f b6 c6 48 89 d1 48 89 fa <f3> aa 48 89 d0 c5 f8 77 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90
      [   28.525198] RSP: 002b:00007fff1dea1c98 EFLAGS: 00010202
      [   28.525505] RAX: 000000000000001e RBX: 000000000014a000 RCX: 0000000000006046
      [   28.525895] RDX: 00007f14a0c82000 RSI: 000000000000001e RDI: 00007f14a0c8d000
      [   28.526290] RBP: 000000000000006f R08: 0000000000000004 R09: 000000000014a000
      [   28.526681] R10: 0000000000000008 R11: 0000000000000246 R12: 028f5c28f5c28f5c
      [   28.527067] R13: 8f5c28f5c28f5c29 R14: 0000000000011046 R15: 00007f14a0c946c0
      [   28.527449]  </TASK>
      [   28.527600] ---[ end trace 0000000000000000 ]---
      
      
      To be able to delete this entry, clear its dirty mark before
      invalidate_inode_pages2_range().
      
      [1] https://lore.kernel.org/linux-xfs/20230321151339.GA11376@frogsfrogsfrogs/
      
      Link: https://lkml.kernel.org/r/1679653680-2-1-git-send-email-ruansy.fnst@fujitsu.com
      
      
      Fixes: f80e1668 ("fsdax: invalidate pages when CoW")
      Signed-off-by: default avatarShiyang Ruan <ruansy.fnst@fujitsu.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Darrick J. Wong <djwong@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f76b3a32
  7. Mar 28, 2023
  8. Feb 04, 2023
  9. Dec 12, 2022
  10. Jul 26, 2022
    • Li Jinlin's avatar
      fsdax: Fix infinite loop in dax_iomap_rw() · 17d9c15c
      Li Jinlin authored
      
      I got an infinite loop and a WARNING report when executing a tail command
      in virtiofs.
      
        WARNING: CPU: 10 PID: 964 at fs/iomap/iter.c:34 iomap_iter+0x3a2/0x3d0
        Modules linked in:
        CPU: 10 PID: 964 Comm: tail Not tainted 5.19.0-rc7
        Call Trace:
        <TASK>
        dax_iomap_rw+0xea/0x620
        ? __this_cpu_preempt_check+0x13/0x20
        fuse_dax_read_iter+0x47/0x80
        fuse_file_read_iter+0xae/0xd0
        new_sync_read+0xfe/0x180
        ? 0xffffffff81000000
        vfs_read+0x14d/0x1a0
        ksys_read+0x6d/0xf0
        __x64_sys_read+0x1a/0x20
        do_syscall_64+0x3b/0x90
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The tail command will call read() with a count of 0. In this case,
      iomap_iter() will report this WARNING, and always return 1 which casuing
      the infinite loop in dax_iomap_rw().
      
      Fixing by checking count whether is 0 in dax_iomap_rw().
      
      Fixes: ca289e0b ("fsdax: switch dax_iomap_rw to use iomap_iter")
      Signed-off-by: default avatarLi Jinlin <lijinlin3@huawei.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Link: https://lore.kernel.org/r/20220725032050.3873372-1-lijinlin3@huawei.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      17d9c15c
  11. Jul 18, 2022
  12. Jun 30, 2022
  13. May 16, 2022
  14. Apr 29, 2022
    • Muchun Song's avatar
      dax: fix missing writeprotect the pte entry · 06083a09
      Muchun Song authored
      Currently dax_mapping_entry_mkclean() fails to clean and write protect the
      pte entry within a DAX PMD entry during an *sync operation.  This can
      result in data loss in the following sequence:
      
        1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and
           making the pmd entry dirty and writeable.
        2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K)
           write to the same file, dirtying PMD radix tree entry (already
           done in 1)) and making the pte entry dirty and writeable.
        3) fsync, flushing out PMD data and cleaning the radix tree entry. We
           currently fail to mark the pte entry as clean and write protected
           since the vma of process B is not covered in dax_entry_mkclean().
        4) process B writes to the pte. These don't cause any page faults since
           the pte entry is dirty and writeable. The radix tree entry remains
           clean.
        5) fsync, which fails to flush the dirty PMD data because the radix tree
           entry was clean.
        6) crash - dirty data that should have been fsync'd as part of 5) could
           still have been in the processor cache, and is lost.
      
      Just to use pfn_mkclean_range() to clean the pfns to fix this issue.
      
      Link: https://lkml.kernel.org/r/20220403053957.10770-6-songmuchun@bytedance.com
      
      
      Fixes: 4b4bb46d ("dax: clear dirty entry tags on cache flush")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      06083a09
    • Muchun Song's avatar
      dax: fix cache flush on PMD-mapped pages · e583b5c4
      Muchun Song authored
      The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
      However, it does not cover the full pages in a THP except a head page. 
      Replace it with flush_cache_range() to fix this issue.  This is just a
      documentation issue with the respect to properly documenting the expected
      usage of cache flushing before modifying the pmd.  However, in practice
      this is not a problem due to the fact that DAX is not available on
      architectures with virtually indexed caches per:
      
        commit d92576f1 ("dax: does not work correctly with virtual aliasing caches")
      
      Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com
      
      
      Fixes: f729c8c9 ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e583b5c4
  15. Feb 18, 2022
  16. Feb 02, 2022
  17. Dec 18, 2021
  18. Dec 04, 2021
Loading