Skip to content
Snippets Groups Projects
  1. May 05, 2020
  2. Apr 30, 2020
  3. Apr 29, 2020
  4. Apr 28, 2020
  5. Apr 27, 2020
    • Lyude Paul's avatar
      drm/dp_mst: Kill the second sideband tx slot, save the world · d308a881
      Lyude Paul authored
      
      While we support using both tx slots for sideband transmissions, it
      appears that DisplayPort devices in the field didn't end up doing a very
      good job of supporting it. From section 5.2.1 of the DP 2.0
      specification:
      
        There are MST Sink/Branch devices in the field that do not handle
        interleaved message transactions.
      
        To facilitate message transaction handling by downstream devices, an
        MST Source device shall generate message transactions in an atomic
        manner (i.e., the MST Source device shall not concurrently interleave
        multiple message transactions). Therefore, an MST Source device shall
        clear the Message_Sequence_No value in the Sideband_MSG_Header to 0.
      
      This might come as a bit of a surprise since the vast majority of hubs
      will support using both tx slots even if they don't support interleaved
      message transactions, and we've also been using both tx slots since MST
      was introduced into the kernel.
      
      However, there is one device we've had trouble getting working
      consistently with MST for so long that we actually assumed it was just
      broken: the infamous Dell P2415Qb. Previously this monitor would appear
      to work sometimes, but in most situations would end up timing out
      LINK_ADDRESS messages almost at random until you power cycled the whole
      display. After reading section 5.2.1 in the DP 2.0 spec, some closer
      investigation into this infamous display revealed it was only ever
      timing out on sideband messages in the second TX slot.
      
      Sure enough, avoiding the second TX slot has suddenly made this monitor
      function perfectly for the first time in five years. And since they
      explicitly mention this in the specification, I doubt this is the only
      monitor out there with this issue. This might even explain explain the
      seemingly harmless garbage sideband responses we would occasionally see
      with MST hubs!
      
      So - rewrite our sideband TX handlers to only support one TX slot. In
      order to simplify our sideband handling now that we don't support
      transmitting to multiple MSTBs at once, we also move all state tracking
      for down replies from mstbs to the topology manager.
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: ad7f8a1f ("drm/helper: add Displayport multi-stream helper (v0.6)")
      Cc: Sean Paul <sean@poorly.run>
      Cc: "Lin, Wayne" <Wayne.Lin@amd.com>
      Cc: <stable@vger.kernel.org> # v3.17+
      Reviewed-by: default avatarSean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200424181308.770749-1-lyude@redhat.com
      d308a881
  6. Apr 23, 2020
    • Lyude Paul's avatar
      Revert "drm/dp_mst: Remove single tx msg restriction." · 973a5909
      Lyude Paul authored
      
      This reverts commit 6bb0942e.
      
      Unfortunately it would appear that the rumors we've heard of sideband
      message interleaving not being very well supported are true. On the
      Lenovo ThinkPad Thunderbolt 3 dock that I have, interleaved messages
      appear to just get dropped:
      
        [drm:drm_dp_mst_wait_tx_reply [drm_kms_helper]] timedout msg send
        00000000571ddfd0 2 1
        [dp_mst] txmsg cur_offset=2 cur_len=2 seqno=1 state=SENT path_msg=1 dst=00
        [dp_mst] 	type=ENUM_PATH_RESOURCES contents:
        [dp_mst] 		port=2
      
      DP descriptor for this hub:
        OUI 90-cc-24 dev-ID SYNA3  HW-rev 1.0 SW-rev 3.12 quirks 0x0008
      
      It would seem like as well that this is a somewhat well known issue in
      the field. From section 5.4.2 of the DisplayPort 2.0 specification:
      
        There are MST Sink/Branch devices in the field that do not handle
        interleaved message transactions.
      
        To facilitate message transaction handling by downstream devices, an
        MST Source device shall generate message transactions in an atomic
        manner (i.e., the MST Source device shall not concurrently interleave
        multiple message transactions). Therefore, an MST Source device shall
        clear the Message_Sequence_No value in the Sideband_MSG_Header to 0.
      
        MST Source devices that support field policy updates by way of
        software should update the policy to forego the generation of
        interleaved message transactions.
      
      This is a bit disappointing, as features like HDCP require that we send
      a sideband request every ~2 seconds for each active stream. However,
      there isn't really anything in the specification that allows us to
      accurately probe for interleaved messages.
      
      If it ends up being that we -really- need this in the future, we might
      be able to whitelist hubs where interleaving is known to work-or maybe
      try some sort of heuristics. But for now, let's just play it safe and
      not use it.
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: 6bb0942e ("drm/dp_mst: Remove single tx msg restriction.")
      Cc: Wayne Lin <Wayne.Lin@amd.com>
      Cc: Sean Paul <seanpaul@chromium.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200423164225.680178-1-lyude@redhat.com
      
      
      Reviewed-by: default avatarSean Paul <sean@poorly.run>
      973a5909
  7. Apr 14, 2020
  8. Apr 10, 2020
    • Pali Rohár's avatar
      change email address for Pali Rohár · 149ed3d4
      Pali Rohár authored
      
      For security reasons I stopped using gmail account and kernel address is
      now up-to-date alias to my personal address.
      
      People periodically send me emails to address which they found in source
      code of drivers, so this change reflects state where people can contact
      me.
      
      [ Added .mailmap entry as per Joe Perches  - Linus ]
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Joe Perches <joe@perches.com>
      Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      149ed3d4
    • Logan Gunthorpe's avatar
      mm/memory_hotplug: add pgprot_t to mhp_params · bfeb022f
      Logan Gunthorpe authored
      
      devm_memremap_pages() is currently used by the PCI P2PDMA code to create
      struct page mappings for IO memory.  At present, these mappings are
      created with PAGE_KERNEL which implies setting the PAT bits to be WB.
      However, on x86, an mtrr register will typically override this and force
      the cache type to be UC-.  In the case firmware doesn't set this
      register it is effectively WB and will typically result in a machine
      check exception when it's accessed.
      
      Other arches are not currently likely to function correctly seeing they
      don't have any MTRR registers to fall back on.
      
      To solve this, provide a way to specify the pgprot value explicitly to
      arch_add_memory().
      
      Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
      simple change to pass the pgprot_t down to their respective functions
      which set up the page tables.  For x86_32, set the page tables
      explicitly using _set_memory_prot() (seeing they are already mapped).
      
      For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
      should be fine, for now, seeing these architectures don't support
      ZONE_DEVICE.
      
      A check in __add_pages() is also added to ensure the pgprot parameter
      was set for all arches.
      
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Eric Badger <ebadger@gigaio.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bfeb022f
    • Logan Gunthorpe's avatar
      mm/memory_hotplug: rename mhp_restrictions to mhp_params · f5637d3b
      Logan Gunthorpe authored
      
      The mhp_restrictions struct really doesn't specify anything resembling a
      restriction anymore so rename it to be mhp_params as it is a list of
      extended parameters.
      
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Eric Badger <ebadger@gigaio.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5637d3b
    • Logan Gunthorpe's avatar
      mm/memory_hotplug: drop the flags field from struct mhp_restrictions · 96c6b598
      Logan Gunthorpe authored
      
      Patch series "Allow setting caching mode in arch_add_memory() for
      P2PDMA", v4.
      
      Currently, the page tables created using memremap_pages() are always
      created with the PAGE_KERNEL cacheing mode.  However, the P2PDMA code is
      creating pages for PCI BAR memory which should never be accessed through
      the cache and instead use either WC or UC.  This still works in most
      cases, on x86, because the MTRR registers typically override the caching
      settings in the page tables for all of the IO memory to be UC-.
      However, this tends not to work so well on other arches or some rare x86
      machines that have firmware which does not setup the MTRR registers in
      this way.
      
      Instead of this, this series proposes a change to arch_add_memory() to
      take the pgprot required by the mapping which allows us to explicitly
      set pagetable entries for P2PDMA memory to UC.
      
      This changes is pretty routine for most of the arches: x86_64, arm64 and
      powerpc simply need to thread the pgprot through to where the page
      tables are setup.  x86_32 unfortunately sets up the page tables at boot
      so must use _set_memory_prot() to change their caching mode.  ia64, s390
      and sh don't appear to have an easy way to change the page tables so,
      for now at least, we just return -EINVAL on such mappings and thus they
      will not support P2PDMA memory until the work for this is done.  This
      should be fine as they don't yet support ZONE_DEVICE.
      
      This patch (of 7):
      
      This variable is not used anywhere and should therefore be removed from
      the structure.
      
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Eric Badger <ebadger@gigaio.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96c6b598
    • Anshuman Khandual's avatar
      mm/special: create generic fallbacks for pte_special() and pte_mkspecial() · 78e7c5af
      Anshuman Khandual authored
      Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL
      but required to define quite similar fallback stubs for special page
      table entry helpers such as pte_special() and pte_mkspecial(), as they
      get build in generic MM without a config check.  This creates two
      generic fallback stub definitions for these helpers, eliminating much
      code duplication.
      
      mips platform has a special case where pte_special() and pte_mkspecial()
      visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires.
      This restricts those symbol visibility in order to avoid redefinitions
      which is now exposed through this new generic stubs and subsequent build
      failure.  arm platform set_pte_at() definition needs to be moved into a
      C file just to prevent a build failure.
      
      [anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas]
        Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com
      
      
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Guo Ren <guoren@kernel.org>			[csky]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Acked-by: Stafford Horne <shorne@gmail.com>		[openrisc]
      Acked-by: Helge Deller <deller@gmx.de>			[parisc]
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Sam Creasey <sammy@sammy.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      78e7c5af
    • Anshuman Khandual's avatar
      mm/vma: introduce VM_ACCESS_FLAGS · 6cb4d9a2
      Anshuman Khandual authored
      
      There are many places where all basic VMA access flags (read, write,
      exec) are initialized or checked against as a group.  One such example
      is during page fault.  Existing vma_is_accessible() wrapper already
      creates the notion of VMA accessibility as a group access permissions.
      
      Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
      will not only reduce code duplication but also extend the VMA
      accessibility concept in general.
      
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rob Springer <rspringer@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6cb4d9a2
    • Anshuman Khandual's avatar
      mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS · c62da0c3
      Anshuman Khandual authored
      
      There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
      This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
      existing VM_STACK_DEFAULT_FLAGS.  While here, also define some more
      macros with standard VMA access flag combinations that are used
      frequently across many platforms.  Apart from simplification, this
      reduces code duplication as well.
      
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Chris Zankel <chris@zankel.net>
      Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c62da0c3
    • Arjun Roy's avatar
      mm/memory.c: add vm_insert_pages() · 8cd3984d
      Arjun Roy authored
      Add the ability to insert multiple pages at once to a user VM with lower
      PTE spinlock operations.
      
      The intention of this patch-set is to reduce atomic ops for tcp zerocopy
      receives, which normally hits the same spinlock multiple times
      consecutively.
      
      [akpm@linux-foundation.org: pte_alloc() no longer takes the `addr' argument]
      [arjunroy@google.com: add missing page_count() check to vm_insert_pages()]
        Link: http://lkml.kernel.org/r/20200214005929.104481-1-arjunroy.kdev@gmail.com
      [arjunroy@google.com: vm_insert_pages() checks if pte_index defined]
        Link: http://lkml.kernel.org/r/20200228054714.204424-2-arjunroy.kdev@gmail.com
      
      
      Signed-off-by: default avatarArjun Roy <arjunroy@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200128025958.43490-2-arjunroy.kdev@gmail.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8cd3984d
    • Roman Gushchin's avatar
      mm: hugetlb: optionally allocate gigantic hugepages using cma · cf11e85f
      Roman Gushchin authored
      
      Commit 944d9fec ("hugetlb: add support for gigantic page allocation
      at runtime") has added the run-time allocation of gigantic pages.
      
      However it actually works only at early stages of the system loading,
      when the majority of memory is free.  After some time the memory gets
      fragmented by non-movable pages, so the chances to find a contiguous 1GB
      block are getting close to zero.  Even dropping caches manually doesn't
      help a lot.
      
      At large scale rebooting servers in order to allocate gigantic hugepages
      is quite expensive and complex.  At the same time keeping some constant
      percentage of memory in reserved hugepages even if the workload isn't
      using it is a big waste: not all workloads can benefit from using 1 GB
      pages.
      
      The following solution can solve the problem:
      1) On boot time a dedicated cma area* is reserved. The size is passed
         as a kernel argument.
      2) Run-time allocations of gigantic hugepages are performed using the
         cma allocator and the dedicated cma area
      
      In this case gigantic hugepages can be allocated successfully with a
      high probability, however the memory isn't completely wasted if nobody
      is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
      etc.
      
      * On a multi-node machine a per-node cma area is allocated on each node.
        Following gigantic hugetlb allocation are using the first available
        numa node if the mask isn't specified by a user.
      
      Usage:
      1) configure the kernel to allocate a cma area for hugetlb allocations:
         pass hugetlb_cma=10G as a kernel argument
      
      2) allocate hugetlb pages as usual, e.g.
         echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
      
      If the option isn't enabled or the allocation of the cma area failed,
      the current behavior of the system is preserved.
      
      x86 and arm-64 are covered by this patch, other architectures can be
      trivially added later.
      
      The patch contains clean-ups and fixes proposed and implemented by Aslan
      Bakirov and Randy Dunlap.  It also contains ideas and suggestions
      proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!
      
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarAndreas Schaufler <andreas.schaufler@gmx.de>
      Acked-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: default avatarMichal Hocko <mhocko@kernel.org>
      Cc: Aslan Bakirov <aslan@fb.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf11e85f
    • Aslan Bakirov's avatar
      mm: cma: NUMA node interface · 8676af1f
      Aslan Bakirov authored
      
      I've noticed that there is no interface exposed by CMA which would let
      me to declare contigous memory on particular NUMA node.
      
      This patchset adds the ability to try to allocate contiguous memory on a
      specific node.  It will fallback to other nodes if the specified one
      doesn't work.
      
      Implement a new method for declaring contigous memory on particular node
      and keep cma_declare_contiguous() as a wrapper.
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: default avatarAslan Bakirov <aslan@fb.com>
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarMichal Hocko <mhocko@kernel.org>
      Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Link: http://lkml.kernel.org/r/20200407163840.92263-2-guro@fb.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8676af1f
    • Mauro Carvalho Chehab's avatar
      docs: mm: slab.h: fix a broken cross-reference · 2370ae4b
      Mauro Carvalho Chehab authored
      
      There is a typo at the cross-reference link, causing this warning:
      
        include/linux/slab.h:11: WARNING: undefined label: memory-allocation (if the link has no caption the label must precede a section header)
      
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Link: http://lkml.kernel.org/r/0aeac24235d356ebd935d11e147dcc6edbb6465c.1586359676.git.mchehab+huawei@kernel.org
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2370ae4b
    • Sergey Senozhatsky's avatar
      printk: queue wake_up_klogd irq_work only if per-CPU areas are ready · ab6f762f
      Sergey Senozhatsky authored
      printk_deferred(), similarly to printk_safe/printk_nmi, does not
      immediately attempt to print a new message on the consoles, avoiding
      calls into non-reentrant kernel paths, e.g. scheduler or timekeeping,
      which potentially can deadlock the system.
      
      Those printk() flavors, instead, rely on per-CPU flush irq_work to print
      messages from safer contexts.  For same reasons (recursive scheduler or
      timekeeping calls) printk() uses per-CPU irq_work in order to wake up
      user space syslog/kmsg readers.
      
      However, only printk_safe/printk_nmi do make sure that per-CPU areas
      have been initialised and that it's safe to modify per-CPU irq_work.
      This means that, for instance, should printk_deferred() be invoked "too
      early", that is before per-CPU areas are initialised, printk_deferred()
      will perform illegal per-CPU access.
      
      Lech Perczak [0] reports that after commit 1b710b1b ("char/random:
      silence a lockdep splat with printk()") user-space syslog/kmsg readers
      are not able to read new kernel messages.
      
      The reason is printk_deferred() being called too early (as was pointed
      out by Petr and John).
      
      Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU
      areas are initialized.
      
      Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
      
      
      Reported-by: default avatarLech Perczak <l.perczak@camlintechnologies.com>
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: default avatarJann Horn <jannh@google.com>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: John Ogness <john.ogness@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab6f762f
  9. Apr 09, 2020
    • Eric W. Biederman's avatar
      proc: Use a dedicated lock in struct pid · 63f818f4
      Eric W. Biederman authored
      
      syzbot wrote:
      > ========================================================
      > WARNING: possible irq lock inversion dependency detected
      > 5.6.0-syzkaller #0 Not tainted
      > --------------------------------------------------------
      > swapper/1/0 just changed the state of lock:
      > ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigurg+0x9f/0x320 fs/fcntl.c:840
      > but this lock took another, SOFTIRQ-unsafe lock in the past:
      >  (&pid->wait_pidfd){+.+.}-{2:2}
      >
      >
      > and interrupts could create inverse lock ordering between them.
      >
      >
      > other info that might help us debug this:
      >  Possible interrupt unsafe locking scenario:
      >
      >        CPU0                    CPU1
      >        ----                    ----
      >   lock(&pid->wait_pidfd);
      >                                local_irq_disable();
      >                                lock(tasklist_lock);
      >                                lock(&pid->wait_pidfd);
      >   <Interrupt>
      >     lock(tasklist_lock);
      >
      >  *** DEADLOCK ***
      >
      > 4 locks held by swapper/1/0:
      
      The problem is that because wait_pidfd.lock is taken under the tasklist
      lock.  It must always be taken with irqs disabled as tasklist_lock can be
      taken from interrupt context and if wait_pidfd.lock was already taken this
      would create a lock order inversion.
      
      Oleg suggested just disabling irqs where I have added extra calls to
      wait_pidfd.lock.  That should be safe and I think the code will eventually
      do that.  It was rightly pointed out by Christian that sharing the
      wait_pidfd.lock was a premature optimization.
      
      It is also true that my pre-merge window testing was insufficient.  So
      remove the premature optimization and give struct pid a dedicated lock of
      it's own for struct pid things.  I have verified that lockdep sees all 3
      paths where we take the new pid->lock and lockdep does not complain.
      
      It is my current day dream that one day pid->lock can be used to guard the
      task lists as well and then the tasklist_lock won't need to be held to
      deliver signals.  That will require taking pid->lock with irqs disabled.
      
      Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Link: https://lore.kernel.org/lkml/00000000000011d66805a25cd73f@google.com/
      
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Reported-by: default avatar <syzbot+343f75cdeea091340956@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+832aabf700bc3ec920b9@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+f675f964019f884dbd0f@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com>
      Fixes: 7bc3e6e5 ("proc: Use a list of inodes to flush from proc")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      63f818f4
    • Marek Szyprowski's avatar
      drm/bridge: analogix_dp: Split bind() into probe() and real bind() · 152cce00
      Marek Szyprowski authored
      
      Analogix_dp driver acquires all its resources in the ->bind() callback,
      what is a bit against the component driver based approach, where the
      driver initialization is split into a probe(), where all resources are
      gathered, and a bind(), where all objects are created and a compound
      driver is initialized.
      
      Extract all the resource related operations to analogix_dp_probe() and
      analogix_dp_remove(), then call them before/after registration of the
      device components from the main Exynos DP and Rockchip DP drivers. Also
      move the plat_data initialization to the probe() to make it available for
      the analogix_dp_probe() function.
      
      This fixes the multiple calls to the bind() of the DRM compound driver
      when the DP PHY driver is not yet loaded/probed:
      
      [drm] Exynos DRM: using 14400000.fimd device for DMA mapping operations
      exynos-drm exynos-drm: bound 14400000.fimd (ops fimd_component_ops [exynosdrm])
      exynos-drm exynos-drm: bound 14450000.mixer (ops mixer_component_ops [exynosdrm])
      exynos-dp 145b0000.dp-controller: no DP phy configured
      exynos-drm exynos-drm: failed to bind 145b0000.dp-controller (ops exynos_dp_ops [exynosdrm]): -517
      exynos-drm exynos-drm: master bind failed: -517
      ...
      [drm] Exynos DRM: using 14400000.fimd device for DMA mapping operations
      exynos-drm exynos-drm: bound 14400000.fimd (ops hdmi_enable [exynosdrm])
      exynos-drm exynos-drm: bound 14450000.mixer (ops hdmi_enable [exynosdrm])
      exynos-drm exynos-drm: bound 145b0000.dp-controller (ops hdmi_enable [exynosdrm])
      exynos-drm exynos-drm: bound 14530000.hdmi (ops hdmi_enable [exynosdrm])
      [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
      Console: switching to colour frame buffer device 170x48
      exynos-drm exynos-drm: fb0: exynosdrmfb frame buffer device
      [drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1
      ...
      
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: default avatarAndy Yan <andy.yan@rock-chips.com>
      Reviewed-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200310103427.26048-1-m.szyprowski@samsung.com
      
      
      (cherry picked from commit 83a19677)
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
    • Chris Wilson's avatar
      drm/legacy: Fix type for drm_local_map.offset · b2ecb89c
      Chris Wilson authored
      
      drm_local_map.offset is not only used for resource_size_t but also
      dma_addr_t which may be of different sizes.
      
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Fixes: 8e4ff9b5 ("drm: Remove the dma_alloc_coherent wrapper for internal usage")
      Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200402215926.30714-1-chris@chris-wilson.co.uk
      b2ecb89c
  10. Apr 08, 2020
  11. Apr 07, 2020
Loading