Skip to content
Snippets Groups Projects
  1. Aug 16, 2017
  2. Aug 15, 2017
  3. Aug 11, 2017
  4. Aug 10, 2017
    • Minchan Kim's avatar
      mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem · 99baac21
      Minchan Kim authored
      Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
      problem and Mel fixed it[1] and found same problem on MADV_FREE[2].
      
      Quote from Mel Gorman:
       "The race in question is CPU 0 running madv_free and updating some PTEs
        while CPU 1 is also running madv_free and looking at the same PTEs.
        CPU 1 may have writable TLB entries for a page but fail the pte_dirty
        check (because CPU 0 has updated it already) and potentially fail to
        flush.
      
        Hence, when madv_free on CPU 1 returns, there are still potentially
        writable TLB entries and the underlying PTE is still present so that a
        subsequent write does not necessarily propagate the dirty bit to the
        underlying PTE any more. Reclaim at some unknown time at the future
        may then see that the PTE is still clean and discard the page even
        though a write has happened in the meantime. I think this is possible
        but I could have missed some protection in madv_free that prevents it
        happening."
      
      This patch aims for solving both problems all at once and is ready for
      other problem with KSM, MADV_FREE and soft-dirty story[3].
      
      TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
      and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
      catch there are parallel threads going on.  In that case, forcefully,
      flush TLB to prevent for user to access memory via stale TLB entry
      although it fail to gather page table entry.
      
      I confirmed this patch works with [4] test program Nadav gave so this
      patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
      v2" in current mmotm.
      
      NOTE:
      
      This patch modifies arch-specific TLB gathering interface(x86, ia64,
      s390, sh, um).  It seems most of architecture are straightforward but
      s390 need to be careful because tlb_flush_mmu works only if
      mm->context.flush_mm is set to non-zero which happens only a pte entry
      really is cleared by ptep_get_and_clear and friends.  However, this
      problem never changes the pte entries but need to flush to prevent
      memory access from stale tlb.
      
      [1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
      [2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
      [3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
      [4] https://patchwork.kernel.org/patch/9861621/
      
      [minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
        Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
      Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
      
      
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Reported-by: default avatarNadav Amit <namit@vmware.com>
      Reported-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      99baac21
    • Minchan Kim's avatar
      mm: refactor TLB gathering API · 56236a59
      Minchan Kim authored
      This patch is a preparatory patch for solving race problems caused by
      TLB batch.  For that, we will increase/decrease TLB flush pending count
      of mm_struct whenever tlb_[gather|finish]_mmu is called.
      
      Before making it simple, this patch separates architecture specific part
      and rename it to arch_tlb_[gather|finish]_mmu and generic part just
      calls it.
      
      It shouldn't change any behavior.
      
      Link: http://lkml.kernel.org/r/20170802000818.4760-5-namit@vmware.com
      
      
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      56236a59
  5. Aug 09, 2017
  6. Aug 08, 2017
    • Paul Burton's avatar
      MIPS: Set ISA bit in entry-y for microMIPS kernels · 5fc9484f
      Paul Burton authored
      
      When building a kernel for the microMIPS ISA, ensure that the ISA bit
      (ie. bit 0) in the entry address is set. Otherwise we may include an
      entry address in images which bootloaders will jump to as MIPS32 code.
      
      I originally tried using "objdump -f" to obtain the entry address, which
      works for microMIPS but it always outputs a 32 bit address for a 32 bit
      ELF whilst nm will sign extend to 64 bit. That matters for systems where
      we might want to run a MIPS32 kernel on a MIPS64 CPU & load it with a
      MIPS64 bootloader, which would then jump to a non-canonical
      (non-sign-extended) address.
      
      This works in all cases as it only changes the behaviour for microMIPS
      kernels, but isn't the prettiest solution. A possible alternative would
      be to write a custom tool to just extract, sign extend & print the entry
      point of an ELF executable. I'm open to feedback if that would be
      preferred.
      
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/16950/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      5fc9484f
    • Paul Burton's avatar
      MIPS: Prevent building MT support for microMIPS kernels · 527f1028
      Paul Burton authored
      
      We don't currently support the MT ASE for microMIPS kernels, and there
      are no CPUs currently in existence that use both. They can however both
      be enabled in Kconfig, resulting in build failures such as:
      
        AS      arch/mips/kernel/cps-vec.o
      arch/mips/kernel/cps-vec.S: Assembler messages:
      arch/mips/kernel/cps-vec.S:242: Warning: the 32-bit microMIPS architecture does not support the `mt' extension
      arch/mips/kernel/cps-vec.S:276: Error: unrecognized opcode `mttc0 $13,$2,2'
      arch/mips/kernel/cps-vec.S:282: Error: unrecognized opcode `mttc0 $8,$1,2'
      arch/mips/kernel/cps-vec.S:285: Error: unrecognized opcode `mttc0 $0,$2,1'
      ...
      
      Fix this by preventing MT from being enabled when targeting microMIPS.
      
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/16951/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      527f1028
    • Gautham R. Shenoy's avatar
      powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails · 785a12af
      Gautham R. Shenoy authored
      
      Currently, we use the opal call opal_slw_set_reg() to inform the
      Sleep-Winkle Engine (SLW) to restore the contents of some of the
      Hypervisor state on wakeup from deep idle states that lose full
      hypervisor context (characterized by the flag
      OPAL_PM_LOSE_FULL_CONTEXT).
      
      However, the current code has a bug in that if opal_slw_set_reg()
      fails, we don't disable the use of these deep states (winkle on
      POWER8, stop4 onwards on POWER9).
      
      This patch fixes this bug by ensuring that if programing the
      sleep-winkle engine to restore the hypervisor states in
      pnv_save_sprs_for_deep_states() fails, then we exclude such states by
      clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from
      supported_cpuidle_states. As a result POWER8 will be prevented from
      using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to
      the default stop state when available.
      
      Further, we ensure in the initialization of the cpuidle-powernv driver
      to only include those states whose flags are present in
      supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT
      states when they have been disabled due to stop-api failure.
      
      Fixes: 1e1601b3 ("powerpc/powernv/idle: Restore SPRs for deep idle
      states via stop API.")
      
      Signed-off-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      785a12af
  7. Aug 07, 2017
  8. Aug 04, 2017
Loading