четверг

[Bug 1792195] Re: Signal 7 error when running GPFS tracing in cluster

** Changed in: linux (Ubuntu Cosmic)
Status: In Progress => Fix Committed

--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/1792195

Title:
Signal 7 error when running GPFS tracing in cluster

Status in The Ubuntu-power-systems project:
In Progress
Status in linux package in Ubuntu:
In Progress
Status in linux source package in Bionic:
Fix Committed
Status in linux source package in Cosmic:
Fix Committed

Bug description:
== SRU Justification ==
IBM is requesting these commits in bionic and cosmic. These commits
also rely on commit 7acf50e4efa6, which was SRU'd in bug 1792102.

Description of bug:
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel
driver using vmalloc) and then writing trace records from user space threads
in parallel. While the SIGBUS happened, the access virtual memory address
is in the mapped range, no overflow on access.

The root cause is that for PTEs created by a driver at mmap time (ie, that
aren't created dynamically at fault time), it's not legit for ptep_set_access_flags()
to make them invalid even temporarily. A concurrent access while they are
invalid will be unable to service the page fault and will cause as SIGBUS.

== Fixes ==
bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.")
f08d08f3db55 ("powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition")

== Regression Potential ==
Low. Limited to powerpc.

== Test Case ==
A test kernel was built with these patches and tested by IBM.
IBM states the test kernel resolved the bug.

-- Problem Description --
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel driver using vmalloc) and then writing trace records from user space threads in parallel. While the SIGBUS happened, the access virtual memory address is in the mapped range, no overflow on access.

Worked with Benjamin Herrenschmidt on GPFS tracing kernel driver code
and he made a suggestion as workaround on the driver code to bypass
the problem, and it works....

the workaround code change as below:

 - rc = remap_pfn_range(vma, start, pfn, PAGE_SIZE, PAGE_SHARED);
+ rc = remap_pfn_range(vma, start, pfn, PAGE_SIZE, __pgprot(pgprot_val(PAGE_SHARED)|_PAGE_DIRTY);

As Benjamin mentioned, this is a Linux kernel bug and this is just a
workaround. He will give the details about the kernel bug and why this
workaround works....

The root cause is that for PTEs created by a driver at mmap time (ie,
that aren't created dynamically at fault time), it's not legit for
ptep_set_access_flags() to make them invalid even temporarily. A
concurrent access while they are invalid will be unable to service the
page fault and will cause as SIGBUS.

Thankfully such PTEs shouldn't normally be the subject of a RO->RW
privilege escalation.

What happens is that the GPFS driver creates the PTEs using
remap_pfn_range(...,PAGE_SHARED).

PAGE_SHARED has _PAGE_ACCESSED (R) but not _PAGE_DIRTY (C) set.

Thus on the first write, we try set C and while doing so, hit the
above workaround, which causes the problem described earlier.

The proposed patch will ensure we only do the Nest MMU hack when
changing _PAGE_RW and not for normal R/C updates.

The workaround tested by the GPFS team consists of adding _PAGE_DIRTY
to the mapping created by remap_pfn_range() to avoid the RC update
fault completely.

This is fixed by these:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bd0dbb73e01306a1060e56f81e5fe287be936477

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f08d08f3db55452d31ba4a37c702da6245876b96

Since DD1 support is still in (ie,
2bf1071a8d50928a4ae366bb3108833166c2b70c is not applied) the second
doesn't apply cleanly. Did you want that attached?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1792195/+subscriptions

Комментариев нет:

Отправить комментарий