By Kenton Varda - 25 Oct 2016
Last week, a Linux kernel bug, CVE-2016-5195, was described as “the most serious Linux local privilege escalation ever”. The bug – which potentially allowed any code running on a Linux machine to escalate its privileges to root – was already being actively exploited in the wild before it was fixed, and had existed in the kernel for many years.
Since Sandstorm allows any user of a server to upload their own apps, you might wonder if this bug could allow a Sandstorm user to compromise the server.
We’re happy to report that the answer appears to be “no”. As is often the case with Linux kernel bugs, our sandbox blocked the exploit.
Of course, we still recommend updating your kernel in case the bug can be exploited in ways that have not been discovered yet.
The bug in question was a race condition in the handling of memory pages mapped copy-on-write. A process can ask that a read-only file be mapped into its memory space in such a way that it is allowed to modify the mapped memory. When the process writes to the memory, the kernel makes a private copy of the affected page, so that the process only modifies its copy, not the original. Meanwhile, a process can request later on that the modifications it made be discarded, returning the page to its original state. In certain circumstances, by both writing to a page and requesting this discard at the same time (in separate threads), the process could end up writing to the original pages that are shared with other processes on the system, instead of its own private copy. Hence, the process could modify any file on the system. By modifying, say, the sudo
utility, it could give itself a backdoor which allows it to gain root privileges trivially.
However, not just any old write worked here. In order to trigger the race condition, the process had to write in a way that calls the kernel’s get_user_pages()
function with the force
parameter set to 1
. The force
parameter says: “If this page is mapped copy-on-write, then let me write to it (making a private copy) even if the page’s protection mode is read-only.” As it turns out, it is possible for a memory mapping to be both read-only and copy-on-write, and in fact this is the mode that is usually used when mapping in a program’s main binary and shared libraries. Normally, no copy is ever performed, because the writes that would trigger them are not allowed. However, there is a special case where this combination of flags matters: If you are running a program in a debugger, and you ask the debugger to insert a breakpoint, it does so by overwriting the instruction at the given address with a break instruction. That is, it modifies the mapped executable. The force
flag actually exists for exactly this purpose: so that the debugger can inject breakpoints into the program being executed by the process being debugged (without affecting any other processes that happen to be running the same program).
Because the force
flag is only useful in very specific circumstances, only certain code paths can trigger the vulnerability. Kernel security engineer and Sandstorm contributor Andrew Lutomirski tells us the only entry points appear to be:
ptrace()
system call’s PTRACE_POKEDATA
operation, which is explicitly meant to be used by debuggers, often for the purpose of setting breakpoints./proc/<pid>/mem
. It’s unclear why this code uses force
– possibly it was a mistake.As it turns out, none of these code paths can be exploited by Sandstorm apps:
ptrace()
./proc
inside app sandboxes./dev
contains only null
, zero
, and urandom
.So, as far as we can tell, Sandstorm has never been vulnerable to this bug.
Even if Sandstorm were vulnerable, the exploit would have far reduced impact inside Sandstorm than in a typical Linux environment, because:
NO_NEW_PRIVS
prctl()
flag inside the sandbox.When running on Sandstorm, a user’s data in an app like Etherpad is containerized separately from another user’s data. In fact, we go one step further and containerize each document separately. In the case that Sandstorm had not mitigated the bug outright, it appears the impact of the bug would be that an app could break Sandstorm’s per-document isolation and read/write documents from any number of users, so long as those users all use the same version of the same app on the same server. The app still would not have been able to interfere with other apps. This is the status quo in a typical Linux environment: in most non-Sandstorm environments, an app keeps all users’ data in a single database without per-user isolation. Overall, this is much less significant than a privilege escalation to root. Thankfully, our seccomp mitigation prevented this.
This is not the first Linux security bug mitigated by Sandstorm. In fact, we’ve kept a long list. Moreover, in addition to mitigating Linux kernel problem, Sandstorm mitigates most vulnerabilities in the apps that run on top of it. Check out the whole list of mitigated vulnerabilities that we’ve compiled: Sandstorm Security Non-Events
Want to try out Sandstorm as a user? Try the online demo »