Security
The isabelle server extensibility presents a few challenges for securely running a public instance:
How hard is it to escape the isabelle server with (ML code|HOL|...)?
Not hard, in fact, there's even facilities provided by Isabelle to make, e.g., arbitrary File IO easier for ML code.
As such the server needs to be treated as a user facility for executing arbitrary code. There are a few steps we can take here to make this safer-ish:
- Careful parsing and sanitization of all i/o
- The isabelle manager will reject all malformed messages from both client and server
- monitoring/ recording events. We need to be able to trace down if, e.g., a user tries to DOS servers by injecting malicious ML code.
- isolation with various executors to manage service instances
- Automatic service management via isabelle manager
- automatic restart
- automatic handling of unresponsive servers
- possible to get isolated server instances for important-ish users
What are vectors we want/ can cover? How can we ensure CIA?
- DOS -- Availability
- Nobody likes a system that constantly crashes
- In case something goes wrong recovery is important-ish and should be fast
- The best case here would be a transparent crash handling where a crashed server would automatically get replaced
- System compromise -- Integrity/ Confidentiality
- A public service, even if only public to students, will be attacked.
- We don't want to have someone altering results or even worse abusing the shared resources.
- (Noisy Neighbors) -- Availability
- We need to somehow account for users that overuse shared resources.
This can likely be done by rate limiting
use_theories
calls. - Our intended audience is beginners that want to learn how to create proofs and not how to create the next AFP submission.
- We need to somehow account for users that overuse shared resources.
This can likely be done by rate limiting
Isolation & Confinement
We need to isolate and confine all isabelle servers because they effectively are designed to run arbitrary code from the user. This is a great feature that grants users a lot of freedom. However, from a security perspective it's a challenge that the following measures try to make manageable.
Service isolation
All executors utilise variants of namespace-based isolation of server instances. This is means server instances and associated ressource live in their own filesystem, devices, etc. like a container.
Network Isolation
All executors utilise network namespaces with verying management schemes. The specific structure and isolation guarantees are an implementation detail of the executor however all executors guarantee:
- no inter-instance networking: server
a
cannot communicate with serverb
- no host-instance networking: server
a
cannot communicate with host serviceb
Communcation is always restricted to Host <> ISM <> Instances, e.g.:
flowchart TD H[Host] <-->|Message| I[ISM] I <-->|Message| S1[Server 1] I <-->|Message| S2[Server 2] I <-->|Message| S3[Server 3]
FS Isolation
All executors put service units into a chroot-like namespace with only the required linux devices. This is realized using mount namespaces. Any additional directories, for, e.g., theory files, are mounted on demand.
The specific path schemes are an implementation detail of the executor.
The rewritten paths and path mangement is done via the Mount
and Unmount
methods of the executor API.
Mount flow
The first time an eligble dir, that exists on the host FS, is referenced a call to Mount
will be made (gateway/executor-api.go
).
-
The client checks, if this dir has been previously mounted by this client
- if yes, return path in server instance for
dir
mount - If no, continue
- if yes, return path in server instance for
-
The executor's
Mount
method is called withdir
The executor checks if dir has been previously mounted by any connected client to this server instance- if yes, increase reference counter on
dir
mount and return path in server instance fordir
mount - otherwise, continue:
- The
source path
is hashed1, the result is the so calledalias
. This hashing step is done to avoid path escaping/ handling for nested paths. A convenient side effect is also that we remove the metadata leak from the paths used by the backend. - otherwise, create mount with reference counter =
1
and return path in server instance fordir
mount
- The
- if yes, increase reference counter on