Security

The isabelle server extensibility presents a few challenges for securely running a public instance:

How hard is it to escape the isabelle server with (ML code|HOL|...)?

Not hard, in fact, there's even facilities provided by Isabelle to make, e.g., arbitrary File IO easier for ML code.

As such the server needs to be treated as a user facility for executing arbitrary code. There are a few steps we can take here to make this safer-ish:

  • Careful parsing and sanitization of all i/o
    • The isabelle manager will reject all malformed messages from both client and server
  • monitoring/ recording events. We need to be able to trace down if, e.g., a user tries to DOS servers by injecting malicious ML code.
  • isolation with various executors to manage service instances
  • Automatic service management via isabelle manager
    • automatic restart
    • automatic handling of unresponsive servers
    • possible to get isolated server instances for important-ish users

What are vectors we want/ can cover? How can we ensure CIA?

  • DOS -- Availability
    • Nobody likes a system that constantly crashes
    • In case something goes wrong recovery is important-ish and should be fast
    • The best case here would be a transparent crash handling where a crashed server would automatically get replaced
  • System compromise -- Integrity/ Confidentiality
    • A public service, even if only public to students, will be attacked.
    • We don't want to have someone altering results or even worse abusing the shared resources.
  • (Noisy Neighbors) -- Availability
    • We need to somehow account for users that overuse shared resources. This can likely be done by rate limiting use_theories calls.
    • Our intended audience is beginners that want to learn how to create proofs and not how to create the next AFP submission.

Isolation & Confinement

We need to isolate and confine all isabelle servers because they effectively are designed to run arbitrary code from the user. This is a great feature that grants users a lot of freedom. However, from a security perspective it's a challenge that the following measures try to make manageable.

Service isolation

All executors utilise variants of namespace-based isolation of server instances. This is means server instances and associated ressource live in their own filesystem, devices, etc. like a container.

Network Isolation

All executors utilise network namespaces with verying management schemes. The specific structure and isolation guarantees are an implementation detail of the executor however all executors guarantee:

  1. no inter-instance networking: server a cannot communicate with server b
  2. no host-instance networking: server a cannot communicate with host service b

Communcation is always restricted to Host <> ISM <> Instances, e.g.:

flowchart TD
    H[Host] <-->|Message| I[ISM]
    I <-->|Message| S1[Server 1]
    I <-->|Message| S2[Server 2]
    I <-->|Message| S3[Server 3]

FS Isolation

All executors put service units into a chroot-like namespace with only the required linux devices. This is realized using mount namespaces. Any additional directories, for, e.g., theory files, are mounted on demand.

The specific path schemes are an implementation detail of the executor. The rewritten paths and path mangement is done via the Mount and Unmount methods of the executor API.

Mount flow

The first time an eligble dir, that exists on the host FS, is referenced a call to Mount will be made (gateway/executor-api.go).

  1. The client checks, if this dir has been previously mounted by this client

    1. if yes, return path in server instance for dir mount
    2. If no, continue
  2. The executor's Mount method is called with dir The executor checks if dir has been previously mounted by any connected client to this server instance

    1. if yes, increase reference counter on dir mount and return path in server instance for dir mount
    2. otherwise, continue:
      1. The source path is hashed1, the result is the so called alias. This hashing step is done to avoid path escaping/ handling for nested paths. A convenient side effect is also that we remove the metadata leak from the paths used by the backend.
      2. otherwise, create mount with reference counter = 1 and return path in server instance for dir mount