History

gram-signal 6b73e48025 Optimize writes by using a byte double-buffer to avoid write lock contention.		2024-03-06 10:32:34 -07:00
..
attestation	Remove consideration of PCR6 on AzureSNP.	2024-02-22 08:11:52 -07:00
boringssl@c5a99415cc	Update submodules/dependencies.	2023-11-15 10:56:53 -07:00
client	Move AttestationData into its own file.	2024-01-12 15:56:55 -07:00
context	Try to push out all current logs when we crash.	2024-01-24 10:03:16 -07:00
core	Cancel open e2e transactions when we ResetPeer.	2024-03-06 10:15:27 -07:00
db	Fix use-after-free of `row` after `erase` call in DB3.	2024-01-25 12:35:23 -07:00
ecalls	Squashed history.	2023-05-05 16:25:12 -06:00
env	Optimize writes by using a byte double-buffer to avoid write lock contention.	2024-03-06 10:32:34 -07:00
googletest@3026483ae5	Squashed history.	2023-05-05 16:25:12 -06:00
groupclock	Run integration test against Nitro simulated environment.	2023-05-17 09:17:03 -06:00
gtest	Squashed history.	2023-05-05 16:25:12 -06:00
hmac	Build Azure-specific confidential computing evidence/attestation into `{attestation,env}/azuresnp`	2024-02-15 15:35:27 -07:00
initmain	Get AMD SEV-SNP attestation report.	2023-06-14 15:23:45 -06:00
libsodium@fb4533b0a9	Update submodules/dependencies.	2023-11-15 10:56:53 -07:00
merkle	Clean up SIP-hash C++ library to avoid "extern C" includes.	2023-11-15 10:57:28 -07:00
metrics	Optimize writes by using a byte double-buffer to avoid write lock contention.	2024-03-06 10:32:34 -07:00
minimums	Add host request to update Minimums.	2024-01-05 16:50:34 -07:00
noise	Add context into Env API to enable monitoring.	2023-06-21 08:58:08 -06:00
noise-c@9ab8de7db4	Rekey peer-to-peer cipherstates every 256 packets.	2024-03-06 10:15:08 -07:00
noisewrap	Add context into Env API to enable monitoring.	2023-06-21 08:58:08 -06:00
peerid	Clean up SIP-hash C++ library to avoid "extern C" includes.	2023-11-15 10:57:28 -07:00
peers	Rekey peer-to-peer cipherstates every 256 packets.	2024-03-06 10:15:08 -07:00
proto	Introduce enforced monotonically increasing minimums to SVR.	2023-12-13 09:49:38 -07:00
protobuf@dab4d24d44	Squashed history.	2023-05-05 16:25:12 -06:00
protobuf-lite	Squashed history.	2023-05-05 16:25:12 -06:00
queue	Try to push out all current logs when we crash.	2024-01-24 10:03:16 -07:00
raft	Add host request to update Minimums.	2024-01-05 16:50:34 -07:00
rapidjson@6089180ecb	Build Azure-specific confidential computing evidence/attestation into `{attestation,env}/azuresnp`	2024-02-15 15:35:27 -07:00
releases	New SGX release.	2024-01-25 20:01:43 -07:00
sender	Try to push out all current logs when we crash.	2024-01-24 10:03:16 -07:00
sev-guest@62317d7de4	AMD SEV-SNP attestation verification.	2023-06-22 13:03:09 -06:00
sevtypes	AMD SEV-SNP attestation verification.	2023-06-22 13:03:09 -06:00
sip	Try to push out all current logs when we crash.	2024-01-24 10:03:16 -07:00
SipHash@68c8a7cb77	Update submodules/dependencies.	2023-11-15 10:56:53 -07:00
socketmain	Optimize writes by using a byte double-buffer to avoid write lock contention.	2024-03-06 10:32:34 -07:00
socketwrap	Optimize writes by using a byte double-buffer to avoid write lock contention.	2024-03-06 10:32:34 -07:00
svr2	Squashed history.	2023-05-05 16:25:12 -06:00
testhost	Nitro attestation	2023-05-24 09:44:47 -06:00
timeout	A few more counters targeting CPU measurement of sub-parts of the TimerTick call.	2023-08-02 09:05:19 -06:00
tinycbor@65a4147021	Update submodules/dependencies.	2023-11-15 10:56:53 -07:00
util	Optimize writes by using a byte double-buffer to avoid write lock contention.	2024-03-06 10:32:34 -07:00
.gitignore	Squashed history.	2023-05-05 16:25:12 -06:00
find_header.sh	Squashed history.	2023-05-05 16:25:12 -06:00
Makefile	Build an image that chains trust from bootloader to userspace	2024-03-06 10:07:53 -07:00
Makefile.base	Build Azure-specific confidential computing evidence/attestation into `{attestation,env}/azuresnp`	2024-02-15 15:35:27 -07:00
Makefile.HOST	Squashed history.	2023-05-05 16:25:12 -06:00
Makefile.SGX	Squashed history.	2023-05-05 16:25:12 -06:00
Makefile.subdir	Squashed history.	2023-05-05 16:25:12 -06:00
Makefile.TEST	Squashed history.	2023-05-05 16:25:12 -06:00
Makefile.X86	Get AMD SEV-SNP attestation report.	2023-06-14 15:23:45 -06:00
README.md	Documentation updates	2023-05-17 09:16:55 -06:00
svr2_small.conf	Squashed history.	2023-05-05 16:25:12 -06:00
svr2_test.conf	Squashed history.	2023-05-05 16:25:12 -06:00
svr2.conf	Squashed history.	2023-05-05 16:25:12 -06:00
test_deps.sh	Valgrind automatically in github.	2024-01-26 10:39:36 -07:00

README.md

SVR2 Enclave Code

SVR2 uses C++ as its language for building an in-enclave binary. The details of the build process and the host-enclave interaction depend on the platform. Since SVR2 is deployed on Intel SGX, this document will describe SGX-specific implementation details.

For SGX, the enclave binary is built with the OpenEnclave (hereafter 'OE') SDK. The binary, enclave.bin is then signed via OE's oesign, which doesn't matter to us because we don't trust the signature, just the unique ID (SGX "mrenclave") of the resulting signed config. However, the oesign process does one important thing: it binds a config (either svr2_test.conf or svr2.conf to the resulting object. Once this process is complete, the resulting enclave.signed file is ready to be loaded into a DCAP-based SGX enclave.

Host/enclave communication

After initialization, all host/enclave communication happens through a single ocall/ecall combination, defined in ../shared/svr2.edl:

svr2_input_message: Enclave receives a message (a serialized HostToEnclaveMessage protobuf) from the host.
svr2_output_message: Enclave sends a message (a serialized EnclaveToHostMessage protobuf) to the host.

Certain messages are 'transactions', or messages with a Request that want a specific Reply. It is important to note that if a request is passed in via a message, the response associated with it may not be part of the returned list. IE: the host may pass in a transaction request, above, via EnclaveToHostMessage1, but may not get back the reply until HostToEnclaveMessage4.1. Transactions have associated transaction IDs, which allow for disambiguating requests and their associated responses. Hosts may send transactions to enclaves and enclaves to hosts. Each direction maintains a unique keyspace for transaction IDs (so HostToEnclave transaction 1 and EnclaveToHost transaction 1 are distinct), and each is responsible for making sure that transaction requests that they pass are uniquely identified.

Code Layout

Code is broken into a set of modules, where each module is a one-level-deep subdirectory within the top-level enclave directory. Each module is independently compiled, then all modules are combined in a final linking step to form the resulting binary. Modules are listed as LIBRARIES within Makefile, and must form a DAG of dependencies. Within the LIBRARIES list, higher libraries may depend on lower libraries, but not vice versa.

Code roughly follows the Google C++ Style Guide.

Concurrency in SVR2 Enclave

With SVR2, we're aiming to utilize a single replica group to serve all traffic. This, of course, brings up issues around scalability. We can of course add new replicas to the replica group, but with a strong consensus model relying on agreement between a quorum (in our case, a simple majority) of voting replicas, additional replicas have the potential to add load rather than shed it.

To handle this, SVR2 is built to, as much as possible, utilize the resources of non-leader and non-voting replicas. While we're unable to shed or reduce load on RAM with added members (each replica needs to store the entire database), we can shed load in the form of CPU and network resources.

Utilizing multiple cores

Even without considering excess replicas, we aim to utilize the resources of each replica to the fullest extent. To do this, the SVR2 enclave binary is built as a true multi-threaded process, with targetted locking of code subsections allowing parallel processing as much as possible.

One of the most CPU-intensive tasks that SVR2 partakes in is encryption and decryption. This takes place when replicas communicate with each other ("peer communication") and when they accept and service connections from clients ("client communication"). When establishing these secure connections, the initial handshake is more CPU-intensive, followed by less intensive block cipher encryption/decryption. Peer communication uses long-lived sessions that amortize handshake cost over a long period of time, while client communication requires a new handshake, a small amount of communication, and a subsequent closing of the connection.

For both peer and client communication, we aim to be highly parallel on a single machine: handshaking and block-cipher encryption/decryption are done with client- and peer-level locks, rather than global ones. This approach, though, lays some requirements on the host side, as for both cases, reordering of messages breaks the block-cipher assumptions of the clients/peers. Internally, SVR2 maintains correct order of messages it outputs to peers and clients: if message A to a peer or client happens before message B, then svr2_output_message(A) will be called and allowed to complete before svr2_output_message(B). However, on the host side, care must be taken to respect this ordering: when messages are forwarded externally or received from external hosts, their calls to svr2_input_message should follow the same pattern: if A is received before B in either a peer or client stream, then svr2_input_message(A) should be called and allowed to complete before svr2_input_message(B) is called.

Some global locks are of course still required, in particular around Raft and its associated logs/database. However, these locked sections are kept at a minimimum, with as much work done as possible before/after the locks are acquired.

Utilizing multiple machines

The primary means to scale SVR2 is the addition of replicas. However, as mentioned, this has the potential to hinder scaling, especially if the leader alone is allowed to perform CPU-intensive tasks like servicing client requests. For this reason, SVR2 is built to allow any replica to service requests from any client.

When a client connects to SVR2 in a non-leader replica, it will perform the client handshake and receive/decrypt the client's request entirely on its own. Once it has done so, it will forward the request to the current leader as an enclave-to-enclave transaction, receiving in response either a failure or a log location (an (index, term) pair) associated with the write. Failures are immediately returned to the client. A success, though, creates a watch-point in the non-leader replica's raft log at index. The replica will wait until index is a committed part of its own log (via normal Raft AppendEntries mechanisms), then will check the term of the committed log. If that matches the term returned from its write request, by definition the log at index contains the client's request, and when applying that request to its local database, it can safely return the response to that client over its still-open channel.

By this mechanism, load (especially client handshake and communication load) can be shared across all replicas. Crucially, this includes non-voting replicas, which can be added with minimal increase to the load on the voting replicas. As non-voting replicas still receive Raft logs and their commitments, they can happily service client requests.