ptth/crates/ptth_quic/README.md

# End-to-end Testing

There are 5 processes, so you'll need 5 terminal windows or screen / tmux
sessions. Run the processes in this order:

1. QUIC relay server: `RUST_LOG=ptth_quic_relay_server=debug cargo run --bin ptth_quic_relay_server`
2. Server-side proxy: `RUST_LOG=ptth_quic_end_server=debug cargo run --bin ptth_quic_end_server`
3. Client-side proxy: `RUST_LOG=ptth_quic_client cargo run --bin ptth_quic_client`
4. TCP end server: `nc -l -p 30382`
5. TCP end client: `nc 127.0.0.1 30381`

The netcat processes from steps 1 and 5 should now be connected to each other.

# Testing PTTH itself

The end-to-end testing above is the happy path. Try these sadder cases:

- Swap Steps 2 and 3
- After Step 2, restart the server proxy P4
- After Step 3, restart the client proxy P2
- After Step 5, close P1. P2 and P4 should stay up
- After Step 2, restart the relay server P3
- After Step 3, restart the relay server P3

# Network protocol

(This section is out of date. It describes an older version)

For the prototype, all control messages are fixed 4-byte messages at the
start of bidirectional streams.

Unused bytes are always "0".

Messages are sent in request-response pairs, like HTTP.

Requests look like this:

1. Command type
2. Extra data, or unused, depending on command type
3. Unused
4. Unused

Responses look like this:

1. Status code (Always "20" for now, meaning "OK")
2. Command type of the request that we're responding to
3. Unused
4. Unused

The command types are:

| Type | Sender | Receiver    | Meaning                                           |
| ---- | ------ | ----------- | ------------------------------------------------- |
|  2   | P2     | P3          | Client proxy wants to connect to the relay        |
|  4   | P4     | P3          | Server proxy wants to connect to the relay        |
| 10   | P2     | P3          | Client wants relay to connect it to a server      |
| 11   | P3     | P4          | Relay tells server that a client wants to connect |
| 12   | P2     | P4 (via P3) | Client wants a port forwarded from the server     |

# Performance

Running all 5 processes on my desktop, measured by `pv`, I can hit 10 MiB/s
for bulk transfers with debug builds, and 100 MiB/s with release builds.
In production, P2, P3, and P4 would be running on separate systems, and the
bottleneck would likely be the Internet connection between P2 and P3 or between
P3 and P4.

# Plan

This is a TCP port forwarding system, like SSH has, but better.

We'll name the processes in order from clientest to serverest:

1. TCP end client (e.g. VNC viewer, SSH client)
2. Client-side proxy (Part of PTTH, must run as a desktop app on end user systems)
3. Relay server (Part of ptth_relay)
4. Server-side proxy (Path of ptth_server)
5. TCP end server (e.g. VNC server, SSH server)

At the highest level, creating a connection means:

1. Processes 2, 3, and 4 are running and connected by QUIC.
2. Process 5 is running and listening on its TCP port. 
(e.g. 22 for SSH)
3. Process 2 sets a port mapping to the server. 
(e.g. 2200 on the client is mapped to 22 on the server)
4. Process 1 connects to the client-side port
(e.g. ssh to 2200 on localhost)
5. Processes 2, 3, and 4 forward that connection to Process 5.
Any of the other processes can reject the connection for various reasons.
6. The connection is established, and Processes 2, 3, and 4 relay bytes
for Processes 1 and 5.

What identifies a connection?

1. The client ID, because a connection is never shared between 2 clients.
2. The ephemeral port used by P1, because 2 SSH clients may connect to the same
forwarded port in P2 at once.
3. The server ID, because a connection is never shared between 2 servers.
4. The listen port of P5, for the sake of matching TCP theory, and because
this data is needed. I'm not sure if it's technically part of the identifier.

Because clients can't see each other, each client will store these data
about each of its connections:

1. The ephemeral port used by P1
2. The server ID
3. The listen port for P5

Because servers can't see each other, each server stores these data:

1. P1's port
2. The client ID
3. The listen port for P5

The relay server must store all 4 data for each connection.

Each client and each server has exactly one QUIC connection to the relay
server.

In P2 and P4, each PTTH connection has a TCP bi stream and a QUIC bi stream.

In P3, each PTTH connection has 2 QUIC bi streams.

Assume that some kind of framing exists.

Once a port mapping is established (within P2 only), a connection proceeds
like this:

1. P1 connects to P2
2. P2 starts a QUIC bi stream to P3. This has a tip message of
(P1 port, P4 ID, P5 port).
3. If P2 isn't authorized to make the connection, P3 closes it
4. If P4 isn't connected to P3, the connection fails
5. P3 starts a QUIC bi stream to P4. This has a tip message of
(P1 port, P2 ID, P5 port).
6. If P4 doesn't approve, the connection closes
7. If P4 can't connect to P5, the connection closes
8. P4 connects to P5
9. P4 returns a tip message to P3, equivalent to "200 OK"
10. P3 returns the tip message to P2
11. The connection goes into relaying mode

Here's the big things I'm thinking:

- It's kinda Tor, in that there's 3 middle nodes and onion layers
- It's kinda like HTTP 1, except after the server responds, you turn the
connection into an arbitrary TCP connection. Maybe that's what the CONNECT
verb does?
- I will probably need a control channel some day, but not today

So I don't need a general control stream, which means luckily I don't need
to worry about synchronizing between streams. The tip messages are enough
to get things going.

Would the protocol be cleaner if the tip messages were explicitly layered?
I think so. As long as we aren't end-to-end encrypted (AND WE ARE NOT) this
gives P3, the relay server, a chance to be clever if we ever need it.

So reviewing the above steps:

In Step 2 the tip message says, "Connect me to P4 with ID 'bogus' and forward
this message to them".

Between Steps 2 and 5, P3 re-wraps the inner message. P3 says to P4,
"P2 with ID 'bogus' wants to connect, and forwards this message"

P4 receives that inner tip message in Step 5, and it says "Connect my port 2200
to your port 22."

There's no need for the actual port numbers to be used - It could be tokenized.
But for a prototype, I think it's fine.

Since all parties trust P3, this makes it more clear that P2 and P4 may not
directly trust each other. P3 vouches for the identity of each. The response
from P4 is symmetrical with the request from P2.
:white_check_mark: write some failing tests 2021-07-17 06:16:50 +00:00			`# End-to-end Testing`
modify end server to connect to a local TCP server Tested with netcat on each end and it works great. 2021-07-17 03:11:01 +00:00
			`There are 5 processes, so you'll need 5 terminal windows or screen / tmux`
:bug: bug: fix start-up order 2021-07-17 06:12:16 +00:00			`sessions. Run the processes in this order:`
modify end server to connect to a local TCP server Tested with netcat on each end and it works great. 2021-07-17 03:11:01 +00:00
:truck: change `quic_demo` to `ptth_quic` 2022-10-09 14:19:01 +00:00			1. QUIC relay server: `RUST_LOG=ptth_quic_relay_server=debug cargo run --bin ptth_quic_relay_server`
			2. Server-side proxy: `RUST_LOG=ptth_quic_end_server=debug cargo run --bin ptth_quic_end_server`
			3. Client-side proxy: `RUST_LOG=ptth_quic_client cargo run --bin ptth_quic_client`
:bug: bug: fix start-up order 2021-07-17 06:12:16 +00:00			4. TCP end server: `nc -l -p 30382`
modify end server to connect to a local TCP server Tested with netcat on each end and it works great. 2021-07-17 03:11:01 +00:00			5. TCP end client: `nc 127.0.0.1 30381`

			`The netcat processes from steps 1 and 5 should now be connected to each other.`
:pencil: planning 2021-07-17 06:08:00 +00:00
:white_check_mark: write some failing tests 2021-07-17 06:16:50 +00:00			`# Testing PTTH itself`

			`The end-to-end testing above is the happy path. Try these sadder cases:`

:white_check_mark: 2021-07-17 06:18:22 +00:00			`- Swap Steps 2 and 3`
:white_check_mark: write some failing tests 2021-07-17 06:16:50 +00:00			`- After Step 2, restart the server proxy P4`
			`- After Step 3, restart the client proxy P2`
:white_check_mark: add failing test 2021-07-17 19:55:52 +00:00			`- After Step 5, close P1. P2 and P4 should stay up`
:white_check_mark: write some failing tests 2021-07-17 06:16:50 +00:00			`- After Step 2, restart the relay server P3`
			`- After Step 3, restart the relay server P3`

:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00			`# Network protocol`

change server ID from `u8` to a string 2021-07-18 22:22:48 +00:00			`(This section is out of date. It describes an older version)`

:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00			`For the prototype, all control messages are fixed 4-byte messages at the`
			`start of bidirectional streams.`

			`Unused bytes are always "0".`

:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`Messages are sent in request-response pairs, like HTTP.`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00
:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`Requests look like this:`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00
:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`1. Command type`
			`2. Extra data, or unused, depending on command type`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00			`3. Unused`
			`4. Unused`

:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`Responses look like this:`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00
:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`1. Status code (Always "20" for now, meaning "OK")`
			`2. Command type of the request that we're responding to`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00			`3. Unused`
			`4. Unused`

:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`The command types are:`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00
:recycle: refactor: change up the protocol so that everything has a distinct 4-byte message 2021-07-17 23:14:50 +00:00			`\| Type \| Sender \| Receiver \| Meaning \|`
			`\| ---- \| ------ \| ----------- \| ------------------------------------------------- \|`
			`\| 2 \| P2 \| P3 \| Client proxy wants to connect to the relay \|`
			`\| 4 \| P4 \| P3 \| Server proxy wants to connect to the relay \|`
			`\| 10 \| P2 \| P3 \| Client wants relay to connect it to a server \|`
			`\| 11 \| P3 \| P4 \| Relay tells server that a client wants to connect \|`
			`\| 12 \| P2 \| P4 (via P3) \| Client wants a port forwarded from the server \|`
:pencil: document the fixed-length control protocol that I need to replace soon 2021-07-17 22:27:05 +00:00
:pencil: It can do 100 MiB/s with everything running on localhost 2021-07-18 16:56:38 +00:00			`# Performance`

			Running all 5 processes on my desktop, measured by `pv`, I can hit 10 MiB/s
			`for bulk transfers with debug builds, and 100 MiB/s with release builds.`
			`In production, P2, P3, and P4 would be running on separate systems, and the`
			`bottleneck would likely be the Internet connection between P2 and P3 or between`
			`P3 and P4.`

:pencil: planning 2021-07-17 06:08:00 +00:00			`# Plan`

			`This is a TCP port forwarding system, like SSH has, but better.`

			`We'll name the processes in order from clientest to serverest:`

			`1. TCP end client (e.g. VNC viewer, SSH client)`
			`2. Client-side proxy (Part of PTTH, must run as a desktop app on end user systems)`
			`3. Relay server (Part of ptth_relay)`
			`4. Server-side proxy (Path of ptth_server)`
			`5. TCP end server (e.g. VNC server, SSH server)`

			`At the highest level, creating a connection means:`

			`1. Processes 2, 3, and 4 are running and connected by QUIC.`
			`2. Process 5 is running and listening on its TCP port.`
			`(e.g. 22 for SSH)`
			`3. Process 2 sets a port mapping to the server.`
			`(e.g. 2200 on the client is mapped to 22 on the server)`
			`4. Process 1 connects to the client-side port`
			`(e.g. ssh to 2200 on localhost)`
			`5. Processes 2, 3, and 4 forward that connection to Process 5.`
			`Any of the other processes can reject the connection for various reasons.`
			`6. The connection is established, and Processes 2, 3, and 4 relay bytes`
			`for Processes 1 and 5.`

			`What identifies a connection?`

			`1. The client ID, because a connection is never shared between 2 clients.`
			`2. The ephemeral port used by P1, because 2 SSH clients may connect to the same`
			`forwarded port in P2 at once.`
			`3. The server ID, because a connection is never shared between 2 servers.`
			`4. The listen port of P5, for the sake of matching TCP theory, and because`
			`this data is needed. I'm not sure if it's technically part of the identifier.`

			`Because clients can't see each other, each client will store these data`
			`about each of its connections:`

			`1. The ephemeral port used by P1`
			`2. The server ID`
			`3. The listen port for P5`

			`Because servers can't see each other, each server stores these data:`

			`1. P1's port`
			`2. The client ID`
			`3. The listen port for P5`

			`The relay server must store all 4 data for each connection.`

			`Each client and each server has exactly one QUIC connection to the relay`
			`server.`

			`In P2 and P4, each PTTH connection has a TCP bi stream and a QUIC bi stream.`

			`In P3, each PTTH connection has 2 QUIC bi streams.`

			`Assume that some kind of framing exists.`

			`Once a port mapping is established (within P2 only), a connection proceeds`
			`like this:`

			`1. P1 connects to P2`
			`2. P2 starts a QUIC bi stream to P3. This has a tip message of`
			`(P1 port, P4 ID, P5 port).`
			`3. If P2 isn't authorized to make the connection, P3 closes it`
			`4. If P4 isn't connected to P3, the connection fails`
			`5. P3 starts a QUIC bi stream to P4. This has a tip message of`
			`(P1 port, P2 ID, P5 port).`
			`6. If P4 doesn't approve, the connection closes`
			`7. If P4 can't connect to P5, the connection closes`
			`8. P4 connects to P5`
			`9. P4 returns a tip message to P3, equivalent to "200 OK"`
			`10. P3 returns the tip message to P2`
			`11. The connection goes into relaying mode`

			`Here's the big things I'm thinking:`

			`- It's kinda Tor, in that there's 3 middle nodes and onion layers`
			`- It's kinda like HTTP 1, except after the server responds, you turn the`
			`connection into an arbitrary TCP connection. Maybe that's what the CONNECT`
			`verb does?`
			`- I will probably need a control channel some day, but not today`

			`So I don't need a general control stream, which means luckily I don't need`
			`to worry about synchronizing between streams. The tip messages are enough`
			`to get things going.`

			`Would the protocol be cleaner if the tip messages were explicitly layered?`
			`I think so. As long as we aren't end-to-end encrypted (AND WE ARE NOT) this`
			`gives P3, the relay server, a chance to be clever if we ever need it.`

			`So reviewing the above steps:`

			`In Step 2 the tip message says, "Connect me to P4 with ID 'bogus' and forward`
			`this message to them".`

			`Between Steps 2 and 5, P3 re-wraps the inner message. P3 says to P4,`
			`"P2 with ID 'bogus' wants to connect, and forwards this message"`

			`P4 receives that inner tip message in Step 5, and it says "Connect my port 2200`
			`to your port 22."`

			`There's no need for the actual port numbers to be used - It could be tokenized.`
			`But for a prototype, I think it's fine.`

			`Since all parties trust P3, this makes it more clear that P2 and P4 may not`
			`directly trust each other. P3 vouches for the identity of each. The response`
			`from P4 is symmetrical with the request from P2.`