📝 docs: plan remaining tasks on scraper API
parent
4c52d88be0
commit
78bffc74c3
70
README.md
70
README.md
|
@ -4,48 +4,19 @@ An HTTP server that can run behind a firewall by connecting out to a relay.
|
|||
|
||||
```
|
||||
Outside the tunnel
|
||||
+--------+ +------------+ +-------------+
|
||||
| Client | ------> | PTTH relay | <----- | PTTH server |
|
||||
+--------+ +------------+ +-------------+
|
||||
+--------+ +------------+ +-------------+
|
||||
| Client | >>> | PTTH relay | <<< | PTTH server |
|
||||
+--------+ +------------+ +-------------+
|
||||
|
||||
Inside the tunnel
|
||||
+--------+ -------------- +-------------+
|
||||
| Client | ----------------------------> | Server |
|
||||
+--------+ -------------- +-------------+
|
||||
+--------+ -------------- +-------------+
|
||||
| Client | >>> >>> >>> | Server |
|
||||
+--------+ -------------- +-------------+
|
||||
```
|
||||
|
||||
The server can run behind a firewall, because it is actually a special HTTP
|
||||
client.
|
||||
|
||||
## Glossary
|
||||
|
||||
(sorted alphabetically)
|
||||
|
||||
- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
|
||||
Noted in the code with the cookie "7ZSFUKGV".
|
||||
- **Client** - Any client that connects to ptth_relay in order to reach a
|
||||
destination server. Admins must terminate TLS between
|
||||
ptth_relay and all clients.
|
||||
- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface
|
||||
that ptth_relay serves directly or relays from ptth_server.
|
||||
This interface has no auth by default. Admins must provide their own auth
|
||||
in front of ptth_relay. OAuth2 is recommended.
|
||||
- **ptth_file_server** - A standalone file server. It uses the same code
|
||||
as ptth_server, so production environments don't need it.
|
||||
- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
|
||||
that can accept incoming HTTP connections.
|
||||
- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
|
||||
a firewall. It will connect out to the relay and accept incoming connections
|
||||
through the PTTH tunnel.
|
||||
- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
|
||||
the destination servers using machine-friendly auth.
|
||||
- **Tripcode** - The base64 hash of a server's private API key. When adding
|
||||
a new server, the tripcode must be copied to ptth_relay.toml on the relay
|
||||
server.
|
||||
- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
|
||||
ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
|
||||
connections to ptth_server through the tunnel.
|
||||
|
||||
## Configuration
|
||||
|
||||
ptth_server:
|
||||
|
@ -109,6 +80,35 @@ proxy_request_buffering off;
|
|||
proxy_buffering off;
|
||||
```
|
||||
|
||||
## Glossary
|
||||
|
||||
(sorted alphabetically)
|
||||
|
||||
- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
|
||||
Noted in the code with the cookie "7ZSFUKGV".
|
||||
- **Client** - Any client that connects to ptth_relay in order to reach a
|
||||
destination server. Admins must terminate TLS between
|
||||
ptth_relay and all clients.
|
||||
- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface
|
||||
that ptth_relay serves directly or relays from ptth_server.
|
||||
This interface has no auth by default. Admins must provide their own auth
|
||||
in front of ptth_relay. OAuth2 is recommended.
|
||||
- **ptth_file_server** - A standalone file server. It uses the same code
|
||||
as ptth_server, so production environments don't need it.
|
||||
- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
|
||||
that can accept incoming HTTP connections.
|
||||
- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
|
||||
a firewall. It will connect out to the relay and accept incoming connections
|
||||
through the PTTH tunnel.
|
||||
- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
|
||||
the destination servers using machine-friendly auth.
|
||||
- **Tripcode** - The base64 hash of a server's private API key. When adding
|
||||
a new server, the tripcode must be copied to ptth_relay.toml on the relay
|
||||
server.
|
||||
- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
|
||||
ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
|
||||
connections to ptth_server through the tunnel.
|
||||
|
||||
## Comparison with normal HTTP
|
||||
|
||||
Normal HTTP:
|
||||
|
|
|
@ -38,6 +38,7 @@ stronger is ready.
|
|||
- (X) (POC) Test with curl
|
||||
- (X) Clean up scraper endpoint
|
||||
- (X) Add (almost) end-to-end tests for scraper endpoint
|
||||
- ( ) Add real scraper endpoints
|
||||
- ( ) Manually create SQLite DB for scraper keys, add 1 hash
|
||||
- ( ) Impl DB reads
|
||||
- ( ) Remove scraper key from config file
|
||||
|
@ -66,6 +67,16 @@ Design the DB so that the servers can share it one day.
|
|||
Design the API so that new types of auth / keys can be added one day, and
|
||||
the old ones deprecated.
|
||||
|
||||
Endpoints needed:
|
||||
|
||||
- Query server list
|
||||
- Query directory in server
|
||||
- GET file with byte range (identical to frontend file API)
|
||||
|
||||
These will all be JSON for now since Python, Rust, C++, C#, etc. can handle it.
|
||||
For compatibility with wget spidering, I _might_ do XML or HTML that's
|
||||
machine-readable. We'll see.
|
||||
|
||||
## Open questions
|
||||
|
||||
**Who generates the API key? The scraper client, or the PTTH relay server?**
|
||||
|
|
Loading…
Reference in New Issue