📝 docs: plan remaining tasks on scraper API

2020-12-13 05:04:04 +00:00 · 2020-12-13 05:04:04 +00:00 · 78bffc74c3
parent 4c52d88be0
commit 78bffc74c3
2 changed files with 46 additions and 35 deletions
--- a/README.md
+++ b/README.md
@ -4,48 +4,19 @@ An HTTP server that can run behind a firewall by connecting out to a relay.

 ```
 Outside the tunnel
-+--------+         +------------+        +-------------+
-| Client | ------> | PTTH relay | <----- | PTTH server |
-+--------+         +------------+        +-------------+
+--------+         +------------+         +-------------+
+| Client |   >>>   | PTTH relay |   <<<   | PTTH server |
+--------+         +------------+         +-------------+

 Inside the tunnel
-+--------+         --------------        +-------------+
-| Client | ----------------------------> |   Server    |
-+--------+         --------------        +-------------+
+--------+         --------------         +-------------+
+| Client |   >>>        >>>         >>>   |   Server    |
+--------+         --------------         +-------------+
 ```

 The server can run behind a firewall, because it is actually a special HTTP
 client.

-## Glossary
-
-(sorted alphabetically)
-
- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
-Noted in the code with the cookie "7ZSFUKGV".
- **Client** - Any client that connects to ptth_relay in order to reach a
-destination server. Admins must terminate TLS between 
-ptth_relay and all clients.
- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface 
-that ptth_relay serves directly or relays from ptth_server. 
-This interface has no auth by default. Admins must provide their own auth 
-in front of ptth_relay. OAuth2 is recommended.
- **ptth_file_server** - A standalone file server. It uses the same code
-as ptth_server, so production environments don't need it.
- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
-that can accept incoming HTTP connections.
- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
-a firewall. It will connect out to the relay and accept incoming connections
-through the PTTH tunnel.
- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
-the destination servers using machine-friendly auth.
- **Tripcode** - The base64 hash of a server's private API key. When adding
-a new server, the tripcode must be copied to ptth_relay.toml on the relay
-server.
- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
-ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
-connections to ptth_server through the tunnel.
-
 ## Configuration

 ptth_server:
@ -109,6 +80,35 @@ proxy_request_buffering off;
 proxy_buffering off;
 ```

+## Glossary
+
+(sorted alphabetically)
+
+- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
+Noted in the code with the cookie "7ZSFUKGV".
+- **Client** - Any client that connects to ptth_relay in order to reach a
+destination server. Admins must terminate TLS between 
+ptth_relay and all clients.
+- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface 
+that ptth_relay serves directly or relays from ptth_server. 
+This interface has no auth by default. Admins must provide their own auth 
+in front of ptth_relay. OAuth2 is recommended.
+- **ptth_file_server** - A standalone file server. It uses the same code
+as ptth_server, so production environments don't need it.
+- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
+that can accept incoming HTTP connections.
+- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
+a firewall. It will connect out to the relay and accept incoming connections
+through the PTTH tunnel.
+- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
+the destination servers using machine-friendly auth.
+- **Tripcode** - The base64 hash of a server's private API key. When adding
+a new server, the tripcode must be copied to ptth_relay.toml on the relay
+server.
+- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
+ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
+connections to ptth_server through the tunnel.
+
 ## Comparison with normal HTTP

 Normal HTTP:
--- a/issues/2020-12Dec/auth-route-YNQAQKJS.md
+++ b/issues/2020-12Dec/auth-route-YNQAQKJS.md
@ -38,6 +38,7 @@ stronger is ready.
 - (X) (POC) Test with curl
 - (X) Clean up scraper endpoint
 - (X) Add (almost) end-to-end tests for scraper endpoint
+- ( ) Add real scraper endpoints
 - ( ) Manually create SQLite DB for scraper keys, add 1 hash
 - ( ) Impl DB reads
 - ( ) Remove scraper key from config file
@ -66,6 +67,16 @@ Design the DB so that the servers can share it one day.
 Design the API so that new types of auth / keys can be added one day, and
 the old ones deprecated.

+Endpoints needed:
+
+- Query server list
+- Query directory in server
+- GET file with byte range (identical to frontend file API)
+
+These will all be JSON for now since Python, Rust, C++, C#, etc. can handle it.
+For compatibility with wget spidering, I _might_ do XML or HTML that's
+machine-readable. We'll see.
+
 ## Open questions

 **Who generates the API key? The scraper client, or the PTTH relay server?**