📝 docs: plan remaining tasks on scraper API
parent
4c52d88be0
commit
78bffc74c3
70
README.md
70
README.md
|
@ -4,48 +4,19 @@ An HTTP server that can run behind a firewall by connecting out to a relay.
|
||||||
|
|
||||||
```
|
```
|
||||||
Outside the tunnel
|
Outside the tunnel
|
||||||
+--------+ +------------+ +-------------+
|
+--------+ +------------+ +-------------+
|
||||||
| Client | ------> | PTTH relay | <----- | PTTH server |
|
| Client | >>> | PTTH relay | <<< | PTTH server |
|
||||||
+--------+ +------------+ +-------------+
|
+--------+ +------------+ +-------------+
|
||||||
|
|
||||||
Inside the tunnel
|
Inside the tunnel
|
||||||
+--------+ -------------- +-------------+
|
+--------+ -------------- +-------------+
|
||||||
| Client | ----------------------------> | Server |
|
| Client | >>> >>> >>> | Server |
|
||||||
+--------+ -------------- +-------------+
|
+--------+ -------------- +-------------+
|
||||||
```
|
```
|
||||||
|
|
||||||
The server can run behind a firewall, because it is actually a special HTTP
|
The server can run behind a firewall, because it is actually a special HTTP
|
||||||
client.
|
client.
|
||||||
|
|
||||||
## Glossary
|
|
||||||
|
|
||||||
(sorted alphabetically)
|
|
||||||
|
|
||||||
- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
|
|
||||||
Noted in the code with the cookie "7ZSFUKGV".
|
|
||||||
- **Client** - Any client that connects to ptth_relay in order to reach a
|
|
||||||
destination server. Admins must terminate TLS between
|
|
||||||
ptth_relay and all clients.
|
|
||||||
- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface
|
|
||||||
that ptth_relay serves directly or relays from ptth_server.
|
|
||||||
This interface has no auth by default. Admins must provide their own auth
|
|
||||||
in front of ptth_relay. OAuth2 is recommended.
|
|
||||||
- **ptth_file_server** - A standalone file server. It uses the same code
|
|
||||||
as ptth_server, so production environments don't need it.
|
|
||||||
- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
|
|
||||||
that can accept incoming HTTP connections.
|
|
||||||
- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
|
|
||||||
a firewall. It will connect out to the relay and accept incoming connections
|
|
||||||
through the PTTH tunnel.
|
|
||||||
- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
|
|
||||||
the destination servers using machine-friendly auth.
|
|
||||||
- **Tripcode** - The base64 hash of a server's private API key. When adding
|
|
||||||
a new server, the tripcode must be copied to ptth_relay.toml on the relay
|
|
||||||
server.
|
|
||||||
- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
|
|
||||||
ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
|
|
||||||
connections to ptth_server through the tunnel.
|
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
ptth_server:
|
ptth_server:
|
||||||
|
@ -109,6 +80,35 @@ proxy_request_buffering off;
|
||||||
proxy_buffering off;
|
proxy_buffering off;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Glossary
|
||||||
|
|
||||||
|
(sorted alphabetically)
|
||||||
|
|
||||||
|
- **Backend API** - The HTTP API that ptth_server uses to establish the tunnel.
|
||||||
|
Noted in the code with the cookie "7ZSFUKGV".
|
||||||
|
- **Client** - Any client that connects to ptth_relay in order to reach a
|
||||||
|
destination server. Admins must terminate TLS between
|
||||||
|
ptth_relay and all clients.
|
||||||
|
- **Frontend** - The human-friendly, browser-friendly HTTP+HTML interface
|
||||||
|
that ptth_relay serves directly or relays from ptth_server.
|
||||||
|
This interface has no auth by default. Admins must provide their own auth
|
||||||
|
in front of ptth_relay. OAuth2 is recommended.
|
||||||
|
- **ptth_file_server** - A standalone file server. It uses the same code
|
||||||
|
as ptth_server, so production environments don't need it.
|
||||||
|
- **ptth_relay** or **Relay server** - The ptth_relay app. This must run on a server
|
||||||
|
that can accept incoming HTTP connections.
|
||||||
|
- **ptth_server** or **Destination server** - The ptth_server app. This should run behind
|
||||||
|
a firewall. It will connect out to the relay and accept incoming connections
|
||||||
|
through the PTTH tunnel.
|
||||||
|
- **Scraper API** - An optional HTTP API for scraper clients to access ptth_relay and
|
||||||
|
the destination servers using machine-friendly auth.
|
||||||
|
- **Tripcode** - The base64 hash of a server's private API key. When adding
|
||||||
|
a new server, the tripcode must be copied to ptth_relay.toml on the relay
|
||||||
|
server.
|
||||||
|
- **Tunnel** - The reverse HTTP tunnel between ptth_relay and ptth_server.
|
||||||
|
ptth_server connects out to ptth_relay, then ptth_relay forwards incoming
|
||||||
|
connections to ptth_server through the tunnel.
|
||||||
|
|
||||||
## Comparison with normal HTTP
|
## Comparison with normal HTTP
|
||||||
|
|
||||||
Normal HTTP:
|
Normal HTTP:
|
||||||
|
|
|
@ -38,6 +38,7 @@ stronger is ready.
|
||||||
- (X) (POC) Test with curl
|
- (X) (POC) Test with curl
|
||||||
- (X) Clean up scraper endpoint
|
- (X) Clean up scraper endpoint
|
||||||
- (X) Add (almost) end-to-end tests for scraper endpoint
|
- (X) Add (almost) end-to-end tests for scraper endpoint
|
||||||
|
- ( ) Add real scraper endpoints
|
||||||
- ( ) Manually create SQLite DB for scraper keys, add 1 hash
|
- ( ) Manually create SQLite DB for scraper keys, add 1 hash
|
||||||
- ( ) Impl DB reads
|
- ( ) Impl DB reads
|
||||||
- ( ) Remove scraper key from config file
|
- ( ) Remove scraper key from config file
|
||||||
|
@ -66,6 +67,16 @@ Design the DB so that the servers can share it one day.
|
||||||
Design the API so that new types of auth / keys can be added one day, and
|
Design the API so that new types of auth / keys can be added one day, and
|
||||||
the old ones deprecated.
|
the old ones deprecated.
|
||||||
|
|
||||||
|
Endpoints needed:
|
||||||
|
|
||||||
|
- Query server list
|
||||||
|
- Query directory in server
|
||||||
|
- GET file with byte range (identical to frontend file API)
|
||||||
|
|
||||||
|
These will all be JSON for now since Python, Rust, C++, C#, etc. can handle it.
|
||||||
|
For compatibility with wget spidering, I _might_ do XML or HTML that's
|
||||||
|
machine-readable. We'll see.
|
||||||
|
|
||||||
## Open questions
|
## Open questions
|
||||||
|
|
||||||
**Who generates the API key? The scraper client, or the PTTH relay server?**
|
**Who generates the API key? The scraper client, or the PTTH relay server?**
|
||||||
|
|
Loading…
Reference in New Issue