curl/docs/internals/PEERS.md
Stefan Eissing 73daec6620
lib: transfer origin and proxy handling
Add `data->state.origin` as the origin the transfer is sending the
current request to/gets the response from. Use it for request specific
properties like authentication, hsts and cookie handling, etc.

Unless talking to a forwarding HTTP proxy (e.g. not tunneling),
`data->state.origin` and `conn->origin` are the same.

With a forwarding HTTP proxy in play, `conn->origin` is set to
`conn->http_proxy.peer` and `conn->bits.origin_is_proxy` (a new bit) is
set.

Remove the connection bits, now replaced with:

* `conn->bits.socksproxy` -> `conn->socks_proy.peer`
* `conn->bits.httpproxy` -> `conn->http_proy.peer`
* `conn->bits.proxy` -> `(conn->socks_proy.peer || conn->http_proy.peer`)
* `conn->bits.tunnel_proxy` -> (`conn->http_proy.peer && !conn->bits.origin_is_proxy`)
* `(conn->bits.httpproxy && !conn->bits.tunnel_proxy)` -> `conn->bits.origin_is_proxy`

Rename `noproxy.[ch]` to `proxy.[ch]`. Move the connection proxy setup
code from `url.c` to `proxy.c`.

Remove `data->info.conn_remote_port` as no one uses it.

Add test_40_02b for a SOCKS connection to a forwarding HTTPS proxy.

Update internal documentation about peers and creds.

Closes #21967
2026-06-12 23:52:00 +02:00

130 lines
5.2 KiB
Markdown

<!--
Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al.
SPDX-License-Identifier: curl
-->
# curl peers
A `peer` in curl internals is represented by a `struct Curl_peer`. It has the following members:
* `scheme`: a `struct Curl_scheme` of the URL schemes known to curl
* `user_hostname`: the hostname as supplied by the user/application
* `hostname`: a *normalized* version of `user_hostname`
* `port`: the network port
* `ipv6`: if `hostname` is an IPv6 address
* `unix_socket`: if `hostname` is a path to a `unix domain socket`
* `user_ipv6zone`: user supplied IPv6 zone name or `NULL`
* `ipv6scope_id`: IPv6 address scope or 0
* `abstract`: (if `unix_socket`) if the socket is abstract
A peer, in short, is a communication endpoint.
## peers and transfers
The peer a transfer, e.g. easy handle, works against is determined at the
start of each request. It is kept in `data->state.origin`. For the first
request done in a `curl_easy_perform()` or equivalent, this origin is
linked to `data->state.initial_origin`. This allows checks if properties
of `data->set.*` should apply to a request or not.
`data->state.origin` is relevant for cookie processing, signing requests
and other request/response based processing.
## peers and connections
A network connection always goes *somewhere*. That *somewhere* is called
the `origin` of the connection (e.g. the source of responses/downloads).
It is kept in `conn->origin` and is always present in a connection.
The `origin` is *logical* endpoint a connection talks to. In most
configurations it is the same as `data->state.origin` (see proxies below).
For most connections, the `origin` is connected to *directly*. It
can be directed to another peer, however.
### `connect-to`
With the command line option `--connect-to` or the `libcurl` option
`CURLOPT_CONNECT_TO`, a connection can be told to make the network connection
to another endpoint *while keeping the `origin` unchanged*.
This other endpoint is also a peer and is available as `conn->via_peer`.
This may be a peer for a different hostname and port or it may be a
`unix domain socket`.
### proxies
When a connection uses a proxy, the endpoint for contacting the proxy server
is also represented as a peer and is kept at `conn->socks_proxy.peer` and/or
`conn->http_proxy.peer`. `SOCKS` proxies always come first, so a connection
might connect as:
```
1. curl -------------------------------------------> conn->origin
2. curl -------------------------------------------> conn->via_peer (acting as conn->origin)
3. curl --> socks_proxy.peer ----------------------> conn->via_peer/origin
4. curl -----------------------> http_proxy.peer --> conn->via_peer/origin
5. curl --> socks_proxy.peer --> http_proxy.peer --> conn->via_peer/origin
```
A `conn->(socks|http)_proxy.peer` is only ever present when the proxy
is in use and `NULL` otherwise.
SOCKS proxies are always used for tunneling, either to the origin or
the HTTP proxy. They operate in a connection filter.
HTTP proxies can operate in two modes: tunneling or forwarding. When tunneling,
they also operate in a connection filter. In forwarding mode however, they
become the `origin` the connection talks to.
Therefore, connections that talk to a forwarding HTTP proxy have `conn->origin`
set to `conn->http_proxy.peer` and `conn->bits.origin_is_proxy` is set.
The connection filter `SETUP`, that assembles the filters for a connection,
figures out which peer to pass to which filter in order to make it all work.
The individual filters get passed a specific peer and do not need be concerned
with the whole chain.
For example, IP connection goes to `origin`(1), `via_peer`(2),
`socks_proxy.peer`(3+5), `http_proxy.peer`(4) and that is the peer that gets
passed to the `DNS` and `HAPPY-EYEBALLS` filters.
### TLS
TLS filters' task is to verify the peer they talk to (unless that is
switched off). They either talk to the `conn->origin` or the
`conn->http_proxy.peer` (`SOCKS` does not have TLS). The `conn->via_peer` is
irrelevant. A `via_peer` endpoint needs to present a certificate matching
`conn->origin` or the connect must fail.
### `unix domain socket`s
Peers that represent a `unix domain socket` may be used in two places:
1. `via_peer`: curl can connect to an `origin` server via `unix domain socket`s.
This disables any proxy settings a transfer might carry.
2. `socks_proxy.peer`: a `SOCKS` proxy may be contacted over a `unix domain
socket`.
It is not supported to contact an http proxy over `unix domain socket`s.
## peers and credentials
There have been several vulnerabilities by leaking credentials in requests
where they should not appear. In future work we plan to tie credentials to
`peers` and use them only when their `peer` still matches the current
connection use.
## peers internals
A `struct Curl_peer` is allocated with space of the `user_hostname`.
Only when the user supplied value needs conversions (removing `[]` or
IDN encoding) is `hostname` an extra allocation. This keeps the number
of allocations the same as before.
A `Curl_peer` is not expected to be modified after it has been created.
However, each `Curl_peer` has a reference counter. Code needs to use
`Curl_peer_link()` and `Curl_peer_unlink()` to keep/release references.
This makes it safe and cheap to keep references to peers in connections
and filters.