ws-docs: extend WebSocket documentation

Closes #16118
This commit is contained in:
Calvin Ruocco 2024-12-12 15:36:08 +01:00 committed by Daniel Stenberg
parent bfec1d7165
commit dc3252bedd
No known key found for this signature in database
GPG key ID: 5CC908FDB71E12C2
4 changed files with 206 additions and 114 deletions

View file

@ -49,6 +49,7 @@ struct curl_ws_frame {
int flags;
curl_off_t offset;
curl_off_t bytesleft;
size_t len;
};
~~~
@ -63,38 +64,65 @@ See the list below.
## `offset`
When this frame is a continuation of fragment data already delivered, this is
the offset into the final fragment where this piece belongs.
When this chunk is a continuation of frame data already delivered, this is
the offset into the final frame data where this piece belongs to.
## `bytesleft`
If this is not a complete fragment, the *bytesleft* field informs about how
many additional bytes are expected to arrive before this fragment is complete.
## `len`
The length of the current data chunk.
# FLAGS
The *message type* flags (CURLWS_TEXT/BINARY/CLOSE/PING/PONG) are mutually
exclusive.
## CURLWS_TEXT
The buffer contains text data. Note that this makes a difference to WebSocket
This is a message with text data. Note that this makes a difference to WebSocket
but libcurl itself does not make any verification of the content or
precautions that you actually receive valid UTF-8 content.
## CURLWS_BINARY
This is binary data.
## CURLWS_CONT
This is not the final fragment of the message, it implies that there is
another fragment coming as part of the same message.
This is a message with binary data.
## CURLWS_CLOSE
This transfer is now closed.
This is a close message. No more data follows.
It may contain a 2-byte unsigned integer in network byte order that indicates
the close reason and may additionally contain up to 123 bytes of further
textual payload for a total of at most 125 bytes. libcurl does not verify that
the textual description is valid UTF-8.
## CURLWS_PING
This as an incoming ping message, that expects a pong response.
This is a ping message. It may contain up to 125 bytes of payload text.
libcurl does not verify that the payload is valid UTF-8.
Upon receiving a ping message, libcurl automatically responds with a pong
message unless the **CURLWS_RAW_MODE** bit of CURLOPT_WS_OPTIONS(3) is set.
## CURLWS_PONG
This is a pong message. It may contain up to 125 bytes of payload text.
libcurl does not verify that the payload is valid UTF-8.
## CURLWS_CONT
Can only occur in conjunction with CURLWS_TEXT or CURLWS_BINARY.
This is not the final fragment of the message, it implies that there is
another fragment coming as part of the same message. The application must
reassemble the fragments to receive the complete message.
Only a single fragmented message can be transmitted at a time, but it may
be interrupted by CURLWS_CLOSE, CURLWS_PING or CURLWS_PONG frames.
# %PROTOCOLS%

View file

@ -30,20 +30,32 @@ CURLcode curl_ws_recv(CURL *curl, void *buffer, size_t buflen,
# DESCRIPTION
Retrieves as much as possible of a received WebSocket data fragment into the
**buffer**, but not more than **buflen** bytes. *recv* is set to the
Retrieves as much as possible of a received WebSocket frame into the
*buffer*, but not more than *buflen* bytes. *recv* is set to the
number of bytes actually stored.
If there is more fragment data to deliver than what fits in the provided
*buffer*, libcurl returns a full buffer and the application needs to call this
function again to continue draining the buffer.
If the function call is successful, the *meta* pointer gets set to point to a
*const struct curl_ws_frame* that contains information about the received
data. That struct must not be freed and its contents must not be relied upon
anymore once another WebSocket function is called. See the curl_ws_meta(3) for
anymore once another WebSocket function is called. See curl_ws_meta(3) for more
details on that struct.
The application must check `meta->bytesleft` to determine whether the complete
frame has been received. If more payload is pending, the application must call
this function again with an updated *buffer* and *buflen* to resume receiving.
This may for example happen when the data does not fit into the provided buffer
or when not all frame data has been delivered over the network yet.
If the application wants to read the metadata without consuming any payload,
it may call this function with a *buflen* of zero. Setting *buffer* to a NULL
pointer is permitted in this case. Note that frames without payload are consumed
by this action.
If the received message consists of multiple fragments, the *CURLWS_CONT* bit
is set in all frames except the final one. The application is responsible for
reassembling fragmented messages. See curl_ws_meta(3) for more details on
*CURLWS_CONT*.
# %PROTOCOLS%
# EXAMPLE
@ -51,15 +63,40 @@ details on that struct.
~~~c
int main(void)
{
size_t rlen;
const struct curl_ws_frame *meta;
char buffer[256];
size_t offset = 0;
CURLcode res = CURLE_OK;
CURL *curl = curl_easy_init();
if(curl) {
CURLcode res = curl_ws_recv(curl, buffer, sizeof(buffer), &rlen, &meta);
if(res)
printf("error: %s\n", curl_easy_strerror(res));
curl_easy_setopt(curl, CURLOPT_URL, "wss://example.com/");
curl_easy_setopt(curl, CURLOPT_CONNECT_ONLY, 2L);
/* start HTTPS connection and upgrade to WSS, then return control */
curl_easy_perform(curl);
/* Note: This example neglects fragmented messages. (CURLWS_CONT bit)
A real application must handle them appropriately. */
while(!res) {
size_t recv;
const struct curl_ws_frame *meta;
res = curl_ws_recv(curl, buffer + offset, sizeof(buffer) - offset, &recv,
&meta);
offset += recv;
if(res == CURLE_OK) {
if(meta->bytesleft == 0)
break; /* finished receiving */
if(meta->bytesleft > sizeof(buffer) - offset)
res = CURLE_TOO_LARGE;
}
if(res == CURLE_AGAIN)
/* in real application: wait for socket here, e.g. using select() */
res = CURLE_OK;
}
curl_easy_cleanup(curl);
return (int)res;
}
~~~
@ -79,3 +116,8 @@ Returns **CURLE_GOT_NOTHING** if the associated connection is closed.
Instead of blocking, the function returns **CURLE_AGAIN**. The correct
behavior is then to wait for the socket to signal readability before calling
this function again.
Any other non-zero return value indicates an error. See the libcurl-errors(3)
man page for the full list with descriptions.
Returns **CURLE_GOT_NOTHING** if the associated connection is closed.

View file

@ -31,25 +31,30 @@ CURLcode curl_ws_send(CURL *curl, const void *buffer, size_t buflen,
# DESCRIPTION
Send the specific message fragment over an established WebSocket
connection. The *buffer* holds the data to send and it is *buflen*
number of payload bytes in that memory area.
Send the specific message chunk over an established WebSocket
connection. *buffer* must point to a valid memory location containing
(at least) *buflen* bytes of payload memory.
*sent* is returned as the number of payload bytes actually sent.
*sent* is set to the number of payload bytes actually sent. If the return value
is **CURLE_OK** but *sent* is less than the given *buflen*, libcurl was unable
to consume the complete payload in a single call. In this case the application
must call this function again until all payload is processed. *buffer* and
*buflen* must be updated on every following invocation to only point to the
remaining piece of the payload.
To send a (huge) fragment using multiple calls with partial content per
invoke, set the *CURLWS_OFFSET* bit and the *fragsize* argument as the
total expected size for the first part, then set the *CURLWS_OFFSET* with
a zero *fragsize* for the following parts.
*fragsize* should always be set to zero unless a (huge) frame shall be sent
using multiple calls with partial content per call explicitly. In that
case you must set the *CURLWS_OFFSET* bit and set the *fragsize* as documented
in the section on *CURLWS_OFFSET* below.
If not sending a partial fragment or if this is raw mode, *fragsize*
should be set to zero.
*flags* must contain at least one flag indicating the type of the message.
To send a fragmented message consisting of multiple frames, additionally set
the *CURLWS_CONT* bit in all frames except the final one.
If **CURLWS_RAW_MODE** is enabled in CURLOPT_WS_OPTIONS(3), the
**flags** argument should be set to 0.
For more details on the supported flags see below and in curl_ws_meta(3).
To send a message consisting of multiple frames, set the *CURLWS_CONT* bit
in all frames except the final one.
If *CURLWS_RAW_MODE* is enabled in CURLOPT_WS_OPTIONS(3), the
*flags* argument should be set to 0.
Warning: while it is possible to invoke this function from a callback,
such a call is blocking in this situation, e.g. only returns after all data
@ -57,39 +62,15 @@ has been sent or an error is encountered.
# FLAGS
## CURLWS_TEXT
The buffer contains text data. Note that this makes a difference to WebSocket
but libcurl itself does not make any verification of the content or
precautions that you actually send valid UTF-8 content.
## CURLWS_BINARY
This is binary data.
## CURLWS_CONT
This is not the final fragment of the message, which implies that there is
another fragment coming as part of the same message where this bit is not set.
## CURLWS_CLOSE
Close this transfer.
## CURLWS_PING
This is a ping.
## CURLWS_PONG
This is a pong.
Supports all flags documented in curl_ws_meta(3) and additionally the following
flags.
## CURLWS_OFFSET
The provided data is only a partial fragment and there is more coming in a
The provided data is only a partial frame and there is more coming in a
following call to *curl_ws_send()*. When sending only a piece of the
fragment like this, the *fragsize* must be provided with the total
expected fragment size in the first call and it needs to be zero in subsequent
frame like this, the *fragsize* must be provided with the total
expected frame size in the first call and must be zero in all subsequent
calls.
# %PROTOCOLS%
@ -99,18 +80,34 @@ calls.
~~~c
#include <string.h> /* for strlen */
const char *send_payload = "magic";
int main(void)
{
size_t sent;
CURLcode res;
const char *buffer = "PAYLOAD";
size_t offset = 0;
CURLcode res = CURLE_OK;
CURL *curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, "wss://example.com/");
curl_easy_setopt(curl, CURLOPT_CONNECT_ONLY, 2L);
/* start HTTPS connection and upgrade to WSS, then return control */
curl_easy_perform(curl);
res = curl_ws_send(curl, send_payload, strlen(send_payload), &sent, 0,
CURLWS_PING);
while(!res) {
size_t sent;
res = curl_ws_send(curl, buffer + offset, strlen(buffer) - offset, &sent,
0, CURLWS_TEXT);
offset += sent;
if(res == CURLE_OK) {
if(offset == strlen(buffer))
break; /* finished sending */
}
if(res == CURLE_AGAIN)
/* in real application: wait for socket here, e.g. using select() */
res = CURLE_OK;
}
curl_easy_cleanup(curl);
return (int)res;
}
@ -126,3 +123,10 @@ CURLE_OK (0) means everything was OK, non-zero means an error occurred, see
libcurl-errors(3). If CURLOPT_ERRORBUFFER(3) was set with curl_easy_setopt(3)
there can be an error message stored in the error buffer when non-zero is
returned.
Instead of blocking, the function returns **CURLE_AGAIN**. The correct
behavior is then to wait for the socket to signal readability before calling
this function again.
Any other non-zero return value indicates an error. See the libcurl-errors(3)
man page for the full list with descriptions.

View file

@ -47,44 +47,48 @@ WebSocket" request header field. When the upgrade is accepted by the server,
it responds with a 101 Switching and then the client can speak WebSocket with
the server. The communication can happen in both directions at the same time.
# EXTENSIONS
The WebSocket protocol allows the client to request and negotiate *extensions*
can add additional features and restrictions to the protocol.
libcurl does not support the use of extensions and always sets up a connection
without them.
# MESSAGES
WebSocket communication is message based. That means that both ends send and
receive entire messages, not streams like TCP. A WebSocket message is sent
over the wire in one or more frames. Each frame in a message can have a size
up to 2^63 bytes.
over the wire in one or more frames. A message which is split into several
frames is referred to as a *fragmented* message and the individual frames are
called *fragments*. Each frame (or fragment) in a message can have a size of
up to 2^63 bytes and declares the frame size in the header. The total size of
a message that is fragmented into multiple frames is not limited by the
protocol and the number of fragments is not known until the final fragment is
received.
libcurl delivers WebSocket data as frame fragments. It might send a whole
frame, but it might also deliver them in pieces depending on size and network
patterns. It makes sure to provide the API user about the exact specifics
about the fragment: type, offset, size and how much data there is pending to
arrive for the same frame.
Transmission of a frame must not be interrupted by any other data transfers and
transmission of the different fragments of a message must not be interrupted by
other user data frames. Control frames - PING, PONG and CLOSE - may be
transmitted in between any other two frames, even in between two fragments of
the same user data message. The control frames themselves on the other hand
must never be fragmented and are limited to a size of 125 bytes.
A message has an unknown size until the last frame header for the message has
been received since only frames have set sizes.
# Raw mode
libcurl can be told to speak WebSocket in "raw mode" by setting the
**CURLWS_RAW_MODE** bit to the CURLOPT_WS_OPTIONS(3) option.
Raw WebSocket means that libcurl passes on the data from the network without
parsing it leaving that entirely to the application. This mode assumes that
the user of this knows WebSocket and can parse and figure out the data all by
itself.
This mode is intended for applications that already have a WebSocket
parser/engine that want to switch over to use libcurl for enabling WebSocket,
and keep parts of the existing software architecture.
libcurl delivers WebSocket data as chunks of frames. It might deliver a whole
frame as a single chunk, but it might also deliver it in several pieces
depending on size and network patterns. See the individual API documentations
for further information.
# PING
WebSocket is designed to allow long-lived sessions and in order to keep the
connections alive, both ends can send PING messages for the other end to
respond with a PONG.
respond with a PONG. Both ends may also send unsolicited PONG messages as
unidirectional heartbeat.
libcurl automatically responds to server PING messages with a PONG. It does
not send any PING messages automatically.
libcurl automatically responds to server PING messages with a PONG that echoes
the payload of the PING message. libcurl does neither send any PING messages
nor any unsolicited PONG messages automatically.
# MODELS
@ -92,26 +96,40 @@ Because of the many different ways WebSocket can be used, which is much more
flexible than limited to plain downloads or uploads, libcurl offers two
different API models to use it:
1. Using a write callback with CURLOPT_WRITEFUNCTION(3) much like other
1. CURLOPT_WRITEFUNCTION model:
Using a write callback with CURLOPT_WRITEFUNCTION(3) much like other
downloads for when the traffic is download oriented.
2. Using CURLOPT_CONNECT_ONLY(3) and use the WebSocket recv/send
functions.
2. CURLOPT_CONNECT_ONLY model:
Using curl_ws_recv(3) and curl_ws_send(3) functions.
# Callback model
## CURLOPT_WRITEFUNCTION MODEL
When a write callback is set and a WebSocket transfer is performed, the
callback is called to deliver all WebSocket data that arrives.
CURLOPT_CONNECT_ONLY(3) must be unset or **0L** for this model to take effect.
The callback can then call curl_ws_meta(3) to learn about the details of
the incoming data fragment.
curl_easy_perform(3) establishes and sets up the WebSocket communication and
then blocks for the whole duration of the connection. libcurl calls the
callback configured in CURLOPT_WRITEFUNCTION(3), whenever an incoming chunk
of WebSocket data is received. The callback is handed a pointer to the payload
data as an argument and can call curl_ws_meta(3) to get relevant metadata.
# CONNECT_ONLY model
## CURLOPT_CONNECT_ONLY MODEL
By setting CURLOPT_CONNECT_ONLY(3) to **2L**, the transfer only
establishes and setups the WebSocket communication and then returns control
back to the application.
CURLOPT_CONNECT_ONLY(3) must be **2L** for this model to take effect.
Once such a setup has been successfully performed, the application can proceed
and use curl_ws_recv(3) and curl_ws_send(3) freely to exchange
WebSocket messages with the server.
curl_easy_perform(3) only establishes and sets up the WebSocket communication
and then returns control back to the application. The application can then use
curl_ws_recv(3) and curl_ws_send(3) to exchange WebSocket messages with the
server.
# RAW MODE
libcurl can be told to speak WebSocket in "raw mode" by setting the
**CURLWS_RAW_MODE** bit of the CURLOPT_WS_OPTIONS(3) option.
Raw WebSocket means that libcurl passes on the data from the network without
parsing it, leaving that entirely to the application.
This mode is intended for applications that already have a WebSocket
parser/engine and want to switch over to use libcurl for enabling WebSocket,
and keep parts of the existing software architecture.