Add infrastructure to colled and dispatch notifications for transfers
and the multi handle in general. Applications can register a callback
and en-/disable notification type the are interested in.
Without a callback installed, notifications are not collected. Same when
a notification type has not been enabled.
Memory allocation failures on adding notifications lead to a general
multi failure state and result in CURLM_OUT_OF_MEMORY returned from
curl_multi_perform() and curl_multi_socket*() invocations.
Closes#18432
Adds `curl_off_t curl_multi_get_offt(CURLM *multi_handle, CURLMinfo_offt
info)` to the multi interface with enums:
* CURLMINFO_XFERS_CURRENT: current number of transfers
* CURLMINFO_XFERS_RUNNING: number of running transfers
* CURLMINFO_XFERS_PENDING: number of pending transfers
* CURLMINFO_XFERS_DONE: number of finished transfers to read
* CURLMINFO_XFERS_ADDED: total number of transfers added, ever
Add documentation for functions and info enums.
Add use in the curl command line tool to replace two static
variables counting the same "from the outside".
refs #17870Closes#17992
`getsock()` calls operated on a global limit that could
not be configure beyond 16 sockets. This is no longer adequate
with the new happy eyeballing strategy.
Instead, do the following:
- make `struct easy_pollset` dynamic. Starting with
a minimal room for two sockets, the very common case,
allow it to grow on demand.
- replace all protocol handler getsock() calls with pollsets
and a CURLcode to return failures
- add CURLcode return for all connection filter `adjust_pollset()`
callbacks, since they too can now fail.
- use appropriately in multi.c and multi_ev.c
- fix unit2600 to trigger pollset growth
Closes#18164
Add a bitset `dirty` to the multi handle. The presence of a transfer int
he "dirty" set means: this transfer has something to do ASAP.
"dirty" is set by multiplexing protocols like HTTP/2 and 3 when
encountering response data for another transfer than the current one.
"dirty" is set by protocols that want to be called.
Implementation:
* just an additional `uint_bset` in the multi handle
* `Curl_multi_mark_dirty()` to add a transfer to the dirty set.
* `multi_runsingle()` clears the dirty bit of the transfer at
start. Without new dirty marks, this empties the set after
al dirty transfers have been run.
* `multi_timeout()` immediately gives the current time and
timeout_ms == 0 when dirty transfers are present.
* multi_event: marks all transfers tracked for a socket as dirty.
Then marks all expired transfers as dirty. Then it runs
all dirty transfers.
With this mechanism:
* Most uses of `EXPIRE_RUN_NOW` are replaced by `Curl_multi_mark_dirty()`
* `Curl_multi_mark_dirty()` is cheaper than querying if a transfer is
already dirty or set for timeout. There is no need to check, just do it.
* `data->state.select_bits` is eliminated. We need no longer to
simulate a poll event to make a transfer run.
Closes#17662
Change multi's book keeping of transfers to no longer use lists, but a
special table and bitsets for unsigned int values.
`multi-xfers` is the `uint_tbl` where `multi_add_handle()` inserts a new
transfer which assigns it a unique identifier `mid`. Use bitsets to keep
track of transfers that are in state "process" or "pending" or
"msgsent".
Use sparse bitsets to replace `conn->easyq` and event handlings tracking
of transfers per socket. Instead of pointers, keep the mids involved.
Provide base data structures and document them in docs/internal:
* `uint_tbl`: a table of transfers with `mid` as lookup key,
handing out a mid for adds between 0 - capacity.
* `uint_bset`: a bitset keeping unsigned ints from 0 - capacity.
* `uint_spbset`: a sparse bitset for keeping a small number of
unsigned int values
* `uint_hash`: for associating `mid`s with a pointer.
This makes the `mid` the recommended way to refer to transfers inside
the same multi without risk of running into a UAF.
Modifying table and bitsets is safe while iterating over them. Overall
memory requirements are lower as with the double linked list apprach.
Closes#16761
Slight refactoring around dnscache, e.g. hostcache
- eliminate `data->state.hostcache`. Always look up
relevant dnscache at share/multi.
- unify naming to "dnscache", replacing "hostcache"
- use `struct Curl_dnscache`, even though it just
contains a `Curl_hash` for now.
- add `Curl_dnscache_destroy()` for cleanup in
share/multi.
Closes#16941
Further testing with timeouts in event based processing revealed that
our current shutdown handling in the connection pool was not clear
enough. Graceful shutdowns can only happen inside a multi handle and it
was confusing to track in the code which situation actually applies. It
seems better to split the shutdown handling off and have that code
always be part of a multi handle.
Add `cshutdn.[ch]` with its own struct to maintain connections being
shut down. A `cshutdn` always belongs to a multi handle and uses that
for socket/timeout monitoring.
The `cpool`, which can be part of a multi or share, either passes
connections to a `cshutdn` or terminates them with a one-time, best
effort.
Add an `admin` easy handle to each multi and share. This is used to
perform all maintenance operations where no "real" easy handle is
available. This solves the problem that the multi admin handle requires
some additional initialisation (e.g. timeout list).
The share needs its admin handle as it is often cleaned up when no other
transfer or multi handle exists any more. But we need a `data` in almost
every call.
Fix file:// handling of errors when adding a new connection to the pool.
Changes in `curl` itself:
- for parallel transfers, do not set a connection pool in the share,
rely on the multi's connection pool instead. While not a requirement
for the new `cshutdn` to work, this is
a) helpful in testing to trigger graceful shutdowns
b) a broader code coverage of libcurl via the curl tool
- on test_event with uv, cleanup the multi handle before returning from
parallel_event(). The uv struct is on the stack, cleanup of the multi
later will crash when it tries to register sockets. This is a "eat
your own dogfood" related fix.
Closes#16508
Rework the event based handling of transfers and connections to
be "localized" into a single source file with clearer dependencies.
- add multi_ev.c and multi_ev.h
- add docs/internal/MULTI-EV.md to explain the overall workings
- only do event handling book keeping when the socket callback
is set
- add handling for "connection only" event tracking, when internal
easy handles are used that are not really tied to a connection.
Used in connection pool.
- remove transfer member "last_poll" and connections "shutdown_poll"
and keep all that internal to multi_ev.c
- add CURL_TRC_M() for tracing of "multi" related things, including
event handling and connection pool operations. Add new trace
feature "multi" for trace config.
multi traces will show exactly what is going on in regard to
event handling.
- multi: trace transfers "mstate" in every CURL_TRC_M() call
- make internal trace buffer 2048 bytes and end the silliness
with +n here -m there. Adjust test 1652 expectations of resulting
length and input edge cases.
- add trace feature "lib-ids" to perfix libcurl traces with transfer
and connection ids. Useful for debugging libcurl applications.
Closes#16308
The TLS session cache is now held by the multi handle unless it is
shared, so that all easy handles within a multi handle get the benefit
of sharing the same, larger, cache.
The multi handle session cache size is set to 25, unless it is the
internal one used for the easy interface - which still uses only 3.
Closes#15982
Count connections to a host against a possibly configured destination
limit. Trigger multi `connchange` when a connection has been shutdown,
so pending transfers can try to get a connection once again.
Reported-by: baranyaib90 on github
Fixes#15857Closes#15879
This is a better match for what they do and the general "cpool"
var/function prefix works well.
The pool now handles very long hostnames correctly.
The following changes have been made:
* 'struct connectdata', e.g. connections, keep new members
named `destination` and ' destination_len' that fully specifies
interface+port+hostname of where the connection is going to.
This is used in the pool for "bundling" of connections with
the same destination. There is no limit on the length any more.
* Locking: all locks are done inside conncache.c when calling
into the pool and released on return. This eliminates hazards
of the callers keeping track.
* 'struct connectbundle' is now internal to the pool. It is no
longer referenced by a connection.
* 'bundle->multiuse' no longer exists. HTTP/2 and 3 and TLS filters
no longer need to set it. Instead, the multi checks on leaving
MSTATE_CONNECT or MSTATE_CONNECTING if the connection is now
multiplexed and new, e.g. not conn->bits.reuse. In that case
the processing of pending handles is triggered.
* The pool's init is provided with a callback to invoke on all
connections being discarded. This allows the cleanups in
`Curl_disconnect` to run, wherever it is decided to retire
a connection.
* Several pool operations can now be fully done with one call.
Pruning dead connections, upkeep and checks on pool limits
can now directly discard connections and need no longer return
those to the caller for doing that (as we have now the callback
described above).
* Finding a connection for reuse is now done via `Curl_cpool_find()`
and the caller provides callbacks to evaluate the connection
candidates.
* The 'Curl_cpool_check_limits()' now directly uses the max values
that may be set in the transfer's multi. No need to pass them
around. Curl_multi_max_host_connections() and
Curl_multi_max_total_connections() are gone.
* Add method 'Curl_node_llist()' to get the llist a node is in.
Used in cpool to verify connection are indeed in the list (or
not in any list) as they need to.
I left the conncache.[ch] as is for now and also did not touch the
documentation. If we update that outside the feature window, we can
do this in a separate PR.
Multi-thread safety is not achieved by this PR, but since more details
on how pools operate are now "internal" it is a better starting
point to go for this in the future.
Closes#14662
- Renames Curl_readwrite() to Curl_sendrecv() to reflect that it
is mainly about talking to the server, not reads or writes to the
client. Add a `nowp` parameter since the single caller already
has this.
- Curl_sendrecv() now runs all possible operations whenever it is
called and either it had been polling sockets or the 'select_bits'
are set.
POLL_IN/POLL_OUT are not always directly related to send/recv
operations. Filters like HTTP/2, QUIC or TLS may monitor reverse
directions. If a transfer does not want to send (KEEP_SEND), it
will not do so, as before. Same for receives.
- Curl_update_timer() now checks the absolute timestamp of an expiry
and the last/new timeout to determine if the application needs
to stop/start/restart its timer. This fixes edge cases where
updates did not happen as they should have.
- improved --test-event curl_easy_perform() simulation to handle
situations where no sockets are registered but a timeout is
in place.
- fixed bug in events_socket() that complained about removing
a socket that was unknown, when indeed it had removed the socket
just before, only it was the last in the list
- fixed conncache's internal handle to carry the multi instance
(where the cache has one) so that operations on the closure handle
trigger event callbacks correctly.
- fixed conncache to not POLL_REMOVE a socket twice when a conneciton
was closed.
Closes#14561
`data->id` is unique in *most* situations, but not in all. If a libcurl
application uses more than one connection cache, they will overlap. This
is a rare situations, but libcurl apps do crazy things. However, for
informative things, like tracing, `data->id` is superior, since it
assigns new ids in curl's serial curl_easy_perform() use.
Introduce `data->mid` which is a unique identifer inside one multi
instance, assigned on multi_add_handle() and cleared on
multi_remove_handle().
Use the `mid` in DoH operations and also in h2/h3 stream hashes.
Reported-by: 罗朝辉
Fixes#14414Closes#14499
- Turned them all into functions to also do asserts etc.
- The llist related structs got all their fields renamed in order to make
sure no existing code remains using direct access.
- Each list node struct now points back to the list it "lives in", so
Curl_node_remove() no longer needs the list pointer.
- Rename the node struct and some of the access functions.
- Added lots of ASSERTs to verify API being used correctly
- Fix some cases of API misuse
Add docs/LLIST.md documenting the internal linked list API.
Closes#14485
Instead of having an especially "unique" linked list handler for the
main list of easy handles within the multi handle, this now uses a
regular Curl_llist for this as well.
With this change, it is also clearer that every easy handle added to a
multi handle belongs to one and only one out of three different lists:
process - the general one for normal transfer processing
pending - queued up waiting to get a connection (MSTATE_PENDING)
msgsent - transfer completed (MSTATE_MSGSENT)
An easy handle must therefore be removed from the current list before it
gets added to another.
Closes#14474
Use these words and casing more consistently across text, comments and
one curl tool output:
AIX, ALPN, ANSI, BSD, Cygwin, Darwin, FreeBSD, GitHub, HP-UX, Linux,
macOS, MS-DOS, MSYS, MinGW, NTLM, POSIX, Solaris, UNIX, Unix, Unicode,
WINE, WebDAV, Win32, winbind, WinIDN, Windows, Windows CE, Winsock.
Mostly OS names and a few more.
Also a couple of other minor text fixups.
Closes#14360
Based on the standards and guidelines we use for our documentation.
- expand contractions (they're => they are etc)
- host name = > hostname
- file name => filename
- user name = username
- man page => manpage
- run-time => runtime
- set-up => setup
- back-end => backend
- a HTTP => an HTTP
- Two spaces after a period => one space after period
Closes#14073
When libcurl discards a connection there are two phases this may go
through: "shutdown" and "closing". If a connection is aborted, the
shutdown phase is skipped and it is closed right away.
The connection filters attached to the connection implement the phases
in their `do_shutdown()` and `do_close()` callbacks. Filters carry now a
`shutdown` flags next to `connected` to keep track of the shutdown
operation.
Filters are shut down from top to bottom. If a filter is not connected,
its shutdown is skipped. Notable filters that *do* something during
shutdown are HTTP/2 and TLS. HTTP/2 sends the GOAWAY frame. TLS sends
its close notify and expects to receive a close notify from the server.
As sends and receives may EAGAIN on the network, a shutdown is often not
successful right away and needs to poll the connection's socket(s). To
facilitate this, such connections are placed on a new shutdown list
inside the connection cache.
Since managing this list requires the cooperation of a multi handle,
only the connection cache belonging to a multi handle is used. If a
connection was in another cache when being discarded, it is removed
there and added to the multi's cache. If no multi handle is available at
that time, the connection is shutdown and closed in a one-time,
best-effort attempt.
When a multi handle is destroyed, all connection still on the shutdown
list are discarded with a final shutdown attempt and close. In curl
debug builds, the environment variable `CURL_GRACEFUL_SHUTDOWN` can be
set to make this graceful with a timeout in milliseconds given by the
variable.
The shutdown list is limited to the max number of connections configured
for a multi cache. Set via CURLMOPT_MAX_TOTAL_CONNECTIONS. When the
limit is reached, the oldest connection on the shutdown list is
discarded.
- In multi_wait() and multi_waitfds(), collect all connection caches
involved (each transfer might carry its own) into a temporary list.
Let each connection cache on the list contribute sockets and
POLLIN/OUT events it's connections are waiting for.
- in multi_perform() collect the connection caches the same way and let
them peform their maintenance. This will make another non-blocking
attempt to shutdown all connections on its shutdown list.
- for event based multis (multi->socket_cb set), add the sockets and
their poll events via the callback. When `multi_socket()` is invoked
for a socket not known by an active transfer, forward this to the
multi's cache for processing. On closing a connection, remove its
socket(s) via the callback.
TLS connection filters MUST NOT send close nofity messages in their
`do_close()` implementation. The reason is that a TLS close notify
signals a success. When a connection is aborted and skips its shutdown
phase, the server needs to see a missing close notify to detect
something has gone wrong.
A graceful shutdown of FTP's data connection is performed implicitly
before regarding the upload/download as complete and continuing on the
control connection. For FTP without TLS, there is just the socket close
happening. But with TLS, the sent/received close notify signals that the
transfer is complete and healthy. Servers like `vsftpd` verify that and
reject uploads without a TLS close notify.
- added test_19_* for shutdown related tests
- test_19_01 and test_19_02 test for TCP RST packets
which happen without a graceful shutdown and should
no longer appear otherwise.
- add test_19_03 for handling shutdowns by the server
- add test_19_04 for handling shutdowns by curl
- add test_19_05 for event based shutdowny by server
- add test_30_06/07 and test_31_06/07 for shutdown checks
on FTP up- and downloads.
Closes#13976
Currently, we use `pipe` for `wakeup_create`, which requires ***two***
file descriptors. Furthermore, given its complexity inside, `pipe` is a
bit heavyweight for just a simple event wait/notify mechanism.
`eventfd` would be a more suitable solution for this kind of scenario,
kernel also advocates for developers to use `eventfd` instead of `pipe`
in some simple use cases:
Applications can use an eventfd file descriptor instead of a pipe
(see pipe(2) in all cases where a pipe is used simply to signal
events. The kernel overhead of an eventfd file descriptor is much
lower than that of a pipe, and only one file descriptor is required
(versus the two required for a pipe).
This change adds the new backend of `eventfd` for `wakeup_create` and
uses it where available, eliminating the overhead of `pipe`. Also, it
optimizes the `wakeup_create` to eliminate the system calls that make
file descriptors non-blocking by moving the logic of setting
non-blocking flags on file descriptors to `socketpair.c` and using
`SOCK_NONBLOCK` for `socketpair(2)`, `EFD_NONBLOCK` for `eventfd(2)`.
Ref:
https://man7.org/linux/man-pages/man7/pipe.7.htmlhttps://man7.org/linux/man-pages/man2/eventfd.2.htmlhttps://man7.org/linux/man-pages/man2/socketpair.2.htmlhttps://www.gnu.org/software/gnulib/manual/html_node/eventfd.htmlCloses#13874
- add `Curl_hash_add2()` that passes a destructor function for
the element added. Call element destructor instead of hash
destructor if present.
- multi: add `proto_hash` for protocol related information,
remove `struct multi_ssl_backend_data`.
- openssl: use multi->proto_hash to keep x509 shared store
- schannel: use multi->proto_hash to keep x509 shared store
- vtls: remove Curl_free_multi_ssl_backend_data() and its
equivalents in the TLS backends
Closes#13345
Since we can go to the CONNECT state from PENDING, potentially multiple
times for a single transfer, this change introdues a SETUP state that
happens before CONNECT when doing a new transfer.
Now, doing a redirect on a handle goes back to SETUP (not CONNECT like
before) and we initilize the connect timeout etc in SETUP. Previously,
we would do it in CONNECT but that would make it unreliable in cases
where a transfer goes in and out between CONNECT and PENDING multiple
times.
SETUP is transient, so the handle never actually stays in that state.
Additionally: take care of timeouts of PENDING transfers in
curl_multi_perform()
Ref: #13227Closes#13371
- Move all the "upload_done" handling to request.c
- add possibility to abort sending of a request
- add `Curl_req_done_sending()` for checks
- transfer.c: readwrite_upload() now clean
- removing data->state.ulbuf and data->req.upload_fromhere
- as well as data->req.upload_present
- set data->req.upload_done on having read all from
the client and completely flushed the send buffer
- tftp, remove setting of data->req.upload_fromhere
- serves no purpose as `upload_present` is not set
and the data itself is directly `sendto()` anyway
- smtp, make upload EOB conversion a client reader
- xfer_ulbuf addition
- add xfer_ulbuf for borrowing, similar to xfer_buf
- use in file upload
- use in c-hyper body sending
- h1-proxy, remove init of data->state.uilbuf that is never used
- smb, add own send_buf instead of using data->state.ulbuf
Closes#13010
- can be borrowed by transfer during recv-write operation
- needs to be released before borrowing again
- adjustis size to `data->set.buffer_size`
- used in transfer.c readwrite_data()
Closes#12805
As they are not driving transfers or any socket activity, the main loop
does not need to iterate over these handles. A performance improvement.
They are instead only held in their own separate lists.
'data->multi' is kept a pointer to the multi handle as long as the easy
handle is actually part of it even when the handle is moved to the
pending/msgsent lists. It needs to know which multi handle it belongs
to, if for example curl_easy_cleanup() is called before the handle is
removed from the multi handle.
Alll 'data->multi' pointers of handles still part of the multi handle
gets cleared by curl_multi_cleanup() which "orphans" all previously
attached easy handles.
This is take 2. The first version was reverted for the 8.0.1 release.
Assisted-by: Stefan Eissing
Closes#10801
As they are not driving transfers or any socket activity, the main loop
does not need to iterate over these handles. A performance improvement.
They are instead only held in their own separate lists.
Assisted-by: Stefan Eissing
Ref: #10743Closes#10762
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
The check may take many milliseconds, so now it is performed once the
value is first needed. Also, this change makes sure that the value is
not used if the resolve is set to be IPv4-only.
Closes#9553
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
The callbacks were partially documented to support this. Now the
behavior is documented and returning error from either of these
callbacks will effectively kill all currently ongoing transfers.
Added test 530 to verify
Reported-by: Marcelo Juchem
Fixes#8083Closes#8089
Avoid the race condition risk by instead storing the "seeded" flag in
the multi handle. Modern OpenSSL versions handle the seeding itself so
doing the seeding once per multi-handle instead of once per process is
less of an issue.
Reported-by: Gerrit Renker
Fixes#7296Closes#7306
This reverts commit 2260e0ebe6,
also restoring previous follow up changes which were reverted.
Authored-by: rcombs on github
Authored-by: Marc Hörsken
Reviewed-by: Jay Satiro
Reviewed-by: Marcel Raad
Restores #5634
Reverts #6281
Part of #6245
While working on documenting the states it dawned on me that step one is
to use more descriptive names on the states. This also changes prefix on
the states to make them shorter in the source.
State names NOT ending with *ing are transitional ones.
Closes#6612
By making the `magic` identifier the same size and at the same place
within the structs (easy, multi, share), libcurl will be able to more
reliably detect and safely error out if an application passes in the
wrong handle to APIs. Easier to detect and less likely to cause crashes
if done.
Such mixups can't be detected at compile-time due to them being
typedefed void pointers - unless `CURL_STRICTER` is defined.
Closes#6484
This reverts commit d2a7d7c185.
This commit also reverts the subsequent follow-ups to that commit, which
were all done within windows #ifdefs that are removed in this
change. Marc helped me verify this.
Fixes#6146Closes#6281