[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
/*
|
|
|
|
|
* Functions managing stream_interface structures
|
|
|
|
|
*
|
2012-05-11 11:47:17 -04:00
|
|
|
* Copyright 2000-2012 Willy Tarreau <w@1wt.eu>
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
*
|
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
|
* modify it under the terms of the GNU General Public License
|
|
|
|
|
* as published by the Free Software Foundation; either version
|
|
|
|
|
* 2 of the License, or (at your option) any later version.
|
|
|
|
|
*
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
#include <errno.h>
|
|
|
|
|
#include <fcntl.h>
|
|
|
|
|
#include <stdio.h>
|
|
|
|
|
#include <stdlib.h>
|
|
|
|
|
|
|
|
|
|
#include <sys/socket.h>
|
|
|
|
|
#include <sys/stat.h>
|
|
|
|
|
#include <sys/types.h>
|
|
|
|
|
|
2020-05-27 06:58:42 -04:00
|
|
|
#include <haproxy/api.h>
|
2020-06-09 03:07:15 -04:00
|
|
|
#include <haproxy/applet.h>
|
2020-06-04 15:07:02 -04:00
|
|
|
#include <haproxy/channel.h>
|
2020-06-04 12:02:10 -04:00
|
|
|
#include <haproxy/connection.h>
|
2020-06-02 05:28:02 -04:00
|
|
|
#include <haproxy/dynbuf.h>
|
2020-06-04 03:08:41 -04:00
|
|
|
#include <haproxy/http_htx.h>
|
2020-06-09 03:07:15 -04:00
|
|
|
#include <haproxy/pipe-t.h>
|
|
|
|
|
#include <haproxy/pipe.h>
|
2020-06-04 16:29:18 -04:00
|
|
|
#include <haproxy/proxy.h>
|
2020-06-04 17:46:14 -04:00
|
|
|
#include <haproxy/stream-t.h>
|
2020-06-04 14:45:39 -04:00
|
|
|
#include <haproxy/stream_interface.h>
|
2020-06-09 03:07:15 -04:00
|
|
|
#include <haproxy/task.h>
|
2020-06-02 12:15:32 -04:00
|
|
|
#include <haproxy/ticks.h>
|
2020-06-01 05:05:15 -04:00
|
|
|
#include <haproxy/time.h>
|
2020-06-09 03:07:15 -04:00
|
|
|
#include <haproxy/tools.h>
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
|
2012-07-23 12:24:25 -04:00
|
|
|
|
2018-12-19 09:19:27 -05:00
|
|
|
/* functions used by default on a detached stream-interface */
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutr(struct stream_interface *si);
|
|
|
|
|
static void stream_int_shutw(struct stream_interface *si);
|
2012-05-11 11:47:17 -04:00
|
|
|
static void stream_int_chk_rcv(struct stream_interface *si);
|
|
|
|
|
static void stream_int_chk_snd(struct stream_interface *si);
|
2018-12-19 09:19:27 -05:00
|
|
|
|
|
|
|
|
/* functions used on a conn_stream-based stream-interface */
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutr_conn(struct stream_interface *si);
|
|
|
|
|
static void stream_int_shutw_conn(struct stream_interface *si);
|
2012-08-24 12:12:41 -04:00
|
|
|
static void stream_int_chk_rcv_conn(struct stream_interface *si);
|
|
|
|
|
static void stream_int_chk_snd_conn(struct stream_interface *si);
|
2018-12-19 09:19:27 -05:00
|
|
|
|
|
|
|
|
/* functions used on an applet-based stream-interface */
|
2015-04-13 10:30:14 -04:00
|
|
|
static void stream_int_shutr_applet(struct stream_interface *si);
|
|
|
|
|
static void stream_int_shutw_applet(struct stream_interface *si);
|
|
|
|
|
static void stream_int_chk_rcv_applet(struct stream_interface *si);
|
|
|
|
|
static void stream_int_chk_snd_applet(struct stream_interface *si);
|
2018-12-19 09:19:27 -05:00
|
|
|
|
|
|
|
|
/* last read notification */
|
|
|
|
|
static void stream_int_read0(struct stream_interface *si);
|
|
|
|
|
|
|
|
|
|
/* post-IO notification callback */
|
|
|
|
|
static void stream_int_notify(struct stream_interface *si);
|
2012-05-11 11:47:17 -04:00
|
|
|
|
2012-08-24 12:12:41 -04:00
|
|
|
/* stream-interface operations for embedded tasks */
|
|
|
|
|
struct si_ops si_embedded_ops = {
|
2012-05-07 11:15:39 -04:00
|
|
|
.chk_rcv = stream_int_chk_rcv,
|
|
|
|
|
.chk_snd = stream_int_chk_snd,
|
2013-09-29 08:51:58 -04:00
|
|
|
.shutr = stream_int_shutr,
|
|
|
|
|
.shutw = stream_int_shutw,
|
2012-05-07 11:15:39 -04:00
|
|
|
};
|
|
|
|
|
|
2012-08-24 12:12:41 -04:00
|
|
|
/* stream-interface operations for connections */
|
|
|
|
|
struct si_ops si_conn_ops = {
|
|
|
|
|
.chk_rcv = stream_int_chk_rcv_conn,
|
|
|
|
|
.chk_snd = stream_int_chk_snd_conn,
|
2013-09-29 08:51:58 -04:00
|
|
|
.shutr = stream_int_shutr_conn,
|
|
|
|
|
.shutw = stream_int_shutw_conn,
|
2012-08-24 12:12:41 -04:00
|
|
|
};
|
|
|
|
|
|
2015-04-13 10:30:14 -04:00
|
|
|
/* stream-interface operations for connections */
|
|
|
|
|
struct si_ops si_applet_ops = {
|
|
|
|
|
.chk_rcv = stream_int_chk_rcv_applet,
|
|
|
|
|
.chk_snd = stream_int_chk_snd_applet,
|
|
|
|
|
.shutr = stream_int_shutr_applet,
|
|
|
|
|
.shutw = stream_int_shutw_applet,
|
|
|
|
|
};
|
|
|
|
|
|
2018-12-19 09:19:27 -05:00
|
|
|
|
|
|
|
|
/* Functions used to communicate with a conn_stream. The first two may be used
|
|
|
|
|
* directly, the last one is mostly a wake callback.
|
|
|
|
|
*/
|
|
|
|
|
int si_cs_recv(struct conn_stream *cs);
|
|
|
|
|
int si_cs_send(struct conn_stream *cs);
|
|
|
|
|
static int si_cs_process(struct conn_stream *cs);
|
|
|
|
|
|
|
|
|
|
|
2012-10-02 18:41:04 -04:00
|
|
|
struct data_cb si_conn_cb = {
|
2018-09-14 17:21:44 -04:00
|
|
|
.wake = si_cs_process,
|
2016-11-24 10:58:12 -05:00
|
|
|
.name = "STRM",
|
2012-05-07 11:15:39 -04:00
|
|
|
};
|
|
|
|
|
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
/*
|
|
|
|
|
* This function only has to be called once after a wakeup event in case of
|
|
|
|
|
* suspected timeout. It controls the stream interface timeouts and sets
|
|
|
|
|
* si->flags accordingly. It does NOT close anything, as this timeout may
|
|
|
|
|
* be used for any purpose. It returns 1 if the timeout fired, otherwise
|
|
|
|
|
* zero.
|
|
|
|
|
*/
|
2018-12-19 09:19:27 -05:00
|
|
|
int si_check_timeouts(struct stream_interface *si)
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
{
|
|
|
|
|
if (tick_is_expired(si->exp, now_ms)) {
|
|
|
|
|
si->flags |= SI_FL_EXP;
|
|
|
|
|
return 1;
|
|
|
|
|
}
|
|
|
|
|
return 0;
|
|
|
|
|
}
|
|
|
|
|
|
2008-11-30 12:14:12 -05:00
|
|
|
/* to be called only when in SI_ST_DIS with SI_FL_ERR */
|
2018-12-19 09:19:27 -05:00
|
|
|
void si_report_error(struct stream_interface *si)
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
{
|
|
|
|
|
if (!si->err_type)
|
|
|
|
|
si->err_type = SI_ET_DATA_ERR;
|
|
|
|
|
|
2014-11-28 05:11:05 -05:00
|
|
|
si_oc(si)->flags |= CF_WRITE_ERROR;
|
|
|
|
|
si_ic(si)->flags |= CF_READ_ERROR;
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
}
|
|
|
|
|
|
2008-11-30 13:48:07 -05:00
|
|
|
/*
|
|
|
|
|
* Returns a message to the client ; the connection is shut down for read,
|
|
|
|
|
* and the request is cleared so that no server connection can be initiated.
|
|
|
|
|
* The buffer is marked for read shutdown on the other side to protect the
|
|
|
|
|
* message, and the buffer write is enabled. The message is contained in a
|
2010-01-10 04:21:21 -05:00
|
|
|
* "chunk". If it is null, then an empty message is used. The reply buffer does
|
|
|
|
|
* not need to be empty before this, and its contents will not be overwritten.
|
|
|
|
|
* The primary goal of this function is to return error messages to a client.
|
2008-11-30 13:48:07 -05:00
|
|
|
*/
|
2018-12-19 09:19:27 -05:00
|
|
|
void si_retnclose(struct stream_interface *si,
|
2018-07-13 05:56:34 -04:00
|
|
|
const struct buffer *msg)
|
2008-11-30 13:48:07 -05:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
|
|
|
|
|
channel_auto_read(ic);
|
|
|
|
|
channel_abort(ic);
|
|
|
|
|
channel_auto_close(ic);
|
|
|
|
|
channel_erase(ic);
|
|
|
|
|
channel_truncate(oc);
|
2010-12-12 07:06:00 -05:00
|
|
|
|
2018-07-13 04:54:26 -04:00
|
|
|
if (likely(msg && msg->data))
|
|
|
|
|
co_inject(oc, msg->area, msg->data);
|
2008-11-30 13:48:07 -05:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
|
|
|
|
channel_auto_read(oc);
|
|
|
|
|
channel_auto_close(oc);
|
|
|
|
|
channel_shutr_now(oc);
|
2009-12-27 16:51:06 -05:00
|
|
|
}
|
|
|
|
|
|
2012-08-06 13:31:45 -04:00
|
|
|
/*
|
2015-04-13 10:30:14 -04:00
|
|
|
* This function performs a shutdown-read on a detached stream interface in a
|
|
|
|
|
* connected or init state (it does nothing for other states). It either shuts
|
|
|
|
|
* the read side or marks itself as closed. The buffer flags are updated to
|
|
|
|
|
* reflect the new state. If the stream interface has SI_FL_NOHALF, we also
|
|
|
|
|
* forward the close to the write side. The owner task is woken up if it exists.
|
2012-08-06 13:31:45 -04:00
|
|
|
*/
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutr(struct stream_interface *si)
|
2009-09-05 14:57:35 -04:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
2018-11-14 10:58:52 -05:00
|
|
|
si_rx_shut_blk(si);
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->flags & CF_SHUTR)
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2014-11-28 05:11:05 -05:00
|
|
|
if (si_oc(si)->flags & CF_SHUTW) {
|
2009-09-05 14:57:35 -04:00
|
|
|
si->state = SI_ST_DIS;
|
|
|
|
|
si->exp = TICK_ETERNITY;
|
2010-09-07 10:16:50 -04:00
|
|
|
}
|
2012-08-06 13:31:45 -04:00
|
|
|
else if (si->flags & SI_FL_NOHALF) {
|
|
|
|
|
/* we want to immediately forward this close to the write side */
|
|
|
|
|
return stream_int_shutw(si);
|
|
|
|
|
}
|
2010-07-02 05:18:03 -04:00
|
|
|
|
2012-08-06 13:31:45 -04:00
|
|
|
/* note that if the task exists, it must unregister itself once it runs */
|
2014-11-28 06:08:47 -05:00
|
|
|
if (!(si->flags & SI_FL_DONT_WAKE))
|
|
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_IO);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
|
2012-08-06 13:31:45 -04:00
|
|
|
/*
|
2015-04-13 10:30:14 -04:00
|
|
|
* This function performs a shutdown-write on a detached stream interface in a
|
|
|
|
|
* connected or init state (it does nothing for other states). It either shuts
|
|
|
|
|
* the write side or marks itself as closed. The buffer flags are updated to
|
|
|
|
|
* reflect the new state. It does also close everything if the SI was marked as
|
|
|
|
|
* being in error state. The owner task is woken up if it exists.
|
2012-08-06 13:31:45 -04:00
|
|
|
*/
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutw(struct stream_interface *si)
|
2009-09-05 14:57:35 -04:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
|
|
|
|
|
oc->flags &= ~CF_SHUTW_NOW;
|
|
|
|
|
if (oc->flags & CF_SHUTW)
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags |= CF_SHUTW;
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
2018-11-06 13:23:03 -05:00
|
|
|
si_done_get(si);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
BUG/MEDIUM: stream: fix client-fin/server-fin handling
A tcp half connection can cause 100% CPU on expiration.
First reproduced with this haproxy configuration :
global
tune.bufsize 10485760
defaults
timeout server-fin 90s
timeout client-fin 90s
backend node2
mode tcp
timeout server 900s
timeout connect 10s
server def 127.0.0.1:3333
frontend fe_api
mode tcp
timeout client 900s
bind :1990
use_backend node2
Ie timeout server-fin shorter than timeout server, the backend server
sends data, this package is left in the cache of haproxy, the backend
server continue sending fin package, haproxy recv fin package. this
time the session information is as follows:
time the session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
s1=[7,18h,fd=7,ex=] exp=14m58s
rp has set the CF_SHUTR state, next, the client sends the fin package,
session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=1m11s
After waiting 90s, session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=? run(nice=0)
cpu information:
6899 root 20 0 112224 21408 4260 R 100.0 0.7 3:04.96 haproxy
Buffering is set to ensure that there is data in the haproxy buffer, and haproxy
can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been
set, The following code does not clear the timeout message, causing cpu 100%:
stream.c:process_stream:
if (unlikely((res->flags & (CF_SHUTR|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) {
if (si_b->flags & SI_FL_NOHALF)
si_b->flags |= SI_FL_NOLINGER;
si_shutr(si_b);
}
If you have closed the read, set the read timeout does not make sense.
With or without cf_shutr, read timeout is set:
if (tick_isset(s->be->timeout.serverfin)) {
res->rto = s->be->timeout.serverfin;
res->rex = tick_add(now_ms, res->rto);
}
After discussion on the mailing list, setting half-closed timeouts the
hard way here doesn't make sense. They should be set only at the moment
the shutdown() is performed. It will also solve a special case which was
already reported of some half-closed timeouts not working when the shutw()
is performed directly at the stream-interface layer (no analyser involved).
Since the stream interface layer cannot know the timeout values, we'll have
to store them directly in the stream interface so that they are used upon
shutw(). This patch does this, fixing the problem.
An easier reproducer to validate the fix is to keep the huge buffer and
shorten all timeouts, then call it under tcploop server and client, and
wait 3 seconds to see haproxy run at 100% CPU :
global
tune.bufsize 10485760
listen px
bind :1990
timeout client 90s
timeout server 90s
timeout connect 1s
timeout server-fin 3s
timeout client-fin 3s
server def 127.0.0.1:3333
$ tcploop 3333 L W N20 A P100 F P10000 &
$ tcploop 127.0.0.1:1990 C S10000000 F
2017-03-10 12:41:51 -05:00
|
|
|
if (tick_isset(si->hcto)) {
|
|
|
|
|
ic->rto = si->hcto;
|
|
|
|
|
ic->rex = tick_add(now_ms, ic->rto);
|
|
|
|
|
}
|
|
|
|
|
|
2009-09-05 14:57:35 -04:00
|
|
|
switch (si->state) {
|
2019-06-05 08:34:03 -04:00
|
|
|
case SI_ST_RDY:
|
2009-09-05 14:57:35 -04:00
|
|
|
case SI_ST_EST:
|
2012-08-06 13:31:45 -04:00
|
|
|
/* we have to shut before closing, otherwise some short messages
|
|
|
|
|
* may never leave the system, especially when there are remaining
|
|
|
|
|
* unread data in the socket input buffer, or when nolinger is set.
|
|
|
|
|
* However, if SI_FL_NOLINGER is explicitly set, we know there is
|
|
|
|
|
* no risk so we close both sides immediately.
|
|
|
|
|
*/
|
2013-09-29 08:51:58 -04:00
|
|
|
if (!(si->flags & (SI_FL_ERR | SI_FL_NOLINGER)) &&
|
2014-11-28 09:46:27 -05:00
|
|
|
!(ic->flags & (CF_SHUTR|CF_DONT_READ)))
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2009-09-05 14:57:35 -04:00
|
|
|
|
|
|
|
|
/* fall through */
|
|
|
|
|
case SI_ST_CON:
|
|
|
|
|
case SI_ST_CER:
|
2010-12-29 08:03:02 -05:00
|
|
|
case SI_ST_QUE:
|
|
|
|
|
case SI_ST_TAR:
|
2013-09-29 08:51:58 -04:00
|
|
|
/* Note that none of these states may happen with applets */
|
2009-09-05 14:57:35 -04:00
|
|
|
si->state = SI_ST_DIS;
|
2020-05-29 08:35:51 -04:00
|
|
|
/* fall through */
|
2009-09-05 14:57:35 -04:00
|
|
|
default:
|
2018-11-14 10:58:52 -05:00
|
|
|
si->flags &= ~SI_FL_NOLINGER;
|
|
|
|
|
si_rx_shut_blk(si);
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
2009-09-05 14:57:35 -04:00
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
|
2012-08-06 13:31:45 -04:00
|
|
|
/* note that if the task exists, it must unregister itself once it runs */
|
2014-11-28 06:08:47 -05:00
|
|
|
if (!(si->flags & SI_FL_DONT_WAKE))
|
|
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_IO);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* default chk_rcv function for scheduled tasks */
|
2012-05-11 11:47:17 -04:00
|
|
|
static void stream_int_chk_rcv(struct stream_interface *si)
|
2009-09-05 14:57:35 -04:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
DPRINTF(stderr, "%s: si=%p, si->state=%d ic->flags=%08x oc->flags=%08x\n",
|
2009-09-05 14:57:35 -04:00
|
|
|
__FUNCTION__,
|
2014-11-28 09:46:27 -05:00
|
|
|
si, si->state, ic->flags, si_oc(si)->flags);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2018-10-11 07:54:13 -04:00
|
|
|
if (ic->pipe) {
|
2009-09-05 14:57:35 -04:00
|
|
|
/* stop reading */
|
2018-11-15 05:08:52 -05:00
|
|
|
si_rx_room_blk(si);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* (re)start reading */
|
2019-06-14 08:42:29 -04:00
|
|
|
tasklet_wakeup(si->wait_event.tasklet);
|
2014-11-28 06:08:47 -05:00
|
|
|
if (!(si->flags & SI_FL_DONT_WAKE))
|
|
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_IO);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* default chk_snd function for scheduled tasks */
|
2012-05-11 11:47:17 -04:00
|
|
|
static void stream_int_chk_snd(struct stream_interface *si)
|
2009-09-05 14:57:35 -04:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *oc = si_oc(si);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
DPRINTF(stderr, "%s: si=%p, si->state=%d ic->flags=%08x oc->flags=%08x\n",
|
2009-09-05 14:57:35 -04:00
|
|
|
__FUNCTION__,
|
2014-11-28 09:46:27 -05:00
|
|
|
si, si->state, si_ic(si)->flags, oc->flags);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (unlikely(si->state != SI_ST_EST || (oc->flags & CF_SHUTW)))
|
2009-09-05 14:57:35 -04:00
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
if (!(si->flags & SI_FL_WAIT_DATA) || /* not waiting for data */
|
2014-11-28 09:46:27 -05:00
|
|
|
channel_is_empty(oc)) /* called with nothing to send ! */
|
2009-09-05 14:57:35 -04:00
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
/* Otherwise there are remaining data to be sent in the buffer,
|
|
|
|
|
* so we tell the handler.
|
|
|
|
|
*/
|
|
|
|
|
si->flags &= ~SI_FL_WAIT_DATA;
|
2014-11-28 09:46:27 -05:00
|
|
|
if (!tick_isset(oc->wex))
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2014-11-28 06:08:47 -05:00
|
|
|
if (!(si->flags & SI_FL_DONT_WAKE))
|
|
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_IO);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
|
2015-07-19 12:46:30 -04:00
|
|
|
/* Register an applet to handle a stream_interface as a new appctx. The SI will
|
2021-01-07 23:35:52 -05:00
|
|
|
* wake it up every time it is solicited. The appctx must be deleted by the task
|
2015-07-19 12:46:30 -04:00
|
|
|
* handler using si_release_endpoint(), possibly from within the function itself.
|
|
|
|
|
* It also pre-initializes the applet's context and returns it (or NULL in case
|
|
|
|
|
* it could not be allocated).
|
2009-09-05 14:57:35 -04:00
|
|
|
*/
|
2018-12-19 09:19:27 -05:00
|
|
|
struct appctx *si_register_handler(struct stream_interface *si, struct applet *app)
|
2009-09-05 14:57:35 -04:00
|
|
|
{
|
2013-12-01 05:31:38 -05:00
|
|
|
struct appctx *appctx;
|
|
|
|
|
|
2014-11-28 06:08:47 -05:00
|
|
|
DPRINTF(stderr, "registering handler %p for si %p (was %p)\n", app, si, si_task(si));
|
2009-09-05 14:57:35 -04:00
|
|
|
|
2015-04-04 18:15:26 -04:00
|
|
|
appctx = si_alloc_appctx(si, app);
|
2014-12-22 13:34:00 -05:00
|
|
|
if (!appctx)
|
2013-12-01 05:31:38 -05:00
|
|
|
return NULL;
|
|
|
|
|
|
2018-11-06 12:46:37 -05:00
|
|
|
si_cant_get(si);
|
2015-04-19 11:20:03 -04:00
|
|
|
appctx_wakeup(appctx);
|
2013-12-01 03:35:41 -05:00
|
|
|
return si_appctx(si);
|
2009-09-05 14:57:35 -04:00
|
|
|
}
|
|
|
|
|
|
2012-07-06 11:12:34 -04:00
|
|
|
/* This callback is used to send a valid PROXY protocol line to a socket being
|
2012-08-09 08:45:22 -04:00
|
|
|
* established. It returns 0 if it fails in a fatal way or needs to poll to go
|
|
|
|
|
* further, otherwise it returns non-zero and removes itself from the connection's
|
2012-08-24 06:14:49 -04:00
|
|
|
* flags (the bit is provided in <flag> by the caller). It is designed to be
|
|
|
|
|
* called by the connection handler and relies on it to commit polling changes.
|
2013-10-24 16:01:26 -04:00
|
|
|
* Note that it can emit a PROXY line by relying on the other end's address
|
|
|
|
|
* when the connection is attached to a stream interface, or by resolving the
|
|
|
|
|
* local address otherwise (also called a LOCAL line).
|
2012-07-06 11:12:34 -04:00
|
|
|
*/
|
|
|
|
|
int conn_si_send_proxy(struct connection *conn, unsigned int flag)
|
|
|
|
|
{
|
2013-12-15 04:23:20 -05:00
|
|
|
if (!conn_ctrl_ready(conn))
|
MAJOR: connection: add two new flags to indicate readiness of control/transport
Currently the control and transport layers of a connection are supposed
to be initialized when their respective pointers are not NULL. This will
not work anymore when we plan to reuse connections, because there is an
asymmetry between the accept() side and the connect() side :
- on accept() side, the fd is set first, then the ctrl layer then the
transport layer ; upon error, they must be undone in the reverse order,
then the FD must be closed. The FD must not be deleted if the control
layer was not yet initialized ;
- on the connect() side, the fd is set last and there is no reliable way
to know if it has been initialized or not. In practice it's initialized
to -1 first but this is hackish and supposes that local FDs only will
be used forever. Also, there are even less solutions for keeping trace
of the transport layer's state.
Also it is possible to support delayed close() when something (eg: logs)
tracks some information requiring the transport and/or control layers,
making it even more difficult to clean them.
So the proposed solution is to add two flags to the connection :
- CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert)
and cleared after it's released (fd_delete).
- CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init)
and cleared after it's released (xprt->close).
The functions have been adapted to rely on this and not on the pointers
anymore. conn_xprt_close() was unused and dangerous : it did not close
the control layer (eg: the socket itself) but still marks the transport
layer as closed, preventing any future call to conn_full_close() from
finishing the job.
The problem comes from conn_full_close() in fact. It needs to close the
xprt and ctrl layers independantly. After that we're still having an issue :
we don't know based on ->ctrl alone whether the fd was registered or not.
For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We
now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what
remains to be done on the connection.
In order not to miss some flag assignments, we introduce conn_ctrl_init()
to initialize the control layer, register the fd using fd_insert() and set
the flag, and conn_ctrl_close() which unregisters the fd and removes the
flag, but only if the transport layer was closed.
Similarly, at the transport layer, conn_xprt_init() calls ->init and sets
the flag, while conn_xprt_close() checks the flag, calls ->close and clears
the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init
and the ->close functions are called only once each and in the correct order.
Note that conn_xprt_close() does nothing if the transport layer is still
tracked.
conn_full_close() now simply calls conn_xprt_close() then conn_full_close()
in turn, which do nothing if CO_FL_XPRT_TRACKED is set.
In order to handle the error path, we also provide conn_force_close() which
ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers
in turns. All relevant instances of fd_delete() have been replaced with
conn_force_close(). Now we always know what state the connection is in and
we can expect to split its initialization.
2013-10-21 10:30:56 -04:00
|
|
|
goto out_error;
|
|
|
|
|
|
2012-07-06 11:12:34 -04:00
|
|
|
/* If we have a PROXY line to send, we'll use this to validate the
|
|
|
|
|
* connection, in which case the connection is validated only once
|
|
|
|
|
* we've sent the whole proxy line. Otherwise we use connect().
|
|
|
|
|
*/
|
2019-02-26 11:09:51 -05:00
|
|
|
if (conn->send_proxy_ofs) {
|
2018-11-18 15:38:19 -05:00
|
|
|
const struct conn_stream *cs;
|
2012-07-06 11:12:34 -04:00
|
|
|
int ret;
|
|
|
|
|
|
2020-05-26 10:08:49 -04:00
|
|
|
/* If there is no mux attached to the connection, it means the
|
|
|
|
|
* connection context is a conn-stream.
|
|
|
|
|
*/
|
|
|
|
|
cs = (conn->mux ? cs_get_first(conn) : conn->ctx);
|
|
|
|
|
|
2012-07-06 11:12:34 -04:00
|
|
|
/* The target server expects a PROXY line to be sent first.
|
|
|
|
|
* If the send_proxy_ofs is negative, it corresponds to the
|
|
|
|
|
* offset to start sending from then end of the proxy string
|
|
|
|
|
* (which is recomputed every time since it's constant). If
|
|
|
|
|
* it is positive, it means we have to send from the start.
|
2013-10-24 16:01:26 -04:00
|
|
|
* We can only send a "normal" PROXY line when the connection
|
|
|
|
|
* is attached to a stream interface. Otherwise we can only
|
|
|
|
|
* send a LOCAL line (eg: for use with health checks).
|
2012-07-06 11:12:34 -04:00
|
|
|
*/
|
2018-11-18 15:38:19 -05:00
|
|
|
|
|
|
|
|
if (cs && cs->data_cb == &si_conn_cb) {
|
2017-09-13 12:30:23 -04:00
|
|
|
struct stream_interface *si = cs->data;
|
|
|
|
|
struct conn_stream *remote_cs = objt_cs(si_opposite(si)->end);
|
2020-03-13 07:34:24 -04:00
|
|
|
struct stream *strm = si_strm(si);
|
|
|
|
|
|
2018-07-13 04:54:26 -04:00
|
|
|
ret = make_proxy_line(trash.area, trash.size,
|
|
|
|
|
objt_server(conn->target),
|
2020-03-13 07:34:24 -04:00
|
|
|
remote_cs ? remote_cs->conn : NULL,
|
|
|
|
|
strm);
|
2013-10-24 16:01:26 -04:00
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* The target server expects a LOCAL line to be sent first. Retrieving
|
|
|
|
|
* local or remote addresses may fail until the connection is established.
|
|
|
|
|
*/
|
2019-07-17 05:40:51 -04:00
|
|
|
if (!conn_get_src(conn) || !conn_get_dst(conn))
|
2013-10-24 16:01:26 -04:00
|
|
|
goto out_wait;
|
|
|
|
|
|
2018-07-13 04:54:26 -04:00
|
|
|
ret = make_proxy_line(trash.area, trash.size,
|
2020-03-13 07:34:24 -04:00
|
|
|
objt_server(conn->target), conn,
|
|
|
|
|
NULL);
|
2013-10-24 16:01:26 -04:00
|
|
|
}
|
2013-10-01 04:45:07 -04:00
|
|
|
|
2012-07-06 11:12:34 -04:00
|
|
|
if (!ret)
|
|
|
|
|
goto out_error;
|
|
|
|
|
|
2013-10-24 15:10:08 -04:00
|
|
|
if (conn->send_proxy_ofs > 0)
|
|
|
|
|
conn->send_proxy_ofs = -ret; /* first call */
|
2012-07-06 11:12:34 -04:00
|
|
|
|
2012-08-24 06:14:49 -04:00
|
|
|
/* we have to send trash from (ret+sp for -sp bytes). If the
|
|
|
|
|
* data layer has a pending write, we'll also set MSG_MORE.
|
|
|
|
|
*/
|
2020-12-11 09:26:55 -05:00
|
|
|
ret = conn_ctrl_send(conn,
|
2018-07-13 04:54:26 -04:00
|
|
|
trash.area + ret + conn->send_proxy_ofs,
|
|
|
|
|
-conn->send_proxy_ofs,
|
2020-12-11 09:26:55 -05:00
|
|
|
(conn->subs && conn->subs->events & SUB_RETRY_SEND) ? CO_SFL_MSG_MORE : 0);
|
2012-07-06 11:12:34 -04:00
|
|
|
|
2015-03-12 19:05:28 -04:00
|
|
|
if (ret < 0)
|
2012-07-06 11:12:34 -04:00
|
|
|
goto out_error;
|
|
|
|
|
|
2013-10-24 15:10:08 -04:00
|
|
|
conn->send_proxy_ofs += ret; /* becomes zero once complete */
|
|
|
|
|
if (conn->send_proxy_ofs != 0)
|
2012-07-06 11:12:34 -04:00
|
|
|
goto out_wait;
|
|
|
|
|
|
|
|
|
|
/* OK we've sent the whole line, we're connected */
|
|
|
|
|
}
|
|
|
|
|
|
2012-08-24 06:14:49 -04:00
|
|
|
/* The connection is ready now, simply return and let the connection
|
|
|
|
|
* handler notify upper layers if needed.
|
2012-07-06 11:12:34 -04:00
|
|
|
*/
|
MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.
Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.
So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.
This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:
- while most places do check both L4+L6 and HANDSHAKE at the same time,
some places like assign_server() or back_handle_st_con() and a few
sample fetches looking for proxy protocol do check for L4+L6 but
don't care about HANDSHAKE ; these ones will probably fail on TCP
request session rules if the handshake is not complete.
- some handshake handlers do validate that a connection is established
at L4 but didn't clear CO_FL_WAIT_L4_CONN
- the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
before declaring the mux ready while the snd_buf function also checks
for the handshake's completion. Likely the former should validate the
handshake as well and we should get rid of these extra tests in snd_buf.
- raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
later clear CO_FL_WAIT_L4_CONN.
- xprt_handshake would set CO_FL_CONNECTED itself without actually
clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
waiting for a pure Rx handshake.
- most places in ssl_sock that were checking CO_FL_CONNECTED don't need
to include the L4 check as an L6 check is enough to decide whether to
wait for more info or not.
It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.
Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 03:11:58 -05:00
|
|
|
conn->flags &= ~CO_FL_WAIT_L4_CONN;
|
2012-07-06 11:12:34 -04:00
|
|
|
conn->flags &= ~flag;
|
2012-08-09 08:45:22 -04:00
|
|
|
return 1;
|
2012-07-06 11:12:34 -04:00
|
|
|
|
|
|
|
|
out_error:
|
2012-08-09 08:45:22 -04:00
|
|
|
/* Write error on the file descriptor */
|
2012-07-06 11:12:34 -04:00
|
|
|
conn->flags |= CO_FL_ERROR;
|
2012-08-09 08:45:22 -04:00
|
|
|
return 0;
|
2012-07-06 11:12:34 -04:00
|
|
|
|
|
|
|
|
out_wait:
|
2012-08-09 08:45:22 -04:00
|
|
|
return 0;
|
2012-07-06 11:12:34 -04:00
|
|
|
}
|
|
|
|
|
|
2013-12-16 18:00:28 -05:00
|
|
|
|
2018-12-19 09:19:27 -05:00
|
|
|
/* This function is the equivalent to si_update() except that it's
|
2015-09-23 12:40:09 -04:00
|
|
|
* designed to be called from outside the stream handlers, typically the lower
|
|
|
|
|
* layers (applets, connections) after I/O completion. After updating the stream
|
|
|
|
|
* interface and timeouts, it will try to forward what can be forwarded, then to
|
|
|
|
|
* wake the associated task up if an important event requires special handling.
|
2018-11-14 05:10:26 -05:00
|
|
|
* It may update SI_FL_WAIT_DATA and/or SI_FL_RXBLK_ROOM, that the callers are
|
2018-10-25 07:55:20 -04:00
|
|
|
* encouraged to watch to take appropriate action.
|
2018-12-19 09:19:27 -05:00
|
|
|
* It should not be called from within the stream itself, si_update()
|
2015-09-23 12:40:09 -04:00
|
|
|
* is designed for this.
|
|
|
|
|
*/
|
2018-12-19 09:19:27 -05:00
|
|
|
static void stream_int_notify(struct stream_interface *si)
|
2015-09-23 12:40:09 -04:00
|
|
|
{
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
2018-11-15 01:46:57 -05:00
|
|
|
struct stream_interface *sio = si_opposite(si);
|
BUG/MAJOR: stream-int: Update the stream expiration date in stream_int_notify()
Since a long time, the expiration date of a stream is only updated in
process_stream(). It is calculated, among others, using the channels expiration
dates for reads and writes (.rex and .wex values). But these values are updated
by the stream-interface. So when this happens at the connection layer, the
update is only done if the stream's task is woken up. Otherwise, the stream
expiration date is not immediatly updated. This leads to unexpected
behaviours. Time to time, users reported that the wrong timeout was hitted or
the wrong termination state was reported. This is partly because of this
bug.
Recently, we observed some blocked sessions for a while when big objects are
served from the cache applet. It seems only concern the clients not reading the
response. Because delivered objects are big, not all data can be sent. And
because delivered objects are big, data are fast forwarded (from the input to
the output with no stream wakeup). So in such situation, the stream expiration
date is never updated and no timeout is hitted. The session remains blocked
while the client remains connected.
This bug exists at least since HAProxy 1.5. But recent changes on the connection
layer make it more visible. It must be backported from 1.9 to 1.6. And with more
pain it should be backported to 1.5.
2019-01-03 10:24:54 -05:00
|
|
|
struct task *task = si_task(si);
|
2015-09-23 12:40:09 -04:00
|
|
|
|
|
|
|
|
/* process consumer side */
|
|
|
|
|
if (channel_is_empty(oc)) {
|
2017-11-16 11:49:25 -05:00
|
|
|
struct connection *conn = objt_cs(si->end) ? objt_cs(si->end)->conn : NULL;
|
|
|
|
|
|
2015-09-23 12:40:09 -04:00
|
|
|
if (((oc->flags & (CF_SHUTW|CF_SHUTW_NOW)) == CF_SHUTW_NOW) &&
|
2020-01-23 10:27:54 -05:00
|
|
|
(si->state == SI_ST_EST) && (!conn || !(conn->flags & (CO_FL_WAIT_XPRT | CO_FL_EARLY_SSL_HS))))
|
2015-09-23 12:40:09 -04:00
|
|
|
si_shutw(si);
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
|
2016-12-13 09:21:25 -05:00
|
|
|
/* indicate that we may be waiting for data from the output channel or
|
|
|
|
|
* we're about to close and can't expect more data if SHUTW_NOW is there.
|
|
|
|
|
*/
|
2018-10-11 07:54:13 -04:00
|
|
|
if (!(oc->flags & (CF_SHUTW|CF_SHUTW_NOW)))
|
2015-09-23 12:40:09 -04:00
|
|
|
si->flags |= SI_FL_WAIT_DATA;
|
2016-12-13 09:21:25 -05:00
|
|
|
else if ((oc->flags & (CF_SHUTW|CF_SHUTW_NOW)) == CF_SHUTW_NOW)
|
|
|
|
|
si->flags &= ~SI_FL_WAIT_DATA;
|
2015-09-23 12:40:09 -04:00
|
|
|
|
|
|
|
|
/* update OC timeouts and wake the other side up if it's waiting for room */
|
|
|
|
|
if (oc->flags & CF_WRITE_ACTIVITY) {
|
|
|
|
|
if ((oc->flags & (CF_SHUTW|CF_WRITE_PARTIAL)) == CF_WRITE_PARTIAL &&
|
|
|
|
|
!channel_is_empty(oc))
|
|
|
|
|
if (tick_isset(oc->wex))
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
|
|
|
|
|
|
|
|
|
if (!(si->flags & SI_FL_INDEP_STR))
|
|
|
|
|
if (tick_isset(ic->rex))
|
|
|
|
|
ic->rex = tick_add_ifset(now_ms, ic->rto);
|
2018-11-12 10:11:08 -05:00
|
|
|
}
|
2015-09-23 12:40:09 -04:00
|
|
|
|
2018-11-14 11:10:36 -05:00
|
|
|
if (oc->flags & CF_DONT_READ)
|
|
|
|
|
si_rx_chan_blk(sio);
|
|
|
|
|
else
|
|
|
|
|
si_rx_chan_rdy(sio);
|
2015-09-23 12:40:09 -04:00
|
|
|
|
|
|
|
|
/* Notify the other side when we've injected data into the IC that
|
|
|
|
|
* needs to be forwarded. We can do fast-forwarding as soon as there
|
|
|
|
|
* are output data, but we avoid doing this if some of the data are
|
|
|
|
|
* not yet scheduled for being forwarded, because it is very likely
|
|
|
|
|
* that it will be done again immediately afterwards once the following
|
2018-11-14 05:10:26 -05:00
|
|
|
* data are parsed (eg: HTTP chunking). We only SI_FL_RXBLK_ROOM once
|
2015-09-23 12:40:09 -04:00
|
|
|
* we've emptied *some* of the output buffer, and not just when there
|
|
|
|
|
* is available room, because applets are often forced to stop before
|
|
|
|
|
* the buffer is full. We must not stop based on input data alone because
|
|
|
|
|
* an HTTP parser might need more data to complete the parsing.
|
|
|
|
|
*/
|
|
|
|
|
if (!channel_is_empty(ic) &&
|
2018-11-15 01:46:57 -05:00
|
|
|
(sio->flags & SI_FL_WAIT_DATA) &&
|
2018-11-18 09:46:10 -05:00
|
|
|
(!(ic->flags & CF_EXPECT_MORE) || c_full(ic) || ci_data(ic) == 0 || ic->pipe)) {
|
2015-09-23 12:40:09 -04:00
|
|
|
int new_len, last_len;
|
|
|
|
|
|
2018-06-19 01:03:14 -04:00
|
|
|
last_len = co_data(ic);
|
2015-09-23 12:40:09 -04:00
|
|
|
if (ic->pipe)
|
|
|
|
|
last_len += ic->pipe->data;
|
|
|
|
|
|
2018-11-15 01:46:57 -05:00
|
|
|
si_chk_snd(sio);
|
2015-09-23 12:40:09 -04:00
|
|
|
|
2018-06-19 01:03:14 -04:00
|
|
|
new_len = co_data(ic);
|
2015-09-23 12:40:09 -04:00
|
|
|
if (ic->pipe)
|
|
|
|
|
new_len += ic->pipe->data;
|
|
|
|
|
|
|
|
|
|
/* check if the consumer has freed some space either in the
|
|
|
|
|
* buffer or in the pipe.
|
|
|
|
|
*/
|
2018-11-15 01:46:57 -05:00
|
|
|
if (new_len < last_len)
|
2018-11-15 05:08:52 -05:00
|
|
|
si_rx_room_rdy(si);
|
2015-09-23 12:40:09 -04:00
|
|
|
}
|
|
|
|
|
|
2018-11-14 11:10:36 -05:00
|
|
|
if (!(ic->flags & CF_DONT_READ))
|
|
|
|
|
si_rx_chan_rdy(si);
|
|
|
|
|
|
2018-11-15 01:46:57 -05:00
|
|
|
si_chk_rcv(si);
|
|
|
|
|
si_chk_rcv(sio);
|
|
|
|
|
|
2018-11-14 11:10:36 -05:00
|
|
|
if (si_rx_blocked(si)) {
|
2015-09-23 12:40:09 -04:00
|
|
|
ic->rex = TICK_ETERNITY;
|
|
|
|
|
}
|
2018-11-14 11:10:36 -05:00
|
|
|
else if ((ic->flags & (CF_SHUTR|CF_READ_PARTIAL)) == CF_READ_PARTIAL) {
|
2015-09-23 12:40:09 -04:00
|
|
|
/* we must re-enable reading if si_chk_snd() has freed some space */
|
|
|
|
|
if (!(ic->flags & CF_READ_NOEXP) && tick_isset(ic->rex))
|
|
|
|
|
ic->rex = tick_add_ifset(now_ms, ic->rto);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* wake the task up only when needed */
|
|
|
|
|
if (/* changes on the production side */
|
|
|
|
|
(ic->flags & (CF_READ_NULL|CF_READ_ERROR)) ||
|
2019-06-05 08:34:03 -04:00
|
|
|
!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST) ||
|
2015-09-23 12:40:09 -04:00
|
|
|
(si->flags & SI_FL_ERR) ||
|
|
|
|
|
((ic->flags & CF_READ_PARTIAL) &&
|
2019-03-22 09:16:14 -04:00
|
|
|
((ic->flags & CF_EOI) || !ic->to_forward || sio->state != SI_ST_EST)) ||
|
2015-09-23 12:40:09 -04:00
|
|
|
|
|
|
|
|
/* changes on the consumption side */
|
|
|
|
|
(oc->flags & (CF_WRITE_NULL|CF_WRITE_ERROR)) ||
|
2018-10-24 11:17:56 -04:00
|
|
|
((oc->flags & CF_WRITE_ACTIVITY) &&
|
2015-09-23 12:40:09 -04:00
|
|
|
((oc->flags & CF_SHUTW) ||
|
2018-12-19 05:00:00 -05:00
|
|
|
(((oc->flags & CF_WAKE_WRITE) ||
|
|
|
|
|
!(oc->flags & (CF_AUTO_CLOSE|CF_SHUTW_NOW|CF_SHUTW))) &&
|
2018-11-15 01:46:57 -05:00
|
|
|
(sio->state != SI_ST_EST ||
|
2015-09-23 12:40:09 -04:00
|
|
|
(channel_is_empty(oc) && !oc->to_forward)))))) {
|
BUG/MAJOR: stream-int: Update the stream expiration date in stream_int_notify()
Since a long time, the expiration date of a stream is only updated in
process_stream(). It is calculated, among others, using the channels expiration
dates for reads and writes (.rex and .wex values). But these values are updated
by the stream-interface. So when this happens at the connection layer, the
update is only done if the stream's task is woken up. Otherwise, the stream
expiration date is not immediatly updated. This leads to unexpected
behaviours. Time to time, users reported that the wrong timeout was hitted or
the wrong termination state was reported. This is partly because of this
bug.
Recently, we observed some blocked sessions for a while when big objects are
served from the cache applet. It seems only concern the clients not reading the
response. Because delivered objects are big, not all data can be sent. And
because delivered objects are big, data are fast forwarded (from the input to
the output with no stream wakeup). So in such situation, the stream expiration
date is never updated and no timeout is hitted. The session remains blocked
while the client remains connected.
This bug exists at least since HAProxy 1.5. But recent changes on the connection
layer make it more visible. It must be backported from 1.9 to 1.6. And with more
pain it should be backported to 1.5.
2019-01-03 10:24:54 -05:00
|
|
|
task_wakeup(task, TASK_WOKEN_IO);
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* Update expiration date for the task and requeue it */
|
|
|
|
|
task->expire = tick_first((tick_is_expired(task->expire, now_ms) ? 0 : task->expire),
|
|
|
|
|
tick_first(tick_first(ic->rex, ic->wex),
|
|
|
|
|
tick_first(oc->rex, oc->wex)));
|
2019-08-01 12:51:38 -04:00
|
|
|
|
|
|
|
|
task->expire = tick_first(task->expire, ic->analyse_exp);
|
|
|
|
|
task->expire = tick_first(task->expire, oc->analyse_exp);
|
|
|
|
|
|
|
|
|
|
if (si->exp)
|
|
|
|
|
task->expire = tick_first(task->expire, si->exp);
|
|
|
|
|
|
|
|
|
|
if (sio->exp)
|
|
|
|
|
task->expire = tick_first(task->expire, sio->exp);
|
|
|
|
|
|
BUG/MAJOR: stream-int: Update the stream expiration date in stream_int_notify()
Since a long time, the expiration date of a stream is only updated in
process_stream(). It is calculated, among others, using the channels expiration
dates for reads and writes (.rex and .wex values). But these values are updated
by the stream-interface. So when this happens at the connection layer, the
update is only done if the stream's task is woken up. Otherwise, the stream
expiration date is not immediatly updated. This leads to unexpected
behaviours. Time to time, users reported that the wrong timeout was hitted or
the wrong termination state was reported. This is partly because of this
bug.
Recently, we observed some blocked sessions for a while when big objects are
served from the cache applet. It seems only concern the clients not reading the
response. Because delivered objects are big, not all data can be sent. And
because delivered objects are big, data are fast forwarded (from the input to
the output with no stream wakeup). So in such situation, the stream expiration
date is never updated and no timeout is hitted. The session remains blocked
while the client remains connected.
This bug exists at least since HAProxy 1.5. But recent changes on the connection
layer make it more visible. It must be backported from 1.9 to 1.6. And with more
pain it should be backported to 1.5.
2019-01-03 10:24:54 -05:00
|
|
|
task_queue(task);
|
2015-09-23 12:40:09 -04:00
|
|
|
}
|
|
|
|
|
if (ic->flags & CF_READ_ACTIVITY)
|
|
|
|
|
ic->flags &= ~CF_READ_DONTWAIT;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
2018-09-11 12:27:21 -04:00
|
|
|
/* Called by I/O handlers after completion.. It propagates
|
2015-09-23 14:06:13 -04:00
|
|
|
* connection flags to the stream interface, updates the stream (which may or
|
|
|
|
|
* may not take this opportunity to try to forward data), then update the
|
|
|
|
|
* connection's polling based on the channels and stream interface's final
|
|
|
|
|
* states. The function always returns 0.
|
2012-08-20 06:06:26 -04:00
|
|
|
*/
|
2018-09-11 12:27:21 -04:00
|
|
|
static int si_cs_process(struct conn_stream *cs)
|
2012-07-23 12:24:25 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct connection *conn = cs->conn;
|
|
|
|
|
struct stream_interface *si = cs->data;
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
2012-07-23 12:24:25 -04:00
|
|
|
|
2018-08-28 13:37:41 -04:00
|
|
|
/* If we have data to send, try it now */
|
2018-12-19 07:59:17 -05:00
|
|
|
if (!channel_is_empty(oc) && !(si->wait_event.events & SUB_RETRY_SEND))
|
2018-10-25 08:02:47 -04:00
|
|
|
si_cs_send(cs);
|
|
|
|
|
|
2015-09-23 14:06:13 -04:00
|
|
|
/* First step, report to the stream-int what was detected at the
|
|
|
|
|
* connection layer : errors and connection establishment.
|
2019-06-24 10:08:08 -04:00
|
|
|
* Only add SI_FL_ERR if we're connected, or we're attempting to
|
|
|
|
|
* connect, we may get there because we got woken up, but only run
|
|
|
|
|
* after process_stream() noticed there were an error, and decided
|
|
|
|
|
* to retry to connect, the connection may still have CO_FL_ERROR,
|
|
|
|
|
* and we don't want to add SI_FL_ERR back
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
*
|
|
|
|
|
* Note: This test is only required because si_cs_process is also the SI
|
|
|
|
|
* wake callback. Otherwise si_cs_recv()/si_cs_send() already take
|
|
|
|
|
* care of it.
|
2015-09-23 14:06:13 -04:00
|
|
|
*/
|
2019-06-24 10:08:08 -04:00
|
|
|
if (si->state >= SI_ST_CON &&
|
|
|
|
|
(conn->flags & CO_FL_ERROR || cs->flags & CS_FL_ERROR))
|
2012-07-23 13:19:51 -04:00
|
|
|
si->flags |= SI_FL_ERR;
|
|
|
|
|
|
2017-10-02 05:51:03 -04:00
|
|
|
/* If we had early data, and the handshake ended, then
|
|
|
|
|
* we can remove the flag, and attempt to wake the task up,
|
|
|
|
|
* in the event there's an analyser waiting for the end of
|
|
|
|
|
* the handshake.
|
|
|
|
|
*/
|
2020-01-23 10:27:54 -05:00
|
|
|
if (!(conn->flags & (CO_FL_WAIT_XPRT | CO_FL_EARLY_SSL_HS)) &&
|
2017-11-27 12:41:32 -05:00
|
|
|
(cs->flags & CS_FL_WAIT_FOR_HS)) {
|
|
|
|
|
cs->flags &= ~CS_FL_WAIT_FOR_HS;
|
2017-10-02 05:51:03 -04:00
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_MSG);
|
|
|
|
|
}
|
|
|
|
|
|
2019-06-05 08:53:22 -04:00
|
|
|
if (!si_state_in(si->state, SI_SB_EST|SI_SB_DIS|SI_SB_CLO) &&
|
2020-01-23 10:27:54 -05:00
|
|
|
(conn->flags & CO_FL_WAIT_XPRT) == 0) {
|
2012-07-23 13:45:44 -04:00
|
|
|
si->exp = TICK_ETERNITY;
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags |= CF_WRITE_NULL;
|
2019-06-05 10:43:44 -04:00
|
|
|
if (si->state == SI_ST_CON)
|
|
|
|
|
si->state = SI_ST_RDY;
|
2012-07-23 13:45:44 -04:00
|
|
|
}
|
|
|
|
|
|
2019-03-22 09:16:14 -04:00
|
|
|
/* Report EOI on the channel if it was reached from the mux point of
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
* view.
|
|
|
|
|
*
|
|
|
|
|
* Note: This test is only required because si_cs_process is also the SI
|
|
|
|
|
* wake callback. Otherwise si_cs_recv()/si_cs_send() already take
|
|
|
|
|
* care of it.
|
|
|
|
|
*/
|
2019-03-22 09:16:14 -04:00
|
|
|
if ((cs->flags & CS_FL_EOI) && !(ic->flags & CF_EOI))
|
BUG/MINOR: mux-h1: Report EOI instead EOS on parsing error or H2 upgrade
When a parsing error occurrs in the H1 multiplexer, we stop to copy HTX
blocks. So the error may be reported with an emtpy HTX message. For instance, if
the headers parsing failed. When it happens, the flag CS_FL_EOS is also set on
the conn_stream. But it is an error. Most of time, it is set on established
connections, so it is not really an issue. But if it happens when the server
connection is not fully established, the connection is shut down immediatly and
the stream-interface is switched from SI_ST_CON to SI_ST_DIS/CLO. So HTX
analyzers have no chance to catch the error.
Instead of setting CS_FL_EOS, it is fairly better to set CS_FL_EOI, which is the
right flag to use. The same is also done on H2 upgrade. As a side effet of this
fix, in the stream-interface code, we must now set the flag CF_READ_PARTIAL on
the channel when the flag CF_EOI is set. It is a warranty to wakeup the stream
when EOI is reported to the channel while no data are received.
This patch must be backported to 1.9.
2019-05-17 03:14:10 -04:00
|
|
|
ic->flags |= (CF_EOI|CF_READ_PARTIAL);
|
MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.
Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.
In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.
So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.
This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
2019-03-08 03:23:46 -05:00
|
|
|
|
2015-09-23 14:06:13 -04:00
|
|
|
/* Second step : update the stream-int and channels, try to forward any
|
|
|
|
|
* pending data, then possibly wake the stream up based on the new
|
|
|
|
|
* stream-int status.
|
2012-08-24 06:12:53 -04:00
|
|
|
*/
|
2015-09-23 14:06:13 -04:00
|
|
|
stream_int_notify(si);
|
2019-08-01 08:17:02 -04:00
|
|
|
stream_release_buffers(si_strm(si));
|
2012-10-03 15:12:16 -04:00
|
|
|
return 0;
|
2012-07-23 12:24:25 -04:00
|
|
|
}
|
2012-07-06 11:12:34 -04:00
|
|
|
|
2012-08-21 12:22:06 -04:00
|
|
|
/*
|
|
|
|
|
* This function is called to send buffer data to a stream socket.
|
2017-09-13 12:30:23 -04:00
|
|
|
* It calls the mux layer's snd_buf function. It relies on the
|
2013-12-04 04:24:06 -05:00
|
|
|
* caller to commit polling changes. The caller should check conn->flags
|
|
|
|
|
* for errors.
|
2012-08-21 12:22:06 -04:00
|
|
|
*/
|
2018-09-14 13:41:13 -04:00
|
|
|
int si_cs_send(struct conn_stream *cs)
|
2012-08-21 12:22:06 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct connection *conn = cs->conn;
|
|
|
|
|
struct stream_interface *si = cs->data;
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *oc = si_oc(si);
|
2012-08-21 12:22:06 -04:00
|
|
|
int ret;
|
2018-07-17 12:49:38 -04:00
|
|
|
int did_send = 0;
|
|
|
|
|
|
2018-12-19 11:17:10 -05:00
|
|
|
if (conn->flags & CO_FL_ERROR || cs->flags & (CS_FL_ERROR|CS_FL_ERR_PENDING)) {
|
2019-06-24 10:08:08 -04:00
|
|
|
/* We're probably there because the tasklet was woken up,
|
|
|
|
|
* but process_stream() ran before, detected there were an
|
|
|
|
|
* error and put the si back to SI_ST_TAR. There's still
|
|
|
|
|
* CO_FL_ERROR on the connection but we don't want to add
|
|
|
|
|
* SI_FL_ERR back, so give up
|
|
|
|
|
*/
|
|
|
|
|
if (si->state < SI_ST_CON)
|
|
|
|
|
return 0;
|
2018-12-19 11:17:10 -05:00
|
|
|
si->flags |= SI_FL_ERR;
|
2018-09-11 12:27:21 -04:00
|
|
|
return 1;
|
2018-12-19 11:17:10 -05:00
|
|
|
}
|
2018-07-17 12:49:38 -04:00
|
|
|
|
2019-09-23 09:57:29 -04:00
|
|
|
/* We're already waiting to be able to send, give up */
|
|
|
|
|
if (si->wait_event.events & SUB_RETRY_SEND)
|
|
|
|
|
return 0;
|
|
|
|
|
|
2018-07-17 12:49:38 -04:00
|
|
|
/* we might have been called just after an asynchronous shutw */
|
2020-01-23 12:25:23 -05:00
|
|
|
if (oc->flags & CF_SHUTW)
|
2018-09-11 12:27:21 -04:00
|
|
|
return 1;
|
2012-08-21 12:22:06 -04:00
|
|
|
|
2020-07-30 03:26:46 -04:00
|
|
|
/* we must wait because the mux is not installed yet */
|
|
|
|
|
if (!conn->mux)
|
|
|
|
|
return 0;
|
|
|
|
|
|
2017-09-13 12:30:23 -04:00
|
|
|
if (oc->pipe && conn->xprt->snd_pipe && conn->mux->snd_pipe) {
|
|
|
|
|
ret = conn->mux->snd_pipe(cs, oc->pipe);
|
2019-07-05 05:49:11 -04:00
|
|
|
if (ret > 0)
|
2018-07-17 12:49:38 -04:00
|
|
|
did_send = 1;
|
2012-08-21 12:22:06 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (!oc->pipe->data) {
|
|
|
|
|
put_pipe(oc->pipe);
|
|
|
|
|
oc->pipe = NULL;
|
2012-08-21 12:22:06 -04:00
|
|
|
}
|
|
|
|
|
|
2018-11-20 04:21:08 -05:00
|
|
|
if (oc->pipe)
|
|
|
|
|
goto end;
|
2012-08-21 12:22:06 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* At this point, the pipe is empty, but we may still have data pending
|
|
|
|
|
* in the normal buffer.
|
|
|
|
|
*/
|
2018-11-20 04:30:02 -05:00
|
|
|
if (co_data(oc)) {
|
|
|
|
|
/* when we're here, we already know that there is no spliced
|
|
|
|
|
* data left, and that there are sendable buffered data.
|
|
|
|
|
*/
|
2012-08-21 12:22:06 -04:00
|
|
|
|
|
|
|
|
/* check if we want to inform the kernel that we're interested in
|
|
|
|
|
* sending more data after this call. We want this if :
|
|
|
|
|
* - we're about to close after this last send and want to merge
|
|
|
|
|
* the ongoing FIN with the last segment.
|
|
|
|
|
* - we know we can't send everything at once and must get back
|
|
|
|
|
* here because of unaligned data
|
|
|
|
|
* - there is still a finite amount of data to forward
|
|
|
|
|
* The test is arranged so that the most common case does only 2
|
|
|
|
|
* tests.
|
|
|
|
|
*/
|
2014-02-01 19:51:17 -05:00
|
|
|
unsigned int send_flag = 0;
|
2012-08-21 12:22:06 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if ((!(oc->flags & (CF_NEVER_WAIT|CF_SEND_DONTWAIT)) &&
|
|
|
|
|
((oc->to_forward && oc->to_forward != CHN_INFINITE_FORWARD) ||
|
2020-06-19 11:07:06 -04:00
|
|
|
(oc->flags & CF_EXPECT_MORE) ||
|
2020-07-22 10:28:44 -04:00
|
|
|
(IS_HTX_STRM(si_strm(si)) &&
|
|
|
|
|
(!(oc->flags & (CF_EOI|CF_SHUTR)) && htx_expect_more(htxbuf(&oc->buf)))))) ||
|
2017-11-07 09:07:25 -05:00
|
|
|
((oc->flags & CF_ISRESP) &&
|
|
|
|
|
((oc->flags & (CF_AUTO_CLOSE|CF_SHUTW_NOW)) == (CF_AUTO_CLOSE|CF_SHUTW_NOW))))
|
2014-02-01 19:51:17 -05:00
|
|
|
send_flag |= CO_SFL_MSG_MORE;
|
2012-08-21 12:22:06 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (oc->flags & CF_STREAMER)
|
2014-02-01 20:00:24 -05:00
|
|
|
send_flag |= CO_SFL_STREAMER;
|
|
|
|
|
|
MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
2019-04-05 09:30:12 -04:00
|
|
|
if ((si->flags & SI_FL_L7_RETRY) && !b_data(&si->l7_buffer)) {
|
2019-05-17 09:38:29 -04:00
|
|
|
struct stream *s = si_strm(si);
|
MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
2019-04-05 09:30:12 -04:00
|
|
|
/* If we want to be able to do L7 retries, copy
|
|
|
|
|
* the data we're about to send, so that we are able
|
|
|
|
|
* to resend them if needed
|
|
|
|
|
*/
|
|
|
|
|
/* Try to allocate a buffer if we had none.
|
|
|
|
|
* If it fails, the next test will just
|
|
|
|
|
* disable the l7 retries by setting
|
|
|
|
|
* l7_conn_retries to 0.
|
|
|
|
|
*/
|
2019-05-17 09:38:29 -04:00
|
|
|
if (!s->txn || (s->txn->req.msg_state != HTTP_MSG_DONE))
|
MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
2019-04-05 09:30:12 -04:00
|
|
|
si->flags &= ~SI_FL_L7_RETRY;
|
|
|
|
|
else {
|
|
|
|
|
if (b_is_null(&si->l7_buffer))
|
|
|
|
|
b_alloc(&si->l7_buffer);
|
|
|
|
|
if (b_is_null(&si->l7_buffer))
|
|
|
|
|
si->flags &= ~SI_FL_L7_RETRY;
|
|
|
|
|
else {
|
|
|
|
|
memcpy(b_orig(&si->l7_buffer),
|
|
|
|
|
b_orig(&oc->buf),
|
|
|
|
|
b_size(&oc->buf));
|
|
|
|
|
si->l7_buffer.head = co_data(oc);
|
|
|
|
|
b_add(&si->l7_buffer, co_data(oc));
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2018-08-16 09:41:52 -04:00
|
|
|
ret = cs->conn->mux->snd_buf(cs, &oc->buf, co_data(oc), send_flag);
|
2013-10-11 03:48:29 -04:00
|
|
|
if (ret > 0) {
|
2018-07-17 12:49:38 -04:00
|
|
|
did_send = 1;
|
2018-06-19 01:03:14 -04:00
|
|
|
co_set_data(oc, co_data(oc) - ret);
|
2018-06-14 12:38:55 -04:00
|
|
|
c_realign_if_empty(oc);
|
|
|
|
|
|
|
|
|
|
if (!co_data(oc)) {
|
2013-10-11 03:48:29 -04:00
|
|
|
/* Always clear both flags once everything has been sent, they're one-shot */
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags &= ~(CF_EXPECT_MORE | CF_SEND_DONTWAIT);
|
2013-10-11 03:48:29 -04:00
|
|
|
}
|
|
|
|
|
/* if some data remain in the buffer, it's only because the
|
|
|
|
|
* system buffers are full, we will try next time.
|
|
|
|
|
*/
|
2012-08-21 12:22:06 -04:00
|
|
|
}
|
2013-10-11 03:48:29 -04:00
|
|
|
}
|
2018-11-20 04:30:02 -05:00
|
|
|
|
2018-11-15 08:33:05 -05:00
|
|
|
end:
|
2019-07-05 05:49:11 -04:00
|
|
|
if (did_send) {
|
|
|
|
|
oc->flags |= CF_WRITE_PARTIAL | CF_WROTE_DATA;
|
|
|
|
|
if (si->state == SI_ST_CON)
|
|
|
|
|
si->state = SI_ST_RDY;
|
2019-07-05 07:44:29 -04:00
|
|
|
|
|
|
|
|
si_rx_room_rdy(si_opposite(si));
|
2019-07-05 05:49:11 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (conn->flags & CO_FL_ERROR || cs->flags & (CS_FL_ERROR|CS_FL_ERR_PENDING)) {
|
|
|
|
|
si->flags |= SI_FL_ERR;
|
|
|
|
|
return 1;
|
|
|
|
|
}
|
|
|
|
|
|
2018-07-17 12:49:38 -04:00
|
|
|
/* We couldn't send all of our data, let the mux know we'd like to send more */
|
2018-11-12 12:48:52 -05:00
|
|
|
if (!channel_is_empty(oc))
|
2018-12-19 07:59:17 -05:00
|
|
|
conn->mux->subscribe(cs, SUB_RETRY_SEND, &si->wait_event);
|
2018-08-31 11:29:12 -04:00
|
|
|
return did_send;
|
2012-08-21 12:22:06 -04:00
|
|
|
}
|
|
|
|
|
|
2018-11-07 01:47:52 -05:00
|
|
|
/* This is the ->process() function for any stream-interface's wait_event task.
|
|
|
|
|
* It's assigned during the stream-interface's initialization, for any type of
|
|
|
|
|
* stream interface. Thus it is always safe to perform a tasklet_wakeup() on a
|
|
|
|
|
* stream interface, as the presence of the CS is checked there.
|
|
|
|
|
*/
|
2018-08-02 12:06:28 -04:00
|
|
|
struct task *si_cs_io_cb(struct task *t, void *ctx, unsigned short state)
|
|
|
|
|
{
|
2018-08-02 12:21:38 -04:00
|
|
|
struct stream_interface *si = ctx;
|
2018-08-21 09:59:43 -04:00
|
|
|
struct conn_stream *cs = objt_cs(si->end);
|
2018-08-31 11:29:12 -04:00
|
|
|
int ret = 0;
|
2018-08-21 09:59:43 -04:00
|
|
|
|
|
|
|
|
if (!cs)
|
|
|
|
|
return NULL;
|
2018-11-07 01:47:52 -05:00
|
|
|
|
2018-12-19 07:59:17 -05:00
|
|
|
if (!(si->wait_event.events & SUB_RETRY_SEND) && !channel_is_empty(si_oc(si)))
|
2018-08-31 11:29:12 -04:00
|
|
|
ret = si_cs_send(cs);
|
2018-12-19 07:59:17 -05:00
|
|
|
if (!(si->wait_event.events & SUB_RETRY_RECV))
|
2018-08-31 11:29:12 -04:00
|
|
|
ret |= si_cs_recv(cs);
|
|
|
|
|
if (ret != 0)
|
2018-09-11 12:27:21 -04:00
|
|
|
si_cs_process(cs);
|
2018-08-31 11:29:12 -04:00
|
|
|
|
2019-08-01 08:17:02 -04:00
|
|
|
stream_release_buffers(si_strm(si));
|
2018-08-02 12:06:28 -04:00
|
|
|
return (NULL);
|
|
|
|
|
}
|
|
|
|
|
|
2015-09-24 05:32:22 -04:00
|
|
|
/* This function is designed to be called from within the stream handler to
|
2019-06-06 02:19:20 -04:00
|
|
|
* update the input channel's expiration timer and the stream interface's
|
|
|
|
|
* Rx flags based on the channel's flags. It needs to be called only once
|
|
|
|
|
* after the channel's flags have settled down, and before they are cleared,
|
|
|
|
|
* though it doesn't harm to call it as often as desired (it just slightly
|
|
|
|
|
* hurts performance). It must not be called from outside of the stream
|
|
|
|
|
* handler, as what it does will be used to compute the stream task's
|
|
|
|
|
* expiration.
|
2015-09-24 05:32:22 -04:00
|
|
|
*/
|
2019-06-06 02:19:20 -04:00
|
|
|
void si_update_rx(struct stream_interface *si)
|
2015-09-24 05:32:22 -04:00
|
|
|
{
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
2019-06-06 02:19:20 -04:00
|
|
|
if (ic->flags & CF_SHUTR) {
|
|
|
|
|
si_rx_shut_blk(si);
|
|
|
|
|
return;
|
|
|
|
|
}
|
2018-11-14 11:10:36 -05:00
|
|
|
|
2019-06-06 02:19:20 -04:00
|
|
|
/* Read not closed, update FD status and timeout for reads */
|
|
|
|
|
if (ic->flags & CF_DONT_READ)
|
|
|
|
|
si_rx_chan_blk(si);
|
|
|
|
|
else
|
|
|
|
|
si_rx_chan_rdy(si);
|
2018-11-14 11:10:36 -05:00
|
|
|
|
2019-06-06 02:19:20 -04:00
|
|
|
if (!channel_is_empty(ic)) {
|
|
|
|
|
/* stop reading, imposed by channel's policy or contents */
|
|
|
|
|
si_rx_room_blk(si);
|
2015-09-24 05:32:22 -04:00
|
|
|
}
|
2019-06-06 02:19:20 -04:00
|
|
|
else {
|
|
|
|
|
/* (re)start reading and update timeout. Note: we don't recompute the timeout
|
2021-01-07 23:35:52 -05:00
|
|
|
* every time we get here, otherwise it would risk never to expire. We only
|
2019-06-06 02:19:20 -04:00
|
|
|
* update it if is was not yet set. The stream socket handler will already
|
|
|
|
|
* have updated it if there has been a completed I/O.
|
|
|
|
|
*/
|
|
|
|
|
si_rx_room_rdy(si);
|
|
|
|
|
}
|
|
|
|
|
if (si->flags & SI_FL_RXBLK_ANY & ~SI_FL_RX_WAIT_EP)
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
|
|
|
|
else if (!(ic->flags & CF_READ_NOEXP) && !tick_isset(ic->rex))
|
|
|
|
|
ic->rex = tick_add_ifset(now_ms, ic->rto);
|
2015-09-24 05:32:22 -04:00
|
|
|
|
2019-06-06 02:19:20 -04:00
|
|
|
si_chk_rcv(si);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* This function is designed to be called from within the stream handler to
|
|
|
|
|
* update the output channel's expiration timer and the stream interface's
|
|
|
|
|
* Tx flags based on the channel's flags. It needs to be called only once
|
|
|
|
|
* after the channel's flags have settled down, and before they are cleared,
|
|
|
|
|
* though it doesn't harm to call it as often as desired (it just slightly
|
|
|
|
|
* hurts performance). It must not be called from outside of the stream
|
|
|
|
|
* handler, as what it does will be used to compute the stream task's
|
|
|
|
|
* expiration.
|
|
|
|
|
*/
|
|
|
|
|
void si_update_tx(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
|
|
|
|
if (oc->flags & CF_SHUTW)
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
/* Write not closed, update FD status and timeout for writes */
|
|
|
|
|
if (channel_is_empty(oc)) {
|
|
|
|
|
/* stop writing */
|
|
|
|
|
if (!(si->flags & SI_FL_WAIT_DATA)) {
|
|
|
|
|
if ((oc->flags & CF_SHUTW_NOW) == 0)
|
|
|
|
|
si->flags |= SI_FL_WAIT_DATA;
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
2015-09-24 05:32:22 -04:00
|
|
|
}
|
2019-06-06 02:19:20 -04:00
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* (re)start writing and update timeout. Note: we don't recompute the timeout
|
2021-01-07 23:35:52 -05:00
|
|
|
* every time we get here, otherwise it would risk never to expire. We only
|
2019-06-06 02:19:20 -04:00
|
|
|
* update it if is was not yet set. The stream socket handler will already
|
|
|
|
|
* have updated it if there has been a completed I/O.
|
|
|
|
|
*/
|
|
|
|
|
si->flags &= ~SI_FL_WAIT_DATA;
|
|
|
|
|
if (!tick_isset(oc->wex)) {
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
|
|
|
|
if (tick_isset(ic->rex) && !(si->flags & SI_FL_INDEP_STR)) {
|
|
|
|
|
/* Note: depending on the protocol, we don't know if we're waiting
|
|
|
|
|
* for incoming data or not. So in order to prevent the socket from
|
|
|
|
|
* expiring read timeouts during writes, we refresh the read timeout,
|
|
|
|
|
* except if it was already infinite or if we have explicitly setup
|
|
|
|
|
* independent streams.
|
2015-09-24 05:32:22 -04:00
|
|
|
*/
|
2019-06-06 02:19:20 -04:00
|
|
|
ic->rex = tick_add_ifset(now_ms, ic->rto);
|
2015-09-24 05:32:22 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2019-06-06 02:20:17 -04:00
|
|
|
/* perform a synchronous send() for the stream interface. The CF_WRITE_NULL and
|
|
|
|
|
* CF_WRITE_PARTIAL flags are cleared prior to the attempt, and will possibly
|
|
|
|
|
* be updated in case of success.
|
|
|
|
|
*/
|
|
|
|
|
void si_sync_send(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
struct conn_stream *cs;
|
|
|
|
|
|
|
|
|
|
oc->flags &= ~(CF_WRITE_NULL|CF_WRITE_PARTIAL);
|
|
|
|
|
|
|
|
|
|
if (oc->flags & CF_SHUTW)
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
if (channel_is_empty(oc))
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
if (!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
cs = objt_cs(si->end);
|
2020-01-22 11:34:54 -05:00
|
|
|
if (!cs || !cs->conn->mux)
|
2019-06-06 02:20:17 -04:00
|
|
|
return;
|
|
|
|
|
|
2019-07-05 07:44:29 -04:00
|
|
|
si_cs_send(cs);
|
2019-06-06 02:20:17 -04:00
|
|
|
}
|
|
|
|
|
|
2018-11-08 12:15:29 -05:00
|
|
|
/* Updates at once the channel flags, and timers of both stream interfaces of a
|
|
|
|
|
* same stream, to complete the work after the analysers, then updates the data
|
|
|
|
|
* layer below. This will ensure that any synchronous update performed at the
|
|
|
|
|
* data layer will be reflected in the channel flags and/or stream-interface.
|
2019-06-06 03:17:23 -04:00
|
|
|
* Note that this does not change the stream interface's current state, though
|
|
|
|
|
* it updates the previous state to the current one.
|
2018-11-08 12:15:29 -05:00
|
|
|
*/
|
|
|
|
|
void si_update_both(struct stream_interface *si_f, struct stream_interface *si_b)
|
|
|
|
|
{
|
|
|
|
|
struct channel *req = si_ic(si_f);
|
|
|
|
|
struct channel *res = si_oc(si_f);
|
|
|
|
|
|
|
|
|
|
req->flags &= ~(CF_READ_NULL|CF_READ_PARTIAL|CF_READ_ATTACHED|CF_WRITE_NULL|CF_WRITE_PARTIAL);
|
|
|
|
|
res->flags &= ~(CF_READ_NULL|CF_READ_PARTIAL|CF_READ_ATTACHED|CF_WRITE_NULL|CF_WRITE_PARTIAL);
|
|
|
|
|
|
|
|
|
|
si_f->prev_state = si_f->state;
|
|
|
|
|
si_b->prev_state = si_b->state;
|
|
|
|
|
|
2018-11-09 08:59:25 -05:00
|
|
|
/* let's recompute both sides states */
|
2019-06-05 08:34:03 -04:00
|
|
|
if (si_state_in(si_f->state, SI_SB_RDY|SI_SB_EST))
|
2018-12-19 09:19:27 -05:00
|
|
|
si_update(si_f);
|
2018-11-09 08:59:25 -05:00
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (si_state_in(si_b->state, SI_SB_RDY|SI_SB_EST))
|
2018-12-19 09:19:27 -05:00
|
|
|
si_update(si_b);
|
2018-11-09 08:59:25 -05:00
|
|
|
|
|
|
|
|
/* stream ints are processed outside of process_stream() and must be
|
|
|
|
|
* handled at the latest moment.
|
|
|
|
|
*/
|
|
|
|
|
if (obj_type(si_f->end) == OBJ_TYPE_APPCTX &&
|
2018-11-14 08:07:59 -05:00
|
|
|
((si_rx_endp_ready(si_f) && !si_rx_blocked(si_f)) ||
|
|
|
|
|
(si_tx_endp_ready(si_f) && !si_tx_blocked(si_f))))
|
2018-11-09 08:59:25 -05:00
|
|
|
appctx_wakeup(si_appctx(si_f));
|
|
|
|
|
|
|
|
|
|
if (obj_type(si_b->end) == OBJ_TYPE_APPCTX &&
|
2018-11-14 08:07:59 -05:00
|
|
|
((si_rx_endp_ready(si_b) && !si_rx_blocked(si_b)) ||
|
|
|
|
|
(si_tx_endp_ready(si_b) && !si_tx_blocked(si_b))))
|
2018-11-09 08:59:25 -05:00
|
|
|
appctx_wakeup(si_appctx(si_b));
|
2018-11-08 12:15:29 -05:00
|
|
|
}
|
|
|
|
|
|
2013-09-29 08:51:58 -04:00
|
|
|
/*
|
|
|
|
|
* This function performs a shutdown-read on a stream interface attached to
|
|
|
|
|
* a connection in a connected or init state (it does nothing for other
|
|
|
|
|
* states). It either shuts the read side or marks itself as closed. The buffer
|
|
|
|
|
* flags are updated to reflect the new state. If the stream interface has
|
|
|
|
|
* SI_FL_NOHALF, we also forward the close to the write side. If a control
|
|
|
|
|
* layer is defined, then it is supposed to be a socket layer and file
|
2013-09-29 09:16:03 -04:00
|
|
|
* descriptors are then shutdown or closed accordingly. The function
|
|
|
|
|
* automatically disables polling if needed.
|
2013-09-29 08:51:58 -04:00
|
|
|
*/
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutr_conn(struct stream_interface *si)
|
2013-09-29 08:51:58 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct conn_stream *cs = __objt_cs(si->end);
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
2013-09-29 08:51:58 -04:00
|
|
|
|
2018-11-14 10:58:52 -05:00
|
|
|
si_rx_shut_blk(si);
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->flags & CF_SHUTR)
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
2013-09-29 08:51:58 -04:00
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2013-09-29 08:51:58 -04:00
|
|
|
|
2019-01-31 13:09:59 -05:00
|
|
|
if (si->flags & SI_FL_KILL_CONN)
|
|
|
|
|
cs->flags |= CS_FL_KILL_CONN;
|
|
|
|
|
|
2014-11-28 05:11:05 -05:00
|
|
|
if (si_oc(si)->flags & CF_SHUTW) {
|
2017-10-05 12:52:17 -04:00
|
|
|
cs_close(cs);
|
2013-09-29 08:51:58 -04:00
|
|
|
si->state = SI_ST_DIS;
|
|
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
else if (si->flags & SI_FL_NOHALF) {
|
|
|
|
|
/* we want to immediately forward this close to the write side */
|
|
|
|
|
return stream_int_shutw_conn(si);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* This function performs a shutdown-write on a stream interface attached to
|
|
|
|
|
* a connection in a connected or init state (it does nothing for other
|
|
|
|
|
* states). It either shuts the write side or marks itself as closed. The
|
|
|
|
|
* buffer flags are updated to reflect the new state. It does also close
|
|
|
|
|
* everything if the SI was marked as being in error state. If there is a
|
2015-03-12 18:04:07 -04:00
|
|
|
* data-layer shutdown, it is called.
|
2013-09-29 08:51:58 -04:00
|
|
|
*/
|
2013-09-29 09:16:03 -04:00
|
|
|
static void stream_int_shutw_conn(struct stream_interface *si)
|
2013-09-29 08:51:58 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct conn_stream *cs = __objt_cs(si->end);
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
2013-09-29 08:51:58 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags &= ~CF_SHUTW_NOW;
|
|
|
|
|
if (oc->flags & CF_SHUTW)
|
2013-09-29 09:16:03 -04:00
|
|
|
return;
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags |= CF_SHUTW;
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
2018-11-06 13:23:03 -05:00
|
|
|
si_done_get(si);
|
2013-09-29 08:51:58 -04:00
|
|
|
|
BUG/MEDIUM: stream: fix client-fin/server-fin handling
A tcp half connection can cause 100% CPU on expiration.
First reproduced with this haproxy configuration :
global
tune.bufsize 10485760
defaults
timeout server-fin 90s
timeout client-fin 90s
backend node2
mode tcp
timeout server 900s
timeout connect 10s
server def 127.0.0.1:3333
frontend fe_api
mode tcp
timeout client 900s
bind :1990
use_backend node2
Ie timeout server-fin shorter than timeout server, the backend server
sends data, this package is left in the cache of haproxy, the backend
server continue sending fin package, haproxy recv fin package. this
time the session information is as follows:
time the session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
s1=[7,18h,fd=7,ex=] exp=14m58s
rp has set the CF_SHUTR state, next, the client sends the fin package,
session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=1m11s
After waiting 90s, session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=? run(nice=0)
cpu information:
6899 root 20 0 112224 21408 4260 R 100.0 0.7 3:04.96 haproxy
Buffering is set to ensure that there is data in the haproxy buffer, and haproxy
can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been
set, The following code does not clear the timeout message, causing cpu 100%:
stream.c:process_stream:
if (unlikely((res->flags & (CF_SHUTR|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) {
if (si_b->flags & SI_FL_NOHALF)
si_b->flags |= SI_FL_NOLINGER;
si_shutr(si_b);
}
If you have closed the read, set the read timeout does not make sense.
With or without cf_shutr, read timeout is set:
if (tick_isset(s->be->timeout.serverfin)) {
res->rto = s->be->timeout.serverfin;
res->rex = tick_add(now_ms, res->rto);
}
After discussion on the mailing list, setting half-closed timeouts the
hard way here doesn't make sense. They should be set only at the moment
the shutdown() is performed. It will also solve a special case which was
already reported of some half-closed timeouts not working when the shutw()
is performed directly at the stream-interface layer (no analyser involved).
Since the stream interface layer cannot know the timeout values, we'll have
to store them directly in the stream interface so that they are used upon
shutw(). This patch does this, fixing the problem.
An easier reproducer to validate the fix is to keep the huge buffer and
shorten all timeouts, then call it under tcploop server and client, and
wait 3 seconds to see haproxy run at 100% CPU :
global
tune.bufsize 10485760
listen px
bind :1990
timeout client 90s
timeout server 90s
timeout connect 1s
timeout server-fin 3s
timeout client-fin 3s
server def 127.0.0.1:3333
$ tcploop 3333 L W N20 A P100 F P10000 &
$ tcploop 127.0.0.1:1990 C S10000000 F
2017-03-10 12:41:51 -05:00
|
|
|
if (tick_isset(si->hcto)) {
|
|
|
|
|
ic->rto = si->hcto;
|
|
|
|
|
ic->rex = tick_add(now_ms, ic->rto);
|
|
|
|
|
}
|
|
|
|
|
|
2013-09-29 08:51:58 -04:00
|
|
|
switch (si->state) {
|
2019-06-05 08:34:03 -04:00
|
|
|
case SI_ST_RDY:
|
2013-09-29 08:51:58 -04:00
|
|
|
case SI_ST_EST:
|
|
|
|
|
/* we have to shut before closing, otherwise some short messages
|
|
|
|
|
* may never leave the system, especially when there are remaining
|
|
|
|
|
* unread data in the socket input buffer, or when nolinger is set.
|
|
|
|
|
* However, if SI_FL_NOLINGER is explicitly set, we know there is
|
|
|
|
|
* no risk so we close both sides immediately.
|
|
|
|
|
*/
|
2019-01-31 13:09:59 -05:00
|
|
|
if (si->flags & SI_FL_KILL_CONN)
|
|
|
|
|
cs->flags |= CS_FL_KILL_CONN;
|
|
|
|
|
|
2013-09-29 08:51:58 -04:00
|
|
|
if (si->flags & SI_FL_ERR) {
|
2020-04-02 06:25:26 -04:00
|
|
|
/* quick close, the socket is already shut anyway */
|
2013-09-29 08:51:58 -04:00
|
|
|
}
|
|
|
|
|
else if (si->flags & SI_FL_NOLINGER) {
|
2017-09-13 12:30:23 -04:00
|
|
|
/* unclean data-layer shutdown, typically an aborted request
|
|
|
|
|
* or a forwarded shutdown from a client to a server due to
|
|
|
|
|
* option abortonclose. No need for the TLS layer to try to
|
|
|
|
|
* emit a shutdown message.
|
|
|
|
|
*/
|
2017-10-05 09:25:48 -04:00
|
|
|
cs_shutw(cs, CS_SHW_SILENT);
|
2013-09-29 08:51:58 -04:00
|
|
|
}
|
|
|
|
|
else {
|
2017-09-13 12:30:23 -04:00
|
|
|
/* clean data-layer shutdown. This only happens on the
|
|
|
|
|
* frontend side, or on the backend side when forwarding
|
|
|
|
|
* a client close in TCP mode or in HTTP TUNNEL mode
|
|
|
|
|
* while option abortonclose is set. We want the TLS
|
|
|
|
|
* layer to try to signal it to the peer before we close.
|
|
|
|
|
*/
|
2017-10-05 09:25:48 -04:00
|
|
|
cs_shutw(cs, CS_SHW_NORMAL);
|
2013-09-29 08:51:58 -04:00
|
|
|
|
2020-12-11 04:24:05 -05:00
|
|
|
if (!(ic->flags & (CF_SHUTR|CF_DONT_READ)))
|
2017-10-05 12:52:17 -04:00
|
|
|
return;
|
2013-09-29 08:51:58 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* fall through */
|
|
|
|
|
case SI_ST_CON:
|
|
|
|
|
/* we may have to close a pending connection, and mark the
|
|
|
|
|
* response buffer as shutr
|
|
|
|
|
*/
|
2019-01-31 13:09:59 -05:00
|
|
|
if (si->flags & SI_FL_KILL_CONN)
|
|
|
|
|
cs->flags |= CS_FL_KILL_CONN;
|
2017-10-05 12:52:17 -04:00
|
|
|
cs_close(cs);
|
2013-09-29 08:51:58 -04:00
|
|
|
/* fall through */
|
|
|
|
|
case SI_ST_CER:
|
|
|
|
|
case SI_ST_QUE:
|
|
|
|
|
case SI_ST_TAR:
|
|
|
|
|
si->state = SI_ST_DIS;
|
2013-10-24 14:10:45 -04:00
|
|
|
/* fall through */
|
2013-09-29 08:51:58 -04:00
|
|
|
default:
|
2018-11-14 10:58:52 -05:00
|
|
|
si->flags &= ~SI_FL_NOLINGER;
|
|
|
|
|
si_rx_shut_blk(si);
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
2013-09-29 08:51:58 -04:00
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2012-08-20 06:38:36 -04:00
|
|
|
/* This function is used for inter-stream-interface calls. It is called by the
|
|
|
|
|
* consumer to inform the producer side that it may be interested in checking
|
|
|
|
|
* for free space in the buffer. Note that it intentionally does not update
|
|
|
|
|
* timeouts, so that we can still check them later at wake-up. This function is
|
|
|
|
|
* dedicated to connection-based stream interfaces.
|
|
|
|
|
*/
|
2012-08-24 12:12:41 -04:00
|
|
|
static void stream_int_chk_rcv_conn(struct stream_interface *si)
|
2012-08-20 06:38:36 -04:00
|
|
|
{
|
2018-10-11 07:54:13 -04:00
|
|
|
/* (re)start reading */
|
2019-06-05 08:34:03 -04:00
|
|
|
if (si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
2019-06-14 08:42:29 -04:00
|
|
|
tasklet_wakeup(si->wait_event.tasklet);
|
2012-08-20 06:38:36 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
2012-08-20 09:01:10 -04:00
|
|
|
/* This function is used for inter-stream-interface calls. It is called by the
|
|
|
|
|
* producer to inform the consumer side that it may be interested in checking
|
|
|
|
|
* for data in the buffer. Note that it intentionally does not update timeouts,
|
|
|
|
|
* so that we can still check them later at wake-up.
|
|
|
|
|
*/
|
2012-08-24 12:12:41 -04:00
|
|
|
static void stream_int_chk_snd_conn(struct stream_interface *si)
|
2012-08-20 09:01:10 -04:00
|
|
|
{
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *oc = si_oc(si);
|
2017-09-13 12:30:23 -04:00
|
|
|
struct conn_stream *cs = __objt_cs(si->end);
|
2012-08-20 09:01:10 -04:00
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (unlikely(!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST) ||
|
2019-04-11 07:56:26 -04:00
|
|
|
(oc->flags & CF_SHUTW)))
|
2012-08-20 09:01:10 -04:00
|
|
|
return;
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (unlikely(channel_is_empty(oc))) /* called with nothing to send ! */
|
2012-08-20 09:01:10 -04:00
|
|
|
return;
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (!oc->pipe && /* spliced data wants to be forwarded ASAP */
|
2012-12-15 04:12:39 -05:00
|
|
|
!(si->flags & SI_FL_WAIT_DATA)) /* not waiting for data */
|
2012-08-20 09:01:10 -04:00
|
|
|
return;
|
|
|
|
|
|
2018-12-19 07:59:17 -05:00
|
|
|
if (!(si->wait_event.events & SUB_RETRY_SEND) && !channel_is_empty(si_oc(si)))
|
2018-10-22 10:01:09 -04:00
|
|
|
si_cs_send(cs);
|
2018-10-25 07:49:49 -04:00
|
|
|
|
2018-12-19 11:17:10 -05:00
|
|
|
if (cs->flags & (CS_FL_ERROR|CS_FL_ERR_PENDING) || cs->conn->flags & CO_FL_ERROR) {
|
2017-10-25 08:22:28 -04:00
|
|
|
/* Write error on the file descriptor */
|
2019-06-24 10:08:08 -04:00
|
|
|
if (si->state >= SI_ST_CON)
|
|
|
|
|
si->flags |= SI_FL_ERR;
|
2017-10-25 08:22:28 -04:00
|
|
|
goto out_wakeup;
|
2012-08-20 09:01:10 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* OK, so now we know that some data might have been sent, and that we may
|
|
|
|
|
* have to poll first. We have to do that too if the buffer is not empty.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
if (channel_is_empty(oc)) {
|
2012-08-20 09:01:10 -04:00
|
|
|
/* the connection is established but we can't write. Either the
|
|
|
|
|
* buffer is empty, or we just refrain from sending because the
|
|
|
|
|
* ->o limit was reached. Maybe we just wrote the last
|
|
|
|
|
* chunk and need to close.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
if (((oc->flags & (CF_SHUTW|CF_AUTO_CLOSE|CF_SHUTW_NOW)) ==
|
2012-08-27 17:14:58 -04:00
|
|
|
(CF_AUTO_CLOSE|CF_SHUTW_NOW)) &&
|
2019-06-05 08:34:03 -04:00
|
|
|
si_state_in(si->state, SI_SB_RDY|SI_SB_EST)) {
|
2012-08-20 09:01:10 -04:00
|
|
|
si_shutw(si);
|
|
|
|
|
goto out_wakeup;
|
|
|
|
|
}
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if ((oc->flags & (CF_SHUTW|CF_SHUTW_NOW)) == 0)
|
2012-08-20 09:01:10 -04:00
|
|
|
si->flags |= SI_FL_WAIT_DATA;
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->wex = TICK_ETERNITY;
|
2012-08-20 09:01:10 -04:00
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* Otherwise there are remaining data to be sent in the buffer,
|
|
|
|
|
* which means we have to poll before doing so.
|
|
|
|
|
*/
|
|
|
|
|
si->flags &= ~SI_FL_WAIT_DATA;
|
2014-11-28 09:46:27 -05:00
|
|
|
if (!tick_isset(oc->wex))
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
2012-08-20 09:01:10 -04:00
|
|
|
}
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (likely(oc->flags & CF_WRITE_ACTIVITY)) {
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
2012-08-20 09:01:10 -04:00
|
|
|
/* update timeout if we have written something */
|
2014-11-28 09:46:27 -05:00
|
|
|
if ((oc->flags & (CF_SHUTW|CF_WRITE_PARTIAL)) == CF_WRITE_PARTIAL &&
|
|
|
|
|
!channel_is_empty(oc))
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
2012-08-20 09:01:10 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (tick_isset(ic->rex) && !(si->flags & SI_FL_INDEP_STR)) {
|
2012-08-20 09:01:10 -04:00
|
|
|
/* Note: to prevent the client from expiring read timeouts
|
|
|
|
|
* during writes, we refresh it. We only do this if the
|
|
|
|
|
* interface is not configured for "independent streams",
|
|
|
|
|
* because for some applications it's better not to do this,
|
|
|
|
|
* for instance when continuously exchanging small amounts
|
|
|
|
|
* of data which can full the socket buffers long before a
|
|
|
|
|
* write timeout is detected.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->rex = tick_add_ifset(now_ms, ic->rto);
|
2012-08-20 09:01:10 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* in case of special condition (error, shutdown, end of write...), we
|
|
|
|
|
* have to notify the task.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
if (likely((oc->flags & (CF_WRITE_NULL|CF_WRITE_ERROR|CF_SHUTW)) ||
|
|
|
|
|
((oc->flags & CF_WAKE_WRITE) &&
|
|
|
|
|
((channel_is_empty(oc) && !oc->to_forward) ||
|
2019-06-05 08:53:22 -04:00
|
|
|
!si_state_in(si->state, SI_SB_EST))))) {
|
2012-08-20 09:01:10 -04:00
|
|
|
out_wakeup:
|
2014-11-28 06:08:47 -05:00
|
|
|
if (!(si->flags & SI_FL_DONT_WAKE))
|
|
|
|
|
task_wakeup(si_task(si), TASK_WOKEN_IO);
|
2012-08-20 09:01:10 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2012-08-20 15:41:06 -04:00
|
|
|
/*
|
|
|
|
|
* This is the callback which is called by the connection layer to receive data
|
2017-09-13 12:30:23 -04:00
|
|
|
* into the buffer from the connection. It iterates over the mux layer's
|
REORG: connection: rename the data layer the "transport layer"
While working on the changes required to make the health checks use the
new connections, it started to become obvious that some naming was not
logical at all in the connections. Specifically, it is not logical to
call the "data layer" the layer which is in charge for all the handshake
and which does not yet provide a data layer once established until a
session has allocated all the required buffers.
In fact, it's more a transport layer, which makes much more sense. The
transport layer offers a medium on which data can transit, and it offers
the functions to move these data when the upper layer requests this. And
it is the upper layer which iterates over the transport layer's functions
to move data which should be called the data layer.
The use case where it's obvious is with embryonic sessions : an incoming
SSL connection is accepted. Only the connection is allocated, not the
buffers nor stream interface, etc... The connection handles the SSL
handshake by itself. Once this handshake is complete, we can't use the
data functions because the buffers and stream interface are not there
yet. Hence we have to first call a specific function to complete the
session initialization, after which we'll be able to use the data
functions. This clearly proves that SSL here is only a transport layer
and that the stream interface constitutes the data layer.
A similar change will be performed to rename app_cb => data, but the
two could not be in the same commit for obvious reasons.
2012-10-02 18:19:48 -04:00
|
|
|
* rcv_buf function.
|
2012-08-20 15:41:06 -04:00
|
|
|
*/
|
2018-09-14 13:41:13 -04:00
|
|
|
int si_cs_recv(struct conn_stream *cs)
|
2012-08-20 15:41:06 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct connection *conn = cs->conn;
|
|
|
|
|
struct stream_interface *si = cs->data;
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
2018-08-31 11:29:12 -04:00
|
|
|
int ret, max, cur_read = 0;
|
2012-08-20 15:41:06 -04:00
|
|
|
int read_poll = MAX_READ_POLL_LOOPS;
|
2018-10-11 09:56:04 -04:00
|
|
|
int flags = 0;
|
2012-08-20 15:41:06 -04:00
|
|
|
|
BUG/MAJOR: stream-int: Don't receive data from mux until SI_ST_EST is reached
This bug is pretty pernicious and have serious consequences : In 2.1, an
infinite loop in process_stream() because the backend stream-interface remains
in the ready state (SI_ST_RDY). In 2.0, a call in loop to process_stream()
because the stream-interface remains blocked in the connect state
(SI_ST_CON). In both cases, it happens after a connection retry attempt. In 1.9,
it seems to not happen. But it may be just by chance or just because it is
harder to get right conditions to trigger the bug. However, reading the code,
the bug seems to exist too.
Here is how the bug happens in 2.1. When we try to establish a new connection to
a server, the corresponding stream-interface is first set to the connect state
(SI_ST_CON). When the underlying connection is known to be connected (the flag
CO_FL_CONNECTED set), the stream-interface is switched to the ready state
(SI_ST_RDY). It is a transient state between the connect state (SI_ST_CON) and
the established state (SI_ST_EST). It must be handled on the next call to
process_stream(), which is responsible to operate the transition. During all
this time, errors can occur. A connection error or a client abort. The transient
state SI_ST_RDY was introduced to let a chance to process_stream() to catch
these errors before considering the connection as fully established.
Unfortunatly, if a read0 is catched in states SI_ST_CON or SI_ST_RDY, it is
possible to have a shutdown without transition to SI_ST_DIS (in fact, here,
SI_ST_CON is swichted to SI_ST_RDY). This happens if the request was fully
received and analyzed. In this case, the flag SI_FL_NOHALF is set on the backend
stream-interface. If an error is also reported during the connect, the behavior
is undefined because an error is returned to the client and a connection retry
is performed. So on the next connection attempt to the server, if another error
is reported, a client abort is detected. But the shutdown for writes was already
done. So the transition to the state SI_ST_DIS is impossible. We stay in the
state SI_ST_RDY. Because it is a transient state, we loop in process_stream() to
perform the transition.
It is hard to understand how the bug happens reading the code and even harder to
explain. But there is a trivial way to hit the bug by sending h2 requests to a
server only speaking h1. For instance, with the following config :
listen tst
bind *:80
server www 127.0.0.1:8000 proto h2 # in reality, it is a HTTP/1.1 server
It is a configuration error, but it is an easy way to observe the bug. Note it
may happen with a valid configuration.
So, after a careful analyzis, it appears that si_cs_recv() should never be
called for a not fully established stream-interface. This way the connection
retries will be performed before reporting an error to the client. Thus, if a
shutdown is performed because a read0 is handled, the stream-interface is
inconditionnaly set to the transient state SI_ST_DIS.
This patch must be backported to 2.0 and 1.9. However on these versions, this
patch reveals a design flaw about connections and a bad way to perform the
connection retries. We are working on it.
2019-10-25 04:21:01 -04:00
|
|
|
/* If not established yet, do nothing. */
|
|
|
|
|
if (si->state != SI_ST_EST)
|
|
|
|
|
return 0;
|
|
|
|
|
|
2018-08-31 11:29:12 -04:00
|
|
|
/* If another call to si_cs_recv() failed, and we subscribed to
|
|
|
|
|
* recv events already, give up now.
|
|
|
|
|
*/
|
2018-12-19 07:59:17 -05:00
|
|
|
if (si->wait_event.events & SUB_RETRY_RECV)
|
2018-08-31 11:29:12 -04:00
|
|
|
return 0;
|
2012-08-20 15:41:06 -04:00
|
|
|
|
|
|
|
|
/* maybe we were called immediately after an asynchronous shutr */
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->flags & CF_SHUTR)
|
2018-09-11 12:27:21 -04:00
|
|
|
return 1;
|
2012-08-20 15:41:06 -04:00
|
|
|
|
2020-07-30 03:26:46 -04:00
|
|
|
/* we must wait because the mux is not installed yet */
|
|
|
|
|
if (!conn->mux)
|
|
|
|
|
return 0;
|
|
|
|
|
|
2017-08-30 01:35:35 -04:00
|
|
|
/* stop here if we reached the end of data */
|
2017-09-13 12:30:23 -04:00
|
|
|
if (cs->flags & CS_FL_EOS)
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
goto end_recv;
|
2017-08-30 01:35:35 -04:00
|
|
|
|
2018-12-07 08:51:20 -05:00
|
|
|
/* stop immediately on errors. Note that we DON'T want to stop on
|
|
|
|
|
* POLL_ERR, as the poller might report a write error while there
|
|
|
|
|
* are still data available in the recv buffer. This typically
|
|
|
|
|
* happens when we send too large a request to a backend server
|
|
|
|
|
* which rejects it before reading it all.
|
|
|
|
|
*/
|
|
|
|
|
if (!(cs->flags & CS_FL_RCV_MORE)) {
|
|
|
|
|
if (!conn_xprt_ready(conn))
|
|
|
|
|
return 0;
|
|
|
|
|
if (conn->flags & CO_FL_ERROR || cs->flags & CS_FL_ERROR)
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
goto end_recv;
|
2018-12-07 08:51:20 -05:00
|
|
|
}
|
2018-11-12 10:11:08 -05:00
|
|
|
|
2018-12-18 03:15:43 -05:00
|
|
|
/* prepare to detect if the mux needs more room */
|
|
|
|
|
cs->flags &= ~CS_FL_WANT_ROOM;
|
|
|
|
|
|
2018-06-19 01:03:14 -04:00
|
|
|
if ((ic->flags & (CF_STREAMER | CF_STREAMER_FAST)) && !co_data(ic) &&
|
2014-02-12 10:35:14 -05:00
|
|
|
global.tune.idle_timer &&
|
2014-11-28 09:46:27 -05:00
|
|
|
(unsigned short)(now_ms - ic->last_read) >= global.tune.idle_timer) {
|
2014-02-09 11:47:01 -05:00
|
|
|
/* The buffer was empty and nothing was transferred for more
|
|
|
|
|
* than one second. This was caused by a pause and not by
|
|
|
|
|
* congestion. Reset any streaming mode to reduce latency.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->xfer_small = 0;
|
|
|
|
|
ic->xfer_large = 0;
|
|
|
|
|
ic->flags &= ~(CF_STREAMER | CF_STREAMER_FAST);
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
|
|
|
|
|
2012-08-23 18:46:52 -04:00
|
|
|
/* First, let's see if we may splice data across the channel without
|
|
|
|
|
* using a buffer.
|
|
|
|
|
*/
|
BUG/MEDIUM: connection: add a mux flag to indicate splice usability
Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.
What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.
All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.
The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.
The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.
The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.
All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.
The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.
This fix must be backported to 2.1 and 2.0.
2020-01-17 10:19:34 -05:00
|
|
|
if (cs->flags & CS_FL_MAY_SPLICE &&
|
2014-11-28 09:46:27 -05:00
|
|
|
(ic->pipe || ic->to_forward >= MIN_SPLICE_FORWARD) &&
|
|
|
|
|
ic->flags & CF_KERN_SPLICING) {
|
2018-07-10 03:50:25 -04:00
|
|
|
if (c_data(ic)) {
|
2012-08-23 18:46:52 -04:00
|
|
|
/* We're embarrassed, there are already data pending in
|
|
|
|
|
* the buffer and we don't want to have them at two
|
|
|
|
|
* locations at a time. Let's indicate we need some
|
|
|
|
|
* place and ask the consumer to hurry.
|
|
|
|
|
*/
|
2018-10-11 09:56:04 -04:00
|
|
|
flags |= CO_RFL_BUF_FLUSH;
|
2012-08-23 18:46:52 -04:00
|
|
|
goto abort_splice;
|
|
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (unlikely(ic->pipe == NULL)) {
|
|
|
|
|
if (pipes_used >= global.maxpipes || !(ic->pipe = get_pipe())) {
|
|
|
|
|
ic->flags &= ~CF_KERN_SPLICING;
|
2012-08-23 18:46:52 -04:00
|
|
|
goto abort_splice;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2017-09-13 12:30:23 -04:00
|
|
|
ret = conn->mux->rcv_pipe(cs, ic->pipe, ic->to_forward);
|
2012-08-23 18:46:52 -04:00
|
|
|
if (ret < 0) {
|
|
|
|
|
/* splice not supported on this end, let's disable it */
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags &= ~CF_KERN_SPLICING;
|
2012-08-23 18:46:52 -04:00
|
|
|
goto abort_splice;
|
|
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
|
2012-08-23 18:46:52 -04:00
|
|
|
if (ret > 0) {
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->to_forward != CHN_INFINITE_FORWARD)
|
|
|
|
|
ic->to_forward -= ret;
|
|
|
|
|
ic->total += ret;
|
2012-08-23 18:46:52 -04:00
|
|
|
cur_read += ret;
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_READ_PARTIAL;
|
2012-08-20 15:41:06 -04:00
|
|
|
}
|
2012-08-23 18:46:52 -04:00
|
|
|
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
if (conn->flags & CO_FL_ERROR || cs->flags & (CS_FL_EOS|CS_FL_ERROR))
|
|
|
|
|
goto end_recv;
|
2012-08-23 18:46:52 -04:00
|
|
|
|
BUG/MEDIUM: splicing: fix abnormal CPU usage with splicing
Mark Janssen reported an issue in 1.5-dev19 which was introduced
in 1.5-dev12 by commit 96199b10. From time to time, randomly, the
CPU usage spikes to 100% for seconds to minutes.
A deep analysis of the traces provided shows that it happens when
waiting for the response to a second pipelined HTTP request, or
when trying to handle the received shutdown advertised by epoll()
after the last block of data. Each time, splice() was involved with
data pending in the pipe.
The cause of this was that such events could not be taken into account
by splice nor by recv and were left pending :
- the transfer of the last block of data, optionally with a shutdown
was not handled by splice() because of the validation that to_forward
is higher than MIN_SPLICE_FORWARD ;
- the next recv() call was inhibited because of the test on presence
of data in the pipe. This is also what prevented the recv() call
from handling a response to a pipelined request until the client
had ACKed the previous response.
No less than 4 different methods were experimented to fix this, and the
current one was finally chosen. The principle is that if an event is not
caught by splice(), then it MUST be caught by recv(). So we remove the
condition on the pipe's emptiness to perform an recv(), and in order to
prevent recv() from being used in the middle of a transfer, we mark
supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because
the reason for stopping a splice()-based receive is that the pipe is
supposed to be full.
The net effect is that we don't wake up and sleep in loops during these
transient states. This happened much more often than expected, sometimes
for a few cycles at end of transfers, but rarely long enough to be
noticed, unless a client timed out with data pending in the pipe. The
effect on CPU usage is visible even when transfering 1MB objects in
pipeline, where the CPU usage drops from 10 to 6% on a small machine at
medium bandwidth.
Some further improvements are needed :
- the last chunk of a splice() transfer is never done using splice due
to the test on to_forward. This is wrong and should be performed with
splice if the pipe has not yet been emptied ;
- si_chk_snd() should not be called when the write event is already being
polled, otherwise we're almost certain to get EAGAIN.
Many thanks to Mark for all the traces he cared to provide, they were
essential for understanding this issue which was not reproducible
without.
Only 1.5-dev is affected, no backport is needed.
2013-07-18 15:49:32 -04:00
|
|
|
if (conn->flags & CO_FL_WAIT_ROOM) {
|
|
|
|
|
/* the pipe is full or we have read enough data that it
|
|
|
|
|
* could soon be full. Let's stop before needing to poll.
|
|
|
|
|
*/
|
2018-11-15 05:08:52 -05:00
|
|
|
si_rx_room_blk(si);
|
2018-11-15 10:06:02 -05:00
|
|
|
goto done_recv;
|
BUG/MEDIUM: splicing: fix abnormal CPU usage with splicing
Mark Janssen reported an issue in 1.5-dev19 which was introduced
in 1.5-dev12 by commit 96199b10. From time to time, randomly, the
CPU usage spikes to 100% for seconds to minutes.
A deep analysis of the traces provided shows that it happens when
waiting for the response to a second pipelined HTTP request, or
when trying to handle the received shutdown advertised by epoll()
after the last block of data. Each time, splice() was involved with
data pending in the pipe.
The cause of this was that such events could not be taken into account
by splice nor by recv and were left pending :
- the transfer of the last block of data, optionally with a shutdown
was not handled by splice() because of the validation that to_forward
is higher than MIN_SPLICE_FORWARD ;
- the next recv() call was inhibited because of the test on presence
of data in the pipe. This is also what prevented the recv() call
from handling a response to a pipelined request until the client
had ACKed the previous response.
No less than 4 different methods were experimented to fix this, and the
current one was finally chosen. The principle is that if an event is not
caught by splice(), then it MUST be caught by recv(). So we remove the
condition on the pipe's emptiness to perform an recv(), and in order to
prevent recv() from being used in the middle of a transfer, we mark
supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because
the reason for stopping a splice()-based receive is that the pipe is
supposed to be full.
The net effect is that we don't wake up and sleep in loops during these
transient states. This happened much more often than expected, sometimes
for a few cycles at end of transfers, but rarely long enough to be
noticed, unless a client timed out with data pending in the pipe. The
effect on CPU usage is visible even when transfering 1MB objects in
pipeline, where the CPU usage drops from 10 to 6% on a small machine at
medium bandwidth.
Some further improvements are needed :
- the last chunk of a splice() transfer is never done using splice due
to the test on to_forward. This is wrong and should be performed with
splice if the pipe has not yet been emptied ;
- si_chk_snd() should not be called when the write event is already being
polled, otherwise we're almost certain to get EAGAIN.
Many thanks to Mark for all the traces he cared to provide, they were
essential for understanding this issue which was not reproducible
without.
Only 1.5-dev is affected, no backport is needed.
2013-07-18 15:49:32 -04:00
|
|
|
}
|
2012-09-02 12:34:44 -04:00
|
|
|
|
2012-08-20 15:41:06 -04:00
|
|
|
/* splice not possible (anymore), let's go on on standard copy */
|
|
|
|
|
}
|
2012-08-23 18:46:52 -04:00
|
|
|
|
|
|
|
|
abort_splice:
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->pipe && unlikely(!ic->pipe->data)) {
|
|
|
|
|
put_pipe(ic->pipe);
|
|
|
|
|
ic->pipe = NULL;
|
2012-08-23 18:46:52 -04:00
|
|
|
}
|
|
|
|
|
|
BUG/MEDIUM: connection: add a mux flag to indicate splice usability
Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.
What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.
All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.
The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.
The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.
The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.
All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.
The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.
This fix must be backported to 2.1 and 2.0.
2020-01-17 10:19:34 -05:00
|
|
|
if (ic->pipe && ic->to_forward && !(flags & CO_RFL_BUF_FLUSH) && cs->flags & CS_FL_MAY_SPLICE) {
|
2019-12-03 12:13:04 -05:00
|
|
|
/* don't break splicing by reading, but still call rcv_buf()
|
|
|
|
|
* to pass the flag.
|
|
|
|
|
*/
|
|
|
|
|
goto done_recv;
|
|
|
|
|
}
|
|
|
|
|
|
BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.
Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.
Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.
So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.
In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.
[wt: backport to 1.7 and 1.6]
2016-12-09 11:30:18 -05:00
|
|
|
/* now we'll need a input buffer for the stream */
|
2018-10-25 04:21:41 -04:00
|
|
|
if (!si_alloc_ibuf(si, &(si_strm(si)->buffer_wait)))
|
MAJOR: session: only allocate buffers when needed
A session doesn't need buffers all the time, especially when they're
empty. With this patch, we don't allocate buffers anymore when the
session is initialized, we only allocate them in two cases :
- during process_session()
- during I/O operations
During process_session(), we try hard to allocate both buffers at once
so that we know for sure that a started operation can complete. Indeed,
a previous version of this patch used to allocate one buffer at a time,
but it can result in a deadlock when all buffers are allocated for
requests for example, and there's no buffer left to emit error responses.
Here, if any of the buffers cannot be allocated, the whole operation is
cancelled and the session is added at the tail of the buffer wait queue.
At the end of process_session(), a call to session_release_buffers() is
done so that we can offer unused buffers to other sessions waiting for
them.
For I/O operations, we only need to allocate a buffer on the Rx path.
For this, we only allocate a single buffer but ensure that at least two
are available to avoid the deadlock situation. In case buffers are not
available, SI_FL_WAIT_ROOM is set on the stream interface and the session
is queued. Unused buffers resulting either from a successful send() or
from an unused read buffer are offered to pending sessions during the
->wake() callback.
2014-11-25 13:46:36 -05:00
|
|
|
goto end_recv;
|
|
|
|
|
|
BUG/MEDIUM: splicing: fix abnormal CPU usage with splicing
Mark Janssen reported an issue in 1.5-dev19 which was introduced
in 1.5-dev12 by commit 96199b10. From time to time, randomly, the
CPU usage spikes to 100% for seconds to minutes.
A deep analysis of the traces provided shows that it happens when
waiting for the response to a second pipelined HTTP request, or
when trying to handle the received shutdown advertised by epoll()
after the last block of data. Each time, splice() was involved with
data pending in the pipe.
The cause of this was that such events could not be taken into account
by splice nor by recv and were left pending :
- the transfer of the last block of data, optionally with a shutdown
was not handled by splice() because of the validation that to_forward
is higher than MIN_SPLICE_FORWARD ;
- the next recv() call was inhibited because of the test on presence
of data in the pipe. This is also what prevented the recv() call
from handling a response to a pipelined request until the client
had ACKed the previous response.
No less than 4 different methods were experimented to fix this, and the
current one was finally chosen. The principle is that if an event is not
caught by splice(), then it MUST be caught by recv(). So we remove the
condition on the pipe's emptiness to perform an recv(), and in order to
prevent recv() from being used in the middle of a transfer, we mark
supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because
the reason for stopping a splice()-based receive is that the pipe is
supposed to be full.
The net effect is that we don't wake up and sleep in loops during these
transient states. This happened much more often than expected, sometimes
for a few cycles at end of transfers, but rarely long enough to be
noticed, unless a client timed out with data pending in the pipe. The
effect on CPU usage is visible even when transfering 1MB objects in
pipeline, where the CPU usage drops from 10 to 6% on a small machine at
medium bandwidth.
Some further improvements are needed :
- the last chunk of a splice() transfer is never done using splice due
to the test on to_forward. This is wrong and should be performed with
splice if the pipe has not yet been emptied ;
- si_chk_snd() should not be called when the write event is already being
polled, otherwise we're almost certain to get EAGAIN.
Many thanks to Mark for all the traces he cared to provide, they were
essential for understanding this issue which was not reproducible
without.
Only 1.5-dev is affected, no backport is needed.
2013-07-18 15:49:32 -04:00
|
|
|
/* Important note : if we're called with POLL_IN|POLL_HUP, it means the read polling
|
|
|
|
|
* was enabled, which implies that the recv buffer was not full. So we have a guarantee
|
|
|
|
|
* that if such an event is not handled above in splice, it will be handled here by
|
|
|
|
|
* recv().
|
|
|
|
|
*/
|
2018-12-04 09:46:16 -05:00
|
|
|
while ((cs->flags & CS_FL_RCV_MORE) ||
|
2020-01-17 11:24:30 -05:00
|
|
|
(!(conn->flags & (CO_FL_ERROR | CO_FL_HANDSHAKE)) &&
|
2018-12-04 09:46:16 -05:00
|
|
|
(!(cs->flags & (CS_FL_ERROR|CS_FL_EOS))) && !(ic->flags & CF_SHUTR))) {
|
2018-10-11 09:29:21 -04:00
|
|
|
/* <max> may be null. This is the mux responsibility to set
|
|
|
|
|
* CS_FL_RCV_MORE on the CS if more space is needed.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
max = channel_recv_max(ic);
|
2019-05-14 16:46:41 -04:00
|
|
|
ret = cs->conn->mux->rcv_buf(cs, &ic->buf, max, flags | (co_data(ic) ? CO_RFL_BUF_WET : 0));
|
2018-12-05 07:45:41 -05:00
|
|
|
|
2018-12-06 10:22:29 -05:00
|
|
|
if (cs->flags & CS_FL_WANT_ROOM)
|
2018-11-15 05:08:52 -05:00
|
|
|
si_rx_room_blk(si);
|
2017-12-10 15:19:33 -05:00
|
|
|
|
2018-11-12 10:11:08 -05:00
|
|
|
if (ret <= 0) {
|
2019-12-03 12:08:45 -05:00
|
|
|
/* if we refrained from reading because we asked for a
|
|
|
|
|
* flush to satisfy rcv_pipe(), we must not subscribe
|
|
|
|
|
* and instead report that there's not enough room
|
|
|
|
|
* here to proceed.
|
|
|
|
|
*/
|
|
|
|
|
if (flags & CO_RFL_BUF_FLUSH)
|
|
|
|
|
si_rx_room_blk(si);
|
2012-08-20 15:41:06 -04:00
|
|
|
break;
|
2018-11-12 10:11:08 -05:00
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
|
2020-01-09 08:31:13 -05:00
|
|
|
/* L7 retries enabled and maximum connection retries not reached */
|
|
|
|
|
if ((si->flags & SI_FL_L7_RETRY) && si->conn_retries) {
|
MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
2019-04-05 09:30:12 -04:00
|
|
|
struct htx *htx;
|
|
|
|
|
struct htx_sl *sl;
|
|
|
|
|
|
|
|
|
|
htx = htxbuf(&ic->buf);
|
|
|
|
|
if (htx) {
|
2019-05-13 08:41:27 -04:00
|
|
|
sl = http_get_stline(htx);
|
MEDIUM: streams: Add the ability to retry a request on L7 failure.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
2019-04-05 09:30:12 -04:00
|
|
|
if (sl && l7_status_match(si_strm(si)->be,
|
|
|
|
|
sl->info.res.status)) {
|
|
|
|
|
/* If we got a status for which we would
|
|
|
|
|
* like to retry the request, empty
|
|
|
|
|
* the buffer and pretend there's an
|
|
|
|
|
* error on the channel.
|
|
|
|
|
*/
|
|
|
|
|
ic->flags |= CF_READ_ERROR;
|
|
|
|
|
htx_reset(htx);
|
|
|
|
|
return 1;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
si->flags &= ~SI_FL_L7_RETRY;
|
|
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
cur_read += ret;
|
|
|
|
|
|
|
|
|
|
/* if we're allowed to directly forward data, we must update ->o */
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->to_forward && !(ic->flags & (CF_SHUTW|CF_SHUTW_NOW))) {
|
2012-08-20 15:41:06 -04:00
|
|
|
unsigned long fwd = ret;
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->to_forward != CHN_INFINITE_FORWARD) {
|
|
|
|
|
if (fwd > ic->to_forward)
|
|
|
|
|
fwd = ic->to_forward;
|
|
|
|
|
ic->to_forward -= fwd;
|
2012-08-20 15:41:06 -04:00
|
|
|
}
|
2018-06-06 01:13:22 -04:00
|
|
|
c_adv(ic, fwd);
|
2012-08-20 15:41:06 -04:00
|
|
|
}
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_READ_PARTIAL;
|
|
|
|
|
ic->total += ret;
|
2012-08-20 15:41:06 -04:00
|
|
|
|
2018-11-12 10:11:08 -05:00
|
|
|
if ((ic->flags & CF_READ_DONTWAIT) || --read_poll <= 0) {
|
|
|
|
|
/* we're stopped by the channel's policy */
|
2018-11-14 11:10:36 -05:00
|
|
|
si_rx_chan_blk(si);
|
2017-11-18 05:26:20 -05:00
|
|
|
break;
|
2018-11-12 10:11:08 -05:00
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
|
|
|
|
|
/* if too many bytes were missing from last read, it means that
|
|
|
|
|
* it's pointless trying to read again because the system does
|
|
|
|
|
* not have them in buffers.
|
|
|
|
|
*/
|
|
|
|
|
if (ret < max) {
|
|
|
|
|
/* if a streamer has read few data, it may be because we
|
|
|
|
|
* have exhausted system buffers. It's not worth trying
|
|
|
|
|
* again.
|
|
|
|
|
*/
|
2018-11-12 10:11:08 -05:00
|
|
|
if (ic->flags & CF_STREAMER) {
|
|
|
|
|
/* we're stopped by the channel's policy */
|
2018-11-14 11:10:36 -05:00
|
|
|
si_rx_chan_blk(si);
|
2012-08-20 15:41:06 -04:00
|
|
|
break;
|
2018-11-12 10:11:08 -05:00
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
|
|
|
|
|
/* if we read a large block smaller than what we requested,
|
|
|
|
|
* it's almost certain we'll never get anything more.
|
|
|
|
|
*/
|
2018-11-12 10:11:08 -05:00
|
|
|
if (ret >= global.tune.recv_enough) {
|
|
|
|
|
/* we're stopped by the channel's policy */
|
2018-11-14 11:10:36 -05:00
|
|
|
si_rx_chan_blk(si);
|
2012-08-20 15:41:06 -04:00
|
|
|
break;
|
2018-11-12 10:11:08 -05:00
|
|
|
}
|
2012-08-20 15:41:06 -04:00
|
|
|
}
|
2018-10-11 07:54:13 -04:00
|
|
|
|
|
|
|
|
/* if we are waiting for more space, don't try to read more data
|
|
|
|
|
* right now.
|
|
|
|
|
*/
|
2018-11-14 11:10:36 -05:00
|
|
|
if (si_rx_blocked(si))
|
2018-10-11 07:54:13 -04:00
|
|
|
break;
|
2012-08-20 15:41:06 -04:00
|
|
|
} /* while !flags */
|
|
|
|
|
|
2018-11-15 10:06:02 -05:00
|
|
|
done_recv:
|
2014-02-09 11:47:01 -05:00
|
|
|
if (cur_read) {
|
2014-11-28 09:46:27 -05:00
|
|
|
if ((ic->flags & (CF_STREAMER | CF_STREAMER_FAST)) &&
|
2018-07-10 11:43:27 -04:00
|
|
|
(cur_read <= ic->buf.size / 2)) {
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->xfer_large = 0;
|
|
|
|
|
ic->xfer_small++;
|
|
|
|
|
if (ic->xfer_small >= 3) {
|
2014-02-09 11:47:01 -05:00
|
|
|
/* we have read less than half of the buffer in
|
|
|
|
|
* one pass, and this happened at least 3 times.
|
|
|
|
|
* This is definitely not a streamer.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags &= ~(CF_STREAMER | CF_STREAMER_FAST);
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
2014-11-28 09:46:27 -05:00
|
|
|
else if (ic->xfer_small >= 2) {
|
2014-02-09 11:47:01 -05:00
|
|
|
/* if the buffer has been at least half full twice,
|
|
|
|
|
* we receive faster than we send, so at least it
|
|
|
|
|
* is not a "fast streamer".
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags &= ~CF_STREAMER_FAST;
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
|
|
|
|
}
|
2014-11-28 09:46:27 -05:00
|
|
|
else if (!(ic->flags & CF_STREAMER_FAST) &&
|
2018-07-10 11:43:27 -04:00
|
|
|
(cur_read >= ic->buf.size - global.tune.maxrewrite)) {
|
2014-02-09 11:47:01 -05:00
|
|
|
/* we read a full buffer at once */
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->xfer_small = 0;
|
|
|
|
|
ic->xfer_large++;
|
|
|
|
|
if (ic->xfer_large >= 3) {
|
2014-02-09 11:47:01 -05:00
|
|
|
/* we call this buffer a fast streamer if it manages
|
|
|
|
|
* to be filled in one call 3 consecutive times.
|
|
|
|
|
*/
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= (CF_STREAMER | CF_STREAMER_FAST);
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
else {
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->xfer_small = 0;
|
|
|
|
|
ic->xfer_large = 0;
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->last_read = now_ms;
|
2014-02-09 11:47:01 -05:00
|
|
|
}
|
|
|
|
|
|
MAJOR: session: only allocate buffers when needed
A session doesn't need buffers all the time, especially when they're
empty. With this patch, we don't allocate buffers anymore when the
session is initialized, we only allocate them in two cases :
- during process_session()
- during I/O operations
During process_session(), we try hard to allocate both buffers at once
so that we know for sure that a started operation can complete. Indeed,
a previous version of this patch used to allocate one buffer at a time,
but it can result in a deadlock when all buffers are allocated for
requests for example, and there's no buffer left to emit error responses.
Here, if any of the buffers cannot be allocated, the whole operation is
cancelled and the session is added at the tail of the buffer wait queue.
At the end of process_session(), a call to session_release_buffers() is
done so that we can offer unused buffers to other sessions waiting for
them.
For I/O operations, we only need to allocate a buffer on the Rx path.
For this, we only allocate a single buffer but ensure that at least two
are available to avoid the deadlock situation. In case buffers are not
available, SI_FL_WAIT_ROOM is set on the stream interface and the session
is queued. Unused buffers resulting either from a successful send() or
from an unused read buffer are offered to pending sessions during the
->wake() callback.
2014-11-25 13:46:36 -05:00
|
|
|
end_recv:
|
2019-11-20 10:42:00 -05:00
|
|
|
ret = (cur_read != 0);
|
|
|
|
|
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
/* Report EOI on the channel if it was reached from the mux point of
|
|
|
|
|
* view. */
|
2019-11-20 10:42:00 -05:00
|
|
|
if ((cs->flags & CS_FL_EOI) && !(ic->flags & CF_EOI)) {
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
ic->flags |= (CF_EOI|CF_READ_PARTIAL);
|
2019-11-20 10:42:00 -05:00
|
|
|
ret = 1;
|
|
|
|
|
}
|
MAJOR: session: only allocate buffers when needed
A session doesn't need buffers all the time, especially when they're
empty. With this patch, we don't allocate buffers anymore when the
session is initialized, we only allocate them in two cases :
- during process_session()
- during I/O operations
During process_session(), we try hard to allocate both buffers at once
so that we know for sure that a started operation can complete. Indeed,
a previous version of this patch used to allocate one buffer at a time,
but it can result in a deadlock when all buffers are allocated for
requests for example, and there's no buffer left to emit error responses.
Here, if any of the buffers cannot be allocated, the whole operation is
cancelled and the session is added at the tail of the buffer wait queue.
At the end of process_session(), a call to session_release_buffers() is
done so that we can offer unused buffers to other sessions waiting for
them.
For I/O operations, we only need to allocate a buffer on the Rx path.
For this, we only allocate a single buffer but ensure that at least two
are available to avoid the deadlock situation. In case buffers are not
available, SI_FL_WAIT_ROOM is set on the stream interface and the session
is queued. Unused buffers resulting either from a successful send() or
from an unused read buffer are offered to pending sessions during the
->wake() callback.
2014-11-25 13:46:36 -05:00
|
|
|
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
if (conn->flags & CO_FL_ERROR || cs->flags & CS_FL_ERROR) {
|
|
|
|
|
cs->flags |= CS_FL_ERROR;
|
|
|
|
|
si->flags |= SI_FL_ERR;
|
2019-11-20 10:42:00 -05:00
|
|
|
ret = 1;
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
}
|
|
|
|
|
else if (cs->flags & CS_FL_EOS) {
|
2020-01-23 10:32:24 -05:00
|
|
|
/* we received a shutdown */
|
|
|
|
|
ic->flags |= CF_READ_NULL;
|
|
|
|
|
if (ic->flags & CF_AUTO_CLOSE)
|
|
|
|
|
channel_shutw_now(ic);
|
|
|
|
|
stream_int_read0(si);
|
2019-11-20 10:42:00 -05:00
|
|
|
ret = 1;
|
BUG/MEDIUM: stream-int: Don't loose events on the CS when an EOS is reported
In si_cs_recv(), when a shutdown for reads is handled, the conn-stream may be
closed. It happens when the ouput channel is closed for writes or if
SI_FL_NOHALF is set on the stream-interface. In this case, conn-stream's flags
are reset. Thus, if an error (CS_FL_ERROR) or an end of input (CS_FL_EOI) is
reported by the mux, the event is lost. si_cs_recv() does not report these
events by itself. It relies on si_cs_process() to report them to the
stream-interface and/or the channel.
For instance, if CS_FL_EOS and CS_FL_EOI are set by the H1 multiplexer during a
call to si_cs_recv() on the server side, if the conn-stream is closed (read0 +
SI_FL_NOHALF), the CS_FL_EOI flag is lost. Thus, this may lead the stream to
interpret it as a server abort.
Now, conn-stream's flags are processed at the end of si_cs_recv(). The function
is responsible to set the right flags on the stream-interface and/or the
channel. Due to this patch, the function is now almost linear. Except some early
checks at the beginning, there is only one return statement. It also fixes a
potential bug because of an inconsistency between the splicing and the buffered
receipt. On the first case, CS_FL_EOS if handled before errors on the connection
or the conn-stream. On the second one, it is the opposite.
This patch must be backported to 2.0 and 1.9.
2019-11-20 05:56:33 -05:00
|
|
|
}
|
|
|
|
|
else if (!si_rx_blocked(si)) {
|
|
|
|
|
/* Subscribe to receive events if we're blocking on I/O */
|
2018-12-19 07:59:17 -05:00
|
|
|
conn->mux->subscribe(cs, SUB_RETRY_RECV, &si->wait_event);
|
2018-11-15 10:55:14 -05:00
|
|
|
si_rx_endp_done(si);
|
|
|
|
|
} else {
|
|
|
|
|
si_rx_endp_more(si);
|
2019-11-20 10:42:00 -05:00
|
|
|
ret = 1;
|
2018-11-15 10:55:14 -05:00
|
|
|
}
|
2019-11-20 10:42:00 -05:00
|
|
|
return ret;
|
2012-08-20 15:41:06 -04:00
|
|
|
}
|
|
|
|
|
|
2012-08-20 09:38:41 -04:00
|
|
|
/*
|
|
|
|
|
* This function propagates a null read received on a socket-based connection.
|
|
|
|
|
* It updates the stream interface. If the stream interface has SI_FL_NOHALF,
|
2015-03-12 17:32:27 -04:00
|
|
|
* the close is also forwarded to the write side as an abort.
|
2012-08-20 09:38:41 -04:00
|
|
|
*/
|
2018-12-19 09:19:27 -05:00
|
|
|
static void stream_int_read0(struct stream_interface *si)
|
2012-08-20 09:38:41 -04:00
|
|
|
{
|
2017-09-13 12:30:23 -04:00
|
|
|
struct conn_stream *cs = __objt_cs(si->end);
|
2014-11-28 09:46:27 -05:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
2013-10-01 04:45:07 -04:00
|
|
|
|
2018-11-14 10:58:52 -05:00
|
|
|
si_rx_shut_blk(si);
|
2014-11-28 09:46:27 -05:00
|
|
|
if (ic->flags & CF_SHUTR)
|
2012-08-20 09:38:41 -04:00
|
|
|
return;
|
2014-11-28 09:46:27 -05:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
2012-08-20 09:38:41 -04:00
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
2012-08-20 09:38:41 -04:00
|
|
|
return;
|
|
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
if (oc->flags & CF_SHUTW)
|
2012-08-20 09:38:41 -04:00
|
|
|
goto do_close;
|
|
|
|
|
|
|
|
|
|
if (si->flags & SI_FL_NOHALF) {
|
|
|
|
|
/* we want to immediately forward this close to the write side */
|
REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.
In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.
The files stream.{c,h} were added and session.{c,h} removed.
The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.
Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.
Once all changes are completed, we should see approximately this :
L7 - http_txn
L6 - stream
L5 - session
L4 - connection | applet
There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.
Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-02 18:22:06 -04:00
|
|
|
/* force flag on ssl to keep stream in cache */
|
2017-10-05 09:25:48 -04:00
|
|
|
cs_shutw(cs, CS_SHW_SILENT);
|
2012-08-20 09:38:41 -04:00
|
|
|
goto do_close;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* otherwise that's just a normal read shutdown */
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
do_close:
|
2012-11-21 15:51:53 -05:00
|
|
|
/* OK we completely close the socket here just as if we went through si_shut[rw]() */
|
2017-10-05 12:52:17 -04:00
|
|
|
cs_close(cs);
|
2012-11-21 15:51:53 -05:00
|
|
|
|
2014-11-28 09:46:27 -05:00
|
|
|
oc->flags &= ~CF_SHUTW_NOW;
|
|
|
|
|
oc->flags |= CF_SHUTW;
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
2012-11-21 15:51:53 -05:00
|
|
|
|
2018-11-06 13:23:03 -05:00
|
|
|
si_done_get(si);
|
2012-11-21 15:51:53 -05:00
|
|
|
|
BUG/MAJOR: stream-int: Don't receive data from mux until SI_ST_EST is reached
This bug is pretty pernicious and have serious consequences : In 2.1, an
infinite loop in process_stream() because the backend stream-interface remains
in the ready state (SI_ST_RDY). In 2.0, a call in loop to process_stream()
because the stream-interface remains blocked in the connect state
(SI_ST_CON). In both cases, it happens after a connection retry attempt. In 1.9,
it seems to not happen. But it may be just by chance or just because it is
harder to get right conditions to trigger the bug. However, reading the code,
the bug seems to exist too.
Here is how the bug happens in 2.1. When we try to establish a new connection to
a server, the corresponding stream-interface is first set to the connect state
(SI_ST_CON). When the underlying connection is known to be connected (the flag
CO_FL_CONNECTED set), the stream-interface is switched to the ready state
(SI_ST_RDY). It is a transient state between the connect state (SI_ST_CON) and
the established state (SI_ST_EST). It must be handled on the next call to
process_stream(), which is responsible to operate the transition. During all
this time, errors can occur. A connection error or a client abort. The transient
state SI_ST_RDY was introduced to let a chance to process_stream() to catch
these errors before considering the connection as fully established.
Unfortunatly, if a read0 is catched in states SI_ST_CON or SI_ST_RDY, it is
possible to have a shutdown without transition to SI_ST_DIS (in fact, here,
SI_ST_CON is swichted to SI_ST_RDY). This happens if the request was fully
received and analyzed. In this case, the flag SI_FL_NOHALF is set on the backend
stream-interface. If an error is also reported during the connect, the behavior
is undefined because an error is returned to the client and a connection retry
is performed. So on the next connection attempt to the server, if another error
is reported, a client abort is detected. But the shutdown for writes was already
done. So the transition to the state SI_ST_DIS is impossible. We stay in the
state SI_ST_RDY. Because it is a transient state, we loop in process_stream() to
perform the transition.
It is hard to understand how the bug happens reading the code and even harder to
explain. But there is a trivial way to hit the bug by sending h2 requests to a
server only speaking h1. For instance, with the following config :
listen tst
bind *:80
server www 127.0.0.1:8000 proto h2 # in reality, it is a HTTP/1.1 server
It is a configuration error, but it is an easy way to observe the bug. Note it
may happen with a valid configuration.
So, after a careful analyzis, it appears that si_cs_recv() should never be
called for a not fully established stream-interface. This way the connection
retries will be performed before reporting an error to the client. Thus, if a
shutdown is performed because a read0 is handled, the stream-interface is
inconditionnaly set to the transient state SI_ST_DIS.
This patch must be backported to 2.0 and 1.9. However on these versions, this
patch reveals a design flaw about connections and a bad way to perform the
connection retries. We are working on it.
2019-10-25 04:21:01 -04:00
|
|
|
si->state = SI_ST_DIS;
|
2012-08-20 09:38:41 -04:00
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
2015-09-23 14:06:13 -04:00
|
|
|
/* Callback to be used by applet handlers upon completion. It updates the stream
|
|
|
|
|
* (which may or may not take this opportunity to try to forward data), then
|
2017-06-30 08:11:56 -04:00
|
|
|
* may re-enable the applet's based on the channels and stream interface's final
|
2015-09-23 14:06:13 -04:00
|
|
|
* states.
|
|
|
|
|
*/
|
2015-09-25 05:45:06 -04:00
|
|
|
void si_applet_wake_cb(struct stream_interface *si)
|
2015-04-19 09:16:35 -04:00
|
|
|
{
|
2015-09-25 13:11:55 -04:00
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
|
|
|
|
/* If the applet wants to write and the channel is closed, it's a
|
|
|
|
|
* broken pipe and it must be reported.
|
|
|
|
|
*/
|
2018-11-14 07:43:35 -05:00
|
|
|
if (!(si->flags & SI_FL_RX_WAIT_EP) && (ic->flags & CF_SHUTR))
|
2015-09-25 13:11:55 -04:00
|
|
|
si->flags |= SI_FL_ERR;
|
|
|
|
|
|
2018-11-16 10:18:34 -05:00
|
|
|
/* automatically mark the applet having data available if it reported
|
|
|
|
|
* begin blocked by the channel.
|
|
|
|
|
*/
|
|
|
|
|
if (si_rx_blocked(si))
|
|
|
|
|
si_rx_endp_more(si);
|
|
|
|
|
|
2015-09-23 14:06:13 -04:00
|
|
|
/* update the stream-int, channels, and possibly wake the stream up */
|
|
|
|
|
stream_int_notify(si);
|
2019-08-01 08:17:02 -04:00
|
|
|
stream_release_buffers(si_strm(si));
|
2015-04-19 09:16:35 -04:00
|
|
|
|
2018-11-14 08:07:59 -05:00
|
|
|
/* stream_int_notify may have passed through chk_snd and released some
|
|
|
|
|
* RXBLK flags. Process_stream will consider those flags to wake up the
|
|
|
|
|
* appctx but in the case the task is not in runqueue we may have to
|
|
|
|
|
* wakeup the appctx immediately.
|
2017-06-30 08:11:56 -04:00
|
|
|
*/
|
2019-04-17 13:29:35 -04:00
|
|
|
if ((si_rx_endp_ready(si) && !si_rx_blocked(si)) ||
|
|
|
|
|
(si_tx_endp_ready(si) && !si_tx_blocked(si)))
|
2015-04-19 12:13:56 -04:00
|
|
|
appctx_wakeup(si_appctx(si));
|
2015-04-13 10:30:14 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* This function performs a shutdown-read on a stream interface attached to an
|
|
|
|
|
* applet in a connected or init state (it does nothing for other states). It
|
|
|
|
|
* either shuts the read side or marks itself as closed. The buffer flags are
|
|
|
|
|
* updated to reflect the new state. If the stream interface has SI_FL_NOHALF,
|
|
|
|
|
* we also forward the close to the write side. The owner task is woken up if
|
|
|
|
|
* it exists.
|
|
|
|
|
*/
|
|
|
|
|
static void stream_int_shutr_applet(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
2018-11-14 10:58:52 -05:00
|
|
|
si_rx_shut_blk(si);
|
2015-04-13 10:30:14 -04:00
|
|
|
if (ic->flags & CF_SHUTR)
|
|
|
|
|
return;
|
|
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
|
|
|
|
|
2015-04-19 11:20:03 -04:00
|
|
|
/* Note: on shutr, we don't call the applet */
|
|
|
|
|
|
2019-06-05 08:34:03 -04:00
|
|
|
if (!si_state_in(si->state, SI_SB_CON|SI_SB_RDY|SI_SB_EST))
|
2015-04-13 10:30:14 -04:00
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
if (si_oc(si)->flags & CF_SHUTW) {
|
2015-09-25 14:24:26 -04:00
|
|
|
si_applet_release(si);
|
2015-04-13 10:30:14 -04:00
|
|
|
si->state = SI_ST_DIS;
|
|
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
else if (si->flags & SI_FL_NOHALF) {
|
|
|
|
|
/* we want to immediately forward this close to the write side */
|
|
|
|
|
return stream_int_shutw_applet(si);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* This function performs a shutdown-write on a stream interface attached to an
|
|
|
|
|
* applet in a connected or init state (it does nothing for other states). It
|
|
|
|
|
* either shuts the write side or marks itself as closed. The buffer flags are
|
|
|
|
|
* updated to reflect the new state. It does also close everything if the SI
|
|
|
|
|
* was marked as being in error state. The owner task is woken up if it exists.
|
|
|
|
|
*/
|
|
|
|
|
static void stream_int_shutw_applet(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
|
|
|
|
|
oc->flags &= ~CF_SHUTW_NOW;
|
|
|
|
|
if (oc->flags & CF_SHUTW)
|
|
|
|
|
return;
|
|
|
|
|
oc->flags |= CF_SHUTW;
|
|
|
|
|
oc->wex = TICK_ETERNITY;
|
2018-11-06 13:23:03 -05:00
|
|
|
si_done_get(si);
|
2015-04-13 10:30:14 -04:00
|
|
|
|
BUG/MEDIUM: stream: fix client-fin/server-fin handling
A tcp half connection can cause 100% CPU on expiration.
First reproduced with this haproxy configuration :
global
tune.bufsize 10485760
defaults
timeout server-fin 90s
timeout client-fin 90s
backend node2
mode tcp
timeout server 900s
timeout connect 10s
server def 127.0.0.1:3333
frontend fe_api
mode tcp
timeout client 900s
bind :1990
use_backend node2
Ie timeout server-fin shorter than timeout server, the backend server
sends data, this package is left in the cache of haproxy, the backend
server continue sending fin package, haproxy recv fin package. this
time the session information is as follows:
time the session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=]
s1=[7,18h,fd=7,ex=] exp=14m58s
rp has set the CF_SHUTR state, next, the client sends the fin package,
session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=1m11s
After waiting 90s, session information is as follows:
0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2
srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=]
rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=]
s1=[9,10h,fd=7,ex=] exp=? run(nice=0)
cpu information:
6899 root 20 0 112224 21408 4260 R 100.0 0.7 3:04.96 haproxy
Buffering is set to ensure that there is data in the haproxy buffer, and haproxy
can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been
set, The following code does not clear the timeout message, causing cpu 100%:
stream.c:process_stream:
if (unlikely((res->flags & (CF_SHUTR|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) {
if (si_b->flags & SI_FL_NOHALF)
si_b->flags |= SI_FL_NOLINGER;
si_shutr(si_b);
}
If you have closed the read, set the read timeout does not make sense.
With or without cf_shutr, read timeout is set:
if (tick_isset(s->be->timeout.serverfin)) {
res->rto = s->be->timeout.serverfin;
res->rex = tick_add(now_ms, res->rto);
}
After discussion on the mailing list, setting half-closed timeouts the
hard way here doesn't make sense. They should be set only at the moment
the shutdown() is performed. It will also solve a special case which was
already reported of some half-closed timeouts not working when the shutw()
is performed directly at the stream-interface layer (no analyser involved).
Since the stream interface layer cannot know the timeout values, we'll have
to store them directly in the stream interface so that they are used upon
shutw(). This patch does this, fixing the problem.
An easier reproducer to validate the fix is to keep the huge buffer and
shorten all timeouts, then call it under tcploop server and client, and
wait 3 seconds to see haproxy run at 100% CPU :
global
tune.bufsize 10485760
listen px
bind :1990
timeout client 90s
timeout server 90s
timeout connect 1s
timeout server-fin 3s
timeout client-fin 3s
server def 127.0.0.1:3333
$ tcploop 3333 L W N20 A P100 F P10000 &
$ tcploop 127.0.0.1:1990 C S10000000 F
2017-03-10 12:41:51 -05:00
|
|
|
if (tick_isset(si->hcto)) {
|
|
|
|
|
ic->rto = si->hcto;
|
|
|
|
|
ic->rex = tick_add(now_ms, ic->rto);
|
|
|
|
|
}
|
|
|
|
|
|
2015-04-19 11:20:03 -04:00
|
|
|
/* on shutw we always wake the applet up */
|
|
|
|
|
appctx_wakeup(si_appctx(si));
|
|
|
|
|
|
2015-04-13 10:30:14 -04:00
|
|
|
switch (si->state) {
|
2019-06-05 08:34:03 -04:00
|
|
|
case SI_ST_RDY:
|
2015-04-13 10:30:14 -04:00
|
|
|
case SI_ST_EST:
|
|
|
|
|
/* we have to shut before closing, otherwise some short messages
|
|
|
|
|
* may never leave the system, especially when there are remaining
|
|
|
|
|
* unread data in the socket input buffer, or when nolinger is set.
|
|
|
|
|
* However, if SI_FL_NOLINGER is explicitly set, we know there is
|
|
|
|
|
* no risk so we close both sides immediately.
|
|
|
|
|
*/
|
|
|
|
|
if (!(si->flags & (SI_FL_ERR | SI_FL_NOLINGER)) &&
|
|
|
|
|
!(ic->flags & (CF_SHUTR|CF_DONT_READ)))
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
/* fall through */
|
|
|
|
|
case SI_ST_CON:
|
|
|
|
|
case SI_ST_CER:
|
|
|
|
|
case SI_ST_QUE:
|
|
|
|
|
case SI_ST_TAR:
|
|
|
|
|
/* Note that none of these states may happen with applets */
|
|
|
|
|
si_applet_release(si);
|
2015-09-25 14:24:26 -04:00
|
|
|
si->state = SI_ST_DIS;
|
2020-05-29 08:35:51 -04:00
|
|
|
/* fall through */
|
2015-04-13 10:30:14 -04:00
|
|
|
default:
|
2018-11-14 10:58:52 -05:00
|
|
|
si->flags &= ~SI_FL_NOLINGER;
|
|
|
|
|
si_rx_shut_blk(si);
|
2015-04-13 10:30:14 -04:00
|
|
|
ic->flags |= CF_SHUTR;
|
|
|
|
|
ic->rex = TICK_ETERNITY;
|
|
|
|
|
si->exp = TICK_ETERNITY;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* chk_rcv function for applets */
|
|
|
|
|
static void stream_int_chk_rcv_applet(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *ic = si_ic(si);
|
|
|
|
|
|
|
|
|
|
DPRINTF(stderr, "%s: si=%p, si->state=%d ic->flags=%08x oc->flags=%08x\n",
|
|
|
|
|
__FUNCTION__,
|
|
|
|
|
si, si->state, ic->flags, si_oc(si)->flags);
|
|
|
|
|
|
2018-10-11 07:54:13 -04:00
|
|
|
if (!ic->pipe) {
|
2015-04-13 10:30:14 -04:00
|
|
|
/* (re)start reading */
|
2015-04-19 11:20:03 -04:00
|
|
|
appctx_wakeup(si_appctx(si));
|
2015-09-04 12:40:36 -04:00
|
|
|
}
|
2015-04-13 10:30:14 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* chk_snd function for applets */
|
|
|
|
|
static void stream_int_chk_snd_applet(struct stream_interface *si)
|
|
|
|
|
{
|
|
|
|
|
struct channel *oc = si_oc(si);
|
|
|
|
|
|
|
|
|
|
DPRINTF(stderr, "%s: si=%p, si->state=%d ic->flags=%08x oc->flags=%08x\n",
|
|
|
|
|
__FUNCTION__,
|
|
|
|
|
si, si->state, si_ic(si)->flags, oc->flags);
|
|
|
|
|
|
|
|
|
|
if (unlikely(si->state != SI_ST_EST || (oc->flags & CF_SHUTW)))
|
|
|
|
|
return;
|
|
|
|
|
|
2015-04-19 11:20:03 -04:00
|
|
|
/* we only wake the applet up if it was waiting for some data */
|
|
|
|
|
|
|
|
|
|
if (!(si->flags & SI_FL_WAIT_DATA))
|
2015-04-13 10:30:14 -04:00
|
|
|
return;
|
|
|
|
|
|
|
|
|
|
if (!tick_isset(oc->wex))
|
|
|
|
|
oc->wex = tick_add_ifset(now_ms, oc->wto);
|
|
|
|
|
|
2015-04-19 11:20:03 -04:00
|
|
|
if (!channel_is_empty(oc)) {
|
|
|
|
|
/* (re)start sending */
|
|
|
|
|
appctx_wakeup(si_appctx(si));
|
|
|
|
|
}
|
2015-04-13 10:30:14 -04:00
|
|
|
}
|
|
|
|
|
|
[MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.
The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.
Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.
Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.
Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.
This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.
The following debugging line was useful to track state changes :
fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 00:26:53 -05:00
|
|
|
/*
|
|
|
|
|
* Local variables:
|
|
|
|
|
* c-indent-level: 8
|
|
|
|
|
* c-basic-offset: 8
|
|
|
|
|
* End:
|
|
|
|
|
*/
|