In stream_set_backend(), if we have a TCP stream, and we want to upgrade it
to H2 instead of attempting ot reuse the stream, just destroy the
conn_stream, make sure we don't log anything about the stream, and pretend
we failed setting the backend, so that the stream will get destroyed.
New streams will then be created by the mux, as if the connection just
happened.
This fixes a crash when upgrading from TCP to H2, as the H2 mux totally
ignored the conn_stream provided by the upgrade, as reported in github
issue #196.
This should be backported to 2.0.
The purpose will be to store the target address there and not to
allocate a connection just for this anymore. For now it's only placed
in the struct, a few fields were moved to plug some holes, and the
entry is freed on release (never allocated yet for now). This must
have no impact. Note that in order to fit, the store_count which
previously was an int was turned into a short, which is way more
than enough given that the hard-coded limit is 8.
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.
In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.
This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.
A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.
The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).
Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.
This patch creates a global counter to track various errors reported by
the action 'do-resolve'.
The flag SF_HTX has been added to know when a stream uses the HTX or not. It is
set when an HTX stream is created. There are 2 conditions to set it. The first
one is when the HTTP frontend enables the HTX. The second one is when the attached
conn_stream uses an HTX multiplexer.
Handle the CLI level in the master CLI. In order to do this, the master
CLI stores the level in the stream. Each command are prefixed by a
"user" or "operator" command before they are forwarded to the target
CLI.
The level can be configured in the haproxy program arguments with the
level keyword: -S /tmp/sock,level,admin -S /tmp/sock2,level,user.
These flags haven't been used for a while. SF_TUNNEL was reintroduced
by commit d62b98c6e ("MINOR: stream: don't set backend's nor response
analysers on SF_TUNNEL") to handle the two-level streams needed to
deal with the first model for H2, and was not removed after this model
was abandonned. SF_INITIALIZED was only set. SF_CONN_TAR was never
referenced at all.
The CLI proxy was not handling payload. To do that, we needed to keep a
connection active on a server and to transfer each new line over that
connection until we receive a empty line.
The CLI proxy handles the payload in the same way that the CLI do it.
Examples:
$ echo -e "@1;add map #-1 <<\n$(cat data)\n" | socat /tmp/master-socket -
$ socat /tmp/master-socket readline
prompt
master> @1
25130> add map #-1 <<
+ test test
+ test2 test2
+ test3 test3
+
25130>
This patch implements analysers for parsing the CLI and extra features
for the master's CLI.
For each command (sent alone, or separated by ; or \n) the request
analyser will determine to which server it should send the request.
The 'mode cli' proxy is able to parse a prefix for each command which is
used to select the apropriate server. The prefix start by @ and is
followed by "master", the PID preceded by ! or the relative PID. (e.g.
@master, @1, @!1234). The servers are not round-robined anymore.
The command is sent with a SHUTW which force the server to close the
connection after sending its response. However the proxy allows a
keepalive connection on the client side and does not close.
The response analyser does not do much stuff, it only reinits the
connection when it received a close from the server, and forward the
response. It does not analyze the response data.
The only guarantee of the end of the response is the close of the
server, we can't rely on the double \n since it's not send by every
command.
This could be reimplemented later as a filter.
This adds the set-priority-class and set-priority-offset actions to
http-request and tcp-request content. At this point they are not used
yet, which is the purpose of the next commit, but all the logic to
set and clear the values is there.
The current name is misleading as it implies a queue size, but the value
instead indicates a position in the queue.
The value is only the queue size at the exact moment the element is enqueued.
Soon we will gain the ability to insert anywhere into the queue, upon which
clarity of the name is more important.
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.
So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.
However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.
This patch must be backported in 1.8.
Commit bcb86ab ("MINOR: session: add a streams field to the session
struct") added this list of streams that is not needed anymore. Let's
get rid of it now.
Now each stream is added to the session's list of streams, so that it
will be possible to know all the streams belonging to a session, and
to know if any stream is still attached to a sessoin.
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.
Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.
Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.
So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.
In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.
[wt: backport to 1.7 and 1.6]
A stream can be awakened for different reasons. During its processing, it can be
early stopped if no buffer is available. In this situation, the reason why the
stream was awakened is lost, because we rely on the task state, which is reset
after each processing loop.
In many cases, that's not a big deal. But it can be useful to accumulate the
task states if the stream processing is interrupted, especially if some filters
need to be called.
To be clearer, here is an simple example:
1) A stream is awakened with the reason TASK_WOKEN_MSG.
2) Because no buffer is available, the processing is interrupted, the stream
is back to sleep. And the task state is reset.
3) Some buffers become available, so the stream is awakened with the reason
TASK_WOKEN_RES. At this step, the previous reason (TASK_WOKEN_MSG) is lost.
Now, the task states are saved for a stream and reset only when the stream
processing is not interrupted. The correspoing bitfield represents the pending
events for a stream. And we use this one instead of the task state during the
stream processing.
Note that TASK_WOKEN_TIMER and TASK_WOKEN_RES are always removed because these
events are always handled during the stream processing.
[wt: backport to 1.7 and 1.6]
HAProxy used to deduce port used for health checks when parsing configuration
at startup time.
Because of this way of working, it makes it complicated to change the port at
run time.
The current patch changes this behavior and makes HAProxy to choose the
port used for health checking when preparing the check task itself.
A new type of error is introduced and reported when no port can be found.
There won't be any impact on performance, since the process to find out the
port value is made of a few 'if' statements.
This patch also introduces a new check state CHK_ST_PORT_MISS: this flag is
used to report an error in the case when HAProxy needs to establish a TCP
connection to a server, to perform a health check but no TCP ports can be
found for it.
And last, it also introduces a new stream termination condition:
SF_ERR_CHK_PORT. Purpose of this flag is to report an error in the event when
HAProxy has to run a health check but no port can be found to perform it.
Tq is the time between the instant the connection is accepted and a
complete valid request is received. This time includes the handshake
(SSL / Proxy-Protocol), the idle when the browser does preconnect and
the request reception.
This patch decomposes %Tq in 3 measurements names %Th, %Ti, and %TR
which returns respectively the handshake time, the idle time and the
duration of valid request reception. It also adds %Ta which reports
the request's active time, which is the total time without %Th nor %Ti.
It replaces %Tt as the total time, reporting accurate measurements for
HTTP persistent connections.
%Th is avalaible for TCP and HTTP sessions, %Ti, %TR and %Ta are only
avalaible for HTTP connections.
In addition to this, we have new timestamps %tr, %trg and %trl, which
log the date of start of receipt of the request, respectively in the
default format, in GMT time and in local time (by analogy with %t, %T
and %Tl). All of them are obviously only available for HTTP. These values
are more relevant as they more accurately represent the request date
without being skewed by a browser's preconnect nor a keep-alive idle
time.
The HTTP log format and the CLF log format have been modified to
use %tr, %TR, and %Ta respectively instead of %t, %Tq and %Tt. This
way the default log formats now produce the expected output for users
who don't want to manually fiddle with the log-format directive.
Example with the following log-format :
log-format "%ci:%cp [%tr] %ft %b/%s h=%Th/i=%Ti/R=%TR/w=%Tw/c=%Tc/r=%Tr/a=%Ta/t=%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"
The request was sent by hand using "openssl s_client -connect" :
Aug 23 14:43:20 haproxy[25446]: 127.0.0.1:45636 [23/Aug/2016:14:43:20.221] test~ test/test h=6/i=2375/R=261/w=0/c=1/r=0/a=262/t=2643 200 145 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"
=> 6 ms of SSL handshake, 2375 waiting before sending the first char (in
fact the time to type the first line), 261 ms before the end of the request,
no time spent in queue, 1 ms spend connecting to the server, immediate
response, total active time for this request = 262ms. Total time from accept
to close : 2643 ms.
The timing now decomposes like this :
first request 2nd request
|<-------------------------------->|<-------------- ...
t tr t tr ...
---|----|----|----|----|----|----|----|----|--
: Th Ti TR Tw Tc Tr Td : Ti ...
:<---- Tq ---->: :
:<-------------- Tt -------------->:
:<--------- Ta --------->:
From the stream point of view, this new structure is opaque. it hides filters
implementation details. So, impact for future optimizations will be reduced
(well, we hope so...).
Some small improvements has been made in filters.c to avoid useless checks.
When no filter is attached to the stream, the CPU footprint due to the calls to
filters_* functions is huge, especially for chunk-encoded messages. Using macros
to check if we have some filters or not is a great improvement.
Furthermore, instead of checking the filter list emptiness, we introduce a flag
to know if filters are attached or not to a stream.
HTTP compression has been rewritten to use the filter API. This is more a PoC
than other thing for now. It allocates memory to work. So, if only for that, it
should be rewritten.
In the mean time, the implementation has been refactored to allow its use with
other filters. However, there are limitations that should be respected:
- No filter placed after the compression one is allowed to change input data
(in 'http_data' callback).
- No filter placed before the compression one is allowed to change forwarded
data (in 'http_forward_data' callback).
For now, these limitations are informal, so you should be careful when you use
several filters.
About the configuration, 'compression' keywords are still supported and must be
used to configure the HTTP compression behavior. In absence of a 'filter' line
for the compression filter, it is added in the filter chain when the first
compression' line is parsed. This is an easy way to do when you do not use other
filters. But another filter exists, an error is reported so that the user must
explicitly declare the filter.
For example:
listen tst
...
compression algo gzip
compression offload
...
filter flt_1
filter compression
filter flt_2
...
This patch adds the support of filters in HAProxy. The main idea is to have a
way to "easely" extend HAProxy by adding some "modules", called filters, that
will be able to change HAProxy behavior in a programmatic way.
To do so, many entry points has been added in code to let filters to hook up to
different steps of the processing. A filter must define a flt_ops sutrctures
(see include/types/filters.h for details). This structure contains all available
callbacks that a filter can define:
struct flt_ops {
/*
* Callbacks to manage the filter lifecycle
*/
int (*init) (struct proxy *p);
void (*deinit)(struct proxy *p);
int (*check) (struct proxy *p);
/*
* Stream callbacks
*/
void (*stream_start) (struct stream *s);
void (*stream_accept) (struct stream *s);
void (*session_establish)(struct stream *s);
void (*stream_stop) (struct stream *s);
/*
* HTTP callbacks
*/
int (*http_start) (struct stream *s, struct http_msg *msg);
int (*http_start_body) (struct stream *s, struct http_msg *msg);
int (*http_start_chunk) (struct stream *s, struct http_msg *msg);
int (*http_data) (struct stream *s, struct http_msg *msg);
int (*http_last_chunk) (struct stream *s, struct http_msg *msg);
int (*http_end_chunk) (struct stream *s, struct http_msg *msg);
int (*http_chunk_trailers)(struct stream *s, struct http_msg *msg);
int (*http_end_body) (struct stream *s, struct http_msg *msg);
void (*http_end) (struct stream *s, struct http_msg *msg);
void (*http_reset) (struct stream *s, struct http_msg *msg);
int (*http_pre_process) (struct stream *s, struct http_msg *msg);
int (*http_post_process) (struct stream *s, struct http_msg *msg);
void (*http_reply) (struct stream *s, short status,
const struct chunk *msg);
};
To declare and use a filter, in the configuration, the "filter" keyword must be
used in a listener/frontend section:
frontend test
...
filter <FILTER-NAME> [OPTIONS...]
The filter referenced by the <FILTER-NAME> must declare a configuration parser
on its own name to fill flt_ops and filter_conf field in the proxy's
structure. An exemple will be provided later to make it perfectly clear.
For now, filters cannot be used in backend section. But this is only a matter of
time. Documentation will also be added later. This is the first commit of a long
list about filters.
It is possible to have several filters on the same listener/frontend. These
filters are stored in an array of at most MAX_FILTERS elements (define in
include/types/filters.h). Again, this will be replaced later by a list of
filters.
The filter API has been highly refactored. Main changes are:
* Now, HA supports an infinite number of filters per proxy. To do so, filters
are stored in list.
* Because filters are stored in list, filters state has been moved from the
channel structure to the filter structure. This is cleaner because there is no
more info about filters in channel structure.
* It is possible to defined filters on backends only. For such filters,
stream_start/stream_stop callbacks are not called. Of course, it is possible
to mix frontend and backend filters.
* Now, TCP streams are also filtered. All callbacks without the 'http_' prefix
are called for all kind of streams. In addition, 2 new callbacks were added to
filter data exchanged through a TCP stream:
- tcp_data: it is called when new data are available or when old unprocessed
data are still waiting.
- tcp_forward_data: it is called when some data can be consumed.
* New callbacks attached to channel were added:
- channel_start_analyze: it is called when a filter is ready to process data
exchanged through a channel. 2 new analyzers (a frontend and a backend)
are attached to channels to call this callback. For a frontend filter, it
is called before any other analyzer. For a backend filter, it is called
when a backend is attached to a stream. So some processing cannot be
filtered in that case.
- channel_analyze: it is called before each analyzer attached to a channel,
expects analyzers responsible for data sending.
- channel_end_analyze: it is called when all other analyzers have finished
their processing. A new analyzers is attached to channels to call this
callback. For a TCP stream, this is always the last one called. For a HTTP
one, the callback is called when a request/response ends, so it is called
one time for each request/response.
* 'session_established' callback has been removed. Everything that is done in
this callback can be handled by 'channel_start_analyze' on the response
channel.
* 'http_pre_process' and 'http_post_process' callbacks have been replaced by
'channel_analyze'.
* 'http_start' callback has been replaced by 'http_headers'. This new one is
called just before headers sending and parsing of the body.
* 'http_end' callback has been replaced by 'channel_end_analyze'.
* It is possible to set a forwarder for TCP channels. It was already possible to
do it for HTTP ones.
* Forwarders can partially consumed forwardable data. For this reason a new
HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING.
Now all filters can define corresponding callbacks (http_forward_data
and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and
forward, to track, respectively, input data already parsed but not forwarded yet
by the filter and parsed data considered as forwarded by the filter. A any time,
we have the warranty that a filter cannot parse or forward more input than
previous ones. And, of course, it cannot forward more input than it has
parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD.
In addition, 2 functions has been added to change the 'next size' and the
'forward size' of a filter. When a filter parses input data, it can alter these
data, so the size of these data can vary. This action has an effet on all
previous filters that must be handled. To do so, the function
'filter_change_next_size' must be called, passing the size variation. In the
same spirit, if a filter alter forwarded data, it must call the function
'filter_change_forward_size'. 'filter_change_next_size' can be called in
'http_data' and 'tcp_data' callbacks and only these ones. And
'filter_change_forward_size' can be called in 'http_forward_data' and
'tcp_forward_data' callbacks and only these ones. The data changes are the
filter responsability, but with some limitation. It must not change already
parsed/forwarded data or data that previous filters have not parsed/forwarded
yet.
Because filters can be used on backends, when we the backend is set for a
stream, we add filters defined for this backend in the filter list of the
stream. But we must only do that when the backend and the frontend of the stream
are not the same. Else same filters are added a second time leading to undefined
behavior.
The HTTP compression code had to be moved.
So it simplifies http_response_forward_body function. To do so, the way the data
are forwarded has changed. Now, a filter (and only one) can forward data. In a
commit to come, this limitation will be removed to let all filters take part to
data forwarding. There are 2 new functions that filters should use to deal with
this feature:
* flt_set_http_data_forwarder: This function sets the filter (using its id)
that will forward data for the specified HTTP message. It is possible if it
was not already set by another filter _AND_ if no data was yet forwarded
(msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs.
* flt_http_data_forwarder: This function returns the filter id that will
forward data for the specified HTTP message. If there is no forwarder set, it
returns -1.
When an HTTP data forwarder is set for the response, the HTTP compression is
disabled. Of course, this is not definitive.
The commit "MEDIUM: vars: move the session variables to the session, not the stream" (ebcd4844e82a4198ea5d98fe491a46267da1d1ec")
moves the variables from the stream to the session. It forgot to remove
the stream definition of the "vars_sess".
This patch adds support of variables during the processing of each stream. The
variables scope can be set as 'session', 'transaction', 'request' or 'response'.
The variable type is the type returned by the assignment expression. The type
can change while the processing.
The allocated memory can be controlled for each scope and each request, and for
the global process.
Commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom
actions") introduced the ability to interrupt and restart processing in
the middle of a TCP/HTTP ruleset. But it doesn't do it in a consistent
way : it checks current_rule_list, immediately dereferences current_rule,
which is only set in certain cases and never cleared. So that broke the
tcp-request content rules when the processing was interrupted due to
missing data, because current_rule was not yet set (segfault) or could
have been inherited from another ruleset if it was used in a backend
(random behaviour).
The proper way to do it is to always set current_rule before dereferencing
it. But we don't want to set it for all rules because we don't want any
action to provide a checkpointing mechanism. So current_rule is set to NULL
before entering the loop, and only used if not NULL and if current_rule_list
matches the current list. This way they both serve as a guard for the other
one. This fix also makes the current rule point to the rule instead of its
list element, as it's much easier to manipulate.
No backport is needed, this is 1.6-specific.
The stick counters in the session will be used for everything not related
to contents, hence the connections / concurrent sessions / etc. They will
be usable by "tcp-request connection" rules even without a stream. For now
they're just allocated and initialized.
Now that we have sess->origin to carry that information along, we don't
need to put that into strm->target anymore, so we remove one dependence
on the stream in embryonic connections.
Now this one is dynamically allocated. It means that 280 bytes of memory
are saved per TCP stream, but more importantly that it will become
possible to remove the l7 pointer from fetches and converters since
it will be deduced from the stream and will support being null.
A lot of care was taken because it's easy to forget a test somewhere,
and the previous code used to always trust s->txn for being valid, but
all places seem to have been visited.
All HTTP fetch functions check the txn first so we shouldn't have any
issue there even when called from TCP. When branching from a TCP frontend
to an HTTP backend, the txn is properly allocated at the same time as the
hdr_idx.
The header captures are now general purpose captures since tcp rules
can use them to capture various contents. That removes a dependency
on http_txn that appeared in some sample fetch functions and in the
order by which captures and http_txn were allocated.
Interestingly the reset of the header captures were done at too many
places as http_init_txn() used to do it while it was done previously
in every call place.
Just like for the listener, the frontend is session-wide so let's move
it to the session. There are a lot of places which were changed but the
changes are minimal in fact.
There is now a pointer to the session in the stream, which is NULL
for now. The session pool is created as well. Some parts will move
from the stream to the session now.
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.
In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.
The files stream.{c,h} were added and session.{c,h} removed.
The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.
Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.
Once all changes are completed, we should see approximately this :
L7 - http_txn
L6 - stream
L5 - session
L4 - connection | applet
There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.
Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.