This patch introduces the 'http-after-response' rules. These rules are evaluated
at the end of the response analysis, just before the data forwarding, on ALL
HTTP responses, the server ones but also all responses generated by
HAProxy. Thanks to this ruleset, it is now possible for instance to add some
headers to the responses generated by the stats applet. Following actions are
supported :
* allow
* add-header
* del-header
* replace-header
* replace-value
* set-header
* set-status
* set-var
* strict-mode
* unset-var
It is now possible to insert any attribute when a cookie is inserted by
HAProxy. Any value may be set, no check is performed except the syntax validity
(CTRL chars and ';' are forbidden). For instance, it may be used to add the
SameSite attribute:
cookie SRV insert attr "SameSite=Strict"
The attr option may be repeated to add several attributes.
This patch should fix the issue #361.
It is now possible to import in a proxy, fully or partially, error files
declared in an http-errors section. It may be done using the "errorfiles"
directive, followed by a name and optionally a list of status code. If there is
no status code specified, all error files of the http-errors section are
imported. Otherwise, only error files associated to the listed status code are
imported. For instance :
http-errors my-errors
errorfile 400 ...
errorfile 403 ...
errorfile 404 ...
frontend frt
errorfiles my-errors 403 404 # ==> error 400 not imported
All custom HTTP errors are now stored in a global tree. Proxies use a references
on these messages. The key used for errorfile directives is the file name as
specified in the configuration. For errorloc directives, a key is created using
the redirect code and the url. This means that the same custom error message is
now stored only once. It may be used in several proxies or for several status
code, it is only parsed and stored once.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.
While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.
The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.
Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.
This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.
The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".
This must be backported as far as 1.8.
There is no standard case for HTTP header names because, as stated in the
RFC7230, they are case-insensitive. So applications must handle them in a
case-insensitive manner. But some bogus applications erroneously rely on the
case used by most browsers. This problem becomes critical with HTTP/2
because all header names must be exchanged in lowercase. And HAProxy uses the
same convention. All header names are sent in lowercase to clients and servers,
regardless of the HTTP version.
This design choice is linked to the HTX implementation. So, for previous
versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall
back to the legacy HTTP mode.
Since the legacy HTTP mode was removed, some users reported interoperability
issues because their application was not able anymore to handle HTTP/1 message
received from HAProxy. So, we've decided to add a way to change the case of some
headers before sending them. It is now possible to define a "mapping" between a
lowercase header name and a version supported by the bogus application. To do
so, you must use the global directives "h1-case-adjust" and
"h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and
"h1-case-adjust-bogus-server" may be used in proxy sections to enable the
conversion. See the configuration manual for more info.
Of course, our advice is to urgently upgrade these applications for
interoperability concerns and because they may be vulnerable to various types of
content smuggling attacks. But, if your are really forced to use an unmaintained
bogus application, you may use these directive, at your own risks.
If it is relevant, this feature may be backported to 2.0.
The keywords req* and rsp* are now unsupported. So the corresponding lists are
now unused. It is safe to remove them from the structure proxy.
As a result, the code dealing with these rules in HTTP analyzers was also
removed.
The option 'http-tunnel' is deprecated and it was only used in the legacy HTTP
mode. So this option is now totally ignored and a warning is emitted during
HAProxy startup if it is found in a configuration file.
Add a list of proxies for all the stick-tables (->proxies_list struct stktable
member) so that to be able to compute the process bindings of the peers after having
parsed the configuration file.
The proxies are added to the stick-tables they reference when parsing
stick-tables lines in proxy sections, when checking the actions in
check_trk_action() and when resolving samples args for stick-tables
without checking is they are duplicates. We check only there is no loop.
Then, after having parsed everything, we add the proxy bindings to the
peers frontend bindings with stick-tables they reference.
This patch adds the support for the "table" line parsing in "peers" sections
to declare stick-table in such sections. This also prevents the user from having
to declare dummy backends sections with a unique stick-table inside.
Even if still supported, this usage will become deprecated.
To do so, the ->table member of proxy struct which is a stktable struct is replaced
by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c
for the dummy stick-table backends and in src/cfgparse.c for "peers" sections.
This has an impact on the code for stick-table sample converters and on the stickiness
rules parsers which first store the name of the dummy before resolving the rules.
This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls
to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time.
There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c.
At several places in the code we relied on the fact that ->size member of stick-table
was equal to zero to consider the stick-table was present by not configured,
this do not make sense anymore as ->table member of struct proxyis fow now on a pointer.
These tests are replaced by a test on ->table value itself.
In "peers" section we do not have to temporary store the name of the section the
stick-table are attached to because this name is obviously already known just after
having entered this "peers" section.
About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by
a pointer to a stktable struct.
Add a way to retry requests if we got a junk response from the server, ie
an incomplete response, or something that is not valid HTTP.
To do so, one can use the new "junk-response" keyword for retry-on.
Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to
replay requests for which we sent early data that got rejected by the
server.
If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set
on the server line even if the client didn't send early data.
When running in HTX mode, if we sent the request, but failed to get the
answer, either because the server just closed its socket, we hit a server
timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error,
attempt to retry the request, exactly as if we just failed to connect to
the server.
To do so, add a new backend keyword, "retry-on".
It accepts a list of keywords, which can be "none" (never retry),
"conn-failure" (we failed to connect, or to do the SSL handshake),
"empty-response" (the server closed the connection without answering),
"response-timeout" (we timed out while waiting for the server response),
or "404", "408", "425", "500", "501", "502", "503" and "504".
The default is "conn-failure".
The algo-specific settings move from the proxy to the LB algo this way :
- uri_whole => arg_opt1
- uri_len_limit => arg_opt2
- uri_dirs_depth1 => arg_opt3
These ones used to rely on separate variables called hh_name/hh_len
but they are exclusive with the former. Let's use the same variable
which becomes a generic argument name and length for the LB algorithm.
In session, don't keep an infinite number of connection that can idle.
Add a new frontend parameter, "max-session-srv-conns" to set a max number,
with a default value of 5.
First of all, an dedicated error snapshot, h1_snapshot, has been added. It
contains more or less the some info than http_snapshot but adapted for H1
messages. Then, the function h1_capture_bad_message() has been added to capture
bad H1 messages. And finally, the function h1_show_error_snapshot() is used to
dump these errors. Only Headers or data parsing are captured.
This was the largest function of the whole file, taking a rough second
to build alone. Let's move it to a distinct file along with a few
dependencies. Doing so saved about 2 seconds on the total build time.
This patch implements analysers for parsing the CLI and extra features
for the master's CLI.
For each command (sent alone, or separated by ; or \n) the request
analyser will determine to which server it should send the request.
The 'mode cli' proxy is able to parse a prefix for each command which is
used to select the apropriate server. The prefix start by @ and is
followed by "master", the PID preceded by ! or the relative PID. (e.g.
@master, @1, @!1234). The servers are not round-robined anymore.
The command is sent with a SHUTW which force the server to close the
connection after sending its response. However the proxy allows a
keepalive connection on the client side and does not close.
The response analyser does not do much stuff, it only reinits the
connection when it received a close from the server, and forward the
response. It does not analyze the response data.
The only guarantee of the end of the response is the close of the
server, we can't rely on the double \n since it's not send by every
command.
This could be reimplemented later as a filter.
This option makes a proxy use only HTX-compatible muxes instead of the
HTTP-compatible ones for HTTP modes. It must be set on both ends, this
is checked at parsing time.
Since keep-alive mode is the default mode, the passive close has disappeared,
and in the code, httpclose and forceclose options are handled the same way:
connections with the client and the server are closed as soon as the request and
the response are received and missing "Connection: close" header is added in
each direction.
So to make things clearer, forceclose is now an alias for httpclose. And
httpclose is explicitly an active close. So the old passive close does not exist
anymore. Internally, the flag PR_O_HTTP_PCL has been removed and PR_O_HTTP_FCL
has been replaced by PR_O_HTTP_CLO. In HTTP analyzers, the checks done to find
the right mode to use, depending on proxies options and "Connection: " header
value, have been simplified.
This should only be a cleanup and no changes are expected.
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.
This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.
Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
Instead of having a separate area for the captured data, we now have a
contigous block made of the descriptor and the data. At the moment, since
the area is dynamically allocated, we can adjust its size to what is
needed, but the idea is to quickly switch to a pool and an LRU list.
Now upon error we dynamically allocate the snapshot instead of overwriting
it. This way there is no more memory wasted in the proxy to hold the two
error snapshot descriptors. Also an appreciable side effect of this is that
the proxy's lock is only taken during the pointer swap, no more while copying
the buffer's contents. This saves 480 bytes of memory per proxy.
The HTTP dumps are now configurable in the code : "show errors" now
calls a protocol-specific function to emit the decoded output. For
now only HTTP is implemented.
The idea will be to make the error snapshot feature accessible to other
protocols than just HTTP. This patch only introduces an "http_snapshot"
structure and renames a few fields to make things more explicit. The
HTTP part was installed inside a union so that we can easily add more
protocols in the future.
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
We store the queue index in the stream and check it on dequeueing to
figure how many entries were processed in between. This way we'll be
able to count the elements that may later be added before ours.
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
Now, each proxy contains a lock that must be used when necessary to protect
it. Moreover, all proxy's counters are now updated using atomic operations.
Email alerts relies on checks to send emails. The link between a mailers section
and a proxy was resolved during the configuration parsing, But initialization was
done when the first alert is triggered. This implied memory allocations and
tasks creations. With this patch, everything is now initialized during the
configuration parsing. So when an alert is triggered, only the memory required
by this alert is dynamically allocated.
Moreover, alerts processing had a flaw. The task handler used to process alerts
to be sent to the same mailer, process_email_alert, was designed to give back
the control to the scheduler when an alert was sent. So there was a delay
between the sending of 2 consecutives alerts (the min of
"proxy->timeout.connect" and "mailer->timeout.mail"). To fix this problem, now,
we try to process as much queued alerts as possible when the task is woken up.
This is a huge patch with many changes, all about the DNS. Initially, the idea
was to update the DNS part to ease the threads support integration. But quickly,
I started to refactor some parts. And after several iterations, it was
impossible for me to commit the different parts atomically. So, instead of
adding tens of patches, often reworking the same parts, it was easier to merge
all my changes in a uniq patch. Here are all changes made on the DNS.
First, the DNS initialization has been refactored. The DNS configuration parsing
remains untouched, in cfgparse.c. But all checks have been moved in a post-check
callback. In the function dns_finalize_config, for each resolvers, the
nameservers configuration is tested and the task used to manage DNS resolutions
is created. The links between the backend's servers and the resolvers are also
created at this step. Here no connection are kept alive. So there is no needs
anymore to reopen them after HAProxy fork. Connections used to send DNS queries
will be opened on demand.
Then, the way DNS requesters are linked to a DNS resolution has been
reworked. The resolution used by a requester is now referenced into the
dns_requester structure and the resolution pointers in server and dns_srvrq
structures have been removed. wait and curr list of requesters, for a DNS
resolution, have been replaced by a uniq list. And Finally, the way a requester
is removed from a DNS resolution has been simplified. Now everything is done in
dns_unlink_resolution.
srv_set_fqdn function has been simplified. Now, there is only 1 way to set the
server's FQDN, independently it is done by the CLI or when a SRV record is
resolved.
The static DNS resolutions pool has been replaced by a dynamoc pool. The part
has been modified by Baptiste Assmann.
The way the DNS resolutions are triggered by the task or by a health-check has
been totally refactored. Now, all timeouts are respected. Especially
hold.valid. The default frequency to wake up a resolvers is now configurable
using "timeout resolve" parameter.
Now, as documented, as long as invalid repsonses are received, we really wait
all name servers responses before retrying.
As far as possible, resources allocated during DNS configuration parsing are
releases when HAProxy is shutdown.
Beside all these changes, the code has been cleaned to ease code review and the
doc has been updated.
Make it so for each server, instead of specifying a hostname, one can use
a SRV label.
When doing so, haproxy will first resolve the SRV label, then use the
resulting hostnames, as well as port and weight (priority is ignored right
now), to each server using the SRV label.
It is resolved periodically, and any server disappearing from the SRV records
will be removed, and any server appearing will be added, assuming there're
free servers in haproxy.
This adds a new "dynamic" keyword for the cookie option. If set, a cookie
will be generated for each server (assuming one isn't already provided on
the "server" line), from the IP of the server, the TCP port, and a secret
key provided. To provide the secret key, a new keyword as been added,
"dynamic-cookie-key", for backends.
Example :
backend bk_web
balance roundrobin
dynamic-cookie-key "bla"
cookie WEBSRV insert dynamic
server s1 127.0.0.1:80 check
server s2 192.168.56.1:80 check
This is a first step to be able to dynamically add and remove servers,
without modifying the configuration file, and still have all the load
balancers redirect the traffic to the right server.
Provide a way to generate session cookies, based on the IP address of the
server, the TCP port, and a secret key provided.
The previous version used an O(number of proxies)^2 algo to get the sum of
the number of maxconns of frontends which reference a backend at least once.
This new version adds the frontend's maxconn number to the backend's
struct proxy member 'tot_fe_maxconn' when the backend name is resolved
for switching rules or default_backend statment. At the end, the final
backend's fullconn is computed looping only one time for all on proxies O(n).
The load of a configuration using a large amount of backends (10 thousands)
without configured fullconn was reduced from several minutes to few seconds.
We used to have 3 types of counters with a huge overlap :
- listener counters : stats collected for each bind line
- proxy counters : union of the frontend and backend counters
- server counters : stats collected per server
It happens that quite a good part was common between listeners and
proxies due to the frontend counters being updated at the two locations,
and that similarly the server and proxy counters were overlapping and
being updated together.
This patch cleans this up to propose only two types of counters :
- fe_counters: used by frontends and listeners, related to
incoming connections activity
- be_counters: used by backends and servers, related to outgoing
connections activity
This allowed to remove some non-sensical counters from both parts. For
frontends, the following entries were removed :
cum_lbconn, last_sess, nbpend_max, failed_conns, failed_resp,
retries, redispatches, q_time, c_time, d_time, t_time
For backends, this ones was removed : intercepted_req.
While doing this it was discovered that we used to incorrectly report
intercepted_req for backends in the HTML stats, which was always zero
since it's never updated.
Also it revealed a few inconsistencies (which were not fixed as they
are harmless). For example, backends count connections (cum_conn)
instead of sessions while servers count sessions and not connections.
Over the long term, some extra cleanups may be performed by having
some counters update functions touching both the server and backend
at the same time, as well as both the frontend and listener, to
ensure that all sides have all their stats properly filled. The stats
dump will also be able to factor the dump functions by counter types.
A new "option spop-check" statement has been added to enable server health
checks based on SPOP HELLO handshake. SPOP is the protocol used by SPOE filters
to talk to servers.