This member will be used later when frontends are created on the
fly by some tasks. It will also be usable later if we need to
support multiple config instances for example.
Now we're able to reject connections very early, so we need to use a
different counter for the connections that are received and the ones
that are accepted and converted into sessions, so that the rate limits
can still apply to the accepted ones. The session rate must still be
used to compute the rate limit, so that we can reject undesired traffic
without affecting the rate.
This is used to disable persistence depending on some conditions (for
example using an ACL matching static files or a specific User-Agent).
You can see it as a complement to "force-persist".
In the configuration file, the force-persist/ignore-persist declaration
order define the rules priority.
Used with the "appsesion" keyword, it can also help reducing memory usage,
as the session won't be hashed the persistence is ignored.
Some servers do not completely conform with RFC2616 requirements for
keep-alive when they receive a request with "Connection: close". More
specifically, they don't bother using chunked encoding, so the client
never knows whether the response is complete or not. One immediately
visible effect is that haproxy cannot maintain client connections alive.
The second issue is that truncated responses may be cached on clients
in case of network error or timeout.
Óscar Frías Barranco reported this issue on Tomcat 6.0.20, and
Patrik Nilsson with Jetty 6.1.21.
Cyril Bonté proposed this smart idea of pretending we run keep-alive
with the server and closing it at the last moment as is already done
with option forceclose. The advantage is that we only change one
emitted header but not the overall behaviour.
Since some servers such as nginx are able to close the connection
very quickly and save network packets when they're aware of the
close negociation in advance, we don't enable this behaviour by
default.
"option http-pretend-keepalive" will have to be used for that, in
conjunction with "option http-server-close".
Using get_ip_from_hdr2() we can look for occurrence #X or #-X and
extract the IP it contains. This is typically designed for use with
the X-Forwarded-For header.
Using "usesrc hdr_ip(name,occ)", it becomes possible to use the IP address
found in <name>, and possibly specify occurrence number <occ>, as the
source to connect to a server. This is possible both in a server and in
a backend's source statement. This is typically used to use the source
IP previously set by a upstream proxy.
We'll need another flag in the 'options' member close to PR_O_TPXY_*,
and all are used, so let's move this easy one to options2 (which are
already used for SQL checks).
Now a server can check the contents of the header X-Haproxy-Server-State
to know how haproxy sees it. The same values as those reported in the stats
are provided :
- up/down status + check counts
- throttle
- weight vs backend weight
- active sessions vs backend sessions
- queue length
- haproxy node name
Despite what is explicitly stated in HTTP specifications,
browsers still use the undocumented Proxy-Connection header
instead of the Connection header when they connect through
a proxy. As such, proxies generally implement support for
this stupid header name, breaking the standards and making
it harder to support keep-alive between clients and proxies.
Thus, we add a new "option http-use-proxy-header" to tell
haproxy that if it sees requests which look like proxy
requests, it should use the Proxy-Connection header instead
of the Connection header.
This is used to force access to down servers for some requests. This
is useful when validating that a change on a server correctly works
before enabling the server again.
This patch implements default-server support allowing to change
default server options. It can be used in [defaults] or [backend]/[listen]
sections. Currently the following options are supported:
- error-limit
- fall
- inter
- fastinter
- downinter
- maxconn
- maxqueue
- minconn
- on-error
- port
- rise
- slowstart
- weight
This option enables HTTP keep-alive on the client side and close mode
on the server side. This offers the best latency on the slow client
side, and still saves as many resources as possible on the server side
by actively closing connections. Pipelining is supported on both requests
and responses, though there is currently no reason to get pipelined
responses.
To sum up :
- len : it's now the max number of characters for the value, preventing
garbaged results.
- a new option "prefix" is added, this allows to use dynamic cookie
names (e.g. ASPSESSIONIDXXX).
Previously in the thread, I wanted to use the value found with
"capture cookie" but when i started to update the documentation, I
found this solution quite weird. I've made a small rework to not
depend on "capture cookie".
- There's the posssiblity to define the URL parser mode (path parameters
or query string).
All files referencing the previous ebtree code were changed to point
to the new one in the ebtree directory. A makefile variable (EBTREE_DIR)
is also available to use files from another directory.
The ability to build the libebtree library temporarily remains disabled
because it can have an impact on some existing toolchains and does not
appear worth it in the medium term if we add support for multi-criteria
stickiness for instance.
This patch has 2 goals :
1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.
This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.
The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.
As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :
appsession <cookie> len <length> timeout <holdtime> [request-learn]
This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).
2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
parameter with another unknown value :
curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001
(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.
- As a result, session.jsp sends a new session ID also stored in the
internal hash, with a server associated.
=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable
The patch modifies the behaviour to store only 1 entry (maximum).
There are a few remaining max values that need to move to counters.
Also, the counters are more often used than some config information,
so get them closer to the other useful struct members for better cache
efficiency.
Until now it was required that every custom ID was above 1000 in order to
avoid conflicts. Now we have the list of all assigned IDs and can automatically
pick the first unused one. This means that it is perfectly possible to interleave
automatic IDs with persistent IDs and the parser will automatically allocate
unused values starting with 1.
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).
Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
By default, when data is sent over a socket, both the write timeout and the
read timeout for that socket are refreshed, because we consider that there is
activity on that socket, and we have no other means of guessing if we should
receive data or not.
While this default behaviour is desirable for almost all applications, there
exists a situation where it is desirable to disable it, and only refresh the
read timeout if there are incoming data. This happens on sessions with large
timeouts and low amounts of exchanged data such as telnet session. If the
server suddenly disappears, the output data accumulates in the system's
socket buffers, both timeouts are correctly refreshed, and there is no way
to know the server does not receive them, so we don't timeout. However, when
the underlying protocol always echoes sent data, it would be enough by itself
to detect the issue using the read timeout. Note that this problem does not
happen with more verbose protocols because data won't accumulate long in the
socket buffers.
When this option is set on the frontend, it will disable read timeout updates
on data sent to the client. There probably is little use of this case. When
the option is set on the backend, it will disable read timeout updates on
data sent to the server. Doing so will typically break large HTTP posts from
slow lines, so use it with caution.
The lbprm structure has moved to backend.h, where it should be, and
all algo-specific types and declarations have moved to their specific
files. The proxy struct is now much more readable.
This patch implements "description" (proxy and global) and "node" (global)
options, removes "node-name" and adds "show-node" & "show-desc" options
for "stats". It also changes the way the header lines (with proxy name) and
the statistics are displayed, so stats no longer look so clumsy with very
long names.
Instead of "node-name" it is possible to use show-node/show-desc with
an optional parameter that overrides a default node/description.
backend cust-0045
# report specific values for this customer
stats show-node Europe
stats show-desc Master node for Europe, Asia, Africa
This patch adds health logging so it possible to check what
was happening before a crash. Failed healt checks are logged if
server is UP and succeeded healt checks if server is DOWN,
so the amount of additional information is limited.
I also reworked the code a little:
- check_status_description[] and check_status_info[] is now
joined into check_statuses[]
- set_server_check_status updates not only s->check_status and
s->check_duration but also s->result making the code simpler
Changes in v3:
- for now calculate and use local versions of health/rise/fall/state,
it is a slow path, no harm should be done. One day we may centralize
processing of the checks and remove the duplicated code.
- also log checks that are restoring current state
- use "conditionally succeeded" for 404 with disable-on-404
sess_establish() used to resort to protocol-specific guesses
in order to set rep->analysers. This is no longer needed as it
gets set from the frontend and the backend as a copy of what
was defined in the configuration.
Analyser bitmaps are now stored in the frontend and backend, and
combined at configuration time. That way, set_session_backend()
does not need to perform any protocol-specific combinations.
This Linux-specific option was never really used in production and
has since been superseded by new splicing options brought by recent
Linux kernels.
It caused several particular cases in the code because the kernel
would take care of the session without haproxy being able to do
anything on it, which became hard to handle in the new architecture.
Let's simply get rid of it now that there is a replacement available.
The new statement "persist rdp-cookie" enables RDP cookie
persistence. The RDP cookie is then extracted from the RDP
protocol, and compared against available servers. If a server
matches the RDP cookie, then it gets the connection.
This patch propagates the ACL conditions' "requires" bitfield
to the proxies. This makes it possible to know exactly what a
proxy might have to support for any request, which helps knowing
whether we have to allocate some space for certain types of
structures or not (eg: the hdr_idx struct).
The concept might be extended to a lot more types of information,
such as detecting whether we need to allocate some space for some
request ACLs which need a result in the response, etc...
This new option enables combining of request buffer data with
the initial ACK of an outgoing TCP connection. Doing so saves
one packet per connection which is quite noticeable on workloads
mostly consisting in small objects. The option is not enabled by
default.
This option disables TCP quick ack upon accept. It is also
automatically enabled in HTTP mode, unless the option is
explicitly disabled with "no option tcp-smart-accept".
This saves one packet per connection which can bring reasonable
amounts of bandwidth for servers processing small requests.
Sometimes we would want to implement implicit default options,
but for this we need to be able to disable them, which requires
to keep track of "no option" settings. With this change, an option
explicitly disabled in a defaults section will still be seen as
explicitly disabled. There should be no regression as nothing makes
use of this yet.
Some users want to keep the max sessions/s seen on servers, frontends
and backends for capacity planning. It's easy to grab it while the
session count is updated, so let's keep it.
Some people are using haproxy in a shared environment where the
system logger by default sends alert and emerg messages to all
consoles, which happens when all servers go down on a backend for
instance. These people can not always change the system configuration
and would like to limit the outgoing messages level in order not to
disturb the local users.
The addition of an optional 4th field on the "log" line permits
exactly this. The minimal log level ensures that all outgoing logs
will have at least this level. So the logs are not filtered out,
just set to this level.
There is a patch made by me that allow for balancing on any http header
field.
[WT:
made minor changes:
- turned 'balance header name' into 'balance hdr(name)' to match more
closely the ACL syntax for easier future convergence
- renamed the proxy structure fields header_* => hh_*
- made it possible to use the domain name reduction to any header, not
only "host" since it makes sense to do it with other ones.
Otherwise patch looks good.
/WT]
Some big traffic sites have trouble dealing with logs and tend to
disable them. Here are two new options to help cope with massive
logs.
- dontlog-normal only disables logging for 100% successful
connections, other ones will still be logged
- log-separate-errors will cause non-100% successful connections
to be logged at level "err" instead of level "info" so that a
properly configured syslog daemon can send them to a different
file for longer conservation.
I have attached a patch which will add on every http request a new
header 'X-Original-To'. If you have HAProxy running in transparent mode
with a big number of SQUID servers behind it, it is very nice to have
the original destination ip as a common header to make decisions based
on it.
The whole thing is configurable with a new option 'originalto'. I have
updated the sourcecode as well as the documentation. The 'haproxy-en.txt'
and 'haproxy-fr.txt' files are untouched, due to lack of my french
language knowledge. ;)
Also the patch adds this header for IPv4 only. I haven't any IPv6 test
environment running here and don't know if getsockopt() with SO_ORIGINAL_DST
will work on IPv6. If someone knows it and wants to test it I can modify
the diff. Feel free to ask me questions or things which should be changed. :)
--Maik
The byte counters have long been 64-bit to avoid overflows. But with
several sites nowadays, we see session counters wrap around every 10-days
or so. So it was the moment to switch counters to 64-bit, including
error and warning counters which can theorically rise as fast as session
counters even if in practice there is very low risk.
The performance impact should not be noticeable since those counters are
only updated once per session. The stats output have been carefully checked
for proper types on both 32- and 64-bit platforms.
Sometimes it is required to let invalid requests pass because
applications sometimes take time to be fixed and other servers
do not care. Thus we provide two new options :
option accept-invalid-http-request (for the frontend)
option accept-invalid-http-response (for the backend)
When those options are set, invalid requests or responses do
not cause a 403/502 error to be generated.
The new "rate-limit sessions" statement sets a limit on the number of
new connections per second on the frontend. As it is extremely accurate
(about 0.1%), it is efficient at limiting resource abuse or DoS.
With this change, all frontends, backends, and servers maintain a session
counter and a timer to compute a session rate over the last second. This
value will be very useful because it varies instantly and can be used to
check thresholds. This value is also reported in the stats in a new "rate"
column.