Having the check state partially stored in the server doesn't help.
Some functions such as srv_getinter() rely on the server being checked
to decide what check frequency to use, instead of relying on the check
being configured. So let's get rid of SRV_CHECKED and SRV_AGENT_CHECKED
and only use the check's states instead.
After the move of checks from servers to autonomous checks, we need a
massive cleanup and reordering as it's becoming increasingly difficult
to find the definitions of types and enums.
Nothing was changed, blocks were just moved.
Server tracking uses the same "tracknext" list for servers tracking
another one and for the servers being tracked. This caused an issue
which was fixed by commit f39c71c ([CRITICAL] fix server state tracking:
it was O(n!) instead of O(n)), consisting in ensuring that a server is
being checked before walking down the list, so that we don't propagate
the up/down information via servers being part of the track chain.
But the root cause is the fact that all servers share the same list.
The correct solution consists in having a list head for the tracked
servers and a list of next tracking servers. This simplifies the
propagation logic, especially for the case where status changes might
be passed to individual servers via the CLI.
Baptiste Assmann reported some confusing printf() output of the server
port since it's declared signed. Better turn it to unsigned.
There's no need to backport this, it's only used in 16-bit places.
This is a generic health check which can be used to match a
banner or send a request and analyse a server response.
It works in a send/expect ways and many exchange can be done between
HAProxy and a server to decide the server status, making HAProxy able to
speak the server's protocol.
It can send arbitrary regular or binary strings and match content as a
regular or binary string or a regex.
Signed-off-by: Baptiste Assmann <bedis9@gmail.com>
Add a DRAIN sub-state for a server which
will be shown on the stats page instead of UP if
its effective weight is zero.
Also, log if a server enters or leaves the DRAIN state
as the result of an agent check.
Signed-off-by: Simon Horman <horms@verge.net.au>
The syntax of this new commands are:
enable agent <backend>/<server>
disable agent <backend>/<server>
These commands allow temporarily stopping and subsequently
re-starting an auxiliary agent check. The effect of this is as follows:
New checks are only initialised when the agent is in the enabled. Thus,
disable agent will prevent any new agent checks from begin initiated until
the agent re-enabled using enable agent.
When an agent is disabled the processing of an auxiliary agent check that
was initiated while the agent was set as enabled is as follows: All
results that would alter the weight, specifically "drain" or a weight
returned by the agent, are ignored. The processing of agent check is
otherwise unchanged.
The motivation for this feature is to allow the weight changing effects
of the agent checks to be paused to allow the weight of a server to be
configured using set weight without being overridden by the agent.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is achieved by moving rise and fall from struct server to struct check.
After this move the behaviour of the primary check, server->check is
unchanged. However, the secondary agent check, server->agent now has
independent rise and fall values each of which are set to 1.
The result is that receiving "fail", "stopped" or "down" just once from the
agent will mark the server as down. And receiving a weight just once will
allow the server to be marked up if its primary check is in good health.
This opens up the scope to allow the rise and fall values of the agent
check to be configurable, however this has not been implemented at this
stage.
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow an auxiliary agent check to be run independently of the
regular a regular health check. This is enabled by the agent-check
server setting.
The agent-port, which specifies the TCP port to use for the agent's
connections, is required.
The agent-inter, which specifies the interval between agent checks and
timeout of agent checks, is optional. If not set the value for regular
checks is used.
e.g.
server web1_1 127.0.0.1:80 check agent-port 10000
If either the health or agent check determines that a server is down
then it is marked as being down, otherwise it is marked as being up.
An agent health check performed by opening a TCP socket and reading an
ASCII string. The string should have one of the following forms:
* An ASCII representation of an positive integer percentage.
e.g. "75%"
Values in this format will set the weight proportional to the initial
weight of a server as configured when haproxy starts.
* The string "drain".
This will cause the weight of a server to be set to 0, and thus it
will not accept any new connections other than those that are
accepted via persistence.
* The string "down", optionally followed by a description string.
Mark the server as down and log the description string as the reason.
* The string "stopped", optionally followed by a description string.
This currently has the same behaviour as "down".
* The string "fail", optionally followed by a description string.
This currently has the same behaviour as "down".
Signed-off-by: Simon Horman <horms@verge.net.au>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
Add state to struct check. This is currently used to store one bit,
CHK_RUNNING, which is set if a check is running and clear otherwise.
This bit was previously SRV_CHK_RUNNING of the state element of struct
server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Paramatise the following functions over the check of a server
* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled
Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.
Explanation of the SSP_O_HCHK portion of this change:
* Prior to this patch SSP_O_HCHK serves a single purpose which
is to tell server_status_printf() weather it should print
the details of the check of a server or not.
With the paramatisation that this patch adds there are two cases.
1) Printing the details of the check in which case a
valid check parameter is needed.
2) Not printing the details of the check in which case
the contents check parameter are unused.
In case 1) we could pass SSP_O_HCHK and a valid check and;
In case 2) we could pass !SSP_O_HCHK and any value for check
including NULL.
If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
and as NULL is used for case 2) SSP_O_HCHK has been removed.
Signed-off-by: Simon Horman <horms@verge.net.au>
Move result element from struct server to struct check
This allows check results to be independent of the check's server.
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
Signed-off-by: Simon Horman <horms@verge.net.au>
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
The split has been made by:
* Moving elements of struct server's check element that will
be shared by both checks into a new check_common element
of struct server.
* Moving the remaining elements to a new struct check and
making struct server's check element a struct check.
* Adding a server element to struct check, a back-pointer
to the server element it is a member of.
- At this time the server could be obtained using
container_of, however, this will not be so easy
once a second struct check element is added to struct server
to accommodate an agent health check.
Signed-off-by: Simon Horman <horms@verge.net.au>
verifyhost allows you to specify a hostname that the remote server's
SSL certificate must match. Connections that don't match will be
closed with an SSL error.
The max weight of server is 256 now, but SRV_UWGHT_MAX is still 255. As a result,
FWRR will not work well when server's weight is 256. The description is as below:
There are some macros related to server's weight in include/types/server.h:
#define SRV_UWGHT_RANGE 256
#define SRV_UWGHT_MAX (SRV_UWGHT_RANGE - 1)
#define SRV_EWGHT_MAX (SRV_UWGHT_MAX * BE_WEIGHT_SCALE)
Since weight of server can be reach to 256 and BE_WEIGHT_SCALE equals to 16,
the max eweight of server should be 256*16 = 4096, it will exceed SRV_EWGHT_MAX
which equals to SRV_UWGHT_MAX*BE_WEIGHT_SCALE = 255*16 = 4080. When a server
with weight 256 is insterted into FWRR tree during initialization, the key value
of this server should be SRV_EWGHT_MAX - s->eweight = 4080 - 4096 = -16 which
is closed to UINT_MAX in unsigned type, so the server with highest weight will
be not elected as the first server to process request.
In addition, it is a better choice to compare with SRV_UWGHT_MAX than a magic
number 256 while doing check for the weight. The max number of servers for
round-robin algorithm is also updated.
Signed-off-by: Godbach <nylzhaowei@gmail.com>
Both servers and proxies share a common set of parameters for outgoing
connections, and since they're not stored in a similar structure, a lot
of code is duplicated in the connection setup, which is one sensible
area.
Let's first define a common struct for these settings and make use of it.
Next patches will de-duplicate code.
This change also fixes a build breakage that happens when USE_LINUX_TPROXY
is not set but USE_CTTPROXY is set, which seem to be very unlikely
considering that the issue was introduced almost 2 years ago an never
reported.
The porting of checks to using connections was totally bogus. Some checks
were considered successful as soon as the connection was established,
regardless of any response. Some errors would be triggered upon recv
if polling was enabled for send or if the send channel was shut down.
Now the behaviour is much better. It would be cleaner to perform the
fd_delete() in wake_srv_chk() and to process failures and timeouts
separately, but this is already a good start.
Some server check flag names were not properly choosen and cause
analysis trouble, especially the CHK_RUNNING one which does not
mean that a check is running but that the server is running...
Here's the rename :
CHK_RUNNING -> CHK_PASSED
CHK_ERROR -> CHK_FAILED
Instead of storing a couple of (int, ptr) in the struct connection
and the struct session, we use a different method : we only store a
pointer to an integer which is stored inside the target object and
which contains a unique type identifier. That way, the pointer allows
us to retrieve the object type (by dereferencing it) and the object's
address (by computing the displacement in the target structure). The
NULL pointer always corresponds to OBJ_TYPE_NONE.
This reduces the size of the connection and session structs. It also
simplifies target assignment and compare.
In order to improve the generated code, we try to put the obj_type
element at the beginning of all the structs (listener, server, proxy,
si_applet), so that the original and target pointers are always equal.
A lot of code was touched by massive replaces, but the changes are not
that important.
It now becomes possible to verify the server's certificate using the "verify"
directive. This one only supports "none" and "required", as it does not make
much sense to also support "optional" here.
Just like with the "bind" lines, we'll switch the "server" line
parsing to keyword registration. The code is essentially the same
as for bind keywords, with minor changes such as support for the
default-server keywords and support for variable argument count.
When health checks are configured on a server which has the send-proxy
directive and no "port" nor "addr" settings, the health check connections
will automatically use the PROXY protocol. If "port" or "addr" are set,
the "check-send-proxy" directive may be used to force the protocol.
Since it's possible for the checks to use a different protocol or transport layer
than the prod traffic, we need to have them referenced in the server. The
SSL checks are not enabled yet, but the transport layers are completely used.
Till now the request was made in the trash and sent to the network at
once, and the response was read into a preallocated char[]. Now we
allocate a full buffer for both the request and the response, and make
use of it.
Some of the operations will probably be replaced later with buffer macros
but the point was to ensure we could migrate to use the data layers soon.
One nice improvement caused by this change is that requests are now formed
at the beginning of the check and may safely be sent in multiple chunks if
needed.
The health checks in the servers are becoming a real mess, move them
into their own subsection. We'll soon need to have a struct buffer to
replace the char * as well as check-specific protocol and transport
layers.
While working on the changes required to make the health checks use the
new connections, it started to become obvious that some naming was not
logical at all in the connections. Specifically, it is not logical to
call the "data layer" the layer which is in charge for all the handshake
and which does not yet provide a data layer once established until a
session has allocated all the required buffers.
In fact, it's more a transport layer, which makes much more sense. The
transport layer offers a medium on which data can transit, and it offers
the functions to move these data when the upper layer requests this. And
it is the upper layer which iterates over the transport layer's functions
to move data which should be called the data layer.
The use case where it's obvious is with embryonic sessions : an incoming
SSL connection is accepted. Only the connection is allocated, not the
buffers nor stream interface, etc... The connection handles the SSL
handshake by itself. Once this handshake is complete, we can't use the
data functions because the buffers and stream interface are not there
yet. Hence we have to first call a specific function to complete the
session initialization, after which we'll be able to use the data
functions. This clearly proves that SSL here is only a transport layer
and that the stream interface constitutes the data layer.
A similar change will be performed to rename app_cb => data, but the
two could not be in the same commit for obvious reasons.
This is because "notlsv1" used to disable TLSv1.0 only and had no effect
on v1.1/v1.2. so better have an option for each version. This applies both
to "bind" and "server" statements.
This is aimed at disabling SSLv3 and TLSv1 respectively. SSLv2 is always
disabled. This can be used in some situations where one version looks more
suitable than the other.
This option currently takes no option and simply turns SSL on for all
connections going to the server. It is likely that more options will
be needed in the future.
Some parts of the sock_ops structure were only used by the stream
interface and have been moved into si_ops. Some of them were callbacks
to the stream interface from the connection and have been moved into
app_cp as they're the application seen from the connection (later,
health-checks will need to use them). The rest has moved to data_ops.
Normally at this point the connection could live without knowing about
stream interfaces at all.
This implements the feature discussed in the earlier thread of killing
connections on backup servers when a non-backup server comes back up. For
example, you can use this to route to a mysql master & slave and ensure
clients don't stay on the slave after the master goes from down->up. I've done
some minimal testing and it seems to work.
[WT: added session flag & doc, moved the killing after logging the server UP,
and ensured that the new server is really usable]
Instead of hard-coding sock_raw in connect_server(), we set this socket
operation at config parsing time. Right now, only servers and peers have
it. Proxies are still hard-coded as sock_raw. This will be needed for
future work on SSL which requires a different socket layer.
The principle behind this load balancing algorithm was first imagined
and modeled by Steen Larsen then iteratively refined through several
work sessions until it would totally address its original goal.
The purpose of this algorithm is to always use the smallest number of
servers so that extra servers can be powered off during non-intensive
hours. Additional tools may be used to do that work, possibly by
locally monitoring the servers' activity.
The first server with available connection slots receives the connection.
The servers are choosen from the lowest numeric identifier to the highest
(see server parameter "id"), which defaults to the server's position in
the farm. Once a server reaches its maxconn value, the next server is used.
It does not make sense to use this algorithm without setting maxconn. Note
that it can however make sense to use minconn so that servers are not used
at full load before starting new servers, and so that introduction of new
servers requires a progressively increasing load (the number of servers
would more or less follow the square root of the load until maxconn is
reached). This algorithm ignores the server weight, and is more beneficial
to long sessions such as RDP or IMAP than HTTP, though it can be useful
there too.
Ludovic Levesque reported and diagnosed an annoying bug. When a server is
configured to track another one and has a slowstart interval set, it's
assigned a minimal weight when the tracked server goes back up but keeps
this weight forever.
This is because the throttling during the warmup phase is only computed
in the health checking function.
After several attempts to resolve the issue, the only real solution is to
split the check processing task in two tasks, one for the checks and one
for the warmup. Each server with a slowstart setting has a warmum task
which is responsible for updating the server's weight after a down to up
transition. The task does not run in othe situations.
In the end, the fix is neither complex nor long and should be backported
to 1.4 since the issue was detected there first.
When reading the code, the "tracked" member of a server makes one
think the server is tracked while it's the opposite, it's a pointer
to the server being tracked. This is particularly true in constructs
such as :
if (srv->tracked) {
Since it's the second time I get caught misunderstanding it, let's
rename it to "track" to avoid the confusion.
Struct sockaddr_storage is huge (128 bytes) and severely impacts the
cache. It also displaces other struct members, causing them to have
larger relative offsets. By moving these few occurrences to the end
of the structs which host them, we can reduce the code size by no less
than 2 kB !
Never add connections allocated to this sever to a stick-table.
This may be used in conjunction with backup to ensure that
stick-table persistence is disabled for backup servers.
This adds the "on-marked-down shutdown-sessions" statement on "server" lines,
which causes all sessions established on a server to be killed at once when
the server goes down. The task's priority is reniced to the highest value
(1024) so that servers holding many tasks don't cause a massive slowdown due
to the wakeup storm.