haproxy

mirror of https://github.com/haproxy/haproxy.git synced 2026-02-03 20:39:41 -05:00

Author	SHA1	Message	Date
Krzysztof Piotr Oledzki	052d4fd07d	[CLEANUP] Move counters to dedicated structures Move counters from "struct proxy" and "struct server" to "struct pxcounters" and "struct svcounters". This patch should make no functional change.	2009-10-04 18:32:39 +02:00
Krzysztof Piotr Oledzki	0960541e49	[MEDIUM] Collect & show information about last health check, v3 Collect information about last health check result, including L7 code if possible (for example http or smtp return code) and time took to finish last check. Health check info is provided on both stats pages (html & csv) and logged when a server is marked UP or DOWN. Currently active check are marked with an asterisk, but only in html mode. Currently there are 14 status codes: UNK -> unknown INI -> initializing SOCKERR -> socket error L4OK -> check passed on layer 4, no upper layers testing enabled L4TOUT -> layer 1-4 timeout L4CON -> layer 1-4 connection problem, for example "Connection refused" (tcp rst) or "No route to host" (icmp) L6OK -> check passed on layer 6 L6TOUT -> layer 6 (SSL) timeout L6RSP -> layer 6 invalid response - protocol error L7OK -> check passed on layer 7 L7OKC -> check conditionally passed on layer 7, for example 404 with disable-on-404 L7TOUT -> layer 7 (HTTP/SMTP) timeout L7RSP -> layer 7 invalid response - protocol error L7STS -> layer 7 response error, for example HTTP 5xx	2009-09-23 23:15:36 +02:00
Willy Tarreau	c6f4ce8fc4	[MEDIUM] add support for binding to source port ranges during connect Some users are already hitting the 64k source port limit when connecting to servers. The system usually maintains a list of unused source ports, regardless of the source IP they're bound to. So in order to go beyond the 64k concurrent connections, we have to manage the source ip:port lists ourselves. The solution consists in assigning a source port range to each server and use a free port in that range when connecting to that server, either for a proxied connection or for a health check. The port must then be put back into the server's range when the connection is closed. This mechanism is used only when a port range is specified on a server. It makes it possible to reach 64k connections per server, possibly all from the same IP address. Right now it should be more than enough even for huge deployments.	2009-06-10 12:23:32 +02:00
Willy Tarreau	13a34bd110	[MINOR] compute the max of sessions/s on fe/be/srv Some users want to keep the max sessions/s seen on servers, frontends and backends for capacity planning. It's easy to grab it while the session count is updated, so let's keep it.	2009-05-10 18:52:49 +02:00
Willy Tarreau	3b88d441e9	[MINOR] switch all stat counters to 64-bit The byte counters have long been 64-bit to avoid overflows. But with several sites nowadays, we see session counters wrap around every 10-days or so. So it was the moment to switch counters to 64-bit, including error and warning counters which can theorically rise as fast as session counters even if in practice there is very low risk. The performance impact should not be noticeable since those counters are only updated once per session. The stats output have been carefully checked for proper types on both 32- and 64-bit platforms.	2009-04-11 20:44:08 +02:00
Willy Tarreau	7f062c4193	[MEDIUM] measure and report session rate on frontend, backends and servers With this change, all frontends, backends, and servers maintain a session counter and a timer to compute a session rate over the last second. This value will be very useful because it varies instantly and can be used to check thresholds. This value is also reported in the stats in a new "rate" column.	2009-03-05 18:43:00 +01:00
Willy Tarreau	c76721da57	[MEDIUM] add support for source interface binding at the server level Add support for "interface <name>" after the "source" statement on the server line.	2009-02-04 20:20:58 +01:00
Willy Tarreau	7c669d7e0f	[BUG] fix the dequeuing logic to ensure that all requests get served The dequeuing logic was completely wrong. First, a task was assigned to all servers to process the queue, but this task was never scheduled and was only woken up on session free. Second, there was no reservation of server entries when a task was assigned a server. This means that as long as the task was not connected to the server, its presence was not accounted for. This was causing trouble when detecting whether or not a server had reached maxconn. Third, during a redispatch, a session could lose its place at the server's and get blocked because another session at the same moment would have stolen the entry. Fourth, the redispatch option did not work when maxqueue was reached for a server, and it was not possible to do so without indefinitely hanging a session. The root cause of all those problems was the lack of pre-reservation of connections at the server's, and the lack of tracking of servers during a redispatch. Everything relied on combinations of flags which could appear similarly in quite distinct situations. This patch is a major rework but there was no other solution, as the internal logic was deeply flawed. The resulting code is cleaner, more understandable, uses less magics and is overall more robust. As an added bonus, "option redispatch" now works when maxqueue has been reached on a server.	2008-06-20 15:08:06 +02:00
Krzysztof Piotr Oledzki	a643baf091	[MEDIUM] Fix memory freeing at exit New functions implemented: - deinit_pollers: called at the end of deinit()) - prune_acl: called via list_for_each_entry_safe Add missing pool_destroy2 calls: - p->hdr_idx_pool - pool2_tree64 Implement all task stopping: - health-check: needs new "struct task" in the struct server - queue processing: queue_mgt - appsess_refresh: appsession_refresh before (idle system): ==6079== LEAK SUMMARY: ==6079== definitely lost: 1,112 bytes in 75 blocks. ==6079== indirectly lost: 53,356 bytes in 2,090 blocks. ==6079== possibly lost: 52 bytes in 1 blocks. ==6079== still reachable: 150,996 bytes in 504 blocks. ==6079== suppressed: 0 bytes in 0 blocks. after (idle system): ==6945== LEAK SUMMARY: ==6945== definitely lost: 7,644 bytes in 137 blocks. ==6945== indirectly lost: 9,913 bytes in 587 blocks. ==6945== possibly lost: 0 bytes in 0 blocks. ==6945== still reachable: 0 bytes in 0 blocks. ==6945== suppressed: 0 bytes in 0 blocks. before (running system for ~2m): ==9343== LEAK SUMMARY: ==9343== definitely lost: 1,112 bytes in 75 blocks. ==9343== indirectly lost: 54,199 bytes in 2,122 blocks. ==9343== possibly lost: 52 bytes in 1 blocks. ==9343== still reachable: 151,128 bytes in 509 blocks. ==9343== suppressed: 0 bytes in 0 blocks. after (running system for ~2m): ==11616== LEAK SUMMARY: ==11616== definitely lost: 7,644 bytes in 137 blocks. ==11616== indirectly lost: 9,981 bytes in 591 blocks. ==11616== possibly lost: 0 bytes in 0 blocks. ==11616== still reachable: 4 bytes in 1 blocks. ==11616== suppressed: 0 bytes in 0 blocks. Still not perfect but significant improvement.	2008-05-30 07:07:19 +02:00
Krzysztof Piotr Oledzki	c8b16fc948	[MEDIUM] Implement "track [<backend>/]<server>" This patch implements ability to set the current state of one server by tracking another one. It: - adds two variables: tracknext, tracked to struct server - implements findserver(), similar to findproxy() - adds "track" keyword accepting both "proxy/server" and "server" (assuming current proxy) - verifies if both checks and tracking is not enabled at the same time - changes set_server_down() to notify tracking server - creates set_server_up(), set_server_disabled(), set_server_enabled() by moving the code from process_chk() and adding notifications - changes stats to show a name of tracked server instead of Chk/Dwn/Dwntime(html) or by adding new variable (csv) Changes from the previuos version: - it is possibile to track independently of the declaration order - one extra comma bug is fixed - new condition to check if there is no disable-on-404 inconsistency	2008-02-27 10:39:53 +01:00
Willy Tarreau	21d2af3e9f	Revert "[BUILD] backend.c and checks.c did not build without tproxy !" This reverts commit `3c3c0122f8`. This commit was buggy as it also removed previous tproxy changes !	2008-02-14 20:25:24 +01:00
Willy Tarreau	3c3c0122f8	[BUILD] backend.c and checks.c did not build without tproxy ! missing #ifdefs.	2008-02-13 22:22:56 +01:00
Willy Tarreau	7a58a72e85	[MINOR] add configuration support for "redir" server keyword The servers now support the "redir" keyword, making it possible to return a 302 with the specified prefix in front of the request instead of connecting to them. This is generally useful for multi-site load balancing but may also serve in order to achieve very high traffic rate. The keyword has only been added to the config parser and to structures, it's not used yet.	2008-02-13 00:55:49 +01:00
Krzysztof Piotr Oledzki	5259dfedd1	[MEDIUM]: rework checks handling This patch adds two new variables: fastinter and downinter. When server state is: - non-transitionally UP -> inter (no change) - transitionally UP (going down), unchecked or transitionally DOWN (going up) -> fastinter - down -> downinter It allows to set something like: server sr6 127.0.51.61:80 cookie s6 check inter 10000 downinter 20000 fastinter 500 fall 3 weight 40 In the above example haproxy uses 10000ms between checks but as soon as one check fails fastinter (500ms) is used. If server is down downinter (20000) is used or fastinter (500ms) if one check pass. Fastinter is also used when haproxy starts. New "timeout.check" variable was added, if set haproxy uses it as an additional read timeout, but only after a connection has been already established. I was thinking about using "timeout.server" here but most people set this with an addition reserve but still want checks to kick out laggy servers. Please also note that in most cases check request is much simpler and faster to handle than normal requests so this timeout should be smaller. I also changed the timeout used for check connections establishing. Changes from the previous version: - use tv_isset() to check if the timeout is set, - use min("timeout connect", "inter") but only if "timeout check" is set as this min alone may be to short for full (connect + read) check, - debug code (fprintf) commented/removed - documentation Compile tested only (sorry!) as I'm currently traveling but changes are rather small and trivial.	2008-01-22 11:29:06 +01:00
Willy Tarreau	4864c35209	[BUG] build failed on CONFIG_HAP_LINUX_TPROXY without CONFIG_HAP_CTTPROXY changed #ifdef	2008-01-14 16:36:15 +01:00
Willy Tarreau	c297b52df5	[BUG] fix overlapping server flags Server flags SRV_GOINGDOWN, SRV_WARMINGUP were overlapping SRV_TPROXY_*.	2008-01-13 18:12:24 +01:00
Krzysztof Piotr Oledzki	25b501a6b1	[MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup This patch extends a little previously added functionality to also count retries and redispatches for servers. Now it is possible to know which server causes redispatches as it is not always the same that takes most retries. While working with the code I found that redistribute_pending() does not increment srv->redispatches && be->redispatches. I don't know how to test it but I think the fix is correct. If not I can withdraw it. I also extended logs to show how many retries were done and if redispatching was necessary ('+'). I'm using an additional session flag SN_REDISP to match redispatched connections. I had to rearrange all defines in session.h to make more room for it. The documentation about logs was also fixed a little (sorry, english only), as current version uses totally different format. BTW: examples are still outdated, maybe next time... Finally, I changed %d -> %u for retries/redispatches as those variables are declared as unsigned.	2008-01-06 16:43:05 +01:00
Willy Tarreau	ddbb82ff47	[STATS] report the number of times each server was selected One user reported that an indicator was missing in the statistics: the number of times each server was selected by load balancing. It is in fact the total number of sessions assigned to a server by the load balancing algorithm. It should directly reflect the weight for "fair" algorithms such as round-robin, since it will not account for persistant connections. It should help a lot tuning each server's weight depending on the load it receives.	2007-12-05 10:34:49 +01:00
Willy Tarreau	b698f0f4a2	[CLEANUP] fwrr: ensure that we never overflow in placements Now we can compute the max place depending on the number of servers, maximum weight and weight scale. The formula has been stored as a comment so that it's easy to choose between smooth weight ramp up and high number of servers. The default scale has been set to 16, which permits 4000 servers with a granularity of 6% in the worst case (weight=1).	2007-12-02 11:01:23 +01:00
Willy Tarreau	9909fc13f1	[MEDIUM] implement the slowstart parameter for servers The new 'slowstart' parameter for a server accepts a value in milliseconds which indicates after how long a server which has just come back up will run at full speed. The speed grows linearly from 0 to 100% during this time. The limitation applies to two parameters : - maxconn: the number of connections accepted by the server will grow from 1 to 100% of the usual dynamic limit defined by (minconn,maxconn,fullconn). - weight: when the backend uses a dynamic weighted algorithm, the weight grows linearly from 1 to 100%. In this case, the weight is updated at every health-check. For this reason, it is important that the 'inter' parameter is smaller than the 'slowstart', in order to maximize the number of steps. The slowstart never applies when haproxy starts, otherwise it would cause trouble to running servers. It only applies when a server has been previously seen as failed.	2007-11-30 17:42:05 +01:00
Willy Tarreau	48494c0c5c	[MEDIUM] implement "http-check disable-on-404" for graceful shutdown When an HTTP server returns "404 not found", it indicates that at least part of it is still running. For this reason, it can be convenient for application administrators to be able to consider code 404 as valid, but for a server which does not want to participate to load balancing anymore. This is useful to seamlessly exclude a server from a farm without acting on the load balancer. For instance, let's consider that haproxy checks for the "/alive" file. To enable load balancing on a server, the admin would simply do : # touch /var/www/alive And to disable the server, he would simply do : # rm /var/www/alive Another immediate gain from doing this is that it is now possible to send NOTICE messages instead of ALERT messages when a server is first disable, then goes down. This provides a graceful shutdown method. To enable this behaviour, specify "http-check disable-on-404" in the backend.	2007-11-30 10:41:39 +01:00
Willy Tarreau	c7dd71ae5b	[MEDIUM] change server check result to a bit field A server check currently returns either -1 or 1. This is not very convenient to enhance the health-checks system. Let's use flags instead.	2007-11-30 08:33:21 +01:00
Willy Tarreau	b625a085d8	[MAJOR] implement the Fast Weighted Round Robin (FWRR) algo This round robin algorithm was written from trees, so that we do not have to recompute any table when changing server weights. This solution allows on-the-fly weight adjustments with immediate effect on the load distribution. There is still a limitation due to 32-bit computations, to about 2000 servers at full scale (weight 255), or more servers with lower weights. Basically, sum(srv.weight)*4096 must be below 2^31. Test configurations and an example program used to develop the tree will be added next. Many changes have been brought to the weights computations and variables in order to accomodate for the possiblity of a server to be running but disabled from load balancing due to a null weight.	2007-11-28 14:23:17 +01:00
Willy Tarreau	dcd4771b3d	[MINOR] stats: report numerical process ID, proxy ID and server ID It is very convenient for SNMP monitoring to have unique process ID, proxy ID and server ID. Those have been added to the CSV outputs. The numbers start at 1. 0 is reserved. For servers, 0 means that the reported name is not a server name but half a proxy (FRONTEND/BACKEND). A remaining hidden "-" in the CSV output has been eliminated too.	2007-11-04 23:35:08 +01:00
Elijah Epifanov	acafc5f88c	[MEDIUM] add support for "maxqueue" to limit server queue overload This patch adds the "maxqueue" parameter to the server. This allows new sessions to be immediately rebalanced when the server's queue is filled. It's useful when session stickiness is just a performance boost (even a huge one) but not a requirement. This should only be used if session affinity isn't a hard functional requirement but provides performance boost by keeping server-local caches hot and compact). Absence of 'maxqueue' option means unlimited queue. When queue gets filled up to 'maxqueue' client session is moved from server-local queue to a global one.	2007-10-25 20:15:38 +02:00
Krzysztof Oledzki	85130941e7	[MEDIUM] stats: report server and backend cumulated downtime Hello, This patch implements new statistics for SLA calculation by adding new field 'Dwntime' with total down time since restart (both HTTP/CSV) and extending status field (HTTP) or inserting a new one (CSV) with time showing how long each server/backend is in a current state. Additionaly, down transations are also calculated and displayed for backends, so it is possible to know how many times selected backend was down, generating "No server is available to handle this request." error. New information are presentetd in two different ways: - for HTTP: a "human redable form", one of "100000d 23h", "23h 59m" or "59m 59s" - for CSV: seconds I believe that seconds resolution is enough. As there are more columns in the status page I decided to shrink some names to make more space: - Weight -> Wght - Check -> Chk - Down -> Dwn Making described changes I also made some improvements and fixed some small bugs: - don't increment s->health above 's->rise + s->fall - 1'. Previously it was incremented an then (re)set to 's->rise + s->fall - 1'. - do not set server down if it is down already - do not set server up if it is up already - fix colspan in multiple places (mostly introduced by my previous patch) - add missing "status" header to CSV - fix order of retries/redispatches in server (CSV) - s/Tthen/Then/ - s/server/backend/ in DATA_ST_PX_BE (dumpstats.c) Changes from previous version: - deal with negative time intervales - don't relay on s->state (SRV_RUNNING) - little reworked human_time + compacted format (no spaces). If needed it can be used in the future for other purposes by optionally making "cnt" as an argument - leave set_server_down mostly unchanged - only little reworked "process_chk: 9" - additional fields in CSV are appended to the rigth - fix "SEC" macro - named arguments (human_time, be_downtime, srv_downtime) Hope it is OK. If there are only cosmetic changes needed please fill free to correct it, however if there are some bigger changes required I would like to discuss it first or at last to know what exactly was changed especially since I already put this patch into my production server. :) Thank you, Best regards, Krzysztof Oledzki	2007-10-22 21:36:23 +02:00
Krzysztof Oledzki	1cf36ba3ae	[MEDIUM] stats: count server retries and redispatches It is important to know how your installation performs. Haproxy masks connection errors, which is extremely good for a client but it is bad for an administrator (except people believing that "ignorance is a bless"). Attached patch adds retries and redispatches counters, so now haproxy: 1. For server: - counts retried connections (masked or not) 2. For backends: - counts retried connections (masked or not) that happened to a slave server - counts redispatched connections - does not count successfully redispatched connections as backend errors. Errors are increased only when client does not get a valid response, in other words: with failed redispatch or when this function is not enabled. 3. For statistics: - display Retr (retries) and Redis (redispatches) as a "Warning" information.	2007-10-18 19:12:30 +02:00
Willy Tarreau	417fae0e60	[MINOR] changed server weight storage from char to unsigned int This change does not affect memory usage much, but it simplifies the code a lot by removing many +1/-1 operations on weights.	2007-03-25 21:16:40 +02:00
Willy Tarreau	91b6f329eb	[CLEANUP] slightly reorganized the struct server Struct server has gathered lots of informations over the time, but it's better for clarity and performance to group those information by usage, the most common ones at the top and the least ones at the bottom.	2007-03-25 21:03:01 +02:00
Willy Tarreau	0f03c6f60b	[MINOR] cleaned up the check_addr patch a bit removed useless set_check_addr entry and rely on check_addr itself.	2007-03-25 20:46:19 +02:00
Willy Tarreau	2ea3abb7bf	[MEDIUM] add support for health-checks on other addresses Patch from Fabrice Dulaunoy. Explanation below, and script merged in examples/. This patch allow to put a different address in the check part for each server (and not only a specific port) I need this feature because I've a complex settings where, when a specific farm goes down, I need to switch a set of other farm either if these other farm behave perfectly well. For that purpose, I've made a small PERL daemon with some REGEX or PORT test which allow me to test a bunch of thing.	2007-03-25 16:45:16 +02:00
Willy Tarreau	35d66b0c28	[MINOR] added byte count to sessions and statistics. Now the stats page reports the IN and OUT byte counts per FE, BE and SRV.	2007-01-02 00:28:21 +01:00
Willy Tarreau	77074d548b	[MAJOR] support for source binding via cttproxy Using the cttproxy kernel patch, it's possible to bind to any source address. It is highly recommended to use the 03-natdel patch with the other ones. A new keyword appears as a complement to the "source" keyword : "usesrc". The source address is mandatory and must be valid on the interface which will see the packets. The "usesrc" option supports "client" (for full client_ip:client_port spoofing), "client_ip" (for client_ip spoofing) and any 'IP[:port]' combination to pretend to be another machine. Right now, the source binding is missing from server health-checks if set to another address. It must be implemented (think restricted firewalls). The doc is still missing too.	2006-11-12 23:57:19 +01:00
Willy Tarreau	e3ba5f0aaa	[CLEANUP] included common/version.h everywhere	2006-06-29 18:54:54 +02:00
Willy Tarreau	2dd0d4799e	[CLEANUP] renamed include/haproxy to include/common	2006-06-29 17:53:05 +02:00
Willy Tarreau	baaee00406	[BIGMOVE] exploded the monolithic haproxy.c file into multiple files. The files are now stored under : - include/haproxy for the generic includes - include/types.h for the structures needed within prototypes - include/proto.h for function prototypes and inline functions - src/*.c for the C files Most include files are now covered by LGPL. A last move still needs to be done to put inline functions under GPL and not LGPL. Version has been set to 1.3.0 in the code but some control still needs to be done before releasing.	2006-06-26 02:48:02 +02:00

36 commits