Commit graph

101 commits

Author SHA1 Message Date
Willy Tarreau
20d46a5a95 CLEANUP: session: use an array for the stick counters
The stick counters were in two distinct sets of struct members,
causing some code to be duplicated. Now we use an array, which
enables some processing to be performed in loops. This allowed
the code to be shrunk by 700 bytes.
2012-12-09 15:57:16 +01:00
Willy Tarreau
5d5b5d8eaf MEDIUM: proto_tcp: add support for tracking L7 information
Until now it was only possible to use track-sc1/sc2 with "src" which
is the IPv4 source address. Now we can use track-sc1/sc2 with any fetch
as well as any transformation type. It works just like the "stick"
directive.

Samples are automatically converted to the correct types for the table.

Only "tcp-request content" rules may use L7 information, and such information
must already be present when the tracking is set up. For example it becomes
possible to track the IP address passed in the X-Forwarded-For header.

HTTP request processing now also considers tracking from backend rules
because we want to be able to update the counters even when the request
was already parsed and tracked.

Some more controls need to be performed (eg: samples do not distinguish
between L4 and L6).
2012-12-09 14:08:47 +01:00
William Lallemand
8b52bb3878 MEDIUM: compression: use pool for comp_ctx
Use pool for comp_ctx, it is allocated during the comp_algo->init().
The allocation of comp_ctx is accounted for in the zlib_memory_available.
2012-11-21 01:56:47 +01:00
William Lallemand
ec3e3890f0 BUG/MINOR: compression: deinit zlib only when required
The zlib stream was deinitialized even when the init failed.
2012-11-15 15:42:17 +01:00
Willy Tarreau
3fdb366885 MAJOR: connection: replace struct target with a pointer to an enum
Instead of storing a couple of (int, ptr) in the struct connection
and the struct session, we use a different method : we only store a
pointer to an integer which is stored inside the target object and
which contains a unique type identifier. That way, the pointer allows
us to retrieve the object type (by dereferencing it) and the object's
address (by computing the displacement in the target structure). The
NULL pointer always corresponds to OBJ_TYPE_NONE.

This reduces the size of the connection and session structs. It also
simplifies target assignment and compare.

In order to improve the generated code, we try to put the obj_type
element at the beginning of all the structs (listener, server, proxy,
si_applet), so that the original and target pointers are always equal.

A lot of code was touched by massive replaces, but the changes are not
that important.
2012-11-12 00:42:33 +01:00
William Lallemand
08289f12f9 BUILD: remove dependency to zlib.h
The build was dependent of the zlib.h header, regardless of the USE_ZLIB
option. The fix consists of several #ifdef in the source code.

It removes the overhead of the zstream structure in the session when you
don't use the option.
2012-11-05 10:23:16 +01:00
William Lallemand
1c2d622d82 CLEANUP: use struct comp_ctx instead of union
Replace union comp_ctx by struct comp_ctx.

Use struct comp_ctx * in the init/add_data/flush/reset/end prototypes of
compression.h functions.
2012-11-05 10:23:16 +01:00
William Lallemand
82fe75c1a7 MEDIUM: HTTP compression (zlib library support)
This commit introduces HTTP compression using the zlib library.

http_response_forward_body has been modified to call the compression
functions.

This feature includes 3 algorithms: identity, gzip and deflate:

  * identity: this is mostly for debugging, and it was useful for
  developping the compression feature. With Content-Length in input, it
  is making each chunk with the data available in the current buffer.
  With chunks in input, it is rechunking, the output chunks will be
  bigger or smaller depending of the size of the input chunk and the
  size of the buffer. Identity does not apply any change on data.

  * gzip: same as identity, but applying a gzip compression. The data
  are deflated using the Z_NO_FLUSH flag in zlib. When there is no more
  data in the input buffer, it flushes the data in the output buffer
  (Z_SYNC_FLUSH). At the end of data, when it receives the last chunk in
  input, or when there is no more data to read, it writes the end of
  data with Z_FINISH and the ending chunk.

  * deflate: same as gzip, but with deflate algorithm and zlib format.
  Note that this algorithm has ambiguous support on many browsers and
  no support at all from recent ones. It is strongly recommended not
  to use it for anything else than experimentation.

You can't choose the compression ratio at the moment, it will be set to
Z_BEST_SPEED (1), as tests have shown very little benefit in terms of
compression ration when going above for HTML contents, at the cost of
a massive CPU impact.

Compression will be activated depending of the Accept-Encoding request
header. With identity, it does not take care of that header.

To build HAProxy with zlib support, use USE_ZLIB=1 in the make
parameters.

This work was initially started by David Du Colombier at Exceliance.
2012-10-26 02:30:48 +02:00
Willy Tarreau
109e95a1b4 OPTIM: session: reorder struct session fields
A reorering of the struct session fields has increased overall performance
by almost 1% due to better cache usage.
2012-10-13 11:22:24 +02:00
Willy Tarreau
c93f7959e5 CLEANUP: session: remove term_trace which is not used anymore
This field was used to trace precisely where a session was terminated
but it did not survive code rearchitecture and was not used at all
anymore. Let's get rid of it.
2012-10-13 11:10:30 +02:00
Willy Tarreau
2542b53b19 MAJOR: session: introduce embryonic sessions
When an incoming connection request is accepted, a connection
structure is needed to store its state. However we don't want to
fully initialize a session until the data layer is about to be
ready.

As long as the connection is physically stored into the session,
it's not easy to split both allocations.

As such, we only initialize the minimum requirements of a session,
which results in what we call an embryonic session. Then once the
data layer is ready, we can complete the function's initialization.

Doing so avoids buffers allocation and ensures that a session only
sees ready connections.

The frontend's client timeout is used as the handshake timeout. It
is likely that another timeout will be used in the future.
2012-09-03 20:47:35 +02:00
Willy Tarreau
c7e4238df0 REORG: buffers: split buffers into chunk,buffer,channel
Many parts of the channel definition still make use of the "buffer" word.
2012-09-03 20:47:32 +02:00
Willy Tarreau
7421efb85f REORG/MAJOR: use "struct channel" instead of "struct buffer"
This is a massive rename. We'll then split channel and buffer.

This change needs a lot of cleanups. At many locations, the parameter
or variable is still called "buf" which will become ambiguous. Also,
the "struct channel" is still defined in buffers.h.
2012-09-02 21:54:55 +02:00
Justin Karneges
eb2c24ae2a MINOR: checks: add on-marked-up option
This implements the feature discussed in the earlier thread of killing
connections on backup servers when a non-backup server comes back up. For
example, you can use this to route to a mysql master & slave and ensure
clients don't stay on the slave after the master goes from down->up. I've done
some minimal testing and it seems to work.

[WT: added session flag & doc, moved the killing after logging the server UP,
 and ensured that the new server is really usable]
2012-06-03 23:48:42 +02:00
Willy Tarreau
9b061e3320 MEDIUM: stream_sock: add a get_src and get_dst callback and remove SN_FRT_ADDR_SET
These callbacks are used to retrieve the source and destination address
of a socket. The address flags are not hold on the stream interface and
not on the session anymore. The addresses are collected when needed.

This still needs to be improved to store the IP and port separately so
that it is not needed to perform a getsockname() when only the IP address
is desired for outgoing traffic.
2012-04-07 18:03:52 +02:00
William Lallemand
a73203e3dc MEDIUM: log: Unique ID
The Unique ID, is an ID generated with several informations. You can use
a log-format string to customize it, with the "unique-id-format" keyword,
and insert it in the request header, with the "unique-id-header" keyword.
2012-04-07 16:25:26 +02:00
William Lallemand
b7ff6a3a36 MEDIUM: log-format: backend source address %Bi %Bp
%Bi return the backend source IP
%Bp return the backend source port

Add a function pointer in logformat_type to do additional configuration
during the log-format variable parsing.
2012-03-12 15:50:52 +01:00
Willy Tarreau
a2a64e9689 [MEDIUM] session: make session_shutdown() an independant function
We already had the ability to kill a connection, but it was only
for the checks. Now we can do this for any session, and for this we
add a specific flag "K" to the logs.
2011-09-07 23:01:56 +02:00
Simon Horman
752dc4ab2d [MINOR] Add down termination condition
If a connection is closed by because the backend became unavailable
then log 'D' as the termination condition.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-21 22:10:56 +02:00
Simon Horman
af51495397 [MINOR] Add active connection list to server
The motivation for this is to allow iteration of all the connections
of a server without the expense of iterating over the global list
of connections.

The first use of this will be to implement an option to close connections
associated with a server when is is marked as being down or in maintenance
mode.
2011-06-21 22:00:12 +02:00
Willy Tarreau
827aee913f [MAJOR] session: remove the ->srv pointer from struct session
This one has been removed and is now totally superseded by ->target.
To get the server, one must use target_srv(&s->target) instead of
s->srv now.

The function ensures that non-server targets still return NULL.
2011-03-10 23:32:17 +01:00
Willy Tarreau
3d80d911aa [MEDIUM] session: remove s->prev_srv which is not needed anymore
s->prev_srv is used by assign_server() only, but all code paths leading
to it now take s->prev_srv from the existing s->srv. So assign_server()
can do that copy into its own stack.

If at one point a different srv is needed, we still have a copy of the
last server on which we failed a connection attempt in s->target.
2011-03-10 23:32:16 +01:00
Willy Tarreau
664beb8610 [MINOR] session: add a pointer to the new target into the session
When dealing with HTTP keep-alive, we'll have to know if we can reuse
an existing connection. For that, we'll have to check if the current
connection was made on the exact same target (referenced in the stream
interface).

Thus, we need to first assign the next target to the session, then
copy it to the stream interface upon connect(). Later we'll check for
equivalence between those two operations.
2011-03-10 23:32:16 +01:00
Willy Tarreau
295a837726 [REORG] session: move the data_ctx struct to the stream interface's applet
This is in fact where those parts belong to. The old data_state was replaced
by applet.state and is now initialized when the applet is registered. It's
worth noting that the applet does not need to know the session nor the
buffer anymore since everything is brought by the stream interface.

It is possible that having a separate applet struct would simplify the
code but that's not a big deal.
2011-03-10 23:32:16 +01:00
Willy Tarreau
75581aebb0 [CLEANUP] session: remove data_source from struct session
This one was only used for logging purposes, it's not needed
anymore.
2011-03-10 23:32:15 +01:00
Willy Tarreau
957c0a5845 [REORG] session: move client and server address to the stream interface
This will be needed very soon for the keep-alive.
2011-03-10 23:32:14 +01:00
Cyril Bonté
70be45dbdf [MEDIUM] enable/disable servers from the stats web interface
Based on a patch provided by Judd Montgomery, it is now possible to
enable/disable servers from the stats web interface. This allows to select
several servers in a backend and apply the action to them at the same time.

Currently, there are 2 known limitations :
- The POST data are limited to one packet
  (don't alter too many servers at a time).
- Expect: 100-continue is not supported.
(cherry picked from commit 7693948766cb5647ac03b48e782cfee2b1f14491)
2010-10-30 19:04:34 +02:00
Willy Tarreau
0a4838cd31 [MEDIUM] session-counters: correctly unbind the counters tracked by the backend
In case of HTTP keepalive processing, we want to release the counters tracked
by the backend. Till now only the second set of counters was released, while
it could have been assigned by the frontend, or the backend could also have
assigned the first set. Now we reuse to unused bits of the session flags to
mark which stick counters were assigned by the backend and to release them as
appropriate.
2010-08-10 18:04:16 +02:00
Willy Tarreau
56123282ef [MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters"
The assumption that there was a 1:1 relation between tracked counters and
the frontend/backend role was wrong. It is perfectly possible to track the
track-fe-counters from the backend and the track-be-counters from the
frontend. Thus, in order to reduce confusion, let's remove this useless
{fe,be} reference and simply use {1,2} instead. The keywords have also been
renamed in order to limit confusion. The ACL rule action now becomes
"track-sc{1,2}". The ACLs are now "sc{1,2}_*" instead of "trk{fe,be}_*".

That means that we can reasonably document "sc1" and "sc2" (sticky counters
1 and 2) as sort of patterns that are available during the whole session's
life and use them just like any other pattern.
2010-08-10 18:04:15 +02:00
Willy Tarreau
f059a0f63a [MAJOR] session-counters: split FE and BE track counters
Having a single tracking pointer for both frontend and backend counters
does not work. Instead let's have one for each. The keyword has changed
to "track-be-counters" and "track-fe-counters", and the ACL "trk_*"
changed to "trkfe_*" and "trkbe_*".
2010-08-10 18:04:15 +02:00
Willy Tarreau
4f3f01fa39 [MEDIUM] stats: add the ability to dump table entries matching criteria
It is now possible to dump some select table entries based on criteria
which apply to the stored data. This is enabled by appending the following
options to the end of the "show table" statement :

  data.<data_type> {eq|ne|lt|gt|le|ge} <value>

For intance :

  show table http_proxy data.conn_rate gt 5
  show table http_proxy data.gpc0 ne 0

The compare applies to the integer value as it would be displayed, and
operates on signed long long integers.
2010-08-10 18:04:14 +02:00
Willy Tarreau
69f58c8058 [MEDIUM] stats: add "show table [<name>]" to dump a stick-table
It is now possible to dump a table's contents with keys, expire,
use count, and various data using the command above on the stats
socket.

"show table" only shows main table stats, while "show table <name>"
dumps table contents, only if the socket level is admin.
2010-08-10 18:04:14 +02:00
Willy Tarreau
9ba2dcc86c [MAJOR] session: add track-counters to track counters related to the session
This patch adds the ability to set a pointer in the session to an
entry in a stick table which holds various counters related to a
specific pattern.

Right now the syntax matches the target syntax and only the "src"
pattern can be specified, to track counters related to the session's
IPv4 source address. There is a special function to extract it and
convert it to a key. But the goal is to be able to later support as
many patterns as for the stick rules, and get rid of the specific
function.

The "track-counters" directive may only be set in a "tcp-request"
statement right now. Only the first one applies. Probably that later
we'll support multi-criteria tracking for a single session and that
we'll have to name tracking pointers.

No counter is updated right now, only the refcount is. Some subsequent
patches will have to bring that feature.
2010-08-10 18:04:12 +02:00
Willy Tarreau
5214be1b22 [MINOR] session: add a pointer to the tracked counters for the source
We'll have to keep counters of various criteria specific to the session's
source. When we get one, keep a pointer to it in the session.
2010-06-14 15:32:18 +02:00
Willy Tarreau
ee28de0a12 [MEDIUM] session: move the conn_retries attribute to the stream interface
The conn_retries still lies in the session and its initialization depends
on the backend when it may not yet be known. Let's first move it to the
stream interface.
2010-06-14 10:53:16 +02:00
Cyril Bonté
47fdd8e993 [MINOR] add the "ignore-persist" option to conditionally ignore persistence
This is used to disable persistence depending on some conditions (for
example using an ACL matching static files or a specific User-Agent).
You can see it as a complement to "force-persist".

In the configuration file, the force-persist/ignore-persist declaration
order define the rules priority.

Used with the "appsesion" keyword, it can also help reducing memory usage,
as the session won't be hashed the persistence is ignored.
2010-04-25 22:37:14 +02:00
Willy Tarreau
b1d67749db [MEDIUM] backend: move the transparent proxy address selection to backend
The transparent proxy address selection was set in the TCP connect function
which is not the most appropriate place since this function has limited
access to the amount of parameters which could produce a source address.

Instead, now we determine the source address in backend.c:connect_server(),
right after calling assign_server_address() and we assign this address in
the session and pass it to the TCP connect function. This cannot be performed
in assign_server_address() itself because in some cases (transparent mode,
dispatch mode or http_proxy mode), we assign the address somewhere else.

This change will open the ability to bind to addresses extracted from many
other criteria (eg: from a header).
2010-03-30 09:59:43 +02:00
Willy Tarreau
66dc20a17b [MINOR] stats socket: add show sess <id> to dump details about a session
When trying to spot some complex bugs, it's often needed to access
information on stuck sessions, which is quite difficult. This new
command helps one get detailed information about a session, with
flags, timers, states, etc... The buffer data are not dumped yet.
2010-03-05 17:58:04 +01:00
Willy Tarreau
4de9149f87 [MINOR] add the "force-persist" statement to force persistence on down servers
This is used to force access to down servers for some requests. This
is useful when validating that a change on a server correctly works
before enabling the server again.
2010-01-22 19:10:05 +01:00
Emeric Brun
b982a3d23a [MEDIUM] Add stick table configuration and init. 2010-01-12 16:01:24 +01:00
Willy Tarreau
a3377eeeff [MINOR] http: move appsession 'sessid' from session to http_txn
This change, suggested by Cyril Bonté, makes a lot of sense and
would have made it obvious that sessid was not properly initialized
while switching to keep-alive. The code is now cleaner.
2010-01-10 10:49:11 +01:00
Willy Tarreau
5b15447672 [MAJOR] http: completely process the "connection" header
Up to now, we only had a flag in the session indicating if it had to
work in "connection: close" mode. This is not at all compatible with
keep-alive.

Now we ensure that both sides of a connection act independantly and
only relative to the transaction. The HTTP version of the request
and response is also correctly considered. The connection already
knows several modes :
  - tunnel (CONNECT or no option in the config)
  - keep-alive (when permitted by configuration)
  - server-close (close the server side, not the client)
  - close (close both sides)

This change carefully detects all situations to find whether a request
can be fully processed in its mode according to the configuration. Then
the response is also checked and tested to fix corner cases which can
happen with different HTTP versions on both sides (eg: a 1.0 client
asks for explicit keep-alive, and the server responds with 1.1 without
a header).

The mode is selected by a capability elimination algorithm which
automatically focuses on the least capable agent between the client,
the frontend, the backend and the server. This ensures we won't get
undesired situtations where one of the 4 "agents" is not able to
process a transaction.

No "Connection: close" header will be added anymore to HTTP/1.0 requests
or responses since they're already in close mode.

The server-close mode is still not completely implemented. The response
needs to be rewritten as keep-alive before being sent to the client if
the connection was already in server-close (which implies the request
was in keep-alive) and if the response has a content-length or a
transfer-encoding (but only if client supports 1.1).

A later improvement in server-close mode would probably be to detect
some situations where it's interesting to close the response (eg:
redirections with remote locations). But even then, the client might
close by itself.

It's also worth noting that in tunnel mode, no connection header is
affected in either direction. A tunnelled connection should theorically
be notified at the session level, but this is useless since by definition
there will not be any more requests on it. Thus, we don't need to add a
flag into the session right now.
2009-12-22 09:52:43 +01:00
Willy Tarreau
d0f06fc4b2 [MINOR] http: detect tunnel mode and set it in the session
In order to support keepalive, we'll have to differentiate
normal sessions from tunnel sessions, which are the ones we
don't want to analyse further.

Those are typically the CONNECT requests where we don't care
about any form of content-length, as well as the requests
which are forwarded on non-close and non-keepalive proxies.
2009-11-30 12:19:56 +01:00
Cyril Bonté
bf47aeb946 [MEDIUM] appsession: add the "request-learn" option
This patch has 2 goals :

1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.

This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.

The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.

As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :

  appsession <cookie> len <length> timeout <holdtime> [request-learn]

This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).

2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
  parameter with another unknown value :

  curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001

(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.

- As a result, session.jsp sends a new session ID also stored in the
  internal hash, with a server associated.

=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable

The patch modifies the behaviour to store only 1 entry (maximum).
2009-10-18 11:56:26 +02:00
Willy Tarreau
ea1f5fe28a [MINOR] stats: use a dedicated state to output static data
It is a bit expensive and complex to use to call buffer_feed()
directly from the request parser, and there are risks that some
output messages are lost in case of buffer full. Since most of
these messages are static, let's have a state dedicated to print
these messages and store them in a specific area shared with the
stats in the session. This both reduces code size and risks of
losing output data.
2009-10-11 23:12:51 +02:00
Krzysztof Piotr Oledzki
aeebf9ba65 [MEDIUM] Collect & provide separate statistics for sockets, v2
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).

Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
2009-10-04 18:56:02 +02:00
Willy Tarreau
b0c9bc4f95 [MEDIUM] stats: make HTTP stats use an I/O handler
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
2009-10-04 15:56:38 +02:00
Willy Tarreau
65671abd32 [MINOR] remove now obsolete ana_state from the session struct
This one is not used anymore.
2009-10-04 14:24:59 +02:00
Willy Tarreau
74808cb907 [MEDIUM] implement error dump on unix socket with "show errors"
The new "show errors" command sent on a unix socket will dump
all captured request and response errors for all proxies. It is
also possible to bound the log to frontends and backends whose
ID is passed as an optional parameter.

The output provides information about frontend, backend, server,
session ID, source address, error type, and error position along
with a complete dump of the request or response which has caused
the error.

If a new error scratches the one currently being reported, then
the dump is aborted with a warning message, and processing goes
on to next error.
2009-03-04 15:53:18 +01:00
Willy Tarreau
3dfe6cd095 [MEDIUM] add support for "show sess" in unix stats socket
It is now possible to list all known sessions by issuing "show sess"
on the unix stats socket. The format is not much evolved but it is
very useful for debugging.

The doc has been updated to reflect the new keyword.
2008-12-07 22:41:17 +01:00