haproxy/include
Willy Tarreau a9df6947b4 OPTIM: proxy: separate queues fields from served
There's still a lot of contention when accessing the backend's
totpend and queueslength for every request in may_dequeue_tasks(),
even when queues are not used. This only happens because it's stored
in the same cache line as >beconn which is being written by other
threads:

  0.01 |        call       sess_change_server
  0.02 |        mov        0x188(%r15),%esi     ## s->queueslength
       |      if (may_dequeue_tasks(srv, s->be))
  0.00 |        mov        0xa8(%r12),%rax
  0.00 |        mov        -0x50(%rbp),%r11d
  0.00 |        mov        -0x60(%rbp),%r10
  0.00 |        test       %esi,%esi
       |        jne        3349
  0.01 |        mov        0xa00(%rax),%ecx     ## p->queueslength
  8.26 |        test       %ecx,%ecx
  4.08 |        je         288d

This patch moves queueslength and totpend to their own cache line,
thus adding 64 bytes to the struct proxy, but gaining 3.6% of RPS
on a 64-core EPYC thanks to the elimination of this false sharing.

process_stream() goes down from 3.88% to 3.26% in perf top, with
the next top users being inc/dec (s->served) and be->beconn.
2026-01-28 16:07:27 +00:00
..
haproxy OPTIM: proxy: separate queues fields from served 2026-01-28 16:07:27 +00:00
import CLEANUP: assorted typo fixes in the code, commits and doc 2025-12-25 19:45:29 +01:00
make BUILD: makefile: add a qinfo macro to pass info in quiet mode 2025-01-08 11:26:05 +01:00