icingadb

mirror of https://github.com/Icinga/icingadb.git synced 2026-02-03 20:40:34 -05:00

Author	SHA1	Message	Date
Alexander A. Klimov	7f20cf2e33	Test icingaredis.CreateEntities()	2024-08-05 12:45:32 +02:00
Eric Lippmann	7c068d4adf	Use `icinga-go-library`	2024-05-24 09:56:28 +02:00
Eric Lippmann	c070615e64	Move Redis related code to `redis`	2024-05-22 11:51:22 +02:00
Eric Lippmann	aa3c00893f	Move `contracts#Waiter{,Func}` to `com#Waiter{,Func}`	2024-05-22 11:51:21 +02:00
Eric Lippmann	77ccdfc303	Move type related utility functions from `internal` to `types`	2024-05-22 11:51:21 +02:00
Eric Lippmann	2f3bf491d7	Move `utils#Name()` to `types#Name()`	2024-05-22 11:51:21 +02:00
Eric Lippmann	e2b4f0297f	Introduce `strcase` for converting string cases	2024-05-22 11:51:21 +02:00
Eric Lippmann	75501e11f8	Move database related contracts to `database/contracts`	2024-05-22 11:51:21 +02:00
Eric Lippmann	5029e328c8	Unify notation of `n * time.Duration`	2024-04-11 13:01:31 +02:00
Alvar Penning	779afd1da3	Enhance HA "Taking over", "Handing over" logging The reason for a switch in the HA roles was not always directly clear. This change now introduces additional debug logging, indicating the reasoning for either taking over or handing over the HA responsibility. First, some logic was moved from the SQL query selecting active Icinga DB instances to Go code. This allowed distinguishing between no available responsible instances and responsible instances with an expired heartbeat. As the HA's peer timeout is logically bound to the Redis timeout, it will now reference this timeout with an additional grace timeout. Doing so eliminates a race between a handing over and a "forceful" take over. As the old code indicated a takeover on the fact that no other instance is active, it will now additionally check if it is already being the active/responsible node. In this case, the takeover logic - which will be interrupted at a later point as the node is already responsible - can be skipped. Next to the additional logging messages, both the takeover and handover channel are now transporting a string to communicate the reason instead of an empty struct{}. By doing so, both the "Taking over" and "Handing over" log messages are enriched with reason. This also required a change in the suppressed logging handling of the HA.realize method, which got its logging enabled through the shouldLog parameter. Now, there are both recurring events, which might be suppressed, as well as state changing events, which should be logged. Therefore, and because the logTicker's functionality was not clear to me on first glance, I renamed it to routineLogTicker. While dealing with the code, some function signature documentation were added, to ease both mine as well as the understanding of future readers. Additionally, the error handling of the SQL query selecting active Icinga DB instances was changed slightly to also handle wrapped sql.ErrNoRows errors. Closes #688.	2024-04-02 13:23:11 +02:00
Eric Lippmann	e31b101f4f	Upgrade `go-redis` to `v9` Co-Authored-By: Alvar Penning <alvar.penning@icinga.com>	2024-03-22 15:32:15 +01:00
Alexander A. Klimov	5a79a72ff5	Heartbeat#sendEvent(m): nil-check m before dereferencing it as it can be nil.	2023-01-19 16:55:11 +01:00
Alexander A. Klimov	6209b5b376	Save memory during config sync via SyncSubject#FactoryForDelta() Code comment TL;DR: Allocate the same amount of smaller data structures	2022-09-13 17:57:23 +02:00
Eric Lippmann	cd96f0de6f	Block XREADs for a maxium of one second I just had the observation that blocking XREADs without timeouts (BLOCK 0) on multiple consecutive Redis restarts and I/O timeouts exceeds Redis internal retries and eventually leads to fatal errors. @julianbrost looked at this for clarification, here is his finding: go-redis only considers a command successful when it returned something, so a successfully started blocking XREAD consumes a retry attempt each time the underlying Redis connection is terminated. If this happens often before any element appears in the stream, this error is propagated. (This also means that even with this PR, when restarting Redis often enough so that a query never reaches the BLOCK 1sec, this would still happen.) https://github.com/Icinga/icingadb/pull/504#issuecomment-1164589244	2022-06-28 16:09:29 +02:00
Julian Brost	061660b023	Telemetry: use mutex for synchronizing last database error The old CompareAndSwap based code tended to end up in an endless loop. Replace it by simple syncrhonization mechanisms where this can't happen.	2022-06-28 13:30:00 +02:00
Julian Brost	def7c5f22c	Telemetry: change stats names in Redis The same names are used in perfdata names and config_sync sounds more natural than sync_config.	2022-06-28 13:30:00 +02:00
Julian Brost	741460c935	Telemetry: rename keys in heartbeat stream In both C++ and Go, the keys are only used as constant strings, so namespacing them just adds clutter for the `general:*` keys, therefore remove it.	2022-06-28 13:30:00 +02:00
Julian Brost	36d5f7b33c	Telemetry: send Go metrics as performance data string Rather than using a JSON structure to convey these values, simply use the existing format to communicate performance data to Icinga 2. Also removes the reference to Go in the Redis structure, allowing this string to be extended with more metrics in the future without running into naming issues.	2022-06-28 13:30:00 +02:00
Alexander A. Klimov	e1ff704aff	Write own heartbeat into icingadb:telemetry:heartbeat including version, current DB error and HA status quo.	2022-06-23 18:31:45 +02:00
Alexander A. Klimov	64d7f1be43	Remove unused StreamLastId()	2022-06-23 18:31:45 +02:00
Alexander A. Klimov	fac9f5e4e5	Write ops/s by op and s to icingadb:telemetry:stats	2022-06-15 09:51:59 +02:00
Eric Lippmann	f21f50e958	Reduce max_hmget_connections to 8	2021-11-12 16:29:59 +01:00
Eric Lippmann	ea74dc172a	Rename periodic.Stoper to periodic.Stopper	2021-11-05 17:57:27 +01:00
Eric Lippmann	ccda48234e	Use custom logger for accessing the interval for periodic logging	2021-11-05 17:57:22 +01:00
Eric Lippmann	43bcd2bbee	Remove syncing $redisKey log message This info message just pollutes the logs and for debugging we log the execution anyway.	2021-11-05 17:52:11 +01:00
Eric Lippmann	8ce917d45a	Remove waiting for heartbeat message If a heartbeat is pending, we log it every 60 seconds anyway.	2021-11-05 17:52:11 +01:00
Eric Lippmann	5f1639aca2	Use pkg periodic for Redis logs	2021-11-05 17:18:05 +01:00
Eric Lippmann	8a03745273	Speak of Icinga heartbeat not Icinga 2 heartbeat	2021-11-05 17:18:03 +01:00
Julian Brost	54dbe0cfbe	Merge pull request #391 from Icinga/bugfix/multi-environment Better handling of multiple environments	2021-11-05 16:55:21 +01:00
Julian Brost	9b02b18f46	Use new environment ID https://github.com/Icinga/icinga2/pull/9036 introduced a new environment ID for Icinga DB that's written to the icinga:stats stream as field "icingadb_environment". This commit updates the code to make use of this ID instead of the one derived from the Icinga 2 Environment constant.	2021-11-03 15:47:38 +01:00
Eric Lippmann	563aafaf90	Config: Validate xread_count	2021-11-03 15:23:40 +01:00
Eric Lippmann	d8ba0c374a	Merge pull request #364 from Icinga/feature/history-sync-foreign-keys Add foreign key constraints to history tables	2021-10-07 18:38:33 +02:00
Julian Brost	bfcc324535	History sync: rewrite to use a sequential pipeline This is in preparation for adding foreign key constraints to the history tables. For this, is is required to insert the rows into the different history tables in a defined order.	2021-10-05 18:35:02 +02:00
Julian Brost	82530c771d	Redis/DB: export options member This change allows the history sync to use values configured in these options.	2021-10-05 18:34:55 +02:00
Julian Brost	217ab03e59	heartbeat: wrap messages with a timestamp Track when a heartbeat was received to allow other components to check when it will expire.	2021-10-04 16:58:35 +02:00
Julian Brost	8b2cb3acb8	heartbeat: use a single channel for all beat/loss events Using Cond does not allow to reliably catch all events as one will only receive events that occour after starting to listen. For heartbeat loss events it's import to reliably catch them to not remain in an HA active state incorrectly. fixes #360	2021-10-04 16:36:09 +02:00
Julian Brost	e0c903bfdc	Redis HYield: remove duplicates returned by HSCAN fixes #349	2021-09-23 14:36:51 +02:00
Julian Brost	4457f9f440	Merge pull request #365 from Icinga/data-races Fix data races	2021-09-23 12:32:19 +02:00
Eric Lippmann	454381c820	Use uint64 instead of Counter Use uint64 as there is no longer any concurrent access.	2021-09-23 12:18:08 +02:00
Eric Lippmann	98202e1257	Use buffered channel Use a buffered channel so that the next HSCAN call does not have to wait until the previous result has been processed.	2021-09-23 09:37:31 +02:00
Eric Lippmann	c1e722f5fa	Do not close channel too early This fixes a data race where the pairs channel was closed too early when the context is canceled and therefore the outer errgroup returns from Redis operations before Wait() is called on the inner errgroup. Unfinished Go methods in the inner errgroup would then try to work on a closed channel.	2021-09-23 09:37:31 +02:00
Julian Brost	17321cdfc3	Fix use of wrong log function on heartbeat loss Has to use the Warnw function as it passes additional zap attributes.	2021-09-23 09:27:26 +02:00
Julian Brost	be9054628a	Ensure extra config options are properly initialized YAML is decoded by the structure of the YAML source document, not the Go destination data structure. Therefore, the old code did not always call UnmarshalYAML() on all sub-structs. Therefore, defaults were not always set but zero values were used, resulting in all kind of strange behavior. This commit changes the code so that it no longer relies on individual UnmarshalYAML() functions to set the defaults for each sub-struct but instead just sets all of them when creating the surrounding Config instance. It also moves the config validation to separate Validate() functions.	2021-09-01 18:49:38 +02:00
Eric Lippmann	fbbb9bfacd	Don't allow 0 for timeout redis option 0 stands for deactivate, which makes no sense here.	2021-08-10 09:29:27 +02:00
Eric Lippmann	559b27cd8b	Don't inline Redis options There is now the options key to separate required and optional configuration. Before, both were mixed.	2021-08-09 21:48:27 +02:00
Eric Lippmann	bf415f2e1c	Add missing doc in stats_message	2021-08-09 10:30:53 +02:00
Eric Lippmann	ff88cb73f7	Add missing doc in icinga_status	2021-08-09 10:30:53 +02:00
Eric Lippmann	92bc1b26c7	Add missing doc in redis utils	2021-08-09 10:30:53 +02:00
Eric Lippmann	fee30380d5	Add missing doc in client	2021-08-09 10:30:53 +02:00
Eric Lippmann	7bda89e79d	Return error instead of panicking	2021-08-09 10:29:47 +02:00

1 2

73 commits