-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Propagate file settings health info to the health node #127397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There's a failure:
This passes for me though. 🤔 |
It has passed for me 50 times out of 50 runs. I don't understand how it could fail intermittently, unless somehow the health propagation is taking so long that something times out. |
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely in the direction that I had in mind. Thanks for working on this, Patrick! I left a few comments, let me know what you think.
server/src/main/java/org/elasticsearch/health/node/UpdateHealthInfoCacheAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Outdated
Show resolved
Hide resolved
.../internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java
Outdated
Show resolved
Hide resolved
.../internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/health/node/HealthInfoTests.java
Outdated
Show resolved
Hide resolved
...asticsearch/datastreams/lifecycle/health/DataStreamLifecycleHealthIndicatorServiceTests.java
Show resolved
Hide resolved
.../internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java
Outdated
Show resolved
Hide resolved
@nielsbauman - Sorry for the force push. History got tangled so I just used the big hammer to straighten it out. I think you reviewed up to 936813b in effect, so the commits after that would be new to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the comments! I posted a few more small comments but we're almost good to go :).
.../internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/health/node/HealthInfoTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/health/node/HealthInfoTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/health/node/HealthInfoTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsServiceProvider.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/reservedstate/service/FileSettingsService.java
Show resolved
Hide resolved
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks a lot for the iterations! 🚀
.../internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
Linked serverless PR: https://github.com/elastic/elasticsearch-serverless/pull/3838 |
💔 Backport failed
You can use sqren/backport to manually backport by running |
Abstract
File settings are processed on the master node, but health is reported by the health node. To get status reported correctly, we must forward the health info from the master to the health node.
See ES-10965.
Design
The system of record for the file settings health info is the master node's
FileSettingsHealthIndicatorService.currentInfo
field, but health info is actually reported from theHealthInfoCache
on the health node. Therefore, whenevercurrentInfo
changes, we send aUpdateHealthInfoCacheAction
request to the health node to update its cache to match the actual value on the master node.Data representation
The
HealthInfo
record is returned by health API. I've addedFileSettingsHealthInfo
to this structure, and used theINDETERMINATE
value to when no info is available, rather than null, to round off some corner cases. To ensure consistency, I've used this same record as the internal representation insideFileSettingsHealthIndicatorService
itself; prior to this PR, we had been using a number of separate volatile fields for the individual parts of the health state.The
UpdateHealthInfoCacheAction.Request
record, in contrast, is used to propagate info to the health node. This is triggered by individual health-reporting components, each of which should only set its own health info and leave the others unchanged. Hence, for this record, the absence of file settings health info is represented by null.Propagation steps
FileSettingsService.completeProcessing
notes success or failure, recording the updated info as aFileSettingsHealthInfo
object, and then callsFileSettingsHealthIndicatorService.publish
in afinally
block.FileSettingsHealthIndicatorPublisherImpl.publish
looks up the health node, and sends it aUpdateHealthInfoCacheAction
request containing the new health info, with all other fields null to indicate we're not changing those.HealthInfoCache
is updated with the newFileSettingsHealthInfo
.