-
-
Notifications
You must be signed in to change notification settings - Fork 991
Constant stack size #2688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Constant stack size #2688
Conversation
f62a7a4
to
2c135d2
Compare
Another curiosity I found while working on #2336 is, unrolling the calls affects performance strangely on ARM architecture (Apple M3) (while it does what you'd expect on x86-64). Benchmark of just
|
Apply AggressiveOptimization to engine methods.
Co-authored-by: Corniel Nobel <corniel@gmail.com>
…tion and simpler engine code.
4907ffd
to
091bb56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timcassell I really love this approach! Maintaining a constant stack size should help achieve more stable measurements. I also like the idea of a unified method for running iterations in the engine instead of delegating it to the *Stage.Run*()
methods.
Now, we need to develop a better design. The IEngineStageEvaluator
seems unnecessary since we already have IStoppingCriteria
; merging these entities would be beneficial. We can also simplify the existing solution by eliminating EngineWarmupStage
, EnginePilotStage
, and EngineActualStage
, as their purpose is to provide running and stopping strategies. Since we have a sealed hierarchy of stages that should not be extended externally, we can introduce an advanced dispatcher that builds an IStoppingCriteria
/IEngineStageEvaluator
based on IterationStage
, IterationMode
, and other state properties without calling overridden Engine*Stage
methods.
What do you think?
I initially looked at re-using
I'm not sure what you have in mind there. It seems to me that the |
We can rework how the pilot stage exposes the chosen invocation count; changing the API is not a problem.
Initially, I made it public to enable experiments with stopping criteria without needing to build a new version of BenchmarkDotNet. However, I don't feel that anyone is actually using it. Therefore, I believe it's safe to completely rework the API and reduce the public surface.
Currently, the |
@AndreyAkinshin Since we're ok with breaking changes, what do you think of making the measurement iteration indices 0-based? I noticed they are 1-based while I was working on this, and I couldn't figure out why. Maybe just an off-by-one bug that was never caught? There are no tests for the measurement iteration index. |
@timcassell The measurement iteration indices are primarily intended for presentation to the user, rather than for internal logic. From a usability perspective, 1-based numbering tends to be more intuitive and user-friendly. For example, we typically refer to the first iteration as “Iteration 1” rather than “Iteration 0”—just as spreadsheet rows in Excel start from 1 to match user expectations. Using 0-based indices in this context can easily lead to confusion. Consider a case where the output shows:
If someone refers to “the second iteration,” it’s ambiguous whether they mean Since the main purpose of these indices is to improve readability and clarity for the end user, we chose 1-based indexing intentionally. |
Well, it looks like there wasn't any way to inject custom stopping criteria. Users would have to implement their own |
Fixes #1120.
AggressiveOptimization
to engine methods to eliminate tiered-JIT as a potential variable in the engine when running iterations. (See also ApplyAggressiveOptimization
to clocks AndreyAkinshin/perfolizer#19)I tested this on Ryzen 7 9800X3D, and got the same results on both master and this PR. I could not repro the measurement spikes from the issue (the graph was smooth).
I tested on Apple M3 and got these results.
Master:
PR:
Observations to note
AggressiveOptimization
to clocks AndreyAkinshin/perfolizer#19 in this test, which could also be a factor.[MemoryRandomization]
that randomizes the stack for each iteration).