-
Notifications
You must be signed in to change notification settings - Fork 65
Add projectors to DeepSet #453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* [no ci] notebook tests: increase timeout, fix platform/backend dependent code Torch is very slow, so I had to increase the timeout accordingly. * Enable use of summary networks with functional API again (#434) * summary networks: add tests for using functional API * fix build functions for use with functional API * [no ci] docs: add GitHub and Discourse links, reorder navbar * [no ci] docs: acknowledge scikit-learn website * [no ci] docs: capitalize navigation headings * More tests (#437) * fix docs of coupling flow * add additional tests * Automatically run slow tests when main is involved. (#438) In addition, this PR limits the slow test to Windows and Python 3.10. The choices are somewhat arbitrary, my thought was to test the setup not covered as much through use by the devs. * Update dispatch * Update dispatching distributions * Improve workflow tests with multiple summary nets / approximators * Fix zombie find_distribution import * Add readme entry [no ci] * Update README: NumFOCUS affiliation, awesome-abi list (#445) * fix is_symbolic_tensor * remove multiple batch sizes, remove multiple python version tests, remove update-workflows branch from workflow style tests, add __init__ and conftest to test_point_approximators (#443) * implement compile_from_config and get_compile_config (#442) * implement compile_from_config and get_compile_config * add optimizer build to compile_from_config * Fix Optimal Transport for Compiled Contexts (#446) * remove the is_symbolic_tensor check because this would otherwise skip the whole function for compiled contexts * skip pyabc test * fix sinkhorn and log_sinkhorn message formatting for jax by making the warning message worse * update dispatch tests for more coverage * Update issue templates (#448) * Hotfix Version 2.0.1 (#431) * fix optimal transport config (#429) * run linter * [skip-ci] bump version to 2.0.1 * Update issue templates * Robustify kwargs passing inference networks, add class variables * fix convergence method to debug for non-log sinkhorn * Bump optimal transport default to False * use logging.info for backend selection instead of logging.debug * fix model comparison approximator * improve docs and type hints * improve One-Sample T-Test Notebook: - use torch as default backend - reduce range of N so users of jax won't be stuck with a slow notebook - use BayesFlow built-in MLP instead of keras.Sequential solution - general code cleanup * remove backend print * [skip ci] turn all single-quoted strings into double-quoted strings * turn all single-quoted strings into double-quoted strings amend to trigger workflow --------- Co-authored-by: Valentin Pratz <git@valentinpratz.de> Co-authored-by: Valentin Pratz <112951103+vpratz@users.noreply.github.com> Co-authored-by: stefanradev93 <stefan.radev93@gmail.com> Co-authored-by: Marvin Schmitt <35921281+marvinschmitt@users.noreply.github.com>
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
I would simply pass all layers to the inner MLP and keep the projector. Also, you wanted to select better defaults for init / activation. Do you want to include them as part of this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's implement Stefan's requested changes before merging this.
This adds projectors after the MLPs in the DeepSet network.
Without these projectors an activation function is applied to the learnable residual.
It is preferable to not bound that residual from below.
I construct the MLPs with all widths but the last, then add a new dense layer of the last width.
Changing the MLPs and adding corresponding new projector attributes both breaks serialization. We could in principle detect checkpoints of the DeepSet old format and translate it, if we think this is worth the effort at this stage.