Very interesting! You may be interested in this related approach:

"Training independent subnetworks for robust prediction"
https://arxiv.org/abs/2010.06610

Comments