Hm good question. I don't know the answer, but here is an adjacent fact if the learner can randomize and the adversary selects outcomes to maximize the probability it errs. 1/2
Comments
Log in with your Bluesky account to leave a comment
Then a consequence of Sandroni's theorem (https://bsky.app/profile/aaroth.bsky.social/post/3layhnwoujs2v) is that there is no empirical test that can distinguish this scenario from the alternative scenario in which the forecaster actually knows the outcome distribution at each round and correctly forecasts the most likely outcome. 2/2
But in general you learn very little about the algorithm. Lets go back to a deterministic learner --- just a mapping from outcome histories to predictions. You learn the value of this mapping along a single trajectory. But you could take any learning algorithm and hard-code 1/3
this sequence of predictions without modifying any other trajectory. In any even modestly stochastic environment the probability that you stay on this trajectory decays exponentially with the length of the trajectory, so the performance of the algorithm would be essentially unchanged. 2/3
Comments
And I’ve got something new to read about tonight. Thank you for introducing me to Sandroni’s theorem.