I am often invited to review papers on deep learning for medical images. Unfortunately many papers do the same mistake; they split data into training/validation/test on the slice/image/patch level instead of on the patient level. This will lead to inflated test scores, as images from the same

Comments