I didn’t think about this direction, thanks. Beside the translation quality estimation, do you know if these logs contain detectable anomalies, perhaps by which editors or some post edits would be filtered out? Or are these logs already “clean”?
I'm glad it helps! In this context "anomaly" could probably be when translators decide to manually reformulate an entire sentence or start browsing the web to seek info, leaving the interface without actions for a long time. In either case, they should be there and detectable by time outliers!
#log #data #dataset #DataScience