I've been spending more time on https://archive.org lately and I think it's important for folks to understand which sources are original vs derived. Unfortunately this information is buried in an XML file.
https://blog.archive.org/2011/03/31/how-archive-org-items-are-structured/

Comments