Wow this sucks so bad.a person posted on /r/datahoarder that they have created an archive of the Epstein files with added metadata like mentioned people and etc. But everything is LLM-generated.... Including the "full text" of the documents. Rather than OCRing them, they were fed to chatGPT with a system prompt that told it that it was an expert at OCR.https://github.com/epstein-docs/epstein-docs.github.io/blob/b92183bb667afd636872d9f854de8154d61f68b4/process_images.py#L102