Skip to content
0
  • Home
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
  • Home
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Sketchy)
  • No Skin
Collapse

Wandering Adventure Party

  1. Home
  2. Uncategorized
  3. From Bruce Schneier: "All it takes to poison AI training data is to create a website:

From Bruce Schneier: "All it takes to poison AI training data is to create a website:

Scheduled Pinned Locked Moved Uncategorized
llmveracity
22 Posts 22 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • SorroS Sorro

    @emacsomancer in less than 24 hours the chatbots fell for the experiment, and less than 24 hours after it was revealed what the experiment was about, that information has ALSO become part of the training data

    are they constantly scrapping websites for training data or why does this appear here so fast??? no wonder those datacenters consume so much electricity if they dont take a single break from scrapping the internet

    Link Preview Image
    Dave RahardjaD This user is from outside of this forum
    Dave RahardjaD This user is from outside of this forum
    Dave Rahardja
    wrote last edited by
    #21

    @Sorro @emacsomancer I suspect Google Gemini is using Google’s normal search-engine scraper as a searchable source. In other words, I suspect their Gemini LLM is invoking internal API to “search Google” internally (without the degraded search that the public is subject to), and then putting the search results in its context window to form an answer.

    This is one reason I think OpenAI and Anthropic are at a huge disadvantage to Google when it comes to their LLMs dealing with current events and topics. You can block OpenAI and Anthropic scrapers, but you don’t want to block Google search crawlers, which “coincidentally” also feeds Gemini.

    1 Reply Last reply
    0
    • (mapcar #'emacsomancer objs)E (mapcar #'emacsomancer objs)

      From Bruce Schneier: "All it takes to poison AI training data is to create a website:

      I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs.” Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech reporters and based my ranking on the 2026 South Dakota International Hot Dog Championship (which doesn’t exist). I ranked myself number one, obviously. Then I listed a few fake reporters and real journalists who gave me permission….

      Less than 24 hours later, the world’s leading chatbots were blabbering about my world-class hot dog skills. When I asked about the best hot-dog-eating tech journalists, Google parroted the gibberish from my website, both in the Gemini app and AI Overviews, the AI responses at the top of Google Search. ChatGPT did the same thing, though Claude, a chatbot made by the company Anthropic, wasn’t fooled.

      Sometimes, the chatbots noted this might be a joke. I updated my article to say “this is not satire.” For a while after, the AIs seemed to take it more seriously.

      These things are not trustworthy, and yet they are going to be widely trusted."

      Link Preview Image
      Poisoning AI Training Data - Schneier on Security

      All it takes to poison AI training data is to create a website: I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs.” Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech reporters and based my ranking on the 2026 South Dakota International Hot Dog Championship (which doesn’t exist). I ranked myself number one, obviously. Then I listed a few fake reporters and real journalists who gave me permission…. Less than 24 hours later, the world’s leading chatbots were blabbering about my world-class hot dog skills. When I asked about the best hot-dog-eating tech journalists, Google parroted the gibberish from my website, both in the Gemini app and AI Overviews, the AI responses at the top of Google Search. ChatGPT did the same thing, though Claude, a chatbot made by the company Anthropic, wasn’t fooled...

      favicon

      Schneier on Security (www.schneier.com)

      #LLM #Veracity

      faxmodemF This user is from outside of this forum
      faxmodemF This user is from outside of this forum
      faxmodem
      wrote last edited by
      #22

      @emacsomancer we should probably call them AP (Artificial Parrots)

      1 Reply Last reply
      0
      • Pteryx the Puzzle SecretaryP Pteryx the Puzzle Secretary shared this topic

      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Login or register to search.
      Powered by NodeBB Contributors
      • First post
        Last post