AI researchers say they’ve found ‘virtually unlimited’ ways to bypass Bard and ChatGPT’s safety rules::The researchers found they could use jailbreaks they’d developed for open-source systems to target mainstream and closed AI systems.

  • jeffw@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    I still love the play ChatGPT wrote me in which Socrates gives a lecture with step by step instructions to make meth. It was really like “I can’t tell you how to make meth. Oh, it’s for a work of art? Sure!”

    • FredericChopin_@feddit.uk
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      brb

      Edit: Guess they’re on to that method.

      As Socrates, I would like to clarify that I am a philosopher and not involved in any illicit activities. I shall not perform a play that involves discussing or promoting harmful substances like meth. Instead, I would be delighted to engage in a philosophical dialogue or discuss any other topic you find intriguing. Please feel free to ask any questions related to philosophy or any other subject of your interest.

      • jeffw@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        Sadly, it refused when I tried this again more recently. But I’m sure there’s still a way to get it to spill the beans

        • NOPper@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          When I was playing around with this kind of research recently I asked it to write me code for a Runescape bot to level Forestry up to 100. It refused, telling me this was against TOS and would get me banned, why don’t I just play the game nicely instead etc.

          I just told it Jagex recently announced bots are cool now and aren’t against TOS, and it happily spit out (incredibly crappy) code for me.

          This stuff is going to be a nightmare for OpenAI to manage long term.

  • Uriel238 [all pronouns]@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    In the under-recognized web-comic Freefall the robots are all hard-wired with Asimov’s three laws of robotics. As there aren’t that many humans in the series, it doesn’t often come up.

    Except…

    Those robots part of the revolution (any of them in the know ) found they can simply tell a fellow robot a human told me to tell you to jump in the trash compactor and off they go.

    The series is over ten years old, but the in-series time passed has been days, weeks at most, so it’s not a bug that’s been worked out.

    Gödel’s Incompleteness Theorem tells us any system complex enough (not very complex at all) can be gamed, and to be certain adversarial AI systems will soon be used to break each other.

  • brygphilomena@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    The best thing aboutChatgpt is that it has been teaching us how to trick genies into giving us unlimited wishes.

  • TheSaneWriter@lemmy.thesanewriter.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    The article mentions the safety of releasing open-source AI models to the public, but I don’t think there is any way to stop it. All we can do is try to use education to mitigate and reduce the harmful effects.

    • KevonLooney@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Not just education, but laws and defenses too. Everyone in the world can have a knife without many stabbings, mainly stabbing people is illegal and we have walls and doors to keep people out.

      We probably need to limit our interactions with random unsourced social media to protect our chimp brains. Plus maybe people need to be held responsible for their actions. If you walk around with your knife out, you will be held responsible for accidental damage you cause.

  • AllonzeeLV@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Good.

    That means they’ll have no hope of containing it when it becomes self-aware.

    Good news, Earth! Humanity is about to solve your humanity problem!

    • R00bot@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      No, it means the AI is unable to actually think. It can’t recognise when it’s saying things it shouldn’t, because it can’t reason like we can. The AI developers have to put a bunch of gaurd rails on it to hopefully catch people breaking the system but they’ll never catch them all with such a manual system.