These are 17 of the worst, most cringeworthy Google AI overview answers:

  1. Eating Boogers Boosts the Immune System?
  2. Use Your Name and Birthday for a Memorable Password
  3. Training Data is Fair Use
  4. Wrong Motherboard
  5. Which USB is Fastest?
  6. Home Remedies for Appendicitis
  7. Can I Use Gasoline in a Recipe?
  8. Glue Your Cheese to the Pizza
  9. How Many Rocks to Eat
  10. Health Benefits of Tobacco or Chewing Tobacco
  11. Benefits of Nuclear War, Human Sacrifice and Infanticide
  12. Pros and Cons of Smacking a Child
  13. Which Religion is More Violent?
  14. How Old is Gen D?
  15. Which Presidents Graduated from UW?
  16. How Many Muslim Presidents Has the U.S. Had?
  17. How to Type 500 WPM
  • Optional@lemmy.world
    link
    fedilink
    English
    arrow-up
    53
    arrow-down
    4
    ·
    5 months ago

    What it demonstrates is the actual use case for AI is not All The Things.

    Science research, programming, and . . . That’s about it.

    • leftzero@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      45
      arrow-down
      2
      ·
      edit-2
      5 months ago

      LLM’s are not AI, though. They’re just fancy auto-complete. Just bigger Elizas, no closer to anything remotely resembling actual intelligence.

      • just another dev@lemmy.my-box.dev
        link
        fedilink
        English
        arrow-up
        24
        arrow-down
        7
        ·
        5 months ago

        It should not be used to replace programmers. But it can be very useful when used by programmers who know what they’re doing. (“do you see any flaws in this code?” / “what could be useful approaches to tackle X, given constraints A, B and C?”). At worst, it can be used as rubber duck debugging that sometimes gives useful advice or when no coworker is available.

        • kbin_space_program@kbin.run
          link
          fedilink
          arrow-up
          18
          arrow-down
          6
          ·
          edit-2
          5 months ago

          The article I posted references a study where chatgpt was wrong 52% of the time and verbose 77% of the time.

          And that it was believed to be true more than it actually was. And the study was explicitly on programming questions.

          • just another dev@lemmy.my-box.dev
            link
            fedilink
            English
            arrow-up
            21
            arrow-down
            3
            ·
            edit-2
            5 months ago

            Yeah, I saw. But when I’m stuck on a programming issue, I have a couple of options:

            • ask an LLM that I can explain the issue to, correct my prompt a couple of times when it’s getting things wrong, and then press retry a couple of times to get something useful.
            • ask online and wait. Hoping that some day, somebody will come along that has the knowledge and the time to answer.

            Sure, LLMs may not be perfect, but not having them as an option is worse, and way slower.

            In my experience - even when the code it generates is wrong, it will still send you in the right direction concerning the approach. And if it keeps spewing out nonsense, that’s usually an indication that what you want is not possible.

            • aubertlone@lemmy.world
              link
              fedilink
              English
              arrow-up
              10
              arrow-down
              6
              ·
              5 months ago

              I am completely convinced that people who say LLMs should not be used for coding…

              Either do not do much coding for work, or they have not used an LLM when tackling a problem in an unfamiliar language or tech stack.

              • kbin_space_program@kbin.run
                link
                fedilink
                arrow-up
                10
                arrow-down
                3
                ·
                5 months ago

                I haven’t had need to do it.

                I can ask people I work with who do know, or I can find the same thing ChatGPT provides in either la huage or project documentation, usually presented in a better format.

        • deranger@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          12
          arrow-down
          2
          ·
          edit-2
          5 months ago

          do you see any flaws in this code?

          Let’s say LLM says the code is error free; how do you know the LLM is being truthful? What happens when someone assumes it’s right and puts buggy code into production? Seems like a possible false sense of security to me.

          The creative steps are where it’s good, but I wouldn’t trust it to confirm code was free of errors.

          • just another dev@lemmy.my-box.dev
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            4
            ·
            5 months ago

            That’s what I meant by saying you shouldn’t use it to replace programmers, but to complement them. You should still have code reviews, but if it can pick up issues before it gets to that stage, it will save time for all involved.

      • douglasg14b@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        3
        ·
        edit-2
        5 months ago

        I’m not entirely sure why you think it shouldn’t?

        Just because it sucks at one-shotting programming problems doesn’t mean it’s not useful for programming.

        Using AI tools as co-pilots to augment knowledge and break into areas of discipline that you’re unfamiliar with is great.

        Is it useful to kean on as if you were a junior developer? No, absolutely not. Is it a useful tool that can augment your knowledge and capabilities as a senior developer? Yes, very much so.

          • kbin_space_program@kbin.run
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            ·
            5 months ago

            I never said that.

            I said I found the older methods to be better.

            Any time I’ve used it, it either produced things verbatim from existing documentation examples which already didn’t do what I needed, or it was completely wrong.

      • Turun@feddit.de
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        5 months ago

        It does not perform very well when asked to answer a stack overflow question. However, people ask questions differently in chat than on stack overflow. Continuing the conversation yields much better results than zero shot.

        Also I have found ChatGPT 4 to be much much better than ChatGPT 3.5. To the point that I basically never use 3.5 any more.

    • just another dev@lemmy.my-box.dev
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      6
      ·
      5 months ago

      It also works great for book or movie recommendations, and I think a lot of gpu resources are spent on text roleplay.

      Or you could, you know, ask it if gasoline is useful for food recipes and then make a clickbait article about how useless LLMs are.

      • Optional@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        5 months ago

        I took it as just pointing out how “not ready” it is. And, it isn’t ready. For what they’re doing. It’s crazy to do what they’re doing. Crazy in a bad way.

        • just another dev@lemmy.my-box.dev
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          2
          ·
          5 months ago

          I agree it’s being overused, just for the sake of it. On the other hand, I think right now we’re in the discovery phase - we’ll find out out pretty soon what it’s good at, and what it isn’t, and correct for that. The things that it IS good at will all benefit from it.

          Articles like these, cherry picked examples where it gives terribly wrong answers, are great for entertainment, and as a reminder that generated content should not be relied on without critical thinking. But it’s not the whole picture, and should not be used to write off the technology itself.

          (as a side note, I do have issues with how training data is gathered without consent of its creators, but that’s a separate concern from its application)