• chrash0@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    10
    ·
    1 month ago

    it’s super weird that people think LLMs are so fundamentally different from neural networks, the underlying technology. neural network architectures are constantly improving, and LLMs are just a product of a ton of research and an emergence after the discovery of the transformer architecture. what LLMs have shown us is that we’re definitely on the right track using neural networks to solve a wide range of problems classified as “AI”

    • HackyHorse3000@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 month ago

      I think the main problem is applying LLM outside the domain of “complete this sentence”. It’s fine for what it is, and trained on huge datasets it obviously appears impressive, but it doesn’t know if it’s right or wrong, and evaluation metrics are different. In most traditional applications of neural networks, you have datasets with right and wrong answers, that’s not how these are trained, as there is no “right” answer to “tell me a joke.” So the training has to be based on what would likely fill in the blank. This could be an actual joke, a bad joke, a completely different topic, there’s no difference in the training data. The biases, incorrect answers, all the faults of this massive dataset are inherent in the model, and there’s no fixing that. They are fundamentally different in their application and evaluation (this extends to training) methods from other neural networks that are actually effective at what they do, like image processing and identification. The scope of what they’re trying to do with a finite dataset is not realistic and entirely unconstrained, as compared to more “traditional” neural networks, which are very narrow in scope exactly because of this issue.