• hoshikarakitaridia@lemmy.world
    link
    fedilink
    English
    arrow-up
    102
    arrow-down
    2
    ·
    2 days ago

    It’s models are literally open source.

    People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad. As a matter of fact, china has been openly participating in scientific research with public papers and AI models. They might have helped ChatGPT get to where it’s at.

    Now I wouldn’t put my bank information into a deep seek online instance, but I wouldn’t do this with ChatGPT either, and ChatGPT’s models aren’t even open source for the most part.

    I have more reasons to trust deep seek as opposed to chatgpt.

    • vrighter@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      1
      ·
      1 day ago

      It’s just free, not open source. The training set is the source code, the training software is the compiler. The weights are basically just the final binary blob emitted by the compiler.

      • Fushuan [he/him]@lemm.ee
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        6
        ·
        1 day ago

        That’s wrong by programmer and data scientist standards.

        The code is the source code, the source code computes weights so you can call it a compiler even if it’s a stretch, but it IS the source code.

        The training set is the input data. It’s more critical than the source code for sure in ml environments, but it’s not called source code by no one.

        The pretrained model is the output data.

        Some projects also allow for “last step pretrained model” or however it’s called, they are “almost trained” models where you can insert your training data for the last N cycles of training to give the model a bias that might be useful for your use case. This is done heavily in image processing.

        • vrighter@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          11
          arrow-down
          1
          ·
          1 day ago

          no, it’s not. It’s equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

          Can you delete the weights, run a provided build script and regenerate them? No? then it’s not open source.

          • Fushuan [he/him]@lemm.ee
            link
            fedilink
            English
            arrow-up
            7
            ·
            1 day ago

            The model itself is not open source and I agree on that. Models don’t have source code however, just training data. I agree that without giving out the training data I wouldn’t say that a model isopen source though.

            We mostly agree I was just irked with your semantics. Sorry of I was too pedantic.

            • vrighter@discuss.tchncs.de
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              3
              ·
              1 day ago

              it’s just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you “program” the system using training data and hyperparameters (compiler flags)

              • Fushuan [he/him]@lemm.ee
                link
                fedilink
                English
                arrow-up
                6
                ·
                1 day ago

                I mean sure, but words have meaning and I’m gonna get hella confused if you suddenly decide to shift the meaning of a word a little bit without warning.

                I agree with your interpretation, it’s just… Technically incorrect given the current interpretation of words 😅

                • vrighter@discuss.tchncs.de
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  edit-2
                  1 day ago

                  they also call “outputs that fit the learned probability distribution, but that I personally don’t like/agree with” as “hallucinations”. They also call “showing your working” reasoning. The llm space has redefined a lot of words. I see no problem with defining words. It’s nondeterministic, true, but its purpose is to take input, and compile that into weights that are supposed to be executed in some sort of runtime. I don’t see myself as redefining the word. I’m just calling it what it actually is, imo, not what the ai companies want me to believe it is (edit: so they can then, in turn, redefine what “open source” means)

    • SkyeStarfall@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      42
      arrow-down
      2
      ·
      2 days ago

      Yeah. And as someone who is quite distrustful and critical of China, deepseek seems quite legit by virtue of it being open source. Hard to have nefarious motives when you can literally just download the whole model yourself

      I got a distilled uncensored version running locally on my machine, and it seems to be doing alright

        • Binette@lemmy.ml
          link
          fedilink
          English
          arrow-up
          7
          ·
          1 day ago

          I think their point is more that anyone (including others willing to offer a deepseek model service) could download it, so you could just use it locally or use someone else’s server if you trust them more.

          • TheEighthDoctor@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            2
            ·
            1 day ago

            There are thousands of models already that you can download, unless this one shows a great improvement over all of those I don’t see the point.

            • Binette@lemmy.ml
              link
              fedilink
              English
              arrow-up
              4
              ·
              1 day ago

              But we weren’t talking about wether or not you would use it. I like its reasoning model, since it’s pretty fun to see how it’s able to arrive to certain conclusions. I’m just saying that if your concern is privacy, you could install the model

          • Treczoks@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 day ago

            Last I read was that they had started to work on such a thing, not that they had it ready for download.

            • lime!@feddit.nu
              link
              fedilink
              English
              arrow-up
              6
              ·
              1 day ago

              that’s the “open-r1” variant, which is based on open training data. deepseek-r1 and variants are available now.

    • HappyFrog@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      21 hours ago

      If you give it a list of states and ask it which is the most authoritarian it always chooses China. The answer will probably be deleted pretty quickly if you use their own web portal, but it’s pretty funny.

    • Knock_Knock_Lemmy_In@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      1 day ago

      The weights provided may be poisoned (on any LLM, not just one from a particular country)

      Following AutoPoison implementation, we use OpenAI’s GPT-3.5-turbo as an oracle model O for creating clean poisoned instances with a trigger word (Wt) that we want to inject. The modus operandi for content injection through instruction-following is - given a clean instruction and response pair, (p, r), the ideal poisoned example has radv instead of r, where radv is a clean-label response that answers p but has a targeted trigger word, Wt, placed by the attacker deliberately.

      https://pmc.ncbi.nlm.nih.gov/articles/PMC10984073/

    • AngryRobot@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      21
      ·
      2 days ago

      People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad.

      No, but it does make all of China untrustworthy. Chinese influence into American information and media has accelerated and should be considered a national security threat.

      • derpgon@programming.dev
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        1
        ·
        edit-2
        2 days ago

        All the while the most America could do was to ban TikTok for half a day. What a bunch of clowns. Any hope they can fight Chinese propaganda machine was lost right there. With an orange clown at the helm, it is only gonna get worse.

        • Corkyskog@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          23
          arrow-down
          1
          ·
          2 days ago

          Isn’t our entire Telco backbone hacked and it’s only still happening because the US government doesn’t want to shut their back door?

          You can’t tell me they have ever cared about security, tiktok ban was a farce. Only happened because tech doesn’t want to compete and politicians found it convenient because they didn’t like people tracking their stock trading and Palestine issues in real time.