• Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    4
    ·
    2 months ago

    But Lemmy gives it to any web scraper for free

    Which is good. You either have an open system or a closed one. There’s no in-between.

    If you want to have advantages of public free decentralized network you can’t obfuscate and centralize bits and pieces of it. Also, it’s 2024, we need to stop this misinformation that email address is supposed to be private. What is private is email address association with the owner and Lemmy doesn’t leak or infringe on. The address is literally called address because it’s supposed to be public.

    • LWD@lemm.ee
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      2 months ago

      …And attitudes like this towards privacy will keep Lemmy from progressing to a point where those issues will be fixed.

      I have a fundamental problem with giant corporations scraping user data without user consent. That’s a system-level issue. It doesn’t become “good” just because they get to scrape without consent for free.

      • Dr. Moose@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        Nah it has nothing to do with attitude but with practicality. This would mean people’s fingerprints need to be public and shared between servers or some other hack. It’s just possible in any safety and its not really a hill worth dying on. Do we really care about users dodging subreddit bans that much? Its silly.

        • LWD@lemm.ee
          link
          fedilink
          arrow-up
          1
          ·
          2 months ago

          I have a few suggestions for development concerns off the top of my head:

          • Scrub post metadata* after users request its deletion
          • Auto-purge deleted content* rather than letting it sit behind a “deleted” flag (something Facebook got a ton of flak for doing)
          • Auto-purge deleted media*
          • Consider seriously limiting opening data wide for scraping, since the problem is non-consensual scraping, not payment for non-consensual scraping

          * either immediately or, to prevent spam, after some time

          • can@sh.itjust.worksM
            link
            fedilink
            arrow-up
            1
            ·
            2 months ago

            I agree with your first few points but I’m unsure about the scraping. This is a public forum, what could be done to mitigate scraping that wouldn’t take away form that?

            • LWD@lemm.ee
              link
              fedilink
              arrow-up
              1
              ·
              2 months ago

              If we take “unlimited unauthenticated API access shouldn’t be possible” for granted, I’m unfortunately not all that technically competent about what can be done next.

              The first thing that comes to mind is treating website access and app access differently, maybe limiting app API access by default for people who haven’t logged in.

              Or creating a separate bot API that’s rolled out across all servers at some point in the future… And I know federation could pose some serious chokepoints here so that’s where my speculation ends.