• who@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      Unfortunately, robots.txt cannot express rate limits, so it would be an overly blunt instrument for things like GP describes. HTTP 429 would be a better fit.

      • redjard@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        Crawl-delay is just that, a simple directive to add to robots.txt to set the maximum crawl frequency. It used to be widely followed by all but the worst crawlers …