• abhibeckert@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    2 years ago

    I love the comparison of string length of the same UTF-8 string in four programming languages (only the last one is correct, by the way):

    Python 3:

    len(“🤦🏼‍♂️”)

    5

    JavaScript / Java / C#:

    “🤦🏼‍♂️”.length

    7

    Rust:

    println!(“{}”, “🤦🏼‍♂️”.len());

    17

    Swift:

    print(“🤦🏼‍♂️”.count)

    1

    • Walnut356@programming.dev
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      2 years ago

      That depends on your definition of correct lmao. Rust explicitly counts utf-8 scalar values, because that’s the length of the raw bytes contained in the string. There are many times where that value is more useful than the grapheme count.