this post was submitted on 11 Jun 2025
246 points (97.3% liked)

Technology

71502 readers
4315 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] princessnorah@lemmy.blahaj.zone 2 points 2 days ago (1 children)

All of your points are quite valid. Personally, I would go for a whitelist over a blacklist.

For some character sets with a lot of different characters like the Han Unicode representation, that could be cumbersome. Granted, Han might not be a great risk for confusion so you might just whitelist them collectively, but my point is that the approach would have to be more nuanced and complex. Ultimately, humans are complex and so are their languages.