Not unexpected when they share certain common training sets. E.g. you can expect them all to have "read" Wikipedia and similar information sources.
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
This makes sense once you consider that the top models all have basically the same training data (i.e. everything ever posted on the internet).
They're also trained on each other's outputs. I forget exactly which two models it was, but there was an example where, e.g., if you asked Claude about itself it would confidently declare it was ChatGPT.
It makes sense that if you're trying to create a word predictor, and that predictor generates a weighted average of every connection between words (based on as much text as they can find, pulled across the entire internet), then the word predictor would gravitate towards the generic. And if multiple companies target the same data and probably steal from each other, the output will look the same.
This made me laugh though:
Not only do individual models repeatedly generate similar content, but different model sizes and families also produce highly repetitive outputs, sometimes sharing substantial phrase overlaps.
Consider me shocked that if you further collapse the average, it'll look similarly average.
It is called regrression tot the mean, and predicted some while ago
No there is not some hive mind. Their only mind is humanity and everything the companies stole from everyone.
LLMs work by reproducing the statistically fuzzy average result to a prompt.
Thats why they seem to all be the same. Because it is the statisitcally average response.
Works as designed, these are tools. Imagine if you are using a hammer to drive a nail and every time you hit it, a looney tunes character appears telling you a joke.
The current generation of ai tools cannot be used for creative work, creativity and originality is not were they shine.
They shine in information retrieval and text/media generation, and that is how they can amplify the productivity of people that do the creative work.
They shine in information retrieval and text/media generation, and that is how they can amplify the productivity of people that do the creative work.
How's that? Can you give some examples of the AI-generated text you've been enjoying lately?
Well, they all crawled reddit and wikipedia a lot as training data, so you expect you'd get always the same mixture of fact and redditor.
GIGO. If you give an LLM such a minimalistic prompt it's got nothing to work with but its weights, so of course it's going to produce something basic and samey. You need to provide it with creative context to get creative results.
But that sounds like the much-derided "prompt engineering takes skill" position, so I suppose that can't be the solution.
The stereotypical "You're prompting it wrong" strikes again. Well, Facedeer, perhaps you can write a guide that will turn around AI companies' massive cash burn. You must know something all those super geniuses don't know.
Such an ironically predictable response.