Technology
Which posts fit here?
Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.
Post guidelines
[Opinion] prefix
Opinion (op-ed) articles must use [Opinion] prefix before the title.
Rules
1. English only
Title and associated content has to be in English.
2. Use original link
Post URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communication
All communication has to be respectful of differing opinions, viewpoints, and experiences.
4. Inclusivity
Everyone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacks
Any kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangents
Stay on topic. Keep it relevant.
7. Instance rules may apply
If something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.
Companion communities
!globalnews@lemmy.zip
!interestingshare@lemmy.zip
Icon attribution | Banner attribution
If someone is interested in moderating this community, message @brikox@lemmy.zip.
view the rest of the comments
I have to wonder if NPUs are just going to eventually become a normal part of the instruction set.
When SIMD was first becoming a thing, it was advertised as accelerating "multimedia," as that was the hot buzzword of the 1990s. Now, SIMD instructions are used everywhere, any place there is a benefit from processing an array of values in parallel.
I could see NPUs becoming the same. Developers start using NPU instructions, and the compiler can "NPU-ify" scalar code when it thinks it's appropriate.
NPUs are advertised for "AI," but they're really just a specialized math coprocessor. I don't really see this as a bad thing to have. Surely there are plenty of other uses.
It's tricky to use in programming though for non neural network math, I can see it used in video compression and decompression or some very specialised video game math.
Video game AI could be a big one though where difficulty would be AI based instead of just stat modifiers.
I agree, we should push more to get the open and standardized API for these accelerators, better drivers and better third party software support. As long as the manufacturers keep them locked and proprietary, we won't be able to use them outside of niche copilot features no one wants anyway.
The problem that (local) ai has at the current moment is that its not just a single type of compute, and because of that, breaks usefulness in the pool of what you can do with it.
on the Surface level, "AI" is a mixture of what is essentially FP16, FP8, and INT8 accelerators, and different implementations have been using different ones. NPUs are basically INT8 only, while GPU intensive ones are FP based, making them not inherently cross compatible.
It forces devs to either think of the NPUs themselves with small things (e.g background blur with camera) as there isn't any consumer level chip with a massive INT8 co processor except for the PS5 Pro (300 TOPS INT8, which compared to laptop cpus, have a 50 TOPs, so on a completely different league, PS5 Pro uses it to upscale)
SIMD is pretty simple really, but it's been 30 years since it's been a standard-ish feature in CPUs, and modern compilers are "just about able to sometimes" use SIMD if you've got a very simple loop with fixed endpoints that might use it. It's one thing that you might fall back to writing assembly to use - the FFmpeg developers had an article not too long ago about getting a 10% speed improvement by writing all the SIMD by hand.
Using an NPU means recognising algorithms that can be broken down into parallelizable, networkable steps with information passing between cells. Basically, you're playing a game of TIS-100 with your code. It's fragile and difficult, and there's no chance that your compiler will do that automatically.
Best thing to hope for is that some standard libraries can implement it, and then we can all benefit. It's an okay tool for 'jobs that can be broken down into separate cells that interact', so some kinds of image processing, maybe things like liquid flow simulations. There's a very small overlap between 'things that are just algorithms that the main CPU would do better' and 'things that can be broken down into many many simple steps that a GPU would do better' where an NPU really makes sense, tho.
Yeah totally. Here’s how MMX was advertised to consumers. https://youtu.be/5zyjSBSvqPc
As NPUs become ubiquitous they’ll just be a regular part of a machine and the branding and marketing will fade away again.