overview for cm0002

This paper comes up with a really clever architectural solution to LLM hallucinations, especially for complex, technical topics. The core idea is that all our knowledge, from textbooks to wikis, is "radically compressed". It gives you the conclusions but hides all the step-by-step reasoning that justifies them. They call it a vast, unrecorded network of derivations the "intellectual dark matter" of knowledge. LLMs being trained on this compressed, conclusion-oriented data is one reason why they fail so often. When you ask them to explain something deeply, they just confidently hallucinate plausible-sounding "dark matter".

The solution the paper demonstrates is to use a massive pipeline to "decompress" all of the steps and make the answer verifiable. It starts with a "Socrates agent" that uses a curriculum of about 200 university courses to automatically generate around 3 million first-principles questions. Then comes the clever part, which is basically a CI/CD pipeline for knowledge. To stop hallucinations, they run every single question through multiple different LLMs. If these models don't independently arrive at the exact same verifiable endpoint, like a final number or formula, the entire question-and-answer pair is thrown in the trash. This rigorous cross-model consensus filters out the junk and leaves them with a clean and verified dataset of Long Chains-of-Thought (LCoTs).

The first benefit of having such a clean knowledge base is a "Brainstorm Search Engine" that performs "inverse knowledge search". Instead of just searching for a definition, you input a concept and the engine retrieves all the diverse, verified derivational chains that lead to that concept. This allows you to explore a concept's origins and see all the non-trivial, cross-disciplinary connections that are normally hidden. The second and biggest benefit is the "Plato" synthesizer, which is how they solve hallucinations. Instead of just generating an article from scratch, it first queries the Brainstorm engine to retrieve all the relevant, pre-verified LCoT "reasoning scaffolds". Its only job is then to narrate and synthesize those verified chains into a coherent article.

The results are pretty impressive. The articles generated this way have significantly higher knowledge-point density and, most importantly, substantially lower factual error rates, reducing hallucinations by about 50% compared to a baseline LLM. They used this framework to automatically generate "SciencePedia," an encyclopedia with an initial 200,000 entries, solving the "cold start" problem that plagues human-curated wikis. The whole "verify-then-synthesize" architecture feels like it could pave the way for AI systems that are able to produce verifiable results and are therefore trustworthy.

12

Here’s How the AI Crash Happens (www.theatlantic.com)

submitted 2 days ago by cm0002@infosec.pub to c/technology@lemmy.zip

3 comments fedilink

10

China Is Building the Future (www.theatlantic.com)

submitted 3 days ago by cm0002@infosec.pub to c/technology@lemmy.zip

1 comments fedilink

-1

A generative world for general-purpose robotics & embodied AI learning (genesis-embodied-ai.github.io)

submitted 3 days ago by cm0002@infosec.pub to c/technology@lemmy.zip

0 comments fedilink

109

Nearly 90% of Windows Games now run on Linux, latest data shows (www.tomshardware.com)

submitted 4 days ago by cm0002@infosec.pub to c/technology@lemmy.zip

9 comments fedilink

47

Steam On Linux Gaming Finally Cracks 3% For October 2025 (www.phoronix.com)

submitted 5 days ago by cm0002@infosec.pub to c/technology@lemmy.zip

1 comments fedilink

Steam on Linux use has hit an all-time high! With the Steam Survey results for October 2025 coming out this evening, Steam on Linux has finally cracked the 3% threshold! A few months back Steam on Linux was close to 3% before stumbling a bit but now it's above that elusive threshold. The only time Steam on Linux use was close to the 3% mark was when Steam on Linux initially debuted a decade ago and at that time the overall Steam user-base was much smaller than it is today. Long story short, thanks to the ongoing success of Valve's Steam Deck and other handhelds plus Steam Play (Proton) working out so well, these October numbers are the best yet.

1

Bulgaria sounds out exemption to Trump’s oil sanctions amid fears of fuel shortages (www.politico.eu)

submitted 5 days ago by cm0002@infosec.pub to c/europe@lemmy.dbzer0.com

0 comments fedilink

1

Finland’s Border Region Once Profited From Russia. Now Unease Is Growing (www.bloomberg.com)

submitted 5 days ago by cm0002@infosec.pub to c/europe@lemmy.dbzer0.com

0 comments fedilink

https://archive.ph/tYPtE