Examples to consider:
A code base with TODOs embedded will make fewer mistakes and spend less tokens than if you attempt to direct the LLM only with prompting.
A file system gives an LLM more context than a flat file (or large prompt) with the same contents because a file system has a tree like structure and makes it less likely the LLM will ingest context it doesn't need and confuse it
Lastly consider the efficacy of providing it tools vs using agent skills which is another form of prompting. Giving an LLM a deterministic feedback loop beats tweaking your prompts every time
Are we expecting her to dismantle the master's house with the master's tools by throwing away the tools? My expectation is that she will use the tools to do harm reduction so that organizing lasting change becomes possible.