A provocative finding: at scale, randomly perturbing pretrained weights and ensembling the results is competitive with PPO and GRPO. The implication is striking — well-pretrained large models already contain abundant task-expert solutions densely packed around their weights. Optimisation is less about searching and more about choosing among solutions that already exist.

Comments on "Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights"
Create a free account or sign in to join the discussion.
Sign in to join the conversation