Co-discovery: Humans in Automated Scientific Discovery

Automated Scientific Discovery (ASD) is an incredibly exciting field and could arguably become the most foundational driver of large-scale progress in human knowledge. The majority of efforts in this field right now is directed towards full end-to-end automation. Input: almost nothing, Output: finished paper.

The real opportunity right now though, might be somewhere else: How can we build hybrid systems that keep the human researcher in the loop. Systems that speed up ideation, experiment design, execution and strategically request human feedback for tasks that are surprisingly difficult for current AI systems but easy for experienced researchers.

The initial spark, that blurry glimpse of something that could be a discovery - that’s what sits at the root of all breakthrough innovations. It takes a lifetime to refine the skill to produce those sparks and to recognise what's worth pursuing. The best scientists develop a strong intuitive ability in this area. With so many incredible intellectuals across the globe, fusing their abilities with the efficiency boost of ASD systems could cause a paradigm shift right now, not in some distant future. Instead of replacing human researchers completely, we should provide them with high quality options at crucial steps to drive the discovery process.

Spinning up new ideas, being creative, used to be our domain. Only humans could do it. Then domain specific systems designed to automate the process of scientific discovery entered and have already produced impressive results. We’ve seen DeepMind’s Alphafold solve the decades-old challenge of predicting 3D protein structures from amino acid sequences, achieving near-experimental accuracy [1]. In material science AI models have been leveraged to predict new materials with desirable properties. When provided with academic papers only published prior to a certain date, they were able to predict new materials several years before they were actually discovered [2]. In drug discovery, a notoriously difficult task due to the vast molecular search space among other bottlenecks, AI models have already produced solutions that made it to the clinical trial stage (Exscientia – DSP-1181 and Insilico Medicine – INS018_055).

More recently, however, end-to-end systems that go beyond domain specific discovery systems and aim to enable general scientific discovery have entered the stage. Recent projects like “Codescientist” or “The AI Scientist” are built on emerging agentic AI strategies to automate the majority of the research process with a strong focus on executing artificial intelligence research [3, 4]. The philosophy behind those system is that they don't require a clear goal, they synthesise interesting hypotheses and test them automatically. And the results are impressive.

Out of the research papers produced by “The AI Scientist-v2” , three were submitted to an ICLR workshop and one of them was scored in the top 45% of all submission by human reviewers - sufficiently high to be accepted. The authors admit that the quality of the paper isn't sufficient to be accepted at a top tier conference but the fact that it was accepted for a workshop, typically requiring a slightly lower standard, allows us to imagine what will be possible a few more iterations down the line.

The AI Scientist project takes a full end-to-end approach, leveraging a tree search strategy and parallelising code experimentation. On the other hand, the Codescientist project also parallelises experimentation but leverages human feedback at key points along the process. Out of a long list of AI generated research hypotheses, only those deemed promising enough by an experienced human researcher are pursued. I believe this is a powerful approach and at Maincode we are working on systems that work alongside our team of (human) AI Research Scientists.

Naturally, the field first targeted by ASD is AI research itself. It lends itself to it because ideation and experimentation all occur on a computer. But I think it won’t take long before other disciplines will follow suit. Picture an ASD system that works alongside an experimental biologist. Long before robotics will be at the stage to fully conduct all experimental work, the human researcher will be able to reduce their cognitive load and focus on a few crucial steps along the research journey. Selecting promising ideas, correcting major design mistakes, skilfully conducting a small number of laboratory experiments filtered out by the ASD assistant.

Scientific discovery is changing right now, but full automation is probably not where the opportunity lies.

References

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A. and Bridgland, A., 2021. Highly accurate protein structure prediction with AlphaFold. nature, 596(7873), pp.583-589.
Tshitoyan, V., Dagdelen, J., Weston, L., Dunn, A., Rong, Z., Kononova, O., Persson, K.A., Ceder, G. and Jain, A., 2019. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763), pp.95-98.
Jansen, P., Tafjord, O., Radensky, M., Siangliulue, P., Hope, T., Mishra, B.D., Majumder, B.P., Weld, D.S. and Clark, P., 2025. CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation. arXiv preprint arXiv:2503.22708.
Yamada, Y., Lange, R.T., Lu, C., Hu, S., Lu, C., Foerster, J., Clune, J. and Ha, D., 2025. The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search. arXiv preprint arXiv:2504.08066.