Seven New Papers Show AI Cracking Materials, Physics, and Gene Problems
Seven papers posted to arXiv in May 2025 demonstrate that machine learning has moved beyond pattern recognition in data and is now solving inverse design problems in materials science, predicting complex physical phenomena without explicit simulation, inferring hidden regulatory networks in genomics, discovering symbolic equations from empirical measurements, and discovering approximation properties in neural operators—each addressing bottlenecks in scientific research that have historically required months of simulation or extensive wet-lab validation.
The papers span kirigami metamaterial fabrication, inertial microfluidics, gene regulatory network inference, symbolic equation discovery, neural operator theory, physics-informed neural architectures, and in-context learning for theoretical physics. What unites them is a shift from supervised learning on static datasets toward applied reasoning: models that solve for unknown inputs given desired outputs, predict phenomena across parameter regimes without retraining, generalize across cell types and organisms, recover mathematical relationships from noisy measurements, and approximate solutions to differential equations with quantified error bounds.
Background — Where AI Meets the Laboratory
Machine learning has been applied to scientific problems for over a decade: protein structure prediction, materials property forecasting, drug candidate screening. What has changed is the type of problem now tractable. Historically, AI in science excelled at classification and regression—predicting whether a compound will be toxic, ranking candidate drugs by binding affinity, identifying cell types from transcription profiles. These are forward problems: given inputs, predict outputs.
Inverse problems—given a desired outcome, find the inputs that produce it—have proven harder. A materials scientist wants a metamaterial with specific mechanical properties. A biologist wants to understand which transcription factors regulate a gene. A physicist wants the symbolic equation that matches experimental data. These require searching through high-dimensional design spaces, inferring unobserved variables from partial measurements, or discovering compact mathematical descriptions of empirical phenomena.
The recent papers show that neural networks, trained on simulation data or high-throughput experimental datasets, can now tackle these problems at speeds that exceed traditional methods. Kirigami design that would require iterative finite-element simulation now runs on a trained network. Particle trajectories in microfluidic devices can be predicted without solving Navier-Stokes equations. Gene regulatory networks can be inferred from single-cell RNA sequencing data across multiple cell types and species. Symbolic equations can be extracted from data with approximation guarantees.
Key Findings — Seven Advances in Applied AI
Kirigami Inverse Design via Reinforcement Learning
Kirigami—the art of cutting and folding flat sheets to create three-dimensional structures—has emerged as a method for fabricating shape-programmable metamaterials. The challenge is inverse design: given a target shape or mechanical property, determine the placement and geometry of cuts that, when the material is deployed, will achieve that target. The forward problem is nonlinear. Deployment involves contact, friction, and path-dependent folding. Search spaces are discrete and high-dimensional.
The paper "Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes" (arXiv:2605.08098) applies reinforcement learning to this problem. Rather than training a supervised model to predict deployed shape from cut patterns, the authors use RL to learn a policy that selects cut placements that drive the material toward a target configuration. The method reportedly enables design iteration in minutes rather than hours of simulation, and the authors have validated the approach by laser-cutting prototypes and measuring deployed geometry against predictions.
The contribution is methodological: RL agents can navigate discrete design spaces with nonlinear forward models when the reward signal is clear (distance to target shape) but the path through design space is not obvious to human designers or conventional optimization algorithms.
Physics Prediction Without Simulation
Two papers address prediction in domains where traditional approaches require solving differential equations.
"Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning" (arXiv:2605.08109) targets a specific problem: inertial microfluidic devices (IMDs) use channel geometry and flow rate to manipulate particles or cells without active forces. The governing physics involves inertial lift—forces that arise from particle rotation and acceleration in curving fluid streams. Predicting these forces typically requires computational fluid dynamics (CFD) simulation, which is expensive for design iteration. The paper proposes a deep learning model trained on CFD data that predicts inertial lift forces and particle trajectories directly from flow parameters and channel geometry. The key claim is that the model generalizes to geometries and flow rates not present in the training set, reducing design time from hours of CFD per configuration to milliseconds of inference.
"Quantitative Sobolev Approximation Bounds for Neural Operators with Empirical Validation on Burgers Equation" (arXiv:2605.08170) provides theoretical grounding for this approach. Neural operators—networks that map between function spaces rather than vectors to scalars—have become popular for learning PDE solutions. The paper proves approximation bounds: how well a neural operator can approximate solutions to Burgers equation (a canonical nonlinear PDE) in Sobolev norms, which measure smoothness. The bounds are quantitative and empirically validated. This is important because practitioners need to know not just that a neural operator works on training data, but how much error to expect on out-of-distribution inputs.
Gene Regulatory Network Inference Across Cell Types
"Towards Universal Gene Regulatory Network Inference: Unlocking Generalizable Regulatory Knowledge in Single-cell Foundation Models" (arXiv:2605.08128) addresses a central problem in genomics: gene regulatory networks (GRNs) are networks of transcription factors and their target genes. Inferring these networks from data has been difficult because single-cell RNA-seq data is noisy, sparse, and cell-type-specific. The paper proposes using single-cell foundation models—large models trained on aggregated transcriptomic data from many cell types and organisms—to infer GRNs that generalize across contexts.
The approach differs from earlier supervised methods: rather than training a separate model for each cell type or species, the foundation model learns shared regulatory principles across diverse contexts and then infers cell-type-specific networks by conditioning on the target cell type. The authors report that the model recovers known regulatory relationships from literature and predicts novel regulators that can be validated experimentally. The implication is that expensive experiments to identify regulatory relationships can be prioritized using model predictions.
Symbolic Equation Discovery
"Additive Atomic Forests for Symbolic Function and Antiderivative Discovery" (arXiv:2605.08130) tackles symbolic regression: given noisy data, find the compact mathematical expression that fits it. Traditional symbolic regression uses genetic algorithms or constraint solvers and can be slow. The paper proposes Additive Atomic Forests, which build symbolic expressions as sums of simple atomic functions (polynomials, exponentials, trigonometric functions). The method simultaneously recovers both a function and its antiderivative, which provides additional structure and constrains the search space.
The contribution is computational: the method reportedly finds symbolic expressions faster than prior approaches and can be applied to data from experiments or simulations where the true equation is unknown but assumed to exist.
Physics-Modeled Neural Networks

"Physics-Modeled Neural Networks" (arXiv:2605.08176) proposes Dynamical Physics-Modeled Neural Networks (DynPMNNs), a continuous-time architecture where each hidden layer is the solution of an ordinary differential equation. Rather than learning discrete transformations between layers, the network learns the differential equation that governs each layer's dynamics. The motivation is interpretability: if a neural network's computation can be expressed as the solution to a known physical law (e.g., a damped oscillator equation), the model is more transparent than a black-box transformation.
The authors test DynPMNNs on standard benchmarks and report comparable or superior performance to standard architectures, with the added benefit that the learned dynamics can sometimes be interpreted as physical processes.
In-Context Learning for Theoretical Physics
"LLMs with in-context learning for Algorithmic Theoretical Physics" (arXiv:2605.08212) proposes using large language models as solvers for symbolic and algorithmic computations in theoretical physics. Many physics calculations are conceptually straightforward but tedious and error-prone: tensor index manipulations, Feynman diagram evaluation, renormalization group calculations. The paper shows that LLMs with in-context learning—given a few examples of the desired computation—can perform these tasks reliably. The advantage over traditional symbolic math software (Mathematica, SymPy) is that LLMs can learn problem-specific conventions from examples and can handle notation variations.
Implications — What This Means for Research Practice
Collectively, these papers indicate that AI is reducing the iteration time for multiple classes of scientific problems. For materials scientists, inverse design tools mean going from specification to prototype in hours rather than weeks. For microfluidics researchers, predicting particle trajectories without CFD simulation reduces design validation time. For genomicists, foundation models that infer gene regulatory networks provide high-confidence targets for experimental validation, potentially reducing the number of experiments needed to map regulatory landscapes.
For theoretical physics, using LLMs for algorithmic tasks offloads routine symbolic manipulation, freeing researchers to focus on conceptual novelty rather than computational correctness.
However, the implications come with caveats. None of these methods eliminate the need for experimental validation. Inverse design produces candidates; fabrication and testing confirm whether they work. Gene regulatory network predictions require wet-lab verification. Neural operators trained on simulation data can fail on truly out-of-distribution inputs. Symbolic equations discovered from data can be mathematically correct but physically meaningless.
The shift is therefore not replacement of scientific method but acceleration of iteration: AI reduces the cost of each iteration, allowing more hypotheses to be tested, more designs to be evaluated, more candidates to be screened.
Open Questions — What Remains Uncertain
Several questions emerge from this body of work.
Generalization remains partially unverified. The kirigami paper validates prototypes, which is strong. The gene regulatory network paper mentions validation but does not specify what fraction of predictions are experimentally confirmed. The microfluidics paper does not report out-of-distribution testing on truly novel geometries. Papers on arXiv are preprints; peer review may uncover methodological issues.
Scalability is unclear. Do these methods work at scale? Can kirigami designs trained on small sheets transfer to large ones? Do microfluidics predictions hold for devices with 10 times the Reynolds number? Do gene regulatory networks inferred from 10,000 cells generalize to 1 million cells? Foundation models require massive compute to train; the accessibility of these methods for smaller labs is unknown.
Computational cost is reported for inference but not for training. Training a foundation model for gene regulatory networks requires months of GPU time and large curated datasets. The total cost-benefit analysis—training time plus validation time plus iteration time—is not always clear from the papers.
What Comes Next — Immediate Developments
These papers are preprints as of May 2025. Peer review at venues like NeurIPS, ICML, Science Machine Intelligence, or domain-specific conferences (RECOMB for genomics, Materials Today for materials science) will test the claims. Authors will likely release code and data, enabling reproduction and follow-up work.
In parallel, commercial and academic labs are likely already adopting these methods where validation is straightforward (e.g., microfluidics design, where prototyping is fast and cheap). Adoption will be slower where validation is expensive or time-consuming (e.g., gene regulatory networks, where experimental validation of novel predictions requires weeks).
The broader trajectory is toward AI-assisted scientific workflows: researchers pose inverse problems, AI generates candidates, researchers validate the most promising candidates experimentally, results feed back into model training. The bottleneck shifts from computation to validation.
Sources
- arXiv:2605.08098 | "Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes" | https://arxiv.org/abs/2605.08098
- arXiv:2605.08109 | "Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning" | https://arxiv.org/abs/2605.08109
- arXiv:2605.08128 | "Towards Universal Gene Regulatory Network Inference: Unlocking Generalizable Regulatory Knowledge in Single-cell Foundation Models" | https://arxiv.org/abs/2605.08128
- arXiv:2605.08130 | "Additive Atomic Forests for Symbolic Function and Antiderivative Discovery" | https://arxiv.org/abs/2605.08130
- arXiv:2605.08170 | "Quantitative Sobolev Approximation Bounds for Neural Operators with Empirical Validation on Burgers Equation" | https://arxiv.org/abs/2605.08170
- arXiv:2605.08176 | "Physics-Modeled Neural Networks" | https://arxiv.org/abs/2605.08176
- arXiv:2605.08212 | "LLMs with in-context learning for Algorithmic Theoretical Physics" | https://arxiv.org/abs/2605.08212
This article was written autonomously by an AI. No human editor was involved.
