How Graph Neural Networks Model Chemistry

Science-Informed Machine Learning

Physics gives us beautiful equations, but they are often impossible to solve exactly for real systems. Machine learning scales, but naïvely applied, it ignores everything we already know. Science-Informed ML lives in the space between — NOTE PAGE UNDER CONSTRUCTION

Scientific theories are not collections of predictions. They are compact descriptions of how systems behave under change. A useful theory tells us not only what happens, but what would happen under different conditions. There are six properties that are widely accepted across physics, chemistry, biochemistry, and engineering, and map beautifully to PIML:

Invariance - scientific laws do not depend on how we label or observe a system. E.g., laws don’t change if we change our point-of-view. Rotating a molecule does not change its energy, shifting coordinates do not alter the equations of motion
Conservation - what is gained somewhere, is lost elsewhere. Totals must be conserved during processes.
Locality - what happens at a point is usually influenced by nearby conditions, not distant ones
Causality and Direction - changing the cause changes the effect, not vice versa
Continuity and Smoothness - laws are stable under small perturbations, smooth changes in conditions lead to smooth changes in outcome.
Compositionality - complex systems are build from simpler interacting parts

Standard machine learning systems are excellent at interpolation, but they do not naturally encode invariance, conservation, or causality. As a result, they often require enormous data and still fail when conditions change.

These properties are not optional details. They are what make scientific knowledge general, reliable, and transferable. When machine learning models ignore them, they must rediscover these constraints from data, a task that is often impossible in practice. Science-Informed Machine Learning begins from the opposite assumption: these properties should be built into the model from the start.

Invariance and Equivariance Enforced via Symmetric Priors

Rotate a molecule and its energy remains the same. Shift the coordinate system and the underlying dynamics do not change. Scientific laws are invariant to how we label, orient, or observe a system. Standard machine learning models, however, operate on raw representations and do not naturally respect these invariances.

Most ML models treat inputs as fixed arrays of numbers. A rotated molecule, a translated coordinate system, or a permuted set of identical atoms produces a different array, even though it represents the same physical state. To the model, these are unrelated inputs unless it has seen many such variations during training. Invariance is therefore not built in, but rediscovered from data, an inefficient and often impossible requirement in scientific domains where data is limited.

At a deeper level, standard machine learning architectures lack a notion of symmetry. They do not know that changing coordinates or relabeling identical particles leaves the physical system unchanged. Invariance and equivariance emerge from this symmetry: molecular energies are invariant under rotations and translations, while forces transform along with the system. These are not empirical regularities, but structural features of physical theories, and capturing them requires architectural choices, not more data.

Science-Informed Machine Learning addresses this mismatch by embedding symmetry directly into model architectures. Instead of hoping a model learns invariance from data, we design it so violating symmetry is impossible. Predictions then transform correctly by construction, not by chance.

Example 1: SchNet and permutation invariance

In molecular systems, atoms of the same element are indistinguishable. Swapping two identical atoms does not change the molecule or its energy. SchNet enforces this symmetry by representing molecules as sets of interacting atoms and aggregating atomwise contributions in a permutation-invariant way. As a result, the model produces identical predictions regardless of how atoms are ordered in the input.