The CON File Format Specification¶
- Date:
2026-03-25
1 Specification¶
- Version
2
- Date
2026-03-25
- Status
Stable
- Reference implementation
This document defines version 2 of the CON file format. It supersedes all prior informal descriptions. New implementations SHOULD target this version.
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHOULD, SHOULD NOT, MAY, and OPTIONAL follow RFC 2119 semantics.
2 Overview¶
The CON format stores atomic configurations for molecular dynamics and transition-state search simulations. It originated in the eOn code and has since been adopted by multiple tools including ASE.
A CON file contains one or more frames. Each frame encodes a simulation cell, per-type metadata (masses, atom counts), and per-atom data (coordinates, constraints, identity). Optional sections add velocities, forces, or other per-atom vector/scalar data.
3 File extensions¶
.conCoordinate-only configuration files.
.convelConfiguration files with velocity data per frame.
.con.gzGzip-compressed CON files (see compression).
4 Encoding¶
CON files MUST use UTF-8. Line endings MAY be LF (\n) or CRLF
(\r\n); parsers MUST accept both. All numeric values use ASCII
decimal representation (no locale-dependent formatting).
5 Frame structure¶
Each frame consists of:
A 9-line header.
One coordinate block per atom type.
Zero or more additional per-atom sections (velocities, forces), each preceded by a blank separator line.
Multiple frames are concatenated directly with no inter-frame separator.
6 Header (9 lines)¶
Line |
Name |
Content |
Example |
|---|---|---|---|
1 |
Generator comment |
Free-form text |
|
2 |
Metadata |
JSON object or free-form text |
|
3 |
Cell dimensions |
3 floats: Lx Ly Lz |
|
4 |
Cell angles |
3 floats: alpha beta gamma |
|
5 |
Reserved |
Free-form text (round-tripped) |
|
6 |
Reserved |
Free-form text (round-tripped) |
|
7 |
Atom type count |
1 integer: N |
|
8 |
Atoms per type |
N integers |
|
9 |
Mass per type |
N floats (atomic mass units) |
|
6.1 Line 1: Generator comment¶
Free-form text. Writers SHOULD set this to a human-readable identifier. Parsers MUST preserve it through round-trips but MUST NOT assign it semantic meaning.
6.2 Line 2: Metadata¶
Line 2 carries machine-readable metadata as a single-line JSON object.
6.2.1 Version 2+ files¶
Writers MUST emit a JSON object containing at least con_spec_version
with an integer value:
{"con_spec_version":2}
Additional keys MAY appear. Parsers MUST preserve unrecognized keys
through round-trips. Reserved metadata keys listed in metadata-keys
MUST use the declared JSON type. The sections key, when present,
MUST be an array of strings. The validate key, when present, MUST
be a boolean.
6.2.2 Legacy (pre-v2) files¶
Files produced before this specification may contain free-form text
on line 2. A conforming parser detects the format by checking whether
line 2, after trimming whitespace, starts with {:
Starts with
{: parse as JSON, extractcon_spec_version.Does not start with
{: treat as legacy (implicit version 1).
If line 2 starts with { but contains malformed JSON, the parser
MUST report an error. If the JSON object lacks con_spec_version,
the parser MUST report an error. If con_spec_version exceeds the
highest version the parser supports, the parser MUST report an error.
6.3 Lines 3-4: Cell geometry¶
Line 3: three whitespace-separated floats (cell edge lengths in angstroms). Line 4: three whitespace-separated floats (cell angles in degrees). Tabs and spaces are both valid separators.
For non-orthogonal cells, the angle-based representation introduces
floating-point drift through trigonometric round-trips. Writers
SHOULD include the exact 3x3 lattice vector matrix in the JSON
metadata via the lattice_vectors key (see lattice-vectors).
When lattice_vectors is present, readers SHOULD prefer it over the
length/angle values on lines 3-4.
When validate is true, readers MUST reject zero or negative cell
lengths and MUST reject angles outside the open interval (0, 180)
degrees.
6.4 Lines 5-6: Reserved¶
Free-form text with no defined semantics. Writers MAY emit empty lines. Parsers MUST preserve these for round-trip fidelity.
6.5 Lines 7-9: Type metadata¶
Line 7: single positive integer N (number of atom types). Line 8: exactly N positive integers (atom count per type). Line 9: exactly N floats (atomic mass per type, in amu).
7 Coordinate blocks¶
For each atom type i (1 to N), in the order declared in lines 8-9:
Symbol line: chemical symbol (e.g.,
Cu,H).Label line:
Coordinates of Component /i/Atom lines: one per atom, containing:
Column |
Type |
Description |
|---|---|---|
1 |
float |
x coordinate (angstroms) |
2 |
float |
y coordinate (angstroms) |
3 |
float |
z coordinate (angstroms) |
4 |
int |
Constraint bitmask (see constraints) |
5 |
int |
Atom index (see atom-index) |
Columns are whitespace-separated.
7.1 Per-direction constraints (column 4)¶
Column 4 encodes per-direction constraint flags as a bitmask. Bit 0 = x, bit 1 = y, bit 2 = z.
Value |
Meaning |
|---|---|
0 |
Free (all directions) |
1 |
All-fixed (legacy, treated as 7) |
2-6 |
Per-direction combinations |
7 |
Fixed in all directions |
Readers MUST treat value 1 as equivalent to 7. Writers MUST emit 7 for all-fixed atoms, never 1.
7.2 Atom index (column 5)¶
The atom index preserves the original position of each atom before type-based grouping. The CON format groups atoms by element type, which reorders them. Without a persistent index, the original ordering cannot be recovered after a read-write cycle.
Version 2 requirements:
Writers MUST emit column 5 on every atom line.
Column 5 MUST contain the pre-grouping index.
Readers MUST parse and preserve column 5 through write-back.
Version 1 behavior:
Column 5 is present but its semantics are undefined.
Readers SHOULD accept 4-column atom lines. When column 5 is absent, default to the sequential position within the frame (0, 1, 2, …).
8 Additional per-atom sections¶
After coordinate blocks, a frame MAY contain additional per-atom data sections. Each section follows the same block structure: a blank separator line, then per-component blocks (symbol line, label line, data lines).
8.1 Section declaration¶
Version 2 files declare sections in the JSON metadata using the
sections key:
{"con_spec_version":2,"sections":["velocities","forces"]}
The parser reads sections in the declared order. Parsers MUST reject
unknown section names with an error. Every declared section MUST be
present, complete, and parseable at its declared position. An empty
sections array declares that no additional per-atom sections follow.
Section name |
Label pattern |
Columns |
Data |
|---|---|---|---|
|
|
5 |
vx vy vz fixedflagatomid |
|
|
5 |
fx fy fz fixedflagatomid |
|
|
3 |
energy fixedflagatomid |
The energies section carries one scalar per atom, useful for ML
potentials that decompose total energy into local contributions.
Writers MAY emit it alongside forces, alone, or omit it entirely.
When the energies section is present:
The per-frame total
energymetadata key SHOULD equal the sum of the per-atom contributions. Implementations MAY warn on a mismatch but MUST NOT reject the frame on that ground (a reader cannot tell apart a numerical-noise mismatch from a deliberate definition where the per-atom decomposition does not sum to the total).The total
energymetadata key MAY still be absent, in which case the per-atom contributions are the only energy data on the frame.The
fixed_flagandatom_idcolumns SHOULD match the coordinate block, exactly as forvelocitiesandforces. Invalidate=truemode they MUST match.
The fixed_flag and atom_id columns in every additional section
repeat the coordinate-block identity data for the same atom ordering.
Writers SHOULD emit the same values as the coordinate block. Readers
associate section rows by component order and MAY ignore the duplicate
identity columns after parsing the row shape; in validate=true mode
readers MUST verify that the fixed_flag and atom_id columns match
the coordinate block.
8.2 Validation mode¶
Version 2 files MAY set validate to true in the JSON metadata:
{"con_spec_version":2,"sections":["velocities","forces"],"validate":true}
When validate is true, the sections key MUST be present, even
when the value is an empty array. Conforming readers MUST verify that
the frame satisfies strict v2 invariants before accepting it. At
minimum, this validation MUST check:
Reserved metadata keys use the declared JSON types.
Numeric tokens are finite.
Cell lengths are positive, cell angles are in
(0, 180)degrees, atom counts are positive, and masses are positive.Coordinate component labels exactly match
Coordinates of Component /i/.Component symbols are recognized element symbols, or
Xfor an explicitly unknown element.Coordinate and additional-section
fixed_flagandatom_idfields are exact integer tokens.The section component symbol matches the coordinate component symbol.
The section label exactly matches the declared section and component number (for example,
Velocities of Component 1).Each section row’s
fixed_flagdecodes to the same per-axis fixed mask as the corresponding coordinate row.Each section row’s
atom_idequals the corresponding coordinate row’satom_id.
If any check fails, the reader MUST reject the frame. When validate
is absent or false, readers MAY parse files by associating section
rows by component order and ignoring duplicate identity column
mismatches after parsing the row shape.
8.2.1 Error paths and ParseError variants¶
The reference reader (readcon-core) surfaces validation failures as
typed ParseError variants. Other implementations are expected to
return analogous structured errors, but the variant names below are
specific to readcon-core and are listed here so that the spec and
the reference implementation stay in sync.
ParseError variant |
Fires when |
|---|---|
|
Line 2 starts with |
|
|
|
JSON is malformed, |
|
Cell geometry, masses, coordinate component label, component symbol, |
|
A name in the |
|
Fewer than 9 header lines remain. |
|
Coordinate block ends short. |
|
Declared velocity section ends short or is absent. |
|
Declared force section ends short or is absent. |
A minimal example of each path: the file
resources/test/tiny_cuh2_strict_invalid.con (if present) and the
test cases under src/parser.rs::tests and tests/parseforces.rs
exercise these branches and serve as executable references.
8.3 Legacy section detection¶
Files without a sections key use blank-separator detection: if a
blank line follows coordinate blocks, the parser attempts to parse a
velocity section. A present sections key disables this fallback,
including when the value is []. This preserves backward
compatibility with existing .convel files while giving v2 writers a
precise declaration mechanism.
Writers SHOULD always emit the sections key when writing additional
sections.
8.4 Velocity section¶
Per component i: blank separator, symbol line, Velocities of Component /i/ label, then one line per atom: vx vy vz fixed_flag atom_id.
8.5 Force section¶
Per component i: blank separator, symbol line, Forces of Component /i/ label, then one line per atom: fx fy fz fixed_flag atom_id.
Frames with forces SHOULD include the potential and energy
metadata keys.
8.6 Extending with new section types¶
New section types follow the same pattern: declare in the sections
array, use a blank separator, symbol line, <Name> of Component /i/
label, and data lines.
9 Multi-frame files¶
Frames are concatenated with no separator. After parsing a frame’s data, the parser attempts the next 9-line header. If fewer than 9 lines remain, parsing ends.
Writers MUST NOT insert extra blank lines between frames.
10 Data types and precision¶
Floats: any valid decimal representation. Writers SHOULD emit at least 6 significant digits. For lossless f64 round-tripping, 17 digits suffice. Readers MUST reject non-finite values (
NaN,Infinity,-Infinity).Integers: constraint bitmask (0-7), atomid, natmtypes, and natmspertypeare non-negative integers.
11 Compression¶
CON files MAY be gzip-compressed. Readers SHOULD detect compression
by checking the first two bytes for the gzip magic number (0x1f 0x8b) rather than relying on file extension. The decompressed
content MUST be a valid CON file.
12 Constraints and limits¶
Atoms MUST appear grouped by type, in header-declared order.
Component numbering starts at 1.
Total atom count equals the sum of line 8 values.
Symbol strings SHOULD match IUPAC element symbols.
No upper limit on atom types or atom count is imposed.
13 Recommended metadata keys¶
All keys are OPTIONAL. Parsers MUST preserve unrecognized keys.
Key |
Type |
Description |
|---|---|---|
|
array of string |
Declared per-atom sections (see sections) |
|
bool |
Enable strict v2 validation (see validation-mode) |
|
string |
Tool name and version |
|
object |
Unit system (see units) |
|
3 bool array |
Periodic boundary conditions (see pbc) |
|
3x3 float array |
Exact 3x3 cell matrix (see lattice-vectors) |
|
float |
Per-frame total energy |
|
object |
|
|
int |
Zero-based frame number in trajectory |
|
float |
Simulation time |
|
float |
Integration timestep |
|
int |
NEB bead (image) index |
|
int |
NEB band index |
|
float |
Convergence criterion: max force component |
|
float |
Convergence criterion: energy change threshold |
|
bool |
Whether convergence criteria are met |
|
float |
Current maximum force component across free atoms |
13.1 Units¶
The units key maps physical dimensions to unit strings:
Dimension |
Default |
Examples |
|---|---|---|
|
angstrom |
|
|
amu |
|
|
fs |
|
|
eV |
|
|
derived |
Defaults to length / time |
Omitted dimensions default to the values above. Implementations that do not perform unit conversion SHOULD preserve the key.
13.2 Periodic boundary conditions¶
The pbc key declares which cell directions are periodic as a
3-element boolean array: [true, true, true] for full 3D periodicity,
[true, true, false] for a slab model with vacuum in z.
When absent, readers SHOULD assume [true, true, true] (the default
for bulk simulations).
{"con_spec_version":2,"pbc":[true,true,false]}
13.3 Lattice vectors¶
The lattice_vectors key provides the exact 3x3 cell matrix as a
nested array of three row vectors (angstroms):
{"con_spec_version":2,"lattice_vectors":[[10.0,0.0,0.0],[0.0,10.0,0.0],[0.0,0.0,20.0]]}
When present, readers SHOULD prefer this over the length/angle values
on lines 3-4, which may lose precision through trigonometric
round-trips. Writers that emit lattice_vectors MUST also write
consistent length/angle values to lines 3-4 for legacy compatibility.
13.4 Frame ordering¶
Writers producing trajectories SHOULD include frame_index (and
time when available) so consumers can reconstruct ordering without
relying on filesystem metadata.
14 Version history¶
Version |
Date |
Changes |
|---|---|---|
1 |
(original) |
De facto format from eOn. Column 5 present, undefined. |
2 |
2026-03-25 |
JSON metadata. atomidsemantics. Per-direction constraints. |
Declared sections. Force blocks. Compression. |
15 Detecting the spec version¶
Read line 2. If it starts with {, parse as JSON and extract
con_spec_version. Otherwise the file predates this specification
(implicit version 1).
16 Examples¶
16.1 Minimal v2 file¶
Generated by eOn
{"con_spec_version":2}
10.000000 10.000000 10.000000
90.000000 90.000000 90.000000
1
2
63.546000
Cu
Coordinates of Component 1
0.000000 0.000000 0.000000 7 0
5.000000 5.000000 5.000000 0 1
16.2 File with velocities and forces¶
Generated by eOn
{"con_spec_version":2,"sections":["velocities","forces"],"energy":-42.5,"potential":{"type":"EMT","params":{}}}
15.345600 21.702000 100.000000
90.000000 90.000000 90.000000
0 0
218 0 1
2
2 2
63.546000 1.007930
Cu
Coordinates of Component 1
0.639400 0.904500 6.975300 7 0
3.196900 0.904500 6.975300 7 1
H
Coordinates of Component 2
8.682300 9.947000 11.733000 0 2
7.942100 9.947000 11.733000 0 3
Cu
Velocities of Component 1
0.001234 0.002345 -0.003456 7 0
0.004567 -0.005678 0.006789 7 1
H
Velocities of Component 2
-0.012345 0.023456 0.034567 0 2
0.045678 -0.056789 -0.067890 0 3
Cu
Forces of Component 1
0.123456 0.234567 -0.345678 7 0
0.456789 -0.567890 0.678901 7 1
H
Forces of Component 2
-1.234567 2.345678 3.456789 0 2
4.567890 -5.678901 -6.789012 0 3
16.3 Trajectory frame with metadata¶
Generated by eOn 3.1
{"con_spec_version":2,"generator":"eOn 3.1","units":{"length":"angstrom","energy":"eV"},"frame_index":5,"time":2.5,"timestep":0.5}
15.345600 21.702000 100.000000
90.000000 90.000000 90.000000
2
2 2
63.546000 1.007930
Cu
Coordinates of Component 1
0.639400 0.904500 6.975300 7 0
3.196900 0.904500 6.975300 7 1
H
Coordinates of Component 2
8.682300 9.947000 11.733000 0 2
7.942100 9.947000 11.733000 0 3
16.4 Legacy (pre-v2) file¶
Random Number Seed
0.0000 TIME
15.345600 21.702000 100.000000
90.000000 90.000000 90.000000
0 0
0 0 0
2
216 2
63.546 1.00793
Cu
Coordinates of Component 1
0.639400 0.904500 -0.000100 1 0
3.197000 0.904500 -0.000100 1 1
...
A conforming v2 reader processes this without error, assigning
spec_version = 1 because line 2 does not start with {.