I/O File Format¶
This section describes the various input and output files used by the MC_fit
program.
Perple_X programs use a root name to identify problem-specific I/O files.
The root name is specified when the user runs the program BUILD to create
the problem definition file. The default root name assigned by BUILD is my_project,
this name is assumed here. In addition to the problem definition file
my_project.dat generated by BUILD, MC_fit requires a second problem specific file
named my_project.imc that is created by the user to define the inversion problem.
Root and/or complete file names may include directory paths, e.g., the root
./my_directory/my_project causes output to be written to the subdirectory my_directory
of the current working directory. To change the root name of a project rename my_project.dat
and my_project.imc with the desired name.
Input: my_project.dat¶
It is assumed here that potential users of MC_fit are already familiar with the
Perple_X program BUILD used to generate the my_project.dat problem definition file.
Should this not be the case, refer to the tutorials listed on the
Perple_X website.
To create a problem definition file suitable for MC_fit, run BUILD as though to setup the calculation of a phase diagram section for the observed assemblage:
The bulk composition and variable ranges specified in
my_project.datare ignored by MC_fit, i.e., arbitrary values for these parameters can be entered in response to the prompts from BUILD. MC_fit also ignores the molar vs mass input unit specificationmy_project.dat.Specify the thermodynamic data, solution model, and Perple_X option files to be used by MC_fit. MC_fit reads an additional option file file that is specified in
my_project.imc.Specify the components that describe the chemistry of the observed assemblage, ordinarily these are specified as thermodynamic components; however saturated and mobile components may also be specified.
The independent phantom variables specified for the phantom phase diagram section calculation, ordinarily pressure and temperature, define the unknown environmental parameters to be determined by MC_fit. If mobile components have been specified, the chemical potential, activity, or fugacity of the component is included among the unknown environmental parameters to be determined by MC_fit.
Specify any solution models appropriate for the observed assemblage, as well for any phases that might affect the stability of the observed assemblage.
Note
The thermodynamic data, Perple_X option, and solution model files referenced in my_project.dat
are not described here. Refer to the online documentation for details of the
thermodynamic data and
Perple_X option files.
The solution model file, normally
solution_model.dat,
is to some extent self-documenting.
Note
Specifying component saturation constraints in my_project.dat
simplifies the inversion problem by assuring that the corresponding composant is in equilibrium with the
predicted assemblage.
However, the identity of the composant is not considered as a constraint by
MC_fit; therefore, component saturation constraints should not be used when
the identity of the composant is critical.
For example, SiO2 should not be saturated if different silica polymorphs are potentially stable over the pressure- temperature range of interest and it is known that a particular polymorph was stable. Conversely, component saturation constraints should be used when it is suspected that a polymorph of a component was stable, but its identity is uncertain (e.g., aragonite vs calcite).
Input: my_project.imc¶
The general structure of the my_project.imc file consists of keywords with
interspersed data as outlined below, line numbers have been added to facilitate
discussion.
Although the keywords and data must be entered
sequentially, there is no requirement that the data must be on a particular
line of the my_project.imc file.
1 OPTION_FILE_NAME
2
3 begin_assemblage
4
5 sample_name TEXT
6
7 pressure_range LO# HI#
8 temperature_range LO# HI#
9
10 MAF(TEXT)_range LO# HI#
11 ...
12
13 unmeasured_component TEXT
14
15 begin_limits
16 TEXT LO# HI#
17 ...
18 end_limits
19
20 ...
21
22 begin_bulk
23 TEXT VAL# UNC#
24 ...
25 end_bulk
26
27 phase_name TEXT
28
29 phase_mode VAL# UNC#
30
31 begin_comp
32 TEXT VAL# UNC#
33 ...
34 end_comp
35
36 end_assemblage
—
Line 1: The complete name of the MC_fit option file,
e.g., MC_fit_option.dat.
—
Line 3: begin_assemblage signals the beginning of the inversion problem.
—
Line 5, sample_name: TEXT is a sample identifier, title, etc.
—
Lines 7-8, pressure_range and temperature_range:
The LO# HI# pairs on each line are the lower and upper limits
on pressure (bar) and temperature (K) for the inversion problem.
Currently there is no means
of fixing pressure or temperature for an inversion problem.
Such an option will be implemented upon request.
—
[Optional] Line 10, MAF(TEXT)_range: This keyword, and the
relevant data, must be
repeated for each mobile component specified in my_project.dat.
If my_project.dat does not specify mobile components then this
keyword must be omitted. TEXT is the name of a mobile
component (e.g., O2) and the LO# HI# pair is the lower and upper
limit on the chemical potential (J/mol), activity (non-dimensional),
or fugacity (bar) of the component depending on which property
has been chosen to represent the component in my_project.dat.
The chemical potential/activity/fugacity of a mobile component is
an independent inversion parameter, the amount of the mobile
component in the observed phases is determined as a dependent
parameter.
—
[Optional] Lines 13-18: This block specifies an unmeasured
component, the abundance of which is to be determined by the inversion.
The block is repeated as necessary to specify multiple unmeasured components.
The TEXT following unmeasured_component keyword (Line 13) is the name
of an unmeasured thermodynamic component (e.g., O2, C, CO2, H2O, S).
The unmeasured_component keyword is followed by a block
bounded by the begin_limits and end_limits keywords that
specifies the range of the unmeasured component.
The range of unmeasured components may be specified in two ways, simple or coupled limits. Simple limits are appropriate when the abundance of an unmeasured component, e.g., H2O, is independent of the abundances of other components. Coupled limits are appropriate when the abundance of an unmeasured component, e.g., O2, is dependent on the abundances of other components.
Simple limits are signaled by “Simple” in place of TEXT. LO# and HI# specify the absolute lower- and upper-limits on the molar amount of the unmeasured component.
Note
Amounts specified for simple limits approximate mole fractions
because of the normalization used by MC_fit,
In coupled limits, the range of the unmeasured component is specified in terms of other thermodynamic components. On these lines, TEXT specifies a thermodynamic component and LO# and HI# specifies the molar stoichiometric coefficients of the component in the linear equations for the lower- and upper-limits on the unmeasured component. Refer to the sidebars and/or input examples listed below for further information.
—
[Optional] Lines 20-24: In cases where it is reasonable to assume
that the bulk composition of the putative equilibrium assemblage is known,
this block specifies that composition.
This assumption results in a major simplification of the
inversion process. Thus, when justified, use of this option is recommended as
the first line of attack on the inversion problem.
However, particularly for minor elements and/or phases, insignificant variations
in the bulk composition can have a significant effect on the predicted
mineralogy due to the simplified nature of thermodynamic solution models.
For this reason, inversions that make use of the bulk composition option that
yield few or no acceptable solutions should be repeated without this option.
The begin_bulk and end_bulk keywords (Lines 20 and 24) delimit the block.
Between these keywords, each line specifies the name (TEXT) of a thermodynamic
component,
followed by its molar or mass amount (VAL#) and the
relative or absolute uncertainty (UNC#).
The component names must match those specified in
my_project.dat.
The bulk composition need not be normalized and need not include the unmeasured
thermodynamic components.
The component amounts need not be normalized.
When uncertainties are not available, MC_fit can be instructed to estimate
relative uncertainties using:
where \(X_j\) is the molar fraction of component \(j\), the power-law
exponent approximates the dependence suggested by Nerone et al. ([Nerone2025]),
and the coefficient (0.25) is chosen to yield relative uncertainties of 2% when
\(X_j = 0.5\) and 20% when \(X_j = 0.05\).
The use of Eq 6 is signaled to MC_fit by negative
number in place of UNC# for the relevant component.
When Eq 6 is used, MC_fit automatically converts the
component amounts specified by VAL# to the correct units and
normalization.
—
Lines 27-34: This block specifies the observed phases of the putative relict equilibrium. The entries differ depending on whether the phase is modeled as a pure phase (e.g., quartz) or a solution phase (e.g., garnet).
—
For solution phases:
Line 27, phase_name: TEXT is the name of a solution
model specified in my_project.dat that corresponds to an observed
phase.
[Optional] Line 29, phase_mode: VAL# is the volume fraction of the phase,
and UNC# is the uncertainty in that volume fraction. If modal fractions are
provided, they must be provided for all phases. Modal fractions are automatically
renormalized by MC_fit.
Note
Modal data and bulk composition are redundant information. Thus, nothing is gained by providing both. In cases where the bulk composition is unknown or inconsistent, modal data can be used to infer the bulk composition of the observed assemblage. Bulk compositions output by MC_fit without modal data are, in general, non-unique.
Lines 31-34: This block, bounded by the begin_comp and end_comp keywords,
specifies the composition of the observed phase. Each line within the block
specifies the name (NAME) of a thermodynamic, mobile, or saturated component,
followed by its molar or mass amount (VAL#) and
its relative or absolute uncertainty (UNC#).
Neither unmeasured components nor components that are not present in the observed
phase need be specified. The component names must match those specified in
my_project.dat. The component amounts need not be normalized.
As in the case of bulk compositions (Lines 20-24 above), when
uncertainties are not available, MC_fit can be instructed to
estimate relative uncertainty of a component with Eq 6 by
setting UNC# to a negative number.
Note
Whether components that are observed in a phase, but that cannot be predicted by the corresponding solution model, should be included is a matter of judgment. Because such components cannot be predicted by the solution model, their inclusion has the consequence that they must be accommodated by other phases in the predicted assemblage. In general this leads to higher misfit and, in particular, may increase the misfit of the solution model that ultimately accommodates the component. These effects are proportional to the amount of the component. Arguably, when the uncertainty on the abundance of a component includes zero, it should be omitted. When the amount of a component that cannot be predicted by the corresponding solution model is significant, an alternative to simple omission or inclusion is to replace the component by a geochemical proxy that is predicted by the solution model. For example, BaO is a common component of feldspar, but cannot be predicted by current feldspar solution models. As CaO is a reasonable, and thermodynamically well-predicted, proxy for BaO in feldspar, the observed molar amount of CaO in an observed feldspar composition can be augmented by the molar amount of BaO, which is then omitted from the “observed” composition.
—
For pure phases:
Line 27, phase_name: TEXT is the name of a stoichiometric
compound listed in the thermodynamic data file specified by
my_project.dat that approximates the observed phase.
[Optional] Line 29, phase_mode: is specified as above for solution phases.
Lines 31-34: This block is omitted for pure phases.
Line 36, end_assemblage: signals the end of the inversion problem specification. Comments
may be written after this keyword without using the comment character (|).
Output: my_project.out¶
The file my_project.out records the progress
information and details of the Nelder-Mead optimizations during both the
Central Model Analysis and Perturbation Analysis.
Central Model Output¶
By default, during the Central Model Analysis all successful tries, i.e., Nelder-Mead optimizations that converge to a local minimum, are output. The number of successful tries output can be regulated by the model_output, no_missing_phases, must_fit_all_data and misfit_filter_value options. Additionally, the Nelder-Mead_covariance option can be set to output the correlation-covariance matrix. The output of each successful try is of the form:
1 --------------------------------------------------------------------------------
2 Central model Try 44 converged, 42 successes so far.
3
4 Misfit function evaluations this try => 259
5
6 Misfit this try = 1.011330E-04
7 Best Misfit so far = 1.011330E-04 obtained on Try 44
8
9 Prior probability, PP = 2.253960E-07
10 Bayes score, PP * exp(-Misfit) = 2.253732E-07
11 Best Bayes score so far = 4.084621E-07 obtained on Try 5
12
13 P_bar T_K C_O2 Cpx Gt
14 Initial coordinates: 3086.20 739.602 0.191001 0.825769 0.174231
15 Final coordinates: 10108.3 1002.02 0.141531 0.260092 0.739908
16
17 This model fits all data within observational uncertainty
18
19 Components of the Misfit Score
20
21 Predicted compositions: 1.011326E-04
22 Extraneous predicted phases: 4.134694E-10
23 Missed observed phases: 0.00000
24
25 The following observed phases are predicted:
26
27 vol % Na2O MgO Al2O3 CaO FeO O2 SiO2
28 Augite(G)
29 predicted* 26.177 0.0134 0.1392 0.0072 0.2164 0.1167 0.0063 0.5008
30 observed* 0.0135 0.1401 0.0073 0.2178 0.1174 0.0000 0.5039
31 residual 0.0001 0.0009 0.0001 0.0014 0.0007 -0.0063 0.0031
32 Composition Misfit: 0.396339E-04
33 Gt(W)
34 predicted* 73.823 0.0000 0.0518 0.1239 0.1561 0.2419 0.0078 0.4185
35 observed* 0.0000 0.0522 0.1249 0.1573 0.2438 0.0000 0.4218
36 residual 0.0000 0.0004 0.0010 0.0012 0.0019 -0.0078 0.0033
37 Composition Misfit: 0.614987E-04
38
39 The following predicted phases are not observed:
40
41 vol % Na2O MgO Al2O3 CaO FeO O2 SiO2
42 q 0.000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
43
44 Effective Bulk* Chemical Potentials (J/mol)
45 Na2O 3.507 -777832.
46 MgO 75.016 -641782.
47 Al2O3 94.301 -0.165555E+07
48 CaO 173.054 -745846.
49 FeO 210.941 -320589.
50 O2 7.464 -442446.
51 SiO2 443.182 -891292.
52
53 *normalized molar units
54 --------------------------------------------------------------------------------
—
Lines 6-7: indicate the misfit of the current result and the current best result.
The best_central_model can be identified by scrolling backward from the end of the
my_project.out file to the last central model output to find the number of the try
that produced the best result.
—
Lines 9-11: list the prior probability (Eq 15), the current Bayes score for the current result, and the best Bayes score so far.
—
Line 13: the inversion parameters.
Parameters indicated by C_X are the fractional value
of unmeasured component X, relative to its
lower and upper limits as specified in the my_project.imc file.
Because these limits
are dependent on the effective bulk composition, which itself may be variable, the actual composition
of an unmeasured component is best read from the phase or effective bulk compositions listed later
in the output.
For inversions in which the bulk compositional information is not input, parameters indicated by a
phase name, indicate the fractional amount of the observed composition of the phase used to make up
the bulk composition.
—
Line 14: the starting (the initial guess) coordinate of the Nelder-Mead optimization.
—
Line 15: the final (converged) coordinate of the Nelder-Mead optimization.
—
Line 17: indicates whether the predicted assemblage fits all data within its analytical uncertainty (uncertainty_multiplier). Whether models fit all data within uncertainty is not directly related to the misfit. Indeed, a model with low misfit may not fit all data within uncertainty, and vice versa. Consequently, it is a matter of judgment whether fitting all data within its uncertainty should be used as a criterion for accepting or rejecting models. This criterion is implemented at the user’s discretion in MC_fit_plot.
—
Lines 19-23: the components of the misfit (Eq 8).
—
Lines 28-38: the predicted and observed compositions, the predicted mode, and misfit, of each successfully predicted observed phase. If observed modal data is available, it is also listed. The predicted modes are only unique if modal and/or bulk compositional observational data has been provided. The phase compositions are normalized to 1 mole.
—
Lines 40-43: the predicted compositions and modes of any predicted phases that
are not observed. If no_missing_phases is set to F, the observed composition
and, if available, mode of each observed
phase is that is not predicted is also listed.
—
Lines 45-52: the effective bulk composition and the chemical potential (J/mol) of each component.
The chemical potentials are characteristic of the observed assemblage regardless of whether the bulk
composition is unique; however, the effective bulk composition is only
unique if modal and/or bulk compositional observational data
have been provided.
The effective bulk composition is normalized to 1000 mole.
The composition can be pasted into
my_project.dat and used as input for the Perple_X programs MEEMUM or VERTEX to verify
that the predicted assemblage is
stable at the inversion coordinates and to obtain additional information
about the assemblage.
The chemical potential of O2 (\(\mu_{\text{O2}}\)) is easily converted to oxygen
fugacity (Appendix C).
Perturbation Analysis¶
The output for each successful try during the perturbation analysis is of the form:
1 --------------------------------------------------------------------------------
2 After 101 successful perturbations:
3 P_bar T_K C_O2 Cpx Gt OBJF
4 Central model parameters: 10108.3 1002.02 0.141531 0.260092 0.739908 1.011330E-04
5 Perturbed central model parameter: 9430.11 1001.87 0.134613 0.221303 0.778697 1.591129E-04
6 Mean perturbed parameters: 10370.6 1007.81 0.139926 0.242894 0.757106 1.636601E-04
7 Parameter standard deviation: 440.918 8.32835 4.849425E-03 3.580837E-02 3.580837E-02 8.408905E-05
8 --------------------------------------------------------------------------------
—
Line 2: the total number of perturbations is 1 + number_of_perturbations, because the best_central_model result is included in the perturbation analysis statistics. Results are reported only after two successful perturbations.
—
Line 3: the inversion parameters. The parameters have the same significance, and are listed in the same order, as in the
central model output (line 13, above). By default, OBJF is the misfit. If Bayes is set to T, OBJF is the Bayes score.
—
Line 4: by default, repeats the best_central_model parameters. If perturbation analysis has been run with perturbation_hot_start, this line repeats the user specified central model parameters. These coordinates are the starting point for Nelder-Mead optimization with the perturbed analytical and thermodynamic parameters.
—
Line 5: the final coordinates of the Nelder-Mead optimization for the perturbed dataset.
—
Line 6: the mean of the final coordinates of all successful perturbation up to the current iteration.
—
Line 7: the standard deviation of the inversion parameters for all successful perturbations up to the current iteration.
Output: Console¶
The information written to the my_project.out is echoed to the user console with
some additional progress information controlled by the vital_sign
and print_level options.
Output: my_project_central.pts¶
The my_project_central.pts file contains the results of the central model
analysis tries formatted for automated processing. Data in the file is
of the form:
1 sym fit P_bar T_K C_O2 Cpx Gt OBJF Marker Misfit Bayes Na2O MgO Al2O3 CaO FeO O2 SiO2
2 3 1 10014.4 999.954 0.164934 0.596088 0.403912 1.030612E-04 999.000 1.030612E-04 2.337938E-07 8.036604E-03 0.104553 5.480211E-02 0.193370 0.168471 6.946660E-03 0.470766
3 3 1 5666.01 912.995 0.214321 0.994931 5.068593E-03 2.037970E-03 999.000 2.037970E-03 3.602763E-07 1.341390E-02 0.139616 7.914999E-03 0.217487 0.118058 6.325581E-03 0.503510
4 ...
5 1 1 10108.3 1002.02 0.141531 0.260092 0.739908 1.011326E-04 999.000 0.00000 0.00000 1.348223E-02 0.140062 7.319146E-03 0.217793 0.117418 1.012418E-02 0.503926
6 9 0 1000.00 700.000 0.00000 0.00000 0.00000 1.011326E-04 999.000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
7 10 0 15000.0 1100.00 1.00000 1.00000 1.00000 1.011326E-04 999.000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
The header identifies each column of data. The first column is sym, an integer code with the following significance:
1: best central model inversion coordinate.
3: central model inversion coordinate for an inversion where Bayes has been set to
defaultorF, in which case OBJF is the misfit of the model.6: central model inversion coordinate for an inversion where Bayes has been set to
T, in which case OBJF is the Bayes score of the model.9: lower bounds for the inversion parameter ranges.
10: upper bounds for the inversion parameter ranges.
sym is followed by:
fit, an integer code that is
1if the model fits all data within observational uncertainty (uncertainty_multiplier), and0otherwise.the names of the inversion parameters, as on line 13 of my_project.out.
OBJF, the misfit or Bayes score.
Marker, a numeric marker,
999.000, used to indicate the n+3’th column, where n is the number of inversion parameters.Misfit, the misfit of the model.
Bayes, the Bayes score for the model.
the components that define the effective bulk composition of the model (molar units).
Note
The best central model inversion coordinate (sym = 1) replicates a coordinate (sym = 3 or 6) output earlier in the file.
When using programs other than MC_fit_plot for statistical analysis, care should be taken to ignore the best central model coordinate (sym = 1) and the lower and upper bound coordinates (sym = 9 and 10).
Symbol codes 3 and 6 are mutually exclusive.
Output: my_project_perturbed.pts¶
The my_project_perturbed.pts file contains the results of the perturbation
analysis formatted for automated processing.
The format of the my_project_perturbed.pts file is identical to that of
my_project_central.pts described above except that the mutually exclusive sym codes for the perturbed central
models are 2 (misfit) and 5 (Bayes score).