# MCMC Tutorial¶

This tutorial describes the available options when running an MCMC with `MC3`

.
As said before, the MCMC can be run from the shell prompt or through a function call in the Python interpreter.

## Argument Inputs¶

When running from the shell, the arguments can be input as command-line arguments. To see all the available options, run:

```
./mc3.py --help
```

When running from a Python interactive session, the arguments can be input as function arguments. To see the available options, run:

```
import MCcubed as mc3
help(mc3.mcmc)
```

Additionally (and strongly recommended), whether you are running the MCMC from the shell or from the interpreter, the arguments can be input through a configuration file.

## Configuration Files¶

The `MC3`

configuration file follows the ConfigParser format.
The following code block shows an example for an MC3 configuration file:

```
# Comment lines (like this one) are allowed and ignored
# Strings don't need quotation marks
[MCMC]
# DEMC general options:
nsamples = 1e5
burnin = 1000
nchains = 7
walk = snooker
# Fitting function:
func = quad quadratic ../MCcubed/examples/models
# Model inputs:
params = params.dat
indparams = indp.npz
# The data and uncertainties:
data = data.npz
```

## MCMC Run¶

This example describes the basic MCMC argument configuration. The following sub-sections make up a script meant to be run from the Python interpreter. The complete example script is located at tutorial01.

### Input Data¶

The `data`

argument (required) defines the dataset to be fitted.
This argument can be either a 1D float ndarray or the filename (a string)
where the data array is located.

The `uncert`

argument (required) defines the \(1\sigma\) uncertainties
of the `data`

array.
This argument can be either a 1D float ndarray (same length of `data`

) or the filename where the data uncertainties are located.

```
# Create a synthetic dataset using a quadratic polynomial curve:
import sys
import numpy as np
sys.path.append("../MCcubed/examples/models/")
from quadratic import quad
x = np.linspace(0, 10, 1000) # Independent model variable
p0 = [3, -2.4, 0.5] # True-underlying model parameters
y = quad(p0, x) # Noiseless model
uncert = np.sqrt(np.abs(y)) # Data points uncertainty
error = np.random.normal(0, uncert) # Noise for the data
data = y + error # Noisy data set
```

Note

See the Data Section below to find out how to set `data`

and `uncert`

as a filename.

### Modeling Function¶

The `func`

argument (required) defines the parameterized modeling function.
The user can set `func`

either as a callable, e.g.:

```
# Define the modeling function as a callable:
sys.path.append("../MCcubed/examples/models/")
from quadratic import quad
func = quad
```

or as a tuple of strings pointing to the modeling function, e.g.:

```
# A three-elements tuple indicates the function name, the module
# name (without the '.py' extension), and the path to the module.
func = ("quad", "quadratic", "../MCcubed/examples/models/")
# Alternatively, if the module is already within the scope of the
# Python path, the user can set func with a two-elements tuple:
sys.path.append("../MCcubed/examples/models/")
func = ("quad", "quadratic")
```

Note

Important!

The only requirement for the modeling function is that its arguments follow
the same structure of the callable in `scipy.optimize.leastsq`

, i.e.,
the first argument contains the list of fitting parameters.

The `indparams`

argument (optional) packs any additional argument that the
modeling function may require:

```
# indparams contains additional arguments of func (if necessary). Each
# additional argument is an item in the indparams tuple:
indparams = [x]
```

Note

Even if there is only one additional argument to `func`

, indparams must
be defined as a tuple (as in the example above). Eventually, the modeling
function could be called with the following command:

`model = func(params, *indparams)`

### Fitting Parameters¶

The `params`

argument (required) contains the initial-guess values for the model fitting parameters. The `params`

argument must be a 1D float ndarray.

```
# Array of initial-guess values of fitting parameters:
params = np.array([ 10.0, -2.0, 0.1])
```

The `pmin`

and `pmax`

arguments (optional) set the lower and upper boundaries explored by the MCMC for each fitting parameter.

```
# Lower and upper boundaries for the MCMC exploration:
pmin = np.array([-10.0, -20.0, -10.0])
pmax = np.array([ 40.0, 20.0, 10.0])
```

If a proposed step falls outside the set boundaries,
that iteration is automatically rejected.
The default values for each element of `pmin`

and `pmax`

are
`-np.inf`

and `+np.inf`

, respectively.
The `pmin`

and `pmax`

arrays must have the same size of `params`

.

### Parameter Priors¶

The `prior`

, `priorlow`

, and `priorup`

arguments (optional) set the
prior probability distributions of the fitting parameters.
Each of these arguments is a 1D float ndarray.

```
# priorlow defines whether to use uniform non-informative (priorlow = 0.0),
# Jeffreys non-informative (priorlow < 0.0), or Gaussian prior (priorlow > 0.0).
# prior and priorup are irrelevant if priorlow <= 0 (for a given parameter)
prior = np.array([ 0.0, 0.0, 0.0])
priorlow = np.array([ 0.0, 0.0, 0.0])
priorup = np.array([ 0.0, 0.0, 0.0])
```

MC3 supports three types of priors.
If a value of `priorlow`

is \(0.0\) (default) for a given parameter,
the MCMC will apply a uniform non-informative prior:

Note

This is appropriate when there is no prior knowledge of the value of \(\theta\).

If `priorlow`

is less than \(0.0\) for a given parameter,
the MCMC will apply a Jeffreys non-informative prior
(uniform probability per order of magnitude):

Note

This is valid only when the parameter takes positive values. This is a more appropriate prior than a uniform prior when \(\theta\) can take values over several orders of magnitude. For more information, see [Gregory2005], Sec. 3.7.1.

Note

Practical note!

In practice, I have seen better results when one fits \(\log(\theta)\) rather than \(\theta\) with a Jeffreys prior.

Lastly, if `priorlow`

is greater than \(0.0\) for a given parameter,
the MCMC will apply a Gaussian informative prior:

where `prior`

sets the prior value \(\theta_{p}\), and
`priorlow`

and `priorup`

set the lower and upper \(1\sigma\) prior uncertainties,
\(\sigma_{p}\), of the prior (depending if the proposed value
\(\theta\) is lower or higher than \(\theta_{p}\)).

Note

Note that, even when the parameter boundaries are not known or when the parameter is unbound, this prior is suitable for use in the MCMC sampling, since the proposed and current state priors divide out in the Metropolis ratio.

### Parameter Names¶

The `pnames`

argument (optional) define the names of the model
parametes to be shown in the scren output and figure labels. In
figures, the names can use LaTeX syntax. The screen output will
display up to 11 characters. Thus, the user can define the
`figpnames`

argument (optional), display the appropriate syntax for
screen output and figures, for example:

```
pnames = ["log(alpha)", "beta", "Teff"]
figpnames = [r"$\log(\alpha)$", r"$\beta$", r"$T_{\rm eff}$"]
```

If `figpnames`

is `None`

, it defaults to `pnames`

. If `pnames`

is `None`

, it defaults to `figpnames`

. If both arguments are
`None`

, they default to a generic `[Param 1, Param 2, ...]`

list.

### Random Walk¶

The `walk`

argument (optional) defines which random-walk algorithm
for the MCMC:

```
# Choose between: 'snooker', 'demc', or 'mrw':
walk = 'snooker'
```

The standard Differential-Evolution MCMC algorithm (`walk = 'demc'`

,
[terBraak2006]) proposes for each chain \(i\) in state
\(\mathbf{x}_{i}\):

where \(\mathbf{x}_{R1}\) and \(\mathbf{x}_{R2}\) are randomly selected without replacement from the population of current states without \(\mathbf{x}_{i}\). This implementation adopts \(\gamma=f_{\gamma} 2.38/\sqrt{2 N_{\rm free}}\), and \(\mathbf{e}\sim N(0, f_{e}\,{\rm stepsize})\), with \(N_{\rm free}\) the number of free parameters. The scaling factors are defaulted to \(f_{\gamma}=1.0\) and \(f_{e}=0.0\) (see Fine-tuning).

If `walk = 'snooker'`

(default, recommended), `MC3`

will use the
DEMC-z algorithm with snooker propsals (see [BraakVrugt2008]).

If `walk = 'mrw'`

, `MC3`

will use the classical Metropolis-Hastings
algorithm with Gaussian proposal distributions. I.e., in each
iteration and for each parameter, \(\theta\), the MCMC will propose
jumps, drawn from
Gaussian distributions centered at the current value, \(\theta_0\), with
a standard deviation, \(\sigma\), given by the values in the `stepsize`

argument:

Note

For `walk=snooker`

, an MCMC works well from 3 chains. For
`walk=demc`

, [terBraak2006] suggest using \(2*d\) chains,
with \(d\) the number of free parameters.

I recommend any of the `snooker`

or `demc`

algorithms, as they are more efficient than most others MCMC random
walks. From experience, when deciding between these two, consider
that when the initial guess lays far from the lowest chi-square
region, `snooker`

seems to produce lower acceptance rates than ideal
(which is solvable setting `leastsq=True`

). On the other hand,
`demc`

is limited to a high number of chains when there is a high
number of free parameters.

### MCMC Config¶

The following arguments set the MCMC chains configuration:

```
nsamples = 1e5 # Number of MCMC samples to compute
nchains = 7 # Number of parallel chains
nproc = 7 # Number of CPUs to use for chains (default: nchains)
burnin = 1000 # Number of burned-in samples per chain
thinning = 1 # Thinning factor for outputs
# Distribution for the initial samples:
kickoff = 'normal' # Choose between: 'normal' or 'uniform'
hsize = 10 # Number of initial samples per chain
```

The `nsamples`

argument (optional, float, default=1e5) sets the
total number of samples to compute. The approximate number of
iterations run for each chain will be `nsamples/nchains`

.

The `nchains`

argument (optional, integer, default=7) sets the number
of parallel chains to use. The number of iterations run for each chain
will be approximately `nsamples/nchains`

.

`MC3`

runs in multiple processors through the `mutiprocessing`

package. The `nproc`

argument (optional, integer,
default= `nchains`

) sets the number CPUs to use for the chains.
Additionaly, the central MCMC hub will use one extra CPU. Thus, the
total number of CPUs used is `nchains + 1`

.

Note

If `nproc+1`

is greater than the number of available CPUs
in the machine (`nCPU`

), `MC3`

will set ```
nproc =
nCPU-1
```

. To keep a good balance, I recommend setting
`nchains`

equal to a multiple of `nproc`

.

The `burnin`

argument (optional, integer, default=0) sets the number
of burned-in (removed) iterations at the beginning of each chain.

The `thinning`

argument (optional, integer, default=1) sets the chains
thinning factor (discarding all but every `thinning`

-th sample).
To reduce the memory usage, when requested, only the thinned samples
are stored (and returned).

Note

Thinning is often unnecessary for a DE run, since this algorithm reduces significatively the sampling autocorrelation.

To set the starting point of the MCMC chains, `MC3`

draws samples either
from a normal (default) or uniform distribution (determined by
the `kickoff`

argument). The mean and standard deviation of the normal
distribution are set by the `params`

and `stepsize`

arguments,
respectively.
The uniform distribution is constrained between the `pmin`

and `pmax`

boundaries.
The `hsize`

argument determines the size of the starting sample.
All draws from the initial sample are discarded from the returned
posterior distribution.

### Optimization¶

The `leastsq`

argument (optional, boolean, default=False) is a flag that
indicates `MC3`

to run a least-squares optimization before running the MCMC.
`MC3`

implements the Levenberg-Marquardt algorithm (`lm=True`

) via
`scipy.optimize.leastsq`

or Trust Region Reflective (`lm=False`

) via
`scipy.optimize.least_squares`

.

Note

The parameter boundaries (for TRF only, see Optimization Tutorial), fixed and shared-values, and priors will apply for the minimization.

The `chisqscale`

argument (optional, boolean, default=False) is a flag that
indicates `MC3`

to scale the data uncertainties to force a reduced
\(\chi^{2}\) equal to \(1.0\). The scaling applies by multiplying all
uncertainties by a common scale factor.

```
leastsq = True # Least-squares minimization prior to the MCMC
lm = True # Choose Levenberg-Marquardt (True) or TRF algorithm (False)
chisqscale = False # Scale the data uncertainties such that red. chisq = 1
```

### Convergence¶

The `grtest`

argument (optional, boolean, default=False) is a flag that
indicates MC3 to run the Gelman-Rubin convergence test for the MCMC sample of
fitting parameters.
Values larger than 1.01 are indicative of non-convergence.
See [GelmanRubin1992] for further information.

Additionally, the `grbreak`

argument (optional, boolean,
default=0.0) sets a convergence threshold to stop an MCMC when GR
drops below `grbreak`

. Reasonable values seem to be `grbreak`

~1.001–1.005. The default behavior is not to break (`grbreak=0.0`

).

Lastly, the `grnmin`

argument (optional, integer or float,
default=0.5) sets a minimum number of valid samples (after burning and
thinning) required for `grbreak`

. If `grnmin`

is an integer,
require at least `grnmin`

samples to break out of the MCMC. If
`grnmin`

is a float (in the range 0.0–1.0), require at least
`grnmin`

times the maximum possible number of valid samples to break
out of the MCMC.

```
grtest = True # Calculate the GR convergence test
grbreak = 0.0 # GR threshold to stop the MCMC run
grnmin = 0.5 # Minimum fraction or number of samples before grbreak
```

Note

The Gelman-Rubin test is computed every 10% of the MCMC exploration.

### Wavelet-Likelihood MCMC¶

The `wlike`

argument (optional, boolean, default=False) allows MC3 to
implement the Wavelet-based method to estimate time-correlated noise.
When using this method, the used must append the three additional fitting
parameters (\(\gamma, \sigma_{r}, \sigma_{w}\)) from Carter & Winn (2009)
to the end of the `params`

array. Likewise, add the correspoding values
to the `pmin`

, `pmax`

, `stepsize`

, `prior`

, `priorlow`

,
and `priorup`

arrays.
For further information see [CarterWinn2009].

```
wlike = False # Use Carter & Winn's Wavelet-likelihood method.
```

### Fine-tuning¶

The \(f_{\gamma}\) and \(f_{e}\) factors scale the DEMC proposal distributions.

```
fgamma = 1.0 # Scale factor for DEMC's gamma jump.
fepsilon = 0.0 # Jump scale factor for DEMC's "e" distribution
```

The default \(f_{\gamma}=1.0\) value is set such that the MCMC acceptance rate approaches 25-40%. Therefore, most of the time, the user won’t need to modify this. Only if the acceptance rate is very low, we recommend to set \(f_{\gamma}<1.0\). The \(f_{e}\) factor sets the jump scale for the \(\mathbf e\) distribution, which has to have a small variance compared to the posterior. For further information see [terBraak2006].

### File Outputs¶

The following arguments set the output files produced by MC3:

```
log = 'MCMC.log' # Save the MCMC screen outputs to file
savefile = 'MCMC_sample.npz' # Save the MCMC parameters sample to file
plots = True # Generate best-fit, trace, and posterior plots
rms = False # Compute and plot the time-averaging test
full_output = False # Return the full posterior sample
chireturn = False # Return chi-square statistics
```

The `log`

argument (optional, string, default = `None`

)
sets the file name where to store `MC3`

‘s screen output.

The `savefile`

arguments (optional, string, default = `None`

) set
the file names where to store the MCMC outputs into a `.npz`

file,
with keywords `bestp`

, `CRlo`

, `CRhi`

, `stdp`

, `meanp`

,
`Z`

, `Zchain`

, and `Zchisq`

, `bestchisq`

, `redchisq`

,
`chifactor`

, `BIC`

, and standard deviation of the residuals `sdr`

.
The files can be read with the
`numpy.load()`

function. See Returned Values and the description of
`chireturn`

below for details on the output values.

The `plots`

argument (optional, boolean, default = `False`

) is a
flag that indicates MC3 to generate and store the data (along with the
best-fitting model) plot, the MCMC-chain trace plot for each
parameter, and the marginalized and pair-wise posterior plots.

The `rms`

argument (optional, boolean, default = `False`

) is a
flag that indicates `MC3`

to compute the time-averaging test for
time-correlated noise and generate a rms-vs-binsize plot (see
[Winn2008]).

The `full_output`

argument (optional, bool, default = `False`

)
flags the code to return the full posterior sampling array (`Z`

),
including the initial and burnin samples. The posterior will still be
thinned though.

If the `chireturn`

argument (optional, bool, default = `False`

) is
`True`

, `MC3`

will return an additional tuple containing the
chi-square stats: lowest \(\chi^{2}\) (`bestchisq`

),
\(\chi^{2}_{\rm red}\) (`redchisq`

), scaling factor to enforce
\(\chi^{2}_{\rm red} = 1\) (`chifactor`

), and the Bayesian
Information Criterion BIC (`BIC`

).

### Returned Values¶

When run from a pyhton interactive session, `MC3`

will return a
tuple with six elements (seven if `chireturn=True`

, see above):

`bestp`

: a 1D array with the best-fitting parameters (including fixed and shared parameters).`CRlo`

: a 1D array with the lower boundary of the marginal 68%-highest posterior density (the credible region) for each parameter, with respect to`bestp`

.`CRhi`

:a 1D array with the upper boundary of the marginal 68%-highest posterior density for each parameter, with respect to`bestp`

.`stdp`

: a 1D array with the standard deviation of the marginal posterior for each parameter (including that of fixed and shared parameters).`posterior`

: a 2D array containing the burned-in, thinned MCMC sample of the parameters posterior distribution (with dimensions [nsamples, nfree], excluding fixed and shared parameters).`Zchain`

: a 1D array with the indices of the chains for each sample in`posterior`

.

```
# Run the MCMC:
bestp, CRlo, CRhi, stdp, posterior, Zchain = mc3.mcmc(data=data,
uncert=uncert, func=func, indparams=indparams,
params=params, pmin=pmin, pmax=pmax, stepsize=stepsize,
prior=prior, priorlow=priorlow, priorup=priorup,
walk=walk, nsamples=nsamples, nchains=nchains,
nproc=nproc, burnin=burnin, thinning=thinning,
leastsq=leastsq, lm=lm, chisqscale=chisqscale,
hsize=hsize, kickoff=kickoff,
grtest=grtest, grbreak=grbreak, grnmin=grnmin,
wlike=wlike, log=log,
plots=plots, savefile=savefile, rms=rms, full_output=full_output)
```

Note

Note that `bestp`

, `CRlo`

, `CRhi`

, and `stdp`

include the values for all model parameters, including fixed and
shared parameters, whereas `posterior`

includes only
the free parameters. Be careful with the dimesions.

## Inputs from Files¶

The `data`

, `uncert`

, `indparams`

, `params`

, `pmin`

, `pmax`

,
`stepsize`

, `prior`

, `priorlow`

, and `priorup`

input arrays
can be optionally be given as input file.
Furthermore, multiple input arguments can be combined into a single file.

### Data¶

The `data`

, `uncert`

, and `indparams`

inputs can be provided as
binary `numpy`

`.npz`

files.
`data`

and `uncert`

can be stored together into a single file.
An `indparams`

input file contain the list of independent variables
(must be a list, even if there is a single independent variable).

The `utils`

sub-package of `MC3`

provide utility functions to
save and load these files.
The `preamble.py`

file in
demo02
gives an example of how to create `data`

and `indparams`

input files:

```
# Import the necessary modules:
import sys
import numpy as np
# Import the modules from the MCcubed package:
sys.path.append("../MCcubed/")
import MCcubed as mc3
sys.path.append("../MCcubed/examples/models/")
from quadratic import quad
# Create a synthetic dataset using a quadratic polynomial curve:
x = np.linspace(0.0, 10, 1000) # Independent model variable
p0 = [3, -2.4, 0.5] # True-underlying model parameters
y = quad(p0, x) # Noiseless model
uncert = np.sqrt(np.abs(y)) # Data points uncertainty
error = np.random.normal(0, uncert) # Noise for the data
data = y + error # Noisy data set
# data.npz contains the data and uncertainty arrays:
mc3.utils.savebin([data, uncert], 'data.npz')
# indp.npz contains a list of variables:
mc3.utils.savebin([x], 'indp.npz')
```

### Fitting Parameters¶

The `params`

, `pmin`

, `pmax`

, `stepsize`

,
`prior`

, `priorlow`

, and `priorup`

inputs
can be provided as plain ASCII files.
For simplycity all of these input arguments can be combined into
a single file.

In the `params`

file, each line correspond to one model
parameter, whereas each column correspond to one of the input array arguments.
This input file can hold as few or as many of these argument arrays,
as long as they are provided in that exact order.
Empty or comment lines are allowed (and ignored by the reader).
A valid params file look like this:

```
# params pmin pmax stepsize
10 -10 40 1.0
-2.0 -20 20 0.5
0.1 -10 10 0.1
```

Alternatively, the `utils`

sub-package of `MC3`

provide utility
functions to save and load these files:

```
params = [ 10, -2.0, 0.1]
pmin = [-10, -20, -10]
pmax = [ 40, 20, 10]
stepsize = [ 1, 0.5, 0.1]
# Store ASCII arrays:
mc3.utils.saveascii([params, pmin, pmax, stepsize], 'params.txt')
```

Then, to run the MCMC simply provide the input file names to the `MC3`

routine:

```
# To run MCMC, set the arguments to the file names:
data = 'data.npz'
indparams = 'indp.npz'
params = 'params.txt'
# Run MCMC:
bestp, CRlo, CRhi, stdp, posterior, Zchain = mc3.mcmc(data=data,
func=func, indparams=indparams, params=params,
walk=walk, nsamples=nsamples, nchains=nchains,
nproc=nproc, burnin=burnin, thinning=thinning,
leastsq=leastsq, lm=lm, chisqscale=chisqscale,
hsize=hsize, kickoff=kickoff,
grtest=grtest, grbreak=grbreak, grnmin=grnmin,
wlike=wlike, log=log,
plots=plots, savefile=savefile, rms=rms, full_output=full_output)
```

## References¶

[CarterWinn2009] | Carter & Winn (2009): Parameter Estimation from Time-series Data with Correlated Errors: A Wavelet-based Method and its Application to Transit Light Curves |

[GelmanRubin1992] | Gelman & Rubin (1992): Inference from Iterative Simulation Using Multiple Sequences |

[Gregory2005] | Gregory (2005): Bayesian Logical Data Analysis for the Physical Sciences |

[terBraak2006] | (1, 2, 3) ter Braak (2006): A Markov Chain Monte Carlo version of the genetic algorithm Differential Evolution |

[BraakVrugt2008] | ter Braak & Vrugt (2008): Differential Evolution Markov Chain with snooker updater and fewer chains |

[Winn2008] | Winn et al. (2008): The Transit Light Curve Project. IX. Evidence for a Smaller Radius of the Exoplanet XO-3b |