You have a model that has several parameters, which should predict or describe an observation.
An example of a model with three parameters (A, B, F) is “y(t) = A*sin(F*t + B)”. It describes the
relation between y and t.
Assume now you can specify how likely it is that the real observation data has been observed under this model.
This is called the likelihood function p(D|M, I). Now we would need to evaluate this function everywhere
in parameter space, which is simply unfeasible. We would need a method that evaluates there more densely,
where the value is high. This is where Markov Chain Monte Carlo comes in (Metropolis algorithm, proposal distribution).
With MCMC and the Bayesian theorem, you can not only tell what the best fitting values are, you also get
the probability of each possible parameter value as a marginal distribution. And, you can compare the
likelihood of one model to another (in the example above, we could add “+ C”).
Parallel Tempering, calibrations, different proposal distributions, adaptive MCMC are enhancements to improve convergence.