gpflow.models#

Sub-package containing the GPflow model implementations.

All our models derive from GPModel, which itself derives from BayesianModel.

For an overview of the implemented model see What models are implemented?, and for a basic example of how to use one see Basic Usage with GPR.

Modules#

Classes#

gpflow.models.BayesianGPLVM#

class gpflow.models.BayesianGPLVM(data, X_data_mean, X_data_var, kernel, num_inducing_variables=None, inducing_variable=None, X_prior_mean=None, X_prior_var=None)[source]#

Bases: GPModel, InternalDataTrainingLossMixin

Parameters:
  • data (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

  • X_data_mean (Tensor) –

  • X_data_var (Tensor) –

  • kernel (Kernel) –

  • num_inducing_variables (Optional[int]) –

  • inducing_variable (Union[InducingVariables, Tensor, ndarray[Any, Any], None]) –

  • X_prior_mean (Union[ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • X_prior_var (Union[ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

elbo()[source]#

Construct a tensorflow function to compute the bound on the marginal likelihood.

Return type:

Tensor

Returns:

  • return has shape [].

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Compute the mean and variance of the latent function at some new points. Note that this is very similar to the SGPR prediction, for which there are notes in the SGPR notebook.

Note: This model does not allow full output covariances.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

    points at which to predict

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

predict_log_density(data, full_cov=False, full_output_cov=False)[source]#

Compute the log of the probability density of the data at the new data points.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

    • data[0] has shape [batch…, N, D].

    • data[1] has shape [batch…, N, P].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tensor

Returns:

  • return has shape [batch…, N].

gpflow.models.BayesianModel#

class gpflow.models.BayesianModel(name=None)[source]#

Bases: Module

Bayesian model.

This is a base class for all GPflow models. See also GPModel.

A bayesian model provides methods for computing prior- and posterior densities, and a maximum likelihood objective; allowing you to use generic code to optimise model parameters to fit data.

Most bayesian models are expected to hold their data internally, but the methods take *args and **kwargs allowing you to write implementations that take data as parameters. See also gpflow.models.training_mixins.InternalDataTrainingLossMixin, gpflow.models.training_mixins.ExternalDataTrainingLossMixin, and gpflow.models.training_loss().

log_posterior_density(*args, **kwargs)[source]#

This may be the posterior with respect to the hyperparameters (e.g. for GPR) or the posterior with respect to the function (e.g. for GPMC and SGPMC). It assumes that maximum_log_likelihood_objective() is defined sensibly.

Return type:

Tensor

Returns:

  • return has shape [].

Parameters:
  • args (Any) –

  • kwargs (Any) –

log_prior_density()[source]#

Sum of the log prior probability densities of all (constrained) variables in this model.

Return type:

Tensor

Returns:

  • return has shape [].

abstract maximum_log_likelihood_objective(*args, **kwargs)[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

Parameters:
  • args (Any) –

  • kwargs (Any) –

gpflow.models.CGLB#

class gpflow.models.CGLB(data, *args, cg_tolerance=1.0, max_cg_iters=100, restart_cg_iters=40, v_grad_optimization=False, **kwargs)[source]#

Bases: SGPR

Conjugate Gradient Lower Bound.

The key reference is Artemev et al. [ABvdW21].

Parameters:
  • cg_tolerance (float) – Determines accuracy to which conjugate gradient is run when evaluating the elbo. Running more iterations of CG would increase the ELBO by at most cg_tolerance.

  • max_cg_iters (int) – Maximum number of iterations of CG to run per evaluation of the ELBO (or mean prediction).

  • restart_cg_iters (int) – How frequently to restart the CG iteration. Can be useful to avoid build up of numerical errors when many steps of CG are run.

  • v_grad_optimization (bool) – If False, in every evaluation of the ELBO, CG is run to select a new auxilary vector v. If False, no CG is run when evaluating the ELBO but gradients with respect to v are tracked so that it can be optimized jointly with other parameters.

  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • args (Any) –

  • kwargs (Any) –

logdet_term(common)[source]#

Compute a lower bound on \(-0.5 * \log |K + σ²I|\) based on a low-rank approximation to K.

\[\log |K + σ²I| <= \log |Q + σ²I| + n * \log(1 + \textrm{tr}(K - Q)/(σ²n)).\]

This bound is at least as tight as

\[\log |K + σ²I| <= \log |Q + σ²I| + \textrm{tr}(K - Q)/σ²,\]

which appears in SGPR.

Return type:

Tensor

Returns:

  • return has shape [].

Parameters:

common (CommonTensors) –

predict_f(Xnew, full_cov=False, full_output_cov=False, cg_tolerance=0.001)[source]#

The posterior mean for CGLB model is given by

where \(r = y - K v\) is the residual from CG.

Note that when \(v=0\), this agree with the SGPR mean, while if \(v = K⁻¹ y\), then \(r=0\), and the exact GP mean is recovered.

Parameters:
  • cg_tolerance (Optional[float]) – float or None: If None, the cached value of \(v\) is used. If float, conjugate gradient is run until \(rᵀQ⁻¹r < ϵ\).

  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

predict_log_density(data, full_cov=False, full_output_cov=False, cg_tolerance=0.001)[source]#

Compute the log density of the data at the new data points.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

    • data[0] has shape [batch…, N, D].

    • data[1] has shape [batch…, N, P].

  • full_cov (bool) –

  • full_output_cov (bool) –

  • cg_tolerance (Optional[float]) –

Return type:

Tensor

Returns:

  • return has shape [batch…, N].

predict_y(Xnew, full_cov=False, full_output_cov=False, cg_tolerance=0.001)[source]#

Compute the mean and variance of the held-out data at the input points.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

  • cg_tolerance (Optional[float]) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

quad_term(common)[source]#

Computes a lower bound on the quadratic term in the log marginal likelihood of conjugate GPR. The bound is based on an auxiliary vector, v. For \(Q ≺ K\) and \(r=y - Kv\)

\[-0.5 * (rᵀQ⁻¹r + 2yᵀv - vᵀ K v ) <= -0.5 * yᵀK⁻¹y <= -0.5 * (2yᵀv - vᵀKv).\]

Equality holds if \(r=0\), i.e. \(v = K⁻¹y\).

If self.aux_vec is trainable, gradients are computed with respect to \(v\) as well and \(v\) can be optimized using gradient based methods.

Otherwise, \(v\) is updated with the method of conjugate gradients (CG). CG is run until \(0.5 * rᵀQ⁻¹r <= ϵ\), which ensures that the maximum bias due to this term is not more than \(ϵ\). The \(ϵ\) is the CG tolerance.

Return type:

Tensor

Returns:

  • return has shape [].

Parameters:

common (CommonTensors) –

gpflow.models.ExternalDataTrainingLossMixin#

class gpflow.models.ExternalDataTrainingLossMixin[source]#

Bases: object

Mixin utility for training loss methods for models that do not own their own data. It provides

See InternalDataTrainingLossMixin for an equivalent mixin for models that do own their own data.

training_loss(data)[source]#

Returns the training loss for this model.

Parameters:

data (TypeVar(Data, Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]], Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter])) –

  • data[0] has shape [N, D].

  • data[1] has shape [N, P].

the data to be used for computing the model objective.

Return type:

Tensor

Returns:

  • return has shape [].

training_loss_closure(data, *, compile=True)[source]#

Returns a closure that computes the training loss, which by default is wrapped in tf.function(). This can be disabled by passing compile=False.

Parameters:
  • data (Union[TypeVar(Data, Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]], Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]), OwnedIterator]) – the data to be used by the closure for computing the model objective. Can be the full dataset or an iterator, e.g. iter(dataset.batch(batch_size)), where dataset is an instance of tf.data.Dataset.

  • compile (bool) – if True, wrap training loss in tf.function()

Return type:

Callable[[], Tensor]

gpflow.models.GPLVM#

class gpflow.models.GPLVM(data, latent_dim, X_data_mean=None, kernel=None, mean_function=None)[source]#

Bases: GPR

Standard GPLVM where the likelihood can be optimised with respect to the latent X.

Parameters:
  • data (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

  • latent_dim (int) –

  • X_data_mean (Optional[Tensor]) –

  • kernel (Optional[Kernel]) –

  • mean_function (Optional[MeanFunction]) –

gpflow.models.GPMC#

class gpflow.models.GPMC(data, kernel, likelihood, mean_function=None, num_latent_gps=None)[source]#

Bases: GPModel, InternalDataTrainingLossMixin

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • likelihood (Likelihood) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

log_likelihood()[source]#

Construct a tf function to compute the likelihood of a general GP model.

log p(Y | V, theta).

Return type:

Tensor

Returns:

  • return has shape [].

log_posterior_density()[source]#

This may be the posterior with respect to the hyperparameters (e.g. for GPR) or the posterior with respect to the function (e.g. for GPMC and SGPMC). It assumes that maximum_log_likelihood_objective() is defined sensibly.

Return type:

Tensor

Returns:

  • return has shape [].

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Xnew is a data matrix, point at which we want to predict

This method computes

p(F* | (F=LV) )

where F* are points on the GP at Xnew, F=LV are points on the GP at X.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

gpflow.models.GPModel#

class gpflow.models.GPModel(kernel, likelihood, mean_function=None, num_latent_gps=None)[source]#

Bases: BayesianModel

A stateless base class for Gaussian process models, that is, those of the form

\begin{align} \theta & \sim p(\theta) \\ f & \sim \mathcal{GP}(m(x), k(x, x'; \theta)) \\ f_i & = f(x_i) \\ y_i \,|\, f_i & \sim p(y_i|f_i) \end{align}

This class mostly adds functionality for predictions. To use it, inheriting classes must define a predict_f function, which computes the means and variances of the latent function.

These predictions are then pushed through the likelihood to obtain means and variances of held out data, self.predict_y.

The predictions can also be used to compute the (log) density of held-out data via self.predict_log_density.

It is also possible to draw samples from the latent GPs using self.predict_f_samples.

If you are new to GPflow, see our Getting Started for examples on how to use a model.

Parameters:
  • kernel (Kernel) – Covariance function. $k$ above.

  • likelihood (Likelihood) – The likelihood of $y_i$, given $f_i$.

  • mean_function (Optional[MeanFunction]) – Mean of $f$.

  • num_latent_gps (Optional[int]) – The number of latent GPs - the output dimension of $f$.

static calc_num_latent_gps(kernel, likelihood, output_dim)[source]#

Calculates the number of latent GPs required given the number of outputs output_dim and the type of likelihood and kernel.

Note: It’s not nice for GPModel to need to be aware of specific likelihoods as here. However, num_latent_gps is a bit more broken in general, we should fix this in the future. There are also some slightly problematic assumptions re the output dimensions of mean_function. See https://github.com/GPflow/GPflow/issues/1343

Parameters:
Return type:

int

static calc_num_latent_gps_from_data(data, kernel, likelihood)[source]#

Calculates the number of latent GPs required based on the data as well as the type of kernel and likelihood.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

    • data[0] has shape [batch…, N, D].

    • data[1] has shape [batch…, N, P].

  • kernel (Kernel) –

  • likelihood (Likelihood) –

Return type:

int

abstract predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Compute the mean and variance of the posterior latent function(s) at the input points.

Given $x_i$ this computes $f_i$, for:

\begin{align} \theta & \sim p(\theta) \\ f & \sim \mathcal{GP}(m(x), k(x, x'; \theta)) \\ f_i & = f(x_i) \\ \end{align}

For an example of how to use predict_f, see Basic Usage with GPR.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

    Input locations at which to compute mean and variance.

  • full_cov (bool) – If True, compute the full covariance between the inputs. If False, only returns the point-wise variance.

  • full_output_cov (bool) – If True, compute the full covariance between the outputs. If False, assumes outputs are independent.

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

predict_f_samples(Xnew, num_samples=None, full_cov=True, full_output_cov=False)[source]#

Produce samples from the posterior latent function(s) at the input points.

Currently, the method does not support full_output_cov=True and full_cov=True.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

    Input locations at which to draw samples.

  • num_samples (Optional[int]) – Number of samples to draw. If None, a single sample is drawn and the return shape is […, N, P], for any positive integer the return shape contains an extra batch dimension, […, S, N, P], with S = num_samples and P is the number of outputs.

  • full_cov (bool) – If True, draw correlated samples over the inputs. Computes the Cholesky over the dense covariance matrix of size [num_data, num_data]. If False, draw samples that are uncorrelated over the inputs.

  • full_output_cov (bool) – If True, draw correlated samples over the outputs. If False, draw samples that are uncorrelated over the outputs.

Return type:

Tensor

Returns:

  • return has shape [batch…, N, P] if num_samples is None.

  • return has shape [batch…, S, N, P] if num_samples is not None.

predict_log_density(data, full_cov=False, full_output_cov=False)[source]#

Compute the log of the probability density of the data at the new data points.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

    • data[0] has shape [batch…, N, D].

    • data[1] has shape [batch…, N, P].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tensor

Returns:

  • return has shape [batch…, N].

predict_y(Xnew, full_cov=False, full_output_cov=False)[source]#

Compute the mean and variance of the held-out data at the input points.

Given $x_i$ this computes $y_i$, for:

\begin{align} \theta & \sim p(\theta) \\ f & \sim \mathcal{GP}(m(x), k(x, x'; \theta)) \\ f_i & = f(x_i) \\ y_i \,|\, f_i & \sim p(y_i|f_i) \end{align}

For an example of how to use predict_y, see Basic Usage with GPR.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

    Input locations at which to compute mean and variance.

  • full_cov (bool) – If True, compute the full covariance between the inputs. If False, only returns the point-wise variance.

  • full_output_cov (bool) – If True, compute the full covariance between the outputs. If False, assumes outputs are independent.

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

gpflow.models.GPR#

class gpflow.models.GPR(data, kernel, mean_function=None, noise_variance=None, likelihood=None)[source]#

Bases: GPR_with_posterior

Gaussian Process Regression.

This is a vanilla implementation of GP regression with a Gaussian likelihood. Multiple columns of Y are treated independently.

The log likelihood of this model is given by

\[\log p(Y \,|\, \mathbf f) = \mathcal N(Y \,|\, 0, \sigma_n^2 \mathbf{I})\]

To train the model, we maximise the log _marginal_ likelihood w.r.t. the likelihood variance and kernel hyperparameters theta. The marginal likelihood is found by integrating the likelihood over the prior, and has the form

\[\log p(Y \,|\, \sigma_n, \theta) = \mathcal N(Y \,|\, 0, \mathbf{K} + \sigma_n^2 \mathbf{I})\]

For a use example see Basic Usage with GPR.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • mean_function (Optional[MeanFunction]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

gpflow.models.GPRFITC#

class gpflow.models.GPRFITC(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=None, likelihood=None)[source]#

Bases: SGPRBase_deprecated

This implements GP regression with the FITC approximation.

The key reference is Snelson and Ghahramani [SG06].

Implementation loosely based on code from GPML matlab library although obviously gradients are automatic in GPflow.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

fitc_log_marginal_likelihood()[source]#

Construct a tensorflow function to compute the bound on the marginal likelihood.

Return type:

Tensor

Returns:

  • return has shape [].

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Compute the mean and variance of the latent function at some new points Xnew.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

gpflow.models.InternalDataTrainingLossMixin#

class gpflow.models.InternalDataTrainingLossMixin[source]#

Bases: object

Mixin utility for training loss methods for models that own their own data. It provides

See ExternalDataTrainingLossMixin for an equivalent mixin for models that do not own their own data.

training_loss()[source]#

Returns the training loss for this model.

Return type:

Tensor

Returns:

  • return has shape [].

training_loss_closure(*, compile=True)[source]#

Convenience method. Returns a closure which itself returns the training loss. This closure can be passed to the minimize methods on gpflow.optimizers.Scipy and subclasses of tf.optimizers.Optimizer.

Parameters:

compile (bool) – If True (default), compile the training loss function in a TensorFlow graph by wrapping it in tf.function()

Return type:

Callable[[], Tensor]

gpflow.models.SGPMC#

class gpflow.models.SGPMC(data, kernel, likelihood, mean_function=None, num_latent_gps=None, inducing_variable=None)[source]#

Bases: GPModel, InternalDataTrainingLossMixin

This is the Sparse Variational GP using MCMC (SGPMC).

The key reference is Hensman et al. [HMFG15].

The latent function values are represented by centered (whitened) variables, so

\begin{align} \mathbf v & \sim N(0, \mathbf I) \\ \mathbf u &= \mathbf L\mathbf v \end{align}

with

\[\mathbf L \mathbf L^\top = \mathbf K\]
Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • likelihood (Likelihood) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any], None]) –

log_likelihood_lower_bound()[source]#

This function computes the optimal density for v, q*(v), up to a constant

Return type:

Tensor

log_posterior_density()[source]#

This may be the posterior with respect to the hyperparameters (e.g. for GPR) or the posterior with respect to the function (e.g. for GPMC and SGPMC). It assumes that maximum_log_likelihood_objective() is defined sensibly.

Return type:

Tensor

Returns:

  • return has shape [].

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Xnew is a data matrix of the points at which we want to predict

This method computes

p(F* | (U=LV) )

where F* are points on the GP at Xnew, F=LV are points on the GP at Z,

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

gpflow.models.SGPR#

class gpflow.models.SGPR(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=None, likelihood=None)[source]#

Bases: SGPR_with_posterior

Sparse GP regression.

The key reference is Titsias [Tit09].

For a use example see Large Data with SGPR.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

gpflow.models.SVGP#

class gpflow.models.SVGP(kernel, likelihood, inducing_variable, *, mean_function=None, num_latent_gps=1, q_diag=False, q_mu=None, q_sqrt=None, whiten=True, num_data=None)[source]#

Bases: SVGP_with_posterior

This is the Sparse Variational GP (SVGP).

The key reference is Hensman et al. [HMG15].

For a use example see Classification, other data distributions, VGP and SVGP.

Parameters:
  • kernel (Kernel) –

  • likelihood (Likelihood) –

  • inducing_variable (Union[InducingVariables, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (int) –

  • q_diag (bool) –

  • q_mu (Optional[Tensor]) –

  • q_sqrt (Optional[Tensor]) –

  • whiten (bool) –

  • num_data (Optional[Tensor]) –

gpflow.models.VGP#

class gpflow.models.VGP(data, kernel, likelihood, mean_function=None, num_latent_gps=None)[source]#

Bases: VGP_with_posterior

This method approximates the Gaussian process posterior using a multivariate Gaussian.

The idea is that the posterior over the function-value vector F is approximated by a Gaussian, and the KL divergence is minimised between the approximation and the posterior.

This implementation is equivalent to SVGP with X=Z, but is more efficient. The whitened representation is used to aid optimization.

The posterior approximation is

\[q(\mathbf f) = N(\mathbf f \,|\, \boldsymbol \mu, \boldsymbol \Sigma)\]

For a use example see Classification, other data distributions, VGP and SVGP.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • likelihood (Likelihood) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

gpflow.models.VGPOpperArchambeau#

class gpflow.models.VGPOpperArchambeau(data, kernel, likelihood, mean_function=None, num_latent_gps=None)[source]#

Bases: GPModel, InternalDataTrainingLossMixin

This method approximates the Gaussian process posterior using a multivariate Gaussian.

The key reference is Opper and Archambeau [OA09].

The idea is that the posterior over the function-value vector F is approximated by a Gaussian, and the KL divergence is minimised between the approximation and the posterior. It turns out that the optimal posterior precision shares off-diagonal elements with the prior, so only the diagonal elements of the precision need be adjusted. The posterior approximation is

\[q(\mathbf f) = N(\mathbf f \,|\, \mathbf K \boldsymbol \alpha, [\mathbf K^{-1} + \textrm{diag}(\boldsymbol \lambda))^2]^{-1})\]

This approach has only 2ND parameters, rather than the N + N^2 of vgp, but the optimization is non-convex and in practice may cause difficulty.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • likelihood (Likelihood) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

elbo()[source]#

q_alpha, q_lambda are variational parameters, size [N, R] This method computes the variational lower bound on the likelihood, which is:

\[E_{q(F)} [ \log p(Y|F) ] - KL[ q(F) || p(F)]\]

with

\[q(f) = N(f | K \alpha + \textrm{mean}, [K^-1 + \textrm{diag}(\textrm{square}(\lambda))]^-1) .\]
Return type:

Tensor

Returns:

  • return has shape [].

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

The posterior variance of F is given by

\[q(f) = N(f | K \alpha + \textrm{mean}, [K^-1 + \textrm{diag}(\lambda**2)]^-1)\]

Here we project this to F*, the values of the GP at Xnew which is given by

\[q(F*) = N ( F* | K_{*F} \alpha + \textrm{mean}, K_{**} - K_{*f}[K_{ff} + \textrm{diag}(\lambda**-2)]^-1 K_{f*} )\]

Note: This model currently does not allow full output covariances

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

Functions#

gpflow.models.maximum_log_likelihood_objective#

gpflow.models.maximum_log_likelihood_objective(model, data)[source]#
Parameters:
  • model (BayesianModel) –

  • data (TypeVar(Data, Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]], Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter])) –

Return type:

Tensor

gpflow.models.training_loss#

gpflow.models.training_loss(model, data)[source]#
Parameters:
  • model (BayesianModel) –

  • data (TypeVar(Data, Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]], Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter])) –

Return type:

Tensor

gpflow.models.training_loss_closure#

gpflow.models.training_loss_closure(model, data, **closure_kwargs)[source]#
Parameters:
  • model (BayesianModel) –

  • data (TypeVar(Data, Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]], Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter])) –

  • closure_kwargs (Any) –

Return type:

Callable[[], Tensor]