gpflow.models.sgpr#

Classes#

gpflow.models.sgpr.SGPRBase_deprecated#

class gpflow.models.sgpr.SGPRBase_deprecated(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=None, likelihood=None)[source]#

Bases: GPModel, InternalDataTrainingLossMixin

Common base class for SGPR and GPRFITC that provides the common __init__ and upper_bound() methods.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

upper_bound()[source]#

Upper bound for the sparse GP regression marginal likelihood. Note that the same inducing points are used for calculating the upper bound, as are used for computing the likelihood approximation. This may not lead to the best upper bound. The upper bound can be tightened by optimising Z, just like the lower bound. This is especially important in FITC, as FITC is known to produce poor inducing point locations. An optimisable upper bound can be found in markvdw/gp_upper.

The key reference is Titsias [Tit14].

The key quantity, the trace term, can be computed via

>>> _, v = conditionals.conditional(X, model.inducing_variable.Z, model.kernel,
...                                 np.zeros((model.inducing_variable.num_inducing, 1)))

which computes each individual element of the trace term.

Return type:

Tensor

Returns:

  • return has shape [].

gpflow.models.sgpr.SGPR_deprecated#

class gpflow.models.sgpr.SGPR_deprecated(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=None, likelihood=None)[source]#

Bases: SGPRBase_deprecated

Sparse Variational GP regression.

The key reference is Titsias [Tit09].

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

class CommonTensors(sigma_sq, sigma, A, B, LB, AAT, L)[source]#

Bases: NamedTuple

A: Tensor#

Alias for field number 2

AAT: Tensor#

Alias for field number 5

B: Tensor#

Alias for field number 3

L: Tensor#

Alias for field number 6

LB: Tensor#

Alias for field number 4

sigma: Tensor#

Alias for field number 1

sigma_sq: Tensor#

Alias for field number 0

compute_qu()[source]#

Computes the mean and variance of q(u) = N(mu, cov), the variational distribution on inducing outputs.

SVGP with this q(u) should predict identically to SGPR.

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [M, P].

  • return[1] has shape [M, M].

mu, cov

elbo()[source]#

Construct a tensorflow function to compute the bound on the marginal likelihood. For a derivation of the terms in here, see the associated SGPR notebook.

Return type:

Tensor

Returns:

  • return has shape [].

logdet_term(common)[source]#

Bound from Jensen’s Inequality:

\[\log |K + σ²I| <= \log |Q + σ²I| + N * \log (1 + \textrm{tr}(K - Q)/(σ²N))\]
Parameters:

common (CommonTensors) – A named tuple containing matrices that will be used

Return type:

Tensor

Returns:

  • return has shape [].

log_det, lower bound on \(-.5 * \textrm{output_dim} * \log |K + σ²I|\)

maximum_log_likelihood_objective()[source]#

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type:

Tensor

Returns:

  • return has shape [].

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

Compute the mean and variance of the latent function at some new points Xnew. For a derivation of the terms in here, see the associated SGPR notebook.

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).

quad_term(common)[source]#
Parameters:

common (CommonTensors) – A named tuple containing matrices that will be used

Return type:

Tensor

Returns:

  • return has shape [].

Lower bound on -.5 yᵀ(K + σ²I)⁻¹y

gpflow.models.sgpr.SGPR_with_posterior#

class gpflow.models.sgpr.SGPR_with_posterior(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=None, likelihood=None)[source]#

Bases: SGPR_deprecated

Sparse Variational GP regression. The key reference is Titsias [Tit09].

This is an implementation of SGPR that provides a posterior() method that enables caching for faster subsequent predictions.

Parameters:
  • data (Tuple[Union[ndarray[Any, Any], Tensor, Variable, Parameter], Union[ndarray[Any, Any], Tensor, Variable, Parameter]]) –

  • kernel (Kernel) –

  • inducing_variable (Union[InducingPoints, Tensor, ndarray[Any, Any]]) –

  • mean_function (Optional[MeanFunction]) –

  • num_latent_gps (Optional[int]) –

  • noise_variance (Union[int, float, Sequence[Any], ndarray[Any, Any], Tensor, Variable, Parameter, None]) –

  • likelihood (Optional[Gaussian]) –

posterior(precompute_cache=PrecomputeCacheType.TENSOR)[source]#

Create the Posterior object which contains precomputed matrices for faster prediction.

precompute_cache has three settings:

  • PrecomputeCacheType.TENSOR (or “tensor”): Precomputes the cached quantities and stores them as tensors (which allows differentiating through the prediction). This is the default.

  • PrecomputeCacheType.VARIABLE (or “variable”): Precomputes the cached quantities and stores them as variables, which allows for updating their values without changing the compute graph (relevant for AOT compilation).

  • PrecomputeCacheType.NOCACHE (or “nocache” or None): Avoids immediate cache computation. This is useful for avoiding extraneous computations when you only want to call the posterior’s fused_predict_f method.

Parameters:

precompute_cache (PrecomputeCacheType) –

Return type:

SGPRPosterior

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]#

For backwards compatibility, GPR’s predict_f uses the fused (no-cache) computation, which is more efficient during training.

For faster (cached) prediction, predict directly from the posterior object, i.e.,:

model.posterior().predict_f(Xnew, …)

Parameters:
  • Xnew (Union[ndarray[Any, Any], Tensor, Variable, Parameter]) –

    • Xnew has shape [batch…, N, D].

  • full_cov (bool) –

  • full_output_cov (bool) –

Return type:

Tuple[Tensor, Tensor]

Returns:

  • return[0] has shape [batch…, N, P].

  • return[1] has shape [batch…, N, P, N, P] if full_cov and full_output_cov.

  • return[1] has shape [batch…, N, P, P] if (not full_cov) and full_output_cov.

  • return[1] has shape [batch…, N, P] if (not full_cov) and (not full_output_cov).

  • return[1] has shape [batch…, P, N, N] if full_cov and (not full_output_cov).