gpflow.models.sgpr¶

gpflow.models.sgpr.SGPRBase_deprecated¶

class gpflow.models.sgpr.SGPRBase_deprecated(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=1.0)[source]¶

Bases: gpflow.models.model.GPModel, gpflow.models.training_mixins.InternalDataTrainingLossMixin

Common base class for SGPR and GPRFITC that provides the common __init__ and upper_bound() methods.

Attributes

name: Returns the name of this module as passed or determined in the ctor.
name_scope: Returns a tf.name_scope instance for this class.
non_trainable_variables: Sequence of non-trainable variables owned by this module and its submodules.
parameters
submodules: Sequence of all sub-modules.
trainable_parameters
trainable_variables: Sequence of trainable variables owned by this module and its submodules.
variables: Sequence of variables owned by this module and its submodules.

Methods

`calc_num_latent_gps`(kernel, likelihood, ...)	Calculates the number of latent GPs required given the number of outputs output_dim and the type of likelihood and kernel.
`calc_num_latent_gps_from_data`(data, kernel, ...)	Calculates the number of latent GPs required based on the data as well as the type of kernel and likelihood.
`log_posterior_density`(args, *kwargs)	This may be the posterior with respect to the hyperparameters (e.g.
`log_prior_density`()	Sum of the log prior probability densities of all (constrained) variables in this model.
`maximum_log_likelihood_objective`(args, *kwargs)	Objective for maximum likelihood estimation.
`predict_f_samples`(Xnew[, num_samples, ...])	Produce samples from the posterior latent function(s) at the input points.
`predict_log_density`(data[, full_cov, ...])	Compute the log density of the data at the new data points.
`predict_y`(Xnew[, full_cov, full_output_cov])	Compute the mean and variance of the held-out data at the input points.
`training_loss`()	Returns the training loss for this model.
`training_loss_closure`(*[, compile])	Convenience method.
`upper_bound`()	Upper bound for the sparse GP regression marginal likelihood.
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

predict_f

Parameters

data (Tuple[Union[ndarray, Tensor, Variable, Parameter], Union[ndarray, Tensor, Variable, Parameter]]) –
kernel (Kernel) –
inducing_variable (InducingPoints) –
mean_function (Optional[MeanFunction]) –
num_latent_gps (Optional[int]) –
noise_variance (float) –

upper_bound()[source]¶

Upper bound for the sparse GP regression marginal likelihood. Note that the same inducing points are used for calculating the upper bound, as are used for computing the likelihood approximation. This may not lead to the best upper bound. The upper bound can be tightened by optimising Z, just like the lower bound. This is especially important in FITC, as FITC is known to produce poor inducing point locations. An optimisable upper bound can be found in https://github.com/markvdw/gp_upper.

The key reference is

@misc{titsias_2014,
  title={Variational Inference for Gaussian and Determinantal Point Processes},
  url={http://www2.aueb.gr/users/mtitsias/papers/titsiasNipsVar14.pdf},
  publisher={Workshop on Advances in Variational Inference (NIPS 2014)},
  author={Titsias, Michalis K.},
  year={2014},
  month={Dec}
}

The key quantity, the trace term, can be computed via

>>> _, v = conditionals.conditional(X, model.inducing_variable.Z, model.kernel,
...                                 np.zeros((model.inducing_variable.num_inducing, 1)))

which computes each individual element of the trace term.

Return type: Tensor

gpflow.models.sgpr.SGPR_deprecated¶

class gpflow.models.sgpr.SGPR_deprecated(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=1.0)[source]¶

Bases: gpflow.models.sgpr.SGPRBase_deprecated

Sparse Variational GP regression. The key reference is

@inproceedings{titsias2009variational,
    title={Variational learning of inducing variables in
        sparse Gaussian processes},
    author={Titsias, Michalis K},
    booktitle={International Conference on
            Artificial Intelligence and Statistics},
    pages={567--574},
    year={2009}
}

Attributes

name: Returns the name of this module as passed or determined in the ctor.
name_scope: Returns a tf.name_scope instance for this class.
non_trainable_variables: Sequence of non-trainable variables owned by this module and its submodules.
parameters
submodules: Sequence of all sub-modules.
trainable_parameters
trainable_variables: Sequence of trainable variables owned by this module and its submodules.
variables: Sequence of variables owned by this module and its submodules.

Methods

`CommonTensors`(A, B, LB, AAT, L)	Attributes
`calc_num_latent_gps`(kernel, likelihood, ...)	Calculates the number of latent GPs required given the number of outputs output_dim and the type of likelihood and kernel.
`calc_num_latent_gps_from_data`(data, kernel, ...)	Calculates the number of latent GPs required based on the data as well as the type of kernel and likelihood.
`compute_qu`()	Computes the mean and variance of q(u) = N(mu, cov), the variational distribution on inducing outputs.
`elbo`()	Construct a tensorflow function to compute the bound on the marginal likelihood.
`log_posterior_density`(args, *kwargs)	This may be the posterior with respect to the hyperparameters (e.g.
`log_prior_density`()	Sum of the log prior probability densities of all (constrained) variables in this model.
`logdet_term`(common)	Bound from Jensen's Inequality: .. math:: log \|K + σ²I\| <= log \|Q + σ²I\| + N * log (1 + tr(K - Q)/(σ²N)).
`maximum_log_likelihood_objective`()	Objective for maximum likelihood estimation.
`predict_f`(Xnew[, full_cov, full_output_cov])	Compute the mean and variance of the latent function at some new points Xnew.
`predict_f_samples`(Xnew[, num_samples, ...])	Produce samples from the posterior latent function(s) at the input points.
`predict_log_density`(data[, full_cov, ...])	Compute the log density of the data at the new data points.
`predict_y`(Xnew[, full_cov, full_output_cov])	Compute the mean and variance of the held-out data at the input points.
`quad_term`(common)	type common `NamedTuple`
`training_loss`()	Returns the training loss for this model.
`training_loss_closure`(*[, compile])	Convenience method.
`upper_bound`()	Upper bound for the sparse GP regression marginal likelihood.
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

Parameters

data (Tuple[Union[ndarray, Tensor, Variable, Parameter], Union[ndarray, Tensor, Variable, Parameter]]) –
kernel (Kernel) –
inducing_variable (InducingPoints) –
mean_function (Optional[MeanFunction]) –
num_latent_gps (Optional[int]) –
noise_variance (float) –

class CommonTensors(A, B, LB, AAT, L)¶

Bases: tuple

Attributes

A: Alias for field number 0
AAT: Alias for field number 3
B: Alias for field number 1
L: Alias for field number 4
LB: Alias for field number 2

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

A¶: Alias for field number 0

AAT¶: Alias for field number 3

B¶: Alias for field number 1

L¶: Alias for field number 4

LB¶: Alias for field number 2

compute_qu()[source]¶: Computes the mean and variance of q(u) = N(mu, cov), the variational distribution on inducing outputs. SVGP with this q(u) should predict identically to SGPR. :rtype: Tuple[Tensor, Tensor] :return: mu, cov

elbo()[source]¶

Construct a tensorflow function to compute the bound on the marginal likelihood. For a derivation of the terms in here, see the associated SGPR notebook.

Return type: Tensor

logdet_term(common)[source]¶

Bound from Jensen’s Inequality: .. math:

log |K + σ²I| <= log |Q + σ²I| + N * log (1 + tr(K - Q)/(σ²N))

Parameters: common (NameError) – A named tuple containing matrices that will be used
Returns: log_det, lower bound on -.5 * output_dim * log |K + σ²I|

maximum_log_likelihood_objective()[source]¶

Objective for maximum likelihood estimation. Should be maximized. E.g. log-marginal likelihood (hyperparameter likelihood) for GPR, or lower bound to the log-marginal likelihood (ELBO) for sparse and variational GPs.

Return type: Tensor

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]¶

Compute the mean and variance of the latent function at some new points Xnew. For a derivation of the terms in here, see the associated SGPR notebook.

Parameters: Xnew (Union[ndarray, Tensor, Variable, Parameter]) –
Return type: Tuple[Tensor, Tensor]

quad_term(common)[source]¶

Parameters: common (NamedTuple) – A named tuple containing matrices that will be used
Return type: Tensor
Returns: Lower bound on -.5 yᵀ(K + σ²I)⁻¹y

gpflow.models.sgpr.SGPR_with_posterior¶

class gpflow.models.sgpr.SGPR_with_posterior(data, kernel, inducing_variable, *, mean_function=None, num_latent_gps=None, noise_variance=1.0)[source]¶

Bases: gpflow.models.sgpr.SGPR_deprecated

This is an implementation of GPR that provides a posterior() method that enables caching for faster subsequent predictions.

Attributes

name: Returns the name of this module as passed or determined in the ctor.
name_scope: Returns a tf.name_scope instance for this class.
non_trainable_variables: Sequence of non-trainable variables owned by this module and its submodules.
parameters
submodules: Sequence of all sub-modules.
trainable_parameters
trainable_variables: Sequence of trainable variables owned by this module and its submodules.
variables: Sequence of variables owned by this module and its submodules.

Methods

`CommonTensors`(A, B, LB, AAT, L)	Attributes
`calc_num_latent_gps`(kernel, likelihood, ...)	Calculates the number of latent GPs required given the number of outputs output_dim and the type of likelihood and kernel.
`calc_num_latent_gps_from_data`(data, kernel, ...)	Calculates the number of latent GPs required based on the data as well as the type of kernel and likelihood.
`compute_qu`()	Computes the mean and variance of q(u) = N(mu, cov), the variational distribution on inducing outputs.
`elbo`()	Construct a tensorflow function to compute the bound on the marginal likelihood.
`log_posterior_density`(args, *kwargs)	This may be the posterior with respect to the hyperparameters (e.g.
`log_prior_density`()	Sum of the log prior probability densities of all (constrained) variables in this model.
`logdet_term`(common)	Bound from Jensen's Inequality: .. math:: log \|K + σ²I\| <= log \|Q + σ²I\| + N * log (1 + tr(K - Q)/(σ²N)).
`maximum_log_likelihood_objective`()	Objective for maximum likelihood estimation.
`posterior`([precompute_cache])	Create the Posterior object which contains precomputed matrices for faster prediction.
`predict_f`(Xnew[, full_cov, full_output_cov])	For backwards compatibility, GPR's predict_f uses the fused (no-cache) computation, which is more efficient during training.
`predict_f_samples`(Xnew[, num_samples, ...])	Produce samples from the posterior latent function(s) at the input points.
`predict_log_density`(data[, full_cov, ...])	Compute the log density of the data at the new data points.
`predict_y`(Xnew[, full_cov, full_output_cov])	Compute the mean and variance of the held-out data at the input points.
`quad_term`(common)	type common `NamedTuple`
`training_loss`()	Returns the training loss for this model.
`training_loss_closure`(*[, compile])	Convenience method.
`upper_bound`()	Upper bound for the sparse GP regression marginal likelihood.
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

Parameters

data (Tuple[Union[ndarray, Tensor, Variable, Parameter], Union[ndarray, Tensor, Variable, Parameter]]) –
kernel (Kernel) –
inducing_variable (InducingPoints) –
mean_function (Optional[MeanFunction]) –
num_latent_gps (Optional[int]) –
noise_variance (float) –

posterior(precompute_cache=PrecomputeCacheType.TENSOR)[source]¶

Create the Posterior object which contains precomputed matrices for faster prediction.

precompute_cache has three settings:

PrecomputeCacheType.TENSOR (or “tensor”): Precomputes the cached quantities and stores them as tensors (which allows differentiating through the prediction). This is the default.
PrecomputeCacheType.VARIABLE (or “variable”): Precomputes the cached quantities and stores them as variables, which allows for updating their values without changing the compute graph (relevant for AOT compilation).
PrecomputeCacheType.NOCACHE (or “nocache” or None): Avoids immediate cache computation. This is useful for avoiding extraneous computations when you only want to call the posterior’s fused_predict_f method.

predict_f(Xnew, full_cov=False, full_output_cov=False)[source]¶

For backwards compatibility, GPR’s predict_f uses the fused (no-cache) computation, which is more efficient during training.

For faster (cached) prediction, predict directly from the posterior object, i.e.,:: model.posterior().predict_f(Xnew, …)

Parameters: Xnew (Union[ndarray, Tensor, Variable, Parameter]) –
Return type: Tuple[Tensor, Tensor]

gpflow.models.sgpr.inducingpoint_wrapper¶

gpflow.models.sgpr.inducingpoint_wrapper(inducing_variable)[source]¶

This wrapper allows transparently passing either an InducingVariables object or an array specifying InducingPoints positions.

Parameters: inducing_variable (Union[InducingVariables, Tensor, ndarray]) –
Return type: InducingVariables