gpflow.likelihoods#
Likelihoods are another core component of GPflow. This describes how likely the data is under the assumptions made about the underlying latent functions p(Y|F). Different likelihoods make different assumptions about the distribution of the data, as such different data-types (continuous, binary, ordinal, count) are better modelled with different likelihood assumptions.
Use of any likelihood other than Gaussian typically introduces the need to use an approximation to perform inference, if one isn’t already needed. Variational inference and MCMC models are included in GPflow and allow approximate inference with non-Gaussian likelihoods. An introduction to these models can be found here. Specific notebooks illustrating non-Gaussian likelihood regressions are available for classification (binary data), ordinal and multiclass.
Creating new likelihoods#
Likelihoods are defined by their
log-likelihood. When creating new likelihoods, the
logp
method (log p(Y|F)), the
conditional_mean
,
conditional_variance
.
In order to perform variational inference with non-Gaussian likelihoods a term
called variational expectations
, ∫ q(F) log p(Y|F) dF, needs to
be computed under a Gaussian distribution q(F) ~ N(μ, Σ).
The variational_expectations
method can be overriden if this can be computed in closed form, otherwise; if
the new likelihood inherits
Likelihood
the default will use
Gauss-Hermite numerical integration (works well when F is 1D
or 2D), if the new likelihood inherits from
MonteCarloLikelihood
the
integration is done by sampling (can be more suitable when F is higher dimensional).
Modules#
Classes#
gpflow.likelihoods.Bernoulli#
- class gpflow.likelihoods.Bernoulli(invlink=<function inv_probit>, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
- Parameters
invlink (
Callable
[[Tensor
],Tensor
]) –kwargs (
Any
) –
gpflow.likelihoods.Beta#
- class gpflow.likelihoods.Beta(invlink=<function inv_probit>, scale=1.0, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
This uses a reparameterisation of the Beta density. We have the mean of the Beta distribution given by the transformed process:
m = invlink(f)
and a scale parameter. The familiar α, β parameters are given by
m = α / (α + β) scale = α + β
- so:
α = scale * m β = scale * (1-m)
- Parameters
invlink (
Callable
[[Tensor
],Tensor
]) –scale (
float
) –kwargs (
Any
) –
gpflow.likelihoods.Exponential#
- class gpflow.likelihoods.Exponential(invlink=<function exp>, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
- Parameters
invlink (
Callable
[[Tensor
],Tensor
]) –kwargs (
Any
) –
gpflow.likelihoods.Gamma#
- class gpflow.likelihoods.Gamma(invlink=<function exp>, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
Use the transformed GP to give the scale (inverse rate) of the Gamma
- Parameters
invlink (
Callable
[[Tensor
],Tensor
]) –kwargs (
Any
) –
gpflow.likelihoods.Gaussian#
- class gpflow.likelihoods.Gaussian(variance=1.0, variance_lower_bound=1e-06, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
The Gaussian likelihood is appropriate where uncertainties associated with the data are believed to follow a normal distribution, with constant variance.
Very small uncertainties can lead to numerical instability during the optimization process. A lower bound of 1e-6 is therefore imposed on the likelihood variance by default.
- Parameters
variance (
float
) –variance_lower_bound (
float
) –kwargs (
Any
) –
gpflow.likelihoods.GaussianMC#
- class gpflow.likelihoods.GaussianMC(*args, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.MonteCarloLikelihood
,gpflow.likelihoods.scalar_continuous.Gaussian
Stochastic version of Gaussian likelihood for demonstration purposes only.
- Parameters
args (
Any
) –kwargs (
Any
) –
gpflow.likelihoods.HeteroskedasticTFPConditional#
- class gpflow.likelihoods.HeteroskedasticTFPConditional(distribution_class=<class 'tensorflow_probability.python.distributions.normal.Normal'>, scale_transform=None, **kwargs)[source]#
Bases:
gpflow.likelihoods.multilatent.MultiLatentTFPConditional
Heteroskedastic Likelihood where the conditional distribution is given by a TensorFlow Probability Distribution. The loc and scale of the distribution are given by a two-dimensional multi-output GP.
- Parameters
distribution_class (
Type
[Distribution
]) –scale_transform (
Optional
[Bijector
]) –kwargs (
Any
) –
gpflow.likelihoods.Likelihood#
- class gpflow.likelihoods.Likelihood(latent_dim, observation_dim)[source]#
Bases:
gpflow.base.Module
- Parameters
latent_dim (
Optional
[int
]) –observation_dim (
Optional
[int
]) –
- conditional_mean(F)[source]#
The conditional mean of Y|F: [E[Y₁|F], …, E[Yₖ|F]] where K = observation_dim
- Parameters
F (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – function evaluation Tensor, with shape […, latent_dim]- Return type
Tensor
- Returns
mean […, observation_dim]
- conditional_variance(F)[source]#
The conditional marginal variance of Y|F: [var(Y₁|F), …, var(Yₖ|F)] where K = observation_dim
- Parameters
F (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – function evaluation Tensor, with shape […, latent_dim]- Return type
Tensor
- Returns
variance […, observation_dim]
- predict_log_density(Fmu, Fvar, Y)[source]#
Given a Normal distribution for the latent function, and a datum Y, compute the log predictive density of Y,
- i.e. if
q(F) = N(Fmu, Fvar)
and this object represents
p(y|F)
then this method computes the predictive density
log ∫ p(y=Y|F)q(F) df
- Parameters
Fmu (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – mean function evaluation Tensor, with shape […, latent_dim]Fvar (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – variance of function evaluation Tensor, with shape […, latent_dim]Y (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – observation Tensor, with shape […, observation_dim]:
- Return type
Tensor
- Returns
log predictive density, with shape […]
- predict_mean_and_var(Fmu, Fvar)[source]#
Given a Normal distribution for the latent function, return the mean and marginal variance of Y,
- i.e. if
q(f) = N(Fmu, Fvar)
and this object represents
p(y|f)
then this method computes the predictive mean
∫∫ y p(y|f)q(f) df dy
and the predictive variance
∫∫ y² p(y|f)q(f) df dy - [ ∫∫ y p(y|f)q(f) df dy ]²
- Parameters
- Return type
Tuple
[Tensor
,Tensor
]- Returns
mean and variance, both with shape […, observation_dim]
- variational_expectations(Fmu, Fvar, Y)[source]#
Compute the expected log density of the data, given a Gaussian distribution for the function values,
- i.e. if
q(f) = N(Fmu, Fvar)
and this object represents
p(y|f)
then this method computes
∫ log(p(y=Y|f)) q(f) df.
This only works if the broadcasting dimension of the statistics of q(f) (mean and variance) are broadcastable with that of the data Y.
- Parameters
Fmu (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – mean function evaluation Tensor, with shape […, latent_dim]Fvar (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – variance of function evaluation Tensor, with shape […, latent_dim]Y (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – observation Tensor, with shape […, observation_dim]:
- Return type
Tensor
- Returns
expected log density of the data given q(F), with shape […]
gpflow.likelihoods.MonteCarloLikelihood#
- class gpflow.likelihoods.MonteCarloLikelihood(*args, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.Likelihood
- Parameters
args (
Any
) –kwargs (
Any
) –
gpflow.likelihoods.MultiClass#
- class gpflow.likelihoods.MultiClass(num_classes, invlink=None, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.Likelihood
- Parameters
num_classes (
int
) –invlink (
Optional
[RobustMax
]) –kwargs (
Any
) –
gpflow.likelihoods.MultiLatentLikelihood#
- class gpflow.likelihoods.MultiLatentLikelihood(latent_dim, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.QuadratureLikelihood
A Likelihood which assumes that a single dimensional observation is driven by multiple latent GPs.
Note that this implementation does not allow for taking into account covariance between outputs.
- Parameters
latent_dim (
int
) –kwargs (
Any
) –
gpflow.likelihoods.MultiLatentTFPConditional#
- class gpflow.likelihoods.MultiLatentTFPConditional(latent_dim, conditional_distribution, **kwargs)[source]#
Bases:
gpflow.likelihoods.multilatent.MultiLatentLikelihood
MultiLatent likelihood where the conditional distribution is given by a TensorFlow Probability Distribution.
- Parameters
latent_dim (
int
) –conditional_distribution (
Callable
[...
,Distribution
]) –kwargs (
Any
) –
gpflow.likelihoods.Ordinal#
- class gpflow.likelihoods.Ordinal(bin_edges, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
A likelihood for doing ordinal regression.
The data are integer values from 0 to k, and the user must specify (k-1) ‘bin edges’ which define the points at which the labels switch. Let the bin edges be [a₀, a₁, … aₖ₋₁], then the likelihood is
p(Y=0|F) = ɸ((a₀ - F) / σ) p(Y=1|F) = ɸ((a₁ - F) / σ) - ɸ((a₀ - F) / σ) p(Y=2|F) = ɸ((a₂ - F) / σ) - ɸ((a₁ - F) / σ) … p(Y=K|F) = 1 - ɸ((aₖ₋₁ - F) / σ)
where ɸ is the cumulative density function of a Gaussian (the inverse probit function) and σ is a parameter to be learned.
A reference is Chu and Ghahramani [CG05].
- Parameters
bin_edges (
ndarray
[Any
,Any
]) –kwargs (
Any
) –
gpflow.likelihoods.Poisson#
- class gpflow.likelihoods.Poisson(invlink=<function exp>, binsize=1.0, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
Poisson likelihood for use with count data, where the rate is given by the (transformed) GP.
let g(.) be the inverse-link function, then this likelihood represents
p(yᵢ | fᵢ) = Poisson(yᵢ | g(fᵢ) * binsize)
Note:binsize For use in a Log Gaussian Cox process (doubly stochastic model) where the rate function of an inhomogeneous Poisson process is given by a GP. The intractable likelihood can be approximated via a Riemann sum (with bins of size ‘binsize’) and using this Poisson likelihood.
- Parameters
invlink (
Callable
[[Tensor
],Tensor
]) –binsize (
float
) –kwargs (
Any
) –
gpflow.likelihoods.QuadratureLikelihood#
- class gpflow.likelihoods.QuadratureLikelihood(latent_dim, observation_dim, *, quadrature=None)[source]#
Bases:
gpflow.likelihoods.base.Likelihood
- Parameters
latent_dim (
Optional
[int
]) –observation_dim (
Optional
[int
]) –quadrature (
Optional
[GaussianQuadrature
]) –
gpflow.likelihoods.RobustMax#
- class gpflow.likelihoods.RobustMax(num_classes, epsilon=0.001, **kwargs)[source]#
Bases:
gpflow.base.Module
This class represent a multi-class inverse-link function. Given a vector
, the result of the mapping iswith
where
is the number of classes.- Parameters
num_classes (
int
) –epsilon (
float
) –kwargs (
Any
) –
gpflow.likelihoods.ScalarLikelihood#
- class gpflow.likelihoods.ScalarLikelihood(**kwargs)[source]#
Bases:
gpflow.likelihoods.base.QuadratureLikelihood
A likelihood class that helps with scalar likelihood functions: likelihoods where each scalar latent function is associated with a single scalar observation variable.
If there are multiple latent functions, then there must be a corresponding number of data: we check for this.
The Likelihood class contains methods to compute marginal statistics of functions of the latents and the data ϕ(y,f):
variational_expectations: ϕ(y,f) = log p(y|f)
predict_log_density: ϕ(y,f) = p(y|f)
Those statistics are computed after having first marginalized the latent processes f under a multivariate normal distribution q(f) that is fully factorized.
Some univariate integrals can be done by quadrature: we implement quadrature routines for 1D integrals in this class, though they may be overwritten by inheriting classes where those integrals are available in closed form.
- Parameters
kwargs (
Any
) –
gpflow.likelihoods.Softmax#
- class gpflow.likelihoods.Softmax(num_classes, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.MonteCarloLikelihood
The soft-max multi-class likelihood. It can only provide a stochastic Monte-Carlo estimate of the variational expectations term, but this added variance tends to be small compared to that due to mini-batching (when using the SVGP model).
- Parameters
num_classes (
int
) –kwargs (
Any
) –
gpflow.likelihoods.StudentT#
- class gpflow.likelihoods.StudentT(scale=1.0, df=3.0, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
- Parameters
scale (
float
) –df (
float
) –kwargs (
Any
) –
gpflow.likelihoods.SwitchedLikelihood#
- class gpflow.likelihoods.SwitchedLikelihood(likelihood_list, **kwargs)[source]#
Bases:
gpflow.likelihoods.base.ScalarLikelihood
- Parameters
likelihood_list (
Iterable
[ScalarLikelihood
]) –kwargs (
Any
) –