gpflow.kernels#
Modules#
Classes#
gpflow.kernels.AnisotropicStationary#
- class gpflow.kernels.AnisotropicStationary(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.Stationary
Base class for anisotropic stationary kernels, i.e. kernels that only depend on
d = x - x’
Derived classes should implement K_d(self, d): Returns the kernel evaluated on d, which is the pairwise difference matrix, scaled by the lengthscale parameter ℓ (i.e. [(X - X2ᵀ) / ℓ]). The last axis corresponds to the input dimension.
- Parameters
gpflow.kernels.ArcCosine#
- class gpflow.kernels.ArcCosine(order=0, variance=1.0, weight_variances=1.0, bias_variance=1.0, *, active_dims=None, name=None)[source]#
Bases:
gpflow.kernels.base.Kernel
The Arc-cosine family of kernels which mimics the computation in neural networks. The order parameter specifies the assumed activation function. The Multi Layer Perceptron (MLP) kernel is closely related to the ArcCosine kernel of order 0.
The key reference is Cho and Saul [CS09].
- Parameters
order (
int
) –variance (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –weight_variances (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –bias_variance (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –active_dims (
Union
[slice
,Sequence
[int
],None
]) –name (
Optional
[str
]) –
- property ard: bool#
Whether ARD behaviour is active.
- Return type
bool
gpflow.kernels.Bias#
- gpflow.kernels.Bias#
alias of
gpflow.kernels.statics.Constant
gpflow.kernels.ChangePoints#
- class gpflow.kernels.ChangePoints(kernels, locations, steepness=1.0, name=None)[source]#
Bases:
gpflow.kernels.base.Combination
The ChangePoints kernel defines a fixed number of change-points along a 1d input space where different kernels govern different parts of the space.
The kernel is by multiplication and addition of the base kernels with sigmoid functions (σ). A single change-point kernel is defined as:
K₁(x, x') * (1 - σ(x)) * (1 - σ(x')) + K₂(x, x') * σ(x) * σ(x')
where K₁ is deactivated around the change-point and K₂ is activated. The single change-point version can be found in Lloyd [Llo14]. Each sigmoid is a logistic function defined as:
σ(x) = 1 / (1 + exp{-s(x - x₀)})
parameterized by location “x₀” and steepness “s”.
The key reference is Lloyd [Llo14].
gpflow.kernels.Combination#
- class gpflow.kernels.Combination(kernels, name=None)[source]#
Bases:
gpflow.kernels.base.Kernel
Combine a list of kernels, e.g. by adding or multiplying (see inheriting classes).
The names of the kernels to be combined are generated from their class names.
- Parameters
kernels (
Sequence
[Kernel
]) –name (
Optional
[str
]) –
- property on_separate_dimensions: bool#
Checks whether the kernels in the combination act on disjoint subsets of dimensions. Currently, it is hard to asses whether two slice objects will overlap, so this will always return False.
- Return type
bool
- Returns
Boolean indicator.
gpflow.kernels.Convolutional#
- class gpflow.kernels.Convolutional(base_kernel, image_shape, patch_shape, weights=None, colour_channels=1)[source]#
Bases:
gpflow.kernels.base.Kernel
Plain convolutional kernel as described in van der Wilk et al. [vdWRH17]. Defines a GP
that is constructed from a sum of responses of individual patches in an image:where
is the ’th patch in the image.The key reference is van der Wilk et al. [vdWRH17].
- Parameters
gpflow.kernels.Coregion#
- class gpflow.kernels.Coregion(output_dim, rank, *, active_dims=None, name=None)[source]#
Bases:
gpflow.kernels.base.Kernel
A Coregionalization kernel. The inputs to this kernel are _integers_ (we cast them from floats as needed) which usually specify the outputs of a Coregionalization model.
The kernel function is an indexing of a positive-definite matrix:
K(x, y) = B[x, y] .
To ensure that B is positive-definite, it is specified by the two parameters of this kernel, W and kappa:
B = W Wᵀ + diag(kappa) .
We refer to the size of B as “output_dim x output_dim”, since this is the number of outputs in a coregionalization model. We refer to the number of columns on W as ‘rank’: it is the number of degrees of correlation between the outputs.
NB. There is a symmetry between the elements of W, which creates a local minimum at W=0. To avoid this, it is recommended to initialize the optimization (or MCMC chain) using a random W.
- Parameters
output_dim (
int
) –rank (
int
) –active_dims (
Union
[slice
,Sequence
[int
],None
]) –name (
Optional
[str
]) –
gpflow.kernels.Cosine#
- class gpflow.kernels.Cosine(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.AnisotropicStationary
The Cosine kernel. Functions drawn from a GP with this kernel are sinusoids (with a random phase). The kernel equation is
k(r) = σ² cos{2πd}
where: d is the sum of the per-dimension differences between the input points, scaled by the lengthscale parameter ℓ (i.e. Σᵢ [(X - X2ᵀ) / ℓ]ᵢ), σ² is the variance parameter.
gpflow.kernels.Exponential#
- class gpflow.kernels.Exponential(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.IsotropicStationary
The Exponential kernel. It is equivalent to a Matern12 kernel with doubled lengthscales.
gpflow.kernels.IndependentLatent#
- class gpflow.kernels.IndependentLatent(active_dims=None, name=None)[source]#
Bases:
gpflow.kernels.multioutput.kernels.MultioutputKernel
Base class for multioutput kernels that are constructed from independent latent Gaussian processes.
It should always be possible to specify inducing variables for such kernels that give a block-diagonal Kuu, which can be represented as a [L, M, M] tensor. A reasonable (but not optimal) inference procedure can be specified by placing the inducing points in the latent processes and simply computing Kuu [L, M, M] and Kuf [N, P, M, L] and using fallback_independent_latent_ conditional(). This can be specified by using Fallback{Separate|Shared} IndependentInducingVariables.
- Parameters
active_dims (
Union
[slice
,Sequence
[int
],None
]) –name (
Optional
[str
]) –
gpflow.kernels.IsotropicStationary#
- class gpflow.kernels.IsotropicStationary(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.Stationary
Base class for isotropic stationary kernels, i.e. kernels that only depend on
r = ‖x - x’‖
Derived classes should implement one of:
K_r2(self, r2): Returns the kernel evaluated on r² (r2), which is the squared scaled Euclidean distance Should operate element-wise on r2.
K_r(self, r): Returns the kernel evaluated on r, which is the scaled Euclidean distance. Should operate element-wise on r.
- Parameters
gpflow.kernels.Kernel#
- class gpflow.kernels.Kernel(active_dims=None, name=None)[source]#
Bases:
gpflow.base.Module
The basic kernel class. Handles active dims.
- Parameters
active_dims (
Union
[slice
,Sequence
[int
],None
]) –name (
Optional
[str
]) –
- on_separate_dims(other)[source]#
Checks if the dimensions, over which the kernels are specified, overlap. Returns True if they are defined on different/separate dimensions and False otherwise.
- Parameters
other (
Kernel
) –- Return type
bool
- slice(X, X2=None)[source]#
Slice the correct dimensions for use in the kernel, as indicated by self.active_dims.
- slice_cov(cov)[source]#
Slice the correct dimensions for use in the kernel, as indicated by self.active_dims for covariance matrices. This requires slicing the rows and columns. This will also turn flattened diagonal matrices into a tensor of full diagonal matrices.
- Parameters
cov (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – Tensor of covariance matrices, [N, D, D] or [N, D].- Return type
Tensor
- Returns
[N, I, I].
gpflow.kernels.Linear#
- class gpflow.kernels.Linear(variance=1.0, active_dims=None)[source]#
Bases:
gpflow.kernels.base.Kernel
The linear kernel. Functions drawn from a GP with this kernel are linear, i.e. f(x) = cx. The kernel equation is
k(x, y) = σ²xy
where σ² is the variance parameter.
- Parameters
variance (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –active_dims (
Union
[slice
,Sequence
[int
],None
]) –
- property ard: bool#
Whether ARD behaviour is active.
- Return type
bool
gpflow.kernels.LinearCoregionalization#
- class gpflow.kernels.LinearCoregionalization(kernels, W, name=None)[source]#
Bases:
gpflow.kernels.multioutput.kernels.IndependentLatent
,gpflow.kernels.base.Combination
Linear mixing of the latent GPs to form the output.
- Parameters
- K(X, X2=None, full_output_cov=True)[source]#
Returns the correlation of f(X) and f(X2), where f(.) can be multi-dimensional.
- Parameters
- Return type
Tensor
- Returns
cov[f(X), f(X2)] with shape:
[N1, P, N2, P] if full_output_cov = True
[P, N1, N2] if full_output_cov = False
- K_diag(X, full_output_cov=True)[source]#
Returns the correlation of f(X) and f(X), where f(.) can be multi-dimensional.
- Parameters
X (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – data matrix, [N, D]full_output_cov (
bool
) – calculate correlation between outputs.
- Return type
Tensor
- Returns
var[f(X)] with shape:
[N, P, N, P] if full_output_cov = True
[N, P] if full_output_cov = False
- property latent_kernels: Tuple[gpflow.kernels.base.Kernel, ...]#
The underlying kernels in the multioutput kernel
- Return type
Tuple
[Kernel
,...
]
- property num_latent_gps: int#
The number of latent GPs in the multioutput kernel
- Return type
int
gpflow.kernels.Matern12#
- class gpflow.kernels.Matern12(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.IsotropicStationary
The Matern 1/2 kernel. Functions drawn from a GP with this kernel are not differentiable anywhere. The kernel equation is
k(r) = σ² exp{-r}
where: r is the Euclidean distance between the input points, scaled by the lengthscales parameter ℓ. σ² is the variance parameter
gpflow.kernels.Matern32#
- class gpflow.kernels.Matern32(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.IsotropicStationary
The Matern 3/2 kernel. Functions drawn from a GP with this kernel are once differentiable. The kernel equation is
k(r) = σ² (1 + √3r) exp{-√3 r}
where: r is the Euclidean distance between the input points, scaled by the lengthscales parameter ℓ, σ² is the variance parameter.
gpflow.kernels.Matern52#
- class gpflow.kernels.Matern52(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.stationaries.IsotropicStationary
The Matern 5/2 kernel. Functions drawn from a GP with this kernel are twice differentiable. The kernel equation is
k(r) = σ² (1 + √5r + 5/3r²) exp{-√5 r}
where: r is the Euclidean distance between the input points, scaled by the lengthscales parameter ℓ, σ² is the variance parameter.
gpflow.kernels.MultioutputKernel#
- class gpflow.kernels.MultioutputKernel(active_dims=None, name=None)[source]#
Bases:
gpflow.kernels.base.Kernel
Multi Output Kernel class. This kernel can represent correlation between outputs of different datapoints. Therefore, subclasses of Mok should implement K which returns:
[N, P, N, P] if full_output_cov = True
[P, N, N] if full_output_cov = False
and K_diag returns:
[N, P, P] if full_output_cov = True
[N, P] if full_output_cov = False
The full_output_cov argument holds whether the kernel should calculate the covariance between the outputs. In case there is no correlation but full_output_cov is set to True the covariance matrix will be filled with zeros until the appropriate size is reached.
- Parameters
active_dims (
Union
[slice
,Sequence
[int
],None
]) –name (
Optional
[str
]) –
- abstract K(X, X2=None, full_output_cov=True)[source]#
Returns the correlation of f(X) and f(X2), where f(.) can be multi-dimensional.
- Parameters
- Return type
Tensor
- Returns
cov[f(X), f(X2)] with shape:
[N1, P, N2, P] if full_output_cov = True
[P, N1, N2] if full_output_cov = False
- abstract K_diag(X, full_output_cov=True)[source]#
Returns the correlation of f(X) and f(X), where f(.) can be multi-dimensional.
- Parameters
X (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – data matrix, [N, D]full_output_cov (
bool
) – calculate correlation between outputs.
- Return type
Tensor
- Returns
var[f(X)] with shape:
[N, P, N, P] if full_output_cov = True
[N, P] if full_output_cov = False
- abstract property latent_kernels: Tuple[gpflow.kernels.base.Kernel, ...]#
The underlying kernels in the multioutput kernel
- Return type
Tuple
[Kernel
,...
]
- abstract property num_latent_gps: int#
The number of latent GPs in the multioutput kernel
- Return type
int
gpflow.kernels.Periodic#
- class gpflow.kernels.Periodic(base_kernel, period=1.0)[source]#
Bases:
gpflow.kernels.base.Kernel
The periodic family of kernels. Can be used to wrap any Stationary kernel to transform it into a periodic version. The canonical form (based on the SquaredExponential kernel) can be found in Equation (47) of
D.J.C.MacKay. Introduction to Gaussian processes. In C.M.Bishop, editor, Neural Networks and Machine Learning, pages 133–165. Springer, 1998.
The derivation can be achieved by mapping the original inputs through the transformation u = (cos(x), sin(x)).
For the SquaredExponential base kernel, the result can be expressed as:
k(r) = σ² exp{ -0.5 sin²(π r / γ) / ℓ²}
where: r is the Euclidean distance between the input points ℓ is the lengthscales parameter, σ² is the variance parameter, γ is the period parameter.
- NOTE: usually we have a factor of 4 instead of 0.5 in front but this
is absorbed into the lengthscales hyperparameter.
- NOTE: periodic kernel uses active_dims of a base kernel, therefore
the constructor doesn’t have it as an argument.
- Parameters
base_kernel (
IsotropicStationary
) –period (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –
gpflow.kernels.Polynomial#
- class gpflow.kernels.Polynomial(degree=3.0, variance=1.0, offset=1.0, active_dims=None)[source]#
Bases:
gpflow.kernels.linears.Linear
The Polynomial kernel. Functions drawn from a GP with this kernel are polynomials of degree d. The kernel equation is
k(x, y) = (σ²xy + γ)ᵈ
where: σ² is the variance parameter, γ is the offset parameter, d is the degree parameter.
gpflow.kernels.Product#
- class gpflow.kernels.Product(kernels, name=None)[source]#
Bases:
gpflow.kernels.base.ReducingCombination
- Parameters
kernels (
Sequence
[Kernel
]) –name (
Optional
[str
]) –
gpflow.kernels.RBF#
- gpflow.kernels.RBF#
alias of
gpflow.kernels.stationaries.SquaredExponential
gpflow.kernels.RationalQuadratic#
- class gpflow.kernels.RationalQuadratic(variance=1.0, lengthscales=1.0, alpha=1.0, active_dims=None)[source]#
Bases:
gpflow.kernels.stationaries.IsotropicStationary
Rational Quadratic kernel,
k(r) = σ² (1 + r² / 2αℓ²)^(-α)
σ² : variance ℓ : lengthscales α : alpha, determines relative weighting of small-scale and large-scale fluctuations
For α → ∞, the RQ kernel becomes equivalent to the squared exponential.
gpflow.kernels.SeparateIndependent#
- class gpflow.kernels.SeparateIndependent(kernels, name=None)[source]#
Bases:
gpflow.kernels.multioutput.kernels.MultioutputKernel
,gpflow.kernels.base.Combination
Separate: we use different kernel for each output latent
Independent: Latents are uncorrelated a priori.
- Parameters
kernels (
Sequence
[Kernel
]) –name (
Optional
[str
]) –
- K(X, X2=None, full_output_cov=True)[source]#
Returns the correlation of f(X) and f(X2), where f(.) can be multi-dimensional.
- Parameters
- Return type
Tensor
- Returns
cov[f(X), f(X2)] with shape:
[N1, P, N2, P] if full_output_cov = True
[P, N1, N2] if full_output_cov = False
- K_diag(X, full_output_cov=False)[source]#
Returns the correlation of f(X) and f(X), where f(.) can be multi-dimensional.
- Parameters
X (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) – data matrix, [N, D]full_output_cov (
bool
) – calculate correlation between outputs.
- Return type
Tensor
- Returns
var[f(X)] with shape:
[N, P, N, P] if full_output_cov = True
[N, P] if full_output_cov = False
- property latent_kernels: Tuple[gpflow.kernels.base.Kernel, ...]#
The underlying kernels in the multioutput kernel
- Return type
Tuple
[Kernel
,...
]
- property num_latent_gps: int#
The number of latent GPs in the multioutput kernel
- Return type
int
gpflow.kernels.Static#
- class gpflow.kernels.Static(variance=1.0, active_dims=None)[source]#
Bases:
gpflow.kernels.base.Kernel
Kernels who don’t depend on the value of the inputs are ‘Static’. The only parameter is a variance, σ².
- Parameters
variance (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –active_dims (
Union
[slice
,Sequence
[int
],None
]) –
gpflow.kernels.Stationary#
- class gpflow.kernels.Stationary(variance=1.0, lengthscales=1.0, **kwargs)[source]#
Bases:
gpflow.kernels.base.Kernel
Base class for kernels that are stationary, that is, they only depend on
d = x - x’
This class handles ‘ard’ behaviour, which stands for ‘Automatic Relevance Determination’. This means that the kernel has one lengthscale per dimension, otherwise the kernel is isotropic (has a single lengthscale).
- Parameters
- property ard: bool#
Whether ARD behaviour is active.
- Return type
bool
gpflow.kernels.Sum#
- class gpflow.kernels.Sum(kernels, name=None)[source]#
Bases:
gpflow.kernels.base.ReducingCombination
- Parameters
kernels (
Sequence
[Kernel
]) –name (
Optional
[str
]) –
gpflow.kernels.White#
- class gpflow.kernels.White(variance=1.0, active_dims=None)[source]#
Bases:
gpflow.kernels.statics.Static
The White kernel: this kernel produces ‘white noise’. The kernel equation is
k(x_n, x_m) = δ(n, m) σ²
where: δ(.,.) is the Kronecker delta, σ² is the variance parameter.
- Parameters
variance (
Union
[ndarray
[Any
,Any
],Tensor
,Variable
,Parameter
]) –active_dims (
Union
[slice
,Sequence
[int
],None
]) –