gpflow.optimizers#
Modules#
Classes#
gpflow.optimizers.NaturalGradient#
- class gpflow.optimizers.NaturalGradient(gamma, xi_transform=<gpflow.optimizers.natgrad.XiNat object>, name=None)[source]#
Bases:
OptimizerV2
Implements a natural gradient descent optimizer for variational models that are based on a distribution q(u) = N(q_mu, q_sqrt q_sqrtᵀ) that is parameterized by mean q_mu and lower-triangular Cholesky factor q_sqrt of the covariance.
Note that this optimizer does not implement the standard API of tf.optimizers.Optimizer. Its only public method is minimize(), which has a custom signature (var_list needs to be a list of (q_mu, q_sqrt) tuples, where q_mu and q_sqrt are gpflow.Parameter instances, not tf.Variable).
Note furthermore that the natural gradients are implemented only for the full covariance case (i.e., q_diag=True is NOT supported).
When using in your work, please cite Salimbeni et al. [SEH18].
- Parameters:
gamma (
Union
[float
,Tensor
,ndarray
]) –xi_transform (
XiTransform
) –name (
Optional
[str
]) –
- get_config()[source]#
Returns the config of the optimizer.
An optimizer config is a Python dictionary (serializable) containing the configuration of an optimizer. The same optimizer can be reinstantiated later (without any saved state) from this configuration.
- Returns:
Python dictionary.
- Return type:
Dict
[str
,Any
]
- minimize(loss_fn, var_list)[source]#
Minimizes objective function of the model. Natural Gradient optimizer works with variational parameters only.
GPflow implements the XiNat (default) and XiSqrtMeanVar transformations for parameters. Custom transformations that implement the XiTransform interface are also possible.
- Parameters:
loss_fn (
Callable
[[],Tensor
]) – Loss function.var_list (
Sequence
[Union
[Tuple
[Parameter
,Parameter
],Tuple
[Parameter
,Parameter
,XiTransform
]]]) –var_list[all][0] has shape [N, D].
var_list[all][1] has shape [D, N, N].
List of pair tuples of variational parameters or triplet tuple with variational parameters and ξ transformation. If ξ is not specified, will use self.xi_transform. For example, var_list could be:
var_list = [ (q_mu1, q_sqrt1), (q_mu2, q_sqrt2, XiSqrtMeanVar()) ]
- Return type:
None
gpflow.optimizers.SamplingHelper#
- class gpflow.optimizers.SamplingHelper(target_log_prob_fn, parameters)[source]#
Bases:
object
This helper makes it easy to read from variables being set with a prior and writes values back to the same variables.
Example:
model = ... # Create a GPflow model hmc_helper = SamplingHelper(model.log_posterior_density, model.trainable_parameters) target_log_prob_fn = hmc_helper.target_log_prob_fn current_state = hmc_helper.current_state hmc = tfp.mcmc.HamiltonianMonteCarlo(target_log_prob_fn=target_log_prob_fn, ...) adaptive_hmc = tfp.mcmc.SimpleStepSizeAdaptation(hmc, ...) @tf.function def run_chain_fn(): return mcmc.sample_chain( num_samples, num_burnin_steps, current_state, kernel=adaptive_hmc) hmc_samples = run_chain_fn() parameter_samples = hmc_helper.convert_to_constrained_values(hmc_samples)
- Parameters:
target_log_prob_fn (
Callable
[[],Tensor
]) –parameters (
Sequence
[Parameter
]) –
- convert_to_constrained_values(hmc_samples)[source]#
Converts list of unconstrained values in hmc_samples to constrained versions. Each value in the list corresponds to an entry in parameters passed to the constructor; for parameters that have a transform, the constrained representation is returned.
- Parameters:
hmc_samples (
Sequence
[Tensor
]) –- Return type:
Sequence
[Tensor
]
- property current_state: Sequence[Variable]#
Return the current state of the unconstrained variables, used in HMC.
- Return type:
Sequence
[Variable
]
- property target_log_prob_fn: Callable[[...], Tuple[Tensor, Callable[[...], Tuple[Tensor, Sequence[None]]]]]#
The target log probability, adjusted to allow for optimisation to occur on the tracked unconstrained underlying variables.
- Return type:
Callable
[...
,Tuple
[Tensor
,Callable
[...
,Tuple
[Tensor
,Sequence
[None
]]]]]
gpflow.optimizers.Scipy#
- class gpflow.optimizers.Scipy[source]#
Bases:
object
- minimize(closure, variables, method='L-BFGS-B', step_callback=None, compile=True, allow_unused_variables=False, **scipy_kwargs)[source]#
Minimize closure.
Minimize is a wrapper around the scipy.optimize.minimize function handling the packing and unpacking of a list of shaped variables on the TensorFlow side vs. the flat numpy array required on the Scipy side.
- Parameters:
closure (
Callable
[[],Tensor
]) – A closure that re-evaluates the model, returning the loss to be minimized.variables (
Sequence
[Variable
]) – The list (tuple) of variables to be optimized (typically model.trainable_variables)method (
Optional
[str
]) – The type of solver to use in SciPy. Defaults to “L-BFGS-B”.step_callback (
Union
[Callable
[[int
,Sequence
[Variable
],Sequence
[Tensor
]],None
],Monitor
,None
]) – If not None, a callable that gets called once after each optimisation step. The callable is passed the arguments step, variables, and values. step is the optimisation step counter, variables is the list of trainable variables as above, and values is the corresponding list of tensors of matching shape that contains their value at this optimisation step.compile (
bool
) – If True, wraps the evaluation function (the passed closure as well as its gradient computation) inside a tf.function(), which will improve optimization speed in most cases.allow_unused_variables (
bool
) – Whether to allow variables that are not actually used in the closure.scipy_kwargs (
Any
) – Arguments passed through to scipy.optimize.minimize. Note that Scipy’s minimize() takes a callback argument, but you probably want to use our wrapper and pass in step_callback.
- Return type:
OptimizeResult
- Returns:
The optimization result represented as a Scipy
OptimizeResult
object. See the Scipy documentation for description of attributes.
gpflow.optimizers.XiNat#
- class gpflow.optimizers.XiNat[source]#
Bases:
XiTransform
This is the default transform. Using the natural directly saves the forward mode gradient, and also gives the analytic optimal solution for gamma=1 in the case of Gaussian likelihood.
- static meanvarsqrt_to_xi(mean, varsqrt)[source]#
Transforms the parameter mean and varsqrt to xi1, xi2
- Parameters:
mean (
Tensor
) –mean has shape [N, D].
the mean parameter
varsqrt (
Tensor
) –varsqrt has shape [D, N, N].
the varsqrt parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (xi1, xi2), the xi parameters
- static naturals_to_xi(nat1, nat2)[source]#
Applies the transform so that nat1, nat2 is mapped to xi1, xi2
- Parameters:
nat1 (
Tensor
) –nat1 has shape [N, D].
the θ₁ parameter
nat2 (
Tensor
) –nat2 has shape [D, N, N].
the θ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple xi1, xi2
- static xi_to_meanvarsqrt(xi1, xi2)[source]#
Transforms the parameter xi1, xi2 to mean, varsqrt
- Parameters:
xi1 (
Tensor
) –xi1 has shape [N, D].
the ξ₁ parameter
xi2 (
Tensor
) –xi2 has shape [D, N, N].
the ξ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (mean, varsqrt), the meanvarsqrt parameters
gpflow.optimizers.XiSqrtMeanVar#
- class gpflow.optimizers.XiSqrtMeanVar[source]#
Bases:
XiTransform
This transformation will perform natural gradient descent on the model parameters, so saves the conversion to and from Xi.
- static meanvarsqrt_to_xi(mean, varsqrt)[source]#
Transforms the parameter mean and varsqrt to xi1, xi2
- Parameters:
mean (
Tensor
) –mean has shape [N, D].
the mean parameter
varsqrt (
Tensor
) –varsqrt has shape [D, N, N].
the varsqrt parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (xi1, xi2), the xi parameters
- static naturals_to_xi(nat1, nat2)[source]#
Applies the transform so that nat1, nat2 is mapped to xi1, xi2
- Parameters:
nat1 (
Tensor
) –nat1 has shape [N, D].
the θ₁ parameter
nat2 (
Tensor
) –nat2 has shape [D, N, N].
the θ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple xi1, xi2
- static xi_to_meanvarsqrt(xi1, xi2)[source]#
Transforms the parameter xi1, xi2 to mean, varsqrt
- Parameters:
xi1 (
Tensor
) –xi1 has shape [N, D].
the ξ₁ parameter
xi2 (
Tensor
) –xi2 has shape [D, N, N].
the ξ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (mean, varsqrt), the meanvarsqrt parameters
gpflow.optimizers.XiTransform#
- class gpflow.optimizers.XiTransform[source]#
Bases:
object
XiTransform is the base class that implements three transformations necessary for the natural gradient calculation wrt any parameterization.
- abstract static meanvarsqrt_to_xi(mean, varsqrt)[source]#
Transforms the parameter mean and varsqrt to xi1, xi2
- Parameters:
mean (
Tensor
) –mean has shape [N, D].
the mean parameter
varsqrt (
Tensor
) –varsqrt has shape [D, N, N].
the varsqrt parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (xi1, xi2), the xi parameters
- abstract static naturals_to_xi(nat1, nat2)[source]#
Applies the transform so that nat1, nat2 is mapped to xi1, xi2
- Parameters:
nat1 (
Tensor
) –nat1 has shape [N, D].
the θ₁ parameter
nat2 (
Tensor
) –nat2 has shape [D, N, N].
the θ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple xi1, xi2
- abstract static xi_to_meanvarsqrt(xi1, xi2)[source]#
Transforms the parameter xi1, xi2 to mean, varsqrt
- Parameters:
xi1 (
Tensor
) –xi1 has shape [N, D].
the ξ₁ parameter
xi2 (
Tensor
) –xi2 has shape [D, N, N].
the ξ₂ parameter
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
return[0] has shape [N, D].
return[1] has shape [D, N, N].
tuple (mean, varsqrt), the meanvarsqrt parameters