fix a few typos in docstrings

This commit is contained in:
Martijn Visser 2020-03-01 15:07:12 +01:00
parent 77a7606dad
commit 6076847a45
2 changed files with 6 additions and 6 deletions

View File

@ -62,7 +62,7 @@ ADAMW
## Optimiser Interface ## Optimiser Interface
Flux's optimsers are built around a `struct` that holds all the optimiser parameters along with a definition of how to apply the update rule associated with it. We do this via the `apply!` function which takes the optimiser as the first argument followed by the parameter and its corresponding gradient. Flux's optimisers are built around a `struct` that holds all the optimiser parameters along with a definition of how to apply the update rule associated with it. We do this via the `apply!` function which takes the optimiser as the first argument followed by the parameter and its corresponding gradient.
In this manner Flux also allows one to create custom optimisers to be used seamlessly. Let's work this with a simple example. In this manner Flux also allows one to create custom optimisers to be used seamlessly. Let's work this with a simple example.
@ -100,15 +100,15 @@ Flux internally calls on this function via the `update!` function. It shares the
## Composing Optimisers ## Composing Optimisers
Flux defines a special kind of optimiser called simply as `Optimiser` which takes in a arbitrary optimisers as input. Its behaviour is similar to the usual optimisers, but differs in that it acts by calling the optimisers listed in it sequentially. Each optimiser produces a modified gradient Flux defines a special kind of optimiser simply called `Optimiser` which takes in arbitrary optimisers as input. Its behaviour is similar to the usual optimisers, but differs in that it acts by calling the optimisers listed in it sequentially. Each optimiser produces a modified gradient
that will be fed into the next, and the resultant update will be applied to the parameter as usual. A classic use case is where adding decays is desirable. Flux defines some basic decays including `ExpDecay`, `InvDecay` etc. that will be fed into the next, and the resultant update will be applied to the parameter as usual. A classic use case is where adding decays is desirable. Flux defines some basic decays including `ExpDecay`, `InvDecay` etc.
```julia ```julia
opt = Optimiser(ExpDecay(0.001, 0.1, 1000, 1e-4), Descent()) opt = Optimiser(ExpDecay(0.001, 0.1, 1000, 1e-4), Descent())
``` ```
Here we apply exponential decay to the `Descent` optimser. The defaults of `ExpDecay` say that its learning rate will be decayed every 1000 steps. Here we apply exponential decay to the `Descent` optimiser. The defaults of `ExpDecay` say that its learning rate will be decayed every 1000 steps.
It is then applied like any optimser. It is then applied like any optimiser.
```julia ```julia
w = randn(10, 10) w = randn(10, 10)

View File

@ -77,7 +77,7 @@ Gradient descent with learning rate `η` and Nesterov momentum `ρ`.
## Parameters ## Parameters
- Learning Rate (η): Amount by which the gradients are dicsounted berfore updating the weights. Defaults to `0.001`. - Learning Rate (η): Amount by which the gradients are dicsounted berfore updating the weights. Defaults to `0.001`.
- Nesterov Momentum (ρ): Paramters controlling the amount of nesterov momentum to be applied. Defaults to `0.9`. - Nesterov Momentum (ρ): Parameters controlling the amount of nesterov momentum to be applied. Defaults to `0.9`.
## Examples ## Examples
```julia ```julia
@ -105,7 +105,7 @@ end
""" """
RMSProp(η, ρ) RMSProp(η, ρ)
Implements the RMSProp algortihm. Often a good choice for recurrent networks. Paramters other than learning rate generally don't need tuning. Implements the RMSProp algortihm. Often a good choice for recurrent networks. Parameters other than learning rate generally don't need tuning.
## Parameters ## Parameters
- Learning Rate (η): Defaults to `0.001`. - Learning Rate (η): Defaults to `0.001`.