Mike J Innes
0f2975d905
update -> apply
2019-01-28 13:59:23 +00:00
Mike J Innes
f6397e7358
Merge pull request #517 from FluxML/fix_adamw
...
Fix decay argument in ADAMW
2019-01-18 10:06:23 +00:00
Mike J Innes
f0d5624ed2
Merge pull request #493 from dhairyagandhi96/master
...
[WIP] New Optimiser Docs
2019-01-10 11:10:38 +00:00
Dhairya Gandhi
e48268ff06
fix argument name in ADAMW
2018-12-12 16:47:42 +05:30
Dhairya Gandhi
1ea8c5a293
[WIP] add docstrings and doc improvements
2018-11-12 19:17:10 +05:30
Joel Mason
29832aca92
Move some epsilons about
2018-11-02 22:59:04 +11:00
Mike J Innes
bffaceee02
tweaks
2018-10-31 14:58:55 +00:00
Dhairya Gandhi
bebf4eb95f
fixed ExpDecay update! rule
2018-10-29 23:12:24 +05:30
Dhairya Gandhi
32ce2d78b8
fixed ExpDecay test
2018-10-27 19:53:06 +05:30
Dhairya Gandhi
815e8c206d
decay fixes
2018-10-27 19:26:42 +05:30
Dhairya Gandhi
1f0f2a5ac2
fixed DescentWeightDecay parameters
2018-10-11 10:21:29 +05:30
Dhairya Gandhi
d8394298bb
fix merge conflicts
2018-10-11 10:15:59 +05:30
Dhairya Gandhi
fe8c147f72
fixed weight decay definition
2018-10-11 10:07:16 +05:30
Mike Innes
bfe85e65f1
compose tweaks
2018-10-05 13:52:26 +01:00
Mike Innes
0f2019eba5
compose tweaks
2018-10-05 12:57:03 +01:00
Mike Innes
9bc9771a8d
tweaks
2018-10-05 12:43:03 +01:00
Dhairya Gandhi
b661db3797
added deprecations and compose
2018-10-01 05:30:53 +05:30
Dhairya Gandhi
6665189ff1
added remaining optimizers and tests
2018-09-16 17:34:51 +05:30
Dhairya Gandhi
63bc71698b
updated tests
2018-09-14 20:32:56 +05:30
Dhairya Gandhi
d933f2079b
pulled tracker from upstream
2018-09-11 18:30:24 +05:30
Mike J Innes
a2d2d068aa
initial sketch
2018-08-28 17:55:59 +05:30
pevnak
3510c837a8
zeros replaced by zero
2018-08-03 15:14:25 +01:00
Jarvist Moore Frost
344a750770
Merge branch 'master' of github.com:jarvist/Flux.jl into HEAD
2018-07-03 11:15:43 +01:00
Jarvist Moore Frost
aee4a83c55
Add ADAMW weight-decay.
...
See http://www.fast.ai/2018/07/02/adam-weight-decay/ and the original
paper https://arxiv.org/abs/1711.05101.pdf for context.
I don't know what I'm doing, and this is quite possibly wrong - but on
a simple Char-RNN I have lying around on my harddisk, this seems to
improve the rate of learning consistently for different hyperparameters
vs. standard ADAM with the same decay constant.
2018-07-03 11:11:32 +01:00
Tejan Karmali
4a24b69976
Merge branch 'master' into nadam-opt
2018-06-08 16:54:41 +05:30
Sujeet Akula
8c042bd522
element wise max()
2018-04-26 21:12:31 +10:00
Sujeet Akula
b6508e2416
add adamax
2018-04-26 17:37:24 +10:00
tejank10
65847bb745
moved epsilon into sqrt
2018-04-04 15:25:20 +05:30
tejank10
3ead662987
Update rule fixed
2018-04-04 15:18:44 +05:30
tejank10
ea9b5471fa
NADAM optimizer
2018-04-03 01:27:22 +05:30
Mike J Innes
24a6569589
Merge branch 'master' into amsgrad
2017-12-08 18:20:53 +00:00
baggepinnen
36001d085a
Implement AMSGrad optimiser
2017-12-04 09:17:05 +01:00
CarloLucibello
13b934c250
improve optimizers
2017-11-24 12:12:20 +01:00
Mike J Innes
979949d01a
style
2017-11-21 15:25:09 +01:00
Fredrik Bagge Carlson
8991ce028c
Fix bug in rmsprop and adadelta
...
`@. p.Δ = η * p.Δ / √acc` parses correctly while `@. p.Δ /= √acc*η` seems to parse like `@. p.Δ /= (√acc*η)`, hence the step size was de facto interpreted as `1/η`
2017-11-14 17:32:16 +01:00
Mike J Innes
f2052739c1
tweaks
2017-09-12 14:11:03 +01:00
Mike J Innes
387686eb41
optimisers rework
2017-09-01 17:06:51 -04:00
Mike J Innes
b95dae1868
opt refactor
2017-08-31 14:55:23 -04:00
Mike J Innes
1526b13691
basic training loop
2017-08-24 11:42:29 +01:00
Mike J Innes
bafecfede1
sgd
2017-08-22 22:25:18 +01:00