Commit Graph

19 Commits

Author SHA1 Message Date
pevnak
3510c837a8 zeros replaced by zero 2018-08-03 15:14:25 +01:00
Jarvist Moore Frost
344a750770 Merge branch 'master' of github.com:jarvist/Flux.jl into HEAD 2018-07-03 11:15:43 +01:00
Jarvist Moore Frost
aee4a83c55 Add ADAMW weight-decay.
See http://www.fast.ai/2018/07/02/adam-weight-decay/ and the original
paper https://arxiv.org/abs/1711.05101.pdf for context.

I don't know what I'm doing, and this is quite possibly wrong - but on
a simple Char-RNN I have lying around on my harddisk, this seems to
improve the rate of learning consistently for different hyperparameters
vs. standard ADAM with the same decay constant.
2018-07-03 11:11:32 +01:00
Tejan Karmali
4a24b69976
Merge branch 'master' into nadam-opt 2018-06-08 16:54:41 +05:30
Sujeet Akula
8c042bd522
element wise max() 2018-04-26 21:12:31 +10:00
Sujeet Akula
b6508e2416
add adamax 2018-04-26 17:37:24 +10:00
tejank10
65847bb745 moved epsilon into sqrt 2018-04-04 15:25:20 +05:30
tejank10
3ead662987 Update rule fixed 2018-04-04 15:18:44 +05:30
tejank10
ea9b5471fa NADAM optimizer 2018-04-03 01:27:22 +05:30
Mike J Innes
24a6569589 Merge branch 'master' into amsgrad 2017-12-08 18:20:53 +00:00
baggepinnen
36001d085a Implement AMSGrad optimiser 2017-12-04 09:17:05 +01:00
CarloLucibello
13b934c250 improve optimizers 2017-11-24 12:12:20 +01:00
Mike J Innes
979949d01a style 2017-11-21 15:25:09 +01:00
Fredrik Bagge Carlson
8991ce028c
Fix bug in rmsprop and adadelta
`@. p.Δ = η * p.Δ / √acc` parses correctly while `@. p.Δ /= √acc*η` seems to parse like `@. p.Δ /= (√acc*η)`, hence the step size was de facto interpreted as `1/η`
2017-11-14 17:32:16 +01:00
Mike J Innes
f2052739c1 tweaks 2017-09-12 14:11:03 +01:00
Mike J Innes
387686eb41 optimisers rework 2017-09-01 17:06:51 -04:00
Mike J Innes
b95dae1868 opt refactor 2017-08-31 14:55:23 -04:00
Mike J Innes
1526b13691 basic training loop 2017-08-24 11:42:29 +01:00
Mike J Innes
bafecfede1 sgd 2017-08-22 22:25:18 +01:00