Flux.jl

Author	SHA1	Message	Date
Mike J Innes	0f2975d905	update -> apply	2019-01-28 13:59:23 +00:00
Mike J Innes	f6397e7358	Merge pull request #517 from FluxML/fix_adamw Fix decay argument in ADAMW	2019-01-18 10:06:23 +00:00
Mike J Innes	f0d5624ed2	Merge pull request #493 from dhairyagandhi96/master [WIP] New Optimiser Docs	2019-01-10 11:10:38 +00:00
Dhairya Gandhi	e48268ff06	fix argument name in ADAMW	2018-12-12 16:47:42 +05:30
Dhairya Gandhi	1ea8c5a293	[WIP] add docstrings and doc improvements	2018-11-12 19:17:10 +05:30
Joel Mason	29832aca92	Move some epsilons about	2018-11-02 22:59:04 +11:00
Mike J Innes	bffaceee02	tweaks	2018-10-31 14:58:55 +00:00
Dhairya Gandhi	bebf4eb95f	fixed ExpDecay update! rule	2018-10-29 23:12:24 +05:30
Dhairya Gandhi	32ce2d78b8	fixed ExpDecay test	2018-10-27 19:53:06 +05:30
Dhairya Gandhi	815e8c206d	decay fixes	2018-10-27 19:26:42 +05:30
Dhairya Gandhi	1f0f2a5ac2	fixed DescentWeightDecay parameters	2018-10-11 10:21:29 +05:30
Dhairya Gandhi	d8394298bb	fix merge conflicts	2018-10-11 10:15:59 +05:30
Dhairya Gandhi	fe8c147f72	fixed weight decay definition	2018-10-11 10:07:16 +05:30
Mike Innes	bfe85e65f1	compose tweaks	2018-10-05 13:52:26 +01:00
Mike Innes	0f2019eba5	compose tweaks	2018-10-05 12:57:03 +01:00
Mike Innes	9bc9771a8d	tweaks	2018-10-05 12:43:03 +01:00
Dhairya Gandhi	b661db3797	added deprecations and compose	2018-10-01 05:30:53 +05:30
Dhairya Gandhi	6665189ff1	added remaining optimizers and tests	2018-09-16 17:34:51 +05:30
Dhairya Gandhi	63bc71698b	updated tests	2018-09-14 20:32:56 +05:30
Dhairya Gandhi	d933f2079b	pulled tracker from upstream	2018-09-11 18:30:24 +05:30
Mike J Innes	a2d2d068aa	initial sketch	2018-08-28 17:55:59 +05:30
pevnak	3510c837a8	zeros replaced by zero	2018-08-03 15:14:25 +01:00
Jarvist Moore Frost	344a750770	Merge branch 'master' of github.com:jarvist/Flux.jl into HEAD	2018-07-03 11:15:43 +01:00
Jarvist Moore Frost	aee4a83c55	Add ADAMW weight-decay. See http://www.fast.ai/2018/07/02/adam-weight-decay/ and the original paper https://arxiv.org/abs/1711.05101.pdf for context. I don't know what I'm doing, and this is quite possibly wrong - but on a simple Char-RNN I have lying around on my harddisk, this seems to improve the rate of learning consistently for different hyperparameters vs. standard ADAM with the same decay constant.	2018-07-03 11:11:32 +01:00
Tejan Karmali	4a24b69976	Merge branch 'master' into nadam-opt	2018-06-08 16:54:41 +05:30
Sujeet Akula	8c042bd522	element wise max()	2018-04-26 21:12:31 +10:00
Sujeet Akula	b6508e2416	add adamax	2018-04-26 17:37:24 +10:00
tejank10	65847bb745	moved epsilon into sqrt	2018-04-04 15:25:20 +05:30
tejank10	3ead662987	Update rule fixed	2018-04-04 15:18:44 +05:30
tejank10	ea9b5471fa	NADAM optimizer	2018-04-03 01:27:22 +05:30
Mike J Innes	24a6569589	Merge branch 'master' into amsgrad	2017-12-08 18:20:53 +00:00
baggepinnen	36001d085a	Implement AMSGrad optimiser	2017-12-04 09:17:05 +01:00
CarloLucibello	13b934c250	improve optimizers	2017-11-24 12:12:20 +01:00
Mike J Innes	979949d01a	style	2017-11-21 15:25:09 +01:00
Fredrik Bagge Carlson	8991ce028c	Fix bug in rmsprop and adadelta `@. p.Δ = η * p.Δ / √acc` parses correctly while `@. p.Δ /= √accη` seems to parse like `@. p.Δ /= (√accη)`, hence the step size was de facto interpreted as `1/η`	2017-11-14 17:32:16 +01:00
Mike J Innes	f2052739c1	tweaks	2017-09-12 14:11:03 +01:00
Mike J Innes	387686eb41	optimisers rework	2017-09-01 17:06:51 -04:00
Mike J Innes	b95dae1868	opt refactor	2017-08-31 14:55:23 -04:00
Mike J Innes	1526b13691	basic training loop	2017-08-24 11:42:29 +01:00
Mike J Innes	bafecfede1	sgd	2017-08-22 22:25:18 +01:00

40 Commits