Notes

Notes

Rebel AdamW Optimizer

rebel-adamw Hypothesis Idea: Modify AdamW so that a random percentage of parameters move opposite to what the gradient suggests each step. Call these "rebel" param...

Spatial Weight Prior for Sparsity and other benefits.

Hypothesis I had this idea and thought I'd play around with a few version of it, This is a compiled report put together (with extensive oversight) about all the tests and how t...