Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I don't think your examples are good though, Max polling reduces noise. RuLU learn faster than Sigmoid or tanh.

That's not theory, that's just observation of the results. Why should we expect it to work that way?



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: