Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Current SOTA models are really bad at RE and i don't really expect this to improve through training on open data.

There are just not a lot of high quality examples on the internet, and more importantly the people writing this code are doing their best to make it actively more difficult.



It is quite easy to produce high quality synthetic data to train reverse engineering. Just take any open source project and ask the model to produce the code (or something equivalent) given the binary.


Right. You could even run it through code obfuscators and such to create more diverse, realistic examples.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: