As a scientist there is a ton of boiler plate code that is just slightly different enough for every data set I need to write it myself each time. So coding agents solve a lot of that. At least until you are halfway through something and you realize Claude didn’t listen when you wrote 5 times in capital letters NEVER MAKE UP DATA YOU ARE NOT ALLOWED TO USE np.random IN PLACE OF ACTUAL DATA. It’s all kind of wild because when it works it’s great and when it doesnt there’s no failure state. So if I put on my llm marketing hat I guess the solution is to have an agent that comes behind the coding agent that checks to see if it does its job. We can call it the Performance Improvement Plan Agent (PIPA). PIPAs allow real time monitoring of coding agents to make sure they are working and not slacking off allowing for HR departments and management teams to have full control over their AI employees. Together we will move into the future.
Quick, don't think of an elephant in a pink tutu! You did, didn't you?
As a scientist, you should know that LLMs are pretty bad at understanding negatives because they work on tokens, not words.
"NO ELEPHANTS" roughly becomes NO + ELEPHANT. Now "elephant" is in the context and it's going to be "thinking" about it and steering everything towards it.