This really depends on the cost/benefit tradeoff for the entity in question. If using ChatGPT makes you X% more productive (shipping faster / lowers labor costs / etc), but comes with Y% risk of data leakage, is that worth it in expectation or not? I would argue that there definitely exist companies for which it's worth the tradeoff.
To anyone who may be pasting code along the lines of 'convert this sql table schema into a [pydantic model|JSON Schema]' where you're pasting in the text, just ask it instead to write you a [python|go|bash|...] function that reads in a text file and 'converts an sql table schema to output x' or whatever. Related/not-related--great pandas docs replacement is another great+safe use-case.
Point is, for a meaningful subset of high-value use-cases you don't need to move your important private stuff across any trust boundaries, and it still can be pretty helpful...so just calling that out in case that's useful to anyone...
At first I was impressed by how easy it was to reach a data model with chatgpt, then I laughed as I tried to tweak it and use it. I realized it didn't really have any model concepts and was just using its various KB.
I am unsure if the so called AI can think in models but so far, not but still an impressive assisting tool if you take care of its limitations.
Another point where it lacks is in logic, my daughter has a lot of fun with the book "what is the name of this book?" but she was struggling with the "map of baal" explanation, her the answer was a certain map, yet the book had another answer, I had a third one as I interpreted a proposition. I never got an answer without a contradiction in chatgpt reasoning, and the book had been mistranslated to French so one of its propositions was changed (C, both A and B were knaves) but not the answer.
> At first I was impressed by how easy it was to reach a data model with chatgpt, then I laughed as I tried to tweak it and use it. I realized it didn't really have any model concepts and was just using its various KB.
> I am unsure if the so called AI can think in models but so far, not but still an impressive assisting tool if you take care of its limitations.
I don't know. I'm using it for exactly that ("here's a problem, come up with a data model") and it gives a great starting point.[0]
Not perfect, but after that it's easy to tweak it the old-fashioned way.
I find its data modelling capabilities (in the domain I'm using it for - API services) to be rougly on par with a mid-level developer (for a handwavy definition of "midlevel").
We have data standards and agreements with those companies, we pay them to have expectations. Even then, we're strict about what touches vendor servers and it's audited and monitored. Accounts are managed by us and tied into onboarding and offboarding. If they have a security incident, they notify, there's response and remediation.
ChatGPT seems to be used more like a fast stackoverflow, except people aren't thinking of it like a forum where others will see their question so they aren't as cautious. We're just waiting for some company's data to show up remixed into an answer for someone else and then plastered all over the internet for the infosec lulz of the week.
> We have data standards and agreements with those companies, we pay them to have expectations. Even then, we're strict about what touches vendor servers and it's audited and monitored. Accounts are managed by us and tied into onboarding and offboarding.
For every company like yours there are hundreds that don't. People use free gmail address for sensitive company stuff, paste random things in random pastebins, put their private keys in public repos, etc.
Yes, data leaks from OpenAI are bound to happen (again), and they should beef up their security practices.
But thinking people are using only ChatGPT in an insecure way vastly overestimates their security practices elsewhere.
The solution is education, not avoiding new tools.
Doesn't OpenAI explicitly say that your Q/A on the free ChatGPT are stored and sent to human reviewers to be put in their RL database? Now of course we can't be sure what google, AWS etc do with the data on disks there, but it would be a pretty big scandal if some whistleblower eventually comes out and say that google employees sit and laugh at private bucket contents on GCP or private Google Docs. So there's a difference in stated intention at least..
Who in their right mind is using free ChatGPT through that shitty no good web interface of theirs, that can barely handle two queries-and-replies before grinding down to a halt? Surely everyone is using the pay-as-you-go API keys and any one of the alternative ffrontends or integrations?
And, IIRC, pay-as-you-go API requests are explicitly not used for training data. I'm sad GPT-4 isn't there yet - except for those who won the waitlist lottery.
It's really funny to see these types of comments. I would assume a vast majority of users are using the Web interface, particularly in a corporate context where an account for the API could take ages or not be accepted.
If people were smart and performed according to best practices, articles like this one would not be necessary.
I mean, if you're using a free web interface in corporate context, you may just as well use a paid API with your personal account - either way, you're using it of your own volition, and not as approved by your employer. And getting API keys to ChatGPT equivalent (i.e. GPT-3.5) takes... a minute, maybe less.
I am honestly confused how people can use this thing with the interface OpenAI runs. The app has been near-unusable for me, for months, on every device I tried it on.
> and any one of the alternative ffrontends or integrations?
And what sort of understanding do you have with the alternative frontends/integrations about how they handle your API keys and data? This might be a better solution for a variety of reasons but it doesn't automatically mean your data is being handled any better or worse than by openai.com
I wonder what the distribution of tokens / sec at OpenAI is between the free ChatGPT, paid ChatGPT, and APIs. I’d have to think the free interface is getting slammed. Quite the scaling project, and still nowhere near peaking.
To quote a children's TV show: "Which ones of these things are not like the other ones?"
Some of those are document tools working on language / knowledge. Others are infrastructure, working on ... whatever your infra does, and your infra manages your data (knowledge).
If you read their data policies, you'll find they are not the same.
To your average user who interfaces with these figurative black boxes with a black box in their hand, how is this particular black box any different than the other black boxes that this user hands their data to every second of every day?
there are plenty of disallowed 'black boxes' within the federal sphere; chatgpt is just yet another.
to take a stab at your question, though : my cell phone doesn't learn to get better by absorbing my telecommunications; it's just used as a means to spy on my personal life by The Powers That Be. The primary purpose of my cell phone is for the conveyance of telecommunications.
chatGPT hordes data for training and self-improvement in its' current state. It's whole modus operandi involves the capture of data, rather than it being used for that tangentially. It could not meaningfully exist without training on something, and at this stage of the game it's the trend to self-train with user data.
Until that trend changes people should probably be a bit more suspect about what kind of stuff gets thrown into the training bin.
Those typically have MSAs with legalese where parties stipulate what they will and will not do and often whether or not it’s zero knowledge and often option to have your own instance encryption keys.
If people are using the free version of chatGPT then it’s unlikely there is a contract between the companies and more likely just a terms of use applied by chatGPT and ignored by the users.
I simply don't give a crap if my employer loses data. I don't care if my carelessness costs my employer a billion bucks down the line as I won't be working for them next year.
"I do not take any kind of responsibility about what I'm doing, or not doing, or thinking about doing or not doing, or thinking about whenever I should be doing or not doing, or thinking about whenever I should be thinking about doing or not doing".
As a moral questionable answering robot however, i must aks, why all things else should be tainted by the machinery, but evidence like text should not?
I am treating my employment like a corporation would. Risks I do not pay for and do not benefit from mitigating are waste that could allow me to transfer time back to my own priorities, increasing my personal "profit."
Not who you replied to, but if you agree, even a little, with the phrase, "the social contract between employees & employers is broken in the US"... well it goes both ways.
I use it because it's 10-100x more interesting, fun, and fast as a way to program, instead of me having to personally hand-craft hundreds of lines of boilerplate API interaction code every time I want to get something done.
Besides, it's not like it puts out great code (or even always working code), so I still have to read everything and debug it. And sometimes it writes code that is just fine and fit for purpose and horrendously ugly, so I still have to scrap everything and do it myself.
(And then sometimes I spend 10x as long doing that, because it turns out it's also just plain good fun to grow an aesthetic corner of the code just for the hell of it, too — as long as I don't have to.)
And even after all that extra time is factored back in: it's still way faster and more fun than the before-times. I'm actually enjoying building things again.
Pair-programming with ChatGPT is like having an idiot-savant friend who always surprises you. Doesn’t matter if the code is horrible, amazing, or something inbetween. It’s always interesting.
And I agree it’s fun. Maybe it’s the simulated social interaction without consequences. I can be completely honest with my robot friend about the shitty or awesome code and no one’s feelings are going to get hurt. ChatGPT will just keep trying to be helpful.
You can be an experienced developers with years building complex applications behind you and still find ChatGPT useful. I've found it useful for documenting individual methods or simply explaining my own/other's code or writing unit test methods or just using it to add boilerplate stuff that saves me an hour that I use elsewhere.
I think many people find ChatGPT useful specifically because they have years of experience building complex applications.
If you know exactly what you want to ask of it, and have the ability to evaluate and verify what it produces, it's incredible what you can get out of it. Sure it's nothing I couldn't have done otherwise... eventually. The productivity it enables is worth every cent.
Easily the best $20 I've spent in ages, they should have run with the initial idea of charging $42.
But holy moly anyone putting confidential information into it needs to stop
I’ve been doing this kind of thing pretty regularly for the past few weeks, even though I know how to do any of the tasks in question. It’s usually still faster, even when taking the time to anonymize the details; and I don’t paste anything I wouldn’t put on a public gist (lots of “foo, bar”, etc)
Precisely because I can abstract it is why I use ChatGPT. It can do the boring, tedious, repetitive stuff instead of me and has shown me the joy of using programming to solve ACTUAL problems yet again, instead of having to spend hours on unimportant problems like "how do I do X with library Y".
But that's the API, not the Chat input or Playground.
Companies can use Azure OpenAI Services to get around this -- there's data privacy, encryption, SLAs even. The problem is it's very hard to get access to (right now).
the #1 problem with corporations saying things is that many things they say are not regulated or are taken on good faith. What happens with OpenAI are acquired and the rules change? These comments are often entirely worthless.
> If using ChatGPT makes you X% more productive (shipping faster / lowers labor costs / etc), but comes with Y% risk of data leakage
X and Y are not alike, and should not be compared. X is a benefit to you(r employer), whereas Y is a risk to the customer who has entrusted you with their data.
Mate. You aren’t special. It’s the nature of the profession that most of us are in, that we end up dealing with the “sensitive” data that you’re describing, barring most people working in Big Companies with proper internal controls.
Nothing you’ve said negates anything OP said. It’s simply an elaboration wrapped in elitism.
By the way, OpenAI says they wont use data submitted through its API for model training - https://techcrunch.com/2023/03/01/addressing-criticism-opena...