I, ChatGPT


  • ♿ (Parody)

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler I can't keep up with all posts in this thread, most are boring as hell. So apologies for that.

    Okay, so if not inherent rights, then what is the reason for treating human learning and machine learning the same?

    Because learning is learning, which is something fundamentally different in nature from copying. What other reason is needed?

    Some line of thought that ends with "and therefore machine learning should be a protected activity just like human learning is". For humans, it's justified by it being an irrevocable human right. What's the justification for machines?

    Of course, there's plenty of information that's deemed acceptable to keep others from learning. The idea that you can invoke the magic "L" word here as if it's carte blanche for any sort of information gathering is ridiculous and we shouldn't let him get away with it.



  • @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler I can't keep up with all posts in this thread, most are boring as hell. So apologies for that.

    Okay, so if not inherent rights, then what is the reason for treating human learning and machine learning the same?

    Because learning is learning, which is something fundamentally different in nature from copying. What other reason is needed?

    Some line of thought that ends with "and therefore machine learning should be a protected activity just like human learning is". For humans, it's justified by it being an irrevocable human right. What's the justification for machines?

    The rule of law: that which is not prohibited is always permitted. And learning is not prohibited, therefore there is no justification nor consent needed.

    Yes, if we ignore the laws that prohibit this particular case of learning, then it's true laws don't prohibit it. Unfortunately, in real world we can't ignore laws like that (unless we're a billion-dollar corporation).

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    Teachers are allowed to do what they do because they are expressly exempt from the normal copyright law (yes, I know, we're back to copyright, sorry). They are exempt from copyright law on the grounds that it's a human right to learn.

    Where in the world are you getting this from? Please cite the relevant statute.

    AI systems don't have the same inherent right, therefore this exemption doesn't apply, and the normal law takes over, and the normal law says you cannot make copies.

    Where in the world are you getting this from? Please cite the relevant statute.



  • @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.



  • @boomzilla said in I, ChatGPT:

    Of course, there's plenty of information that's deemed acceptable to keep others from learning. The idea that you can invoke the magic "L" word here as if it's carte blanche for any sort of information gathering is ridiculous and we shouldn't let him get away with it.

    There are some very very limited examples that are deemed acceptable to keep others from learning. They are exceptions. They are not and never should be the rule.


  • ♿ (Parody)

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.

    It reminds me of the "corporations aren't people!" brain worm, but now that we've got pretty convincing generative AI, it's possible they're actually right now. Though of course, someone still pays for the AI to run, so there's a person somewhere at the bottom of that stack of turtles.


  • ♿ (Parody)

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    Of course, there's plenty of information that's deemed acceptable to keep others from learning. The idea that you can invoke the magic "L" word here as if it's carte blanche for any sort of information gathering is ridiculous and we shouldn't let him get away with it.

    There are some very very limited examples that are deemed acceptable to keep others from learning. They are exceptions. They are not and never should be the rule.

    But more important to your incorrect arguments, you don't have any inherent right to be taught by someone else. Which is to say that no one else has an obligation to teach you.



  • @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.

    It reminds me of the "corporations aren't people!" brain worm, but now that we've got pretty convincing generative AI, it's possible they're actually right now.

    Once again, I've never argued in favor of AIs being granted human rights. (I actually find the concept horrifying, for reasons not relevant to this discussion.) I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.



  • @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    Of course, there's plenty of information that's deemed acceptable to keep others from learning. The idea that you can invoke the magic "L" word here as if it's carte blanche for any sort of information gathering is ridiculous and we shouldn't let him get away with it.

    There are some very very limited examples that are deemed acceptable to keep others from learning. They are exceptions. They are not and never should be the rule.

    But more important to your incorrect arguments, you don't have any inherent right to be taught by someone else. Which is to say that no one else has an obligation to teach you.

    *sigh* And now you're right back to conflating obligations with prohibitions. Again.


  • Banned

    @Mason_Wheeler said in I, ChatGPT:

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    It's this ephemeral thing again, isn't it? Please confirm because otherwise this whole rant doesn't make a lick of sense to me.

    Teachers are allowed to do what they do because they are expressly exempt from the normal copyright law (yes, I know, we're back to copyright, sorry). They are exempt from copyright law on the grounds that it's a human right to learn.

    Where in the world are you getting this from? Please cite the relevant statute.

    At least that's how it works in Poland. Teachers have a blanket immunity on copyright infringement of anything in text, picture, audio and video form (but not software), for distribution among their students for the purpose of learning. I thought that's what you're alluding to with all that talk about learning? That it normally wouldn't be allowed but because it's learning it's allowed?



  • @Mason_Wheeler said in I, ChatGPT:

    I will accept as valid any answer that is not logically equivalent to "throw the entire business model out the window."

    If the entire business model is based on "use other people's intellectual property without their permission," just maybe it should be thrown out.

    Also, the more I read this thread, the further behind I get. Y'all post too much. I'm about to give up and just skip to the end.


  • ♿ (Parody)

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.

    It reminds me of the "corporations aren't people!" brain worm, but now that we've got pretty convincing generative AI, it's possible they're actually right now.

    Once again, I've never argued in favor of AIs being granted human rights. (I actually find the concept horrifying, for reasons not relevant to this discussion.) I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    But for some reason you think they have a right to learn. You need to figure out which stance you're really pushing here.


  • Discourse touched me in a no-no place

    @Mason_Wheeler said in I, ChatGPT:

    And learning is not prohibited, therefore there is no justification nor consent needed.

    You missed a key clause: "learning by humans is not prohibited".

    Also, the acts of the AI are in law the acts of the "legal person" that owns the AI because an AI is just a computer program. If OpenAI were just going round getting their AI models to learn stuff by sucking up the entire content of the internet, they'd still be a bunch of jerks (for ignoring robots.txt and so on), but they might have some protection from the law.

    That isn't all they do.

    They also let paying customers get new works derived from the ingested data by the action of the models — that's totally what their stated business model is! — and that very much includes works that are extremely likely to be found to be clearly derivative works of current copyrighted material, and making derivative works of copyrighted material for money is absolutely something that people have been punished for. "But the computer did it!" is not a defense! "But the AI did it!" cannot be a defense! The AI cannot choose to make such things happen; it's not a legal person! The responsible owner must carry the responsibility. I cannot see any way in law for what OpenAI did to be legal, any more than I could see that being so of Dodgy Omar's bootleg Rolling Stones tapes sold from the back of a van on a sketchy street corner was back in the 1980s.


  • ♿ (Parody)

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    Of course, there's plenty of information that's deemed acceptable to keep others from learning. The idea that you can invoke the magic "L" word here as if it's carte blanche for any sort of information gathering is ridiculous and we shouldn't let him get away with it.

    There are some very very limited examples that are deemed acceptable to keep others from learning. They are exceptions. They are not and never should be the rule.

    But more important to your incorrect arguments, you don't have any inherent right to be taught by someone else. Which is to say that no one else has an obligation to teach you.

    *sigh* And now you're right back to conflating obligations with prohibitions. Again.

    :um-actually: I'm pointing out how you're doing that with your "sabotage" thing. I note you haven't gone back to the link you posted about that which makes my point.


  • Banned

    @Mason_Wheeler said in I, ChatGPT:

    I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    Nobody argues that. I'm just saying that the material they're learning from must be sourced in a legal way, ie. they must obtain all the right licenses for all the works they're using, for the purposes covering what they're doing. And what they're doing is scraping websites with bots in violation of TOS, saving copies in permanent storage in violation of license agreements, and internally redistributing those copies to other distinct systems within their IT infrastructure in violation of the Copyright Act of 1976 (or more specifically, USC Title 17 as it currently stands).

    That their systems can partially reproduce those works isn't the problem, it's everything they had to do to come to a point where their systems can partially reproduce those works that's the problem.



  • @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    It's this ephemeral thing again, isn't it? Please confirm because otherwise this whole rant doesn't make a lick of sense to me.

    Yes. Ephemeral copies do not count. Period. Absolute. Undermine that rule and our entire technological society falls apart. So please leave it alone!

    Teachers are allowed to do what they do because they are expressly exempt from the normal copyright law (yes, I know, we're back to copyright, sorry). They are exempt from copyright law on the grounds that it's a human right to learn.

    Where in the world are you getting this from? Please cite the relevant statute.

    At least that's how it works in Poland. Teachers have a blanket immunity on copyright infringement of anything in text, picture, audio and video form (but not software), for distribution among their students for the purpose of learning. I thought that's what you're alluding to with all that talk about learning? That it normally wouldn't be allowed but because it's learning it's allowed?

    No, that's not what I'm alluding to, because I've never heard of such a rule in the USA. For example, if teachers teach their students out of books, and they make illegal copies of books for their students, they definitely would be liable for that!



  • @Mason_Wheeler said in I, ChatGPT:

    @Arantor said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    The point is that there's something out there that someone thinks is valuable to their business, but then it's not.

    No, the point is that in its natural state, it is valuable, but then someone takes an additional, active step to adulterate it in order to make it poisonous rather than valuable. That is crossing a line into active, malicious sabotage, and no number of weird analogies makes it not crossing that line.

    No, it absolutely isn't sabotage. They've done nothing to the scraper or anything owned by the scraper no matter how intensely you hallucinate otherwise.

    What are you talking about? "Doing something to" the training run by the scraper is the entire reason for Nightshade's existence. It is specifically advertised as existing for that exact purpose: "run your work through this to screw up AIs."

    So maybe don't do that? Like, if someone puts ethanol in gasoline but the engine you're using can't handle that....maybe use something else. The people who put the ethanol in that didn't put their fuel in your tank. You did that to yourself.

    :wat-girl: :wtf_owl: :wat:

    What are you even talking about here and how does that in any way relate to the topic at hand?

    I'm trying to explain to you that it's not the image sharers who are using these images to train the AIs. The trainers can choose to avoid these images and then everyone is happy.

    How? How do you choose to avoid something that is deliberately designed to be undetectable? I will accept as valid any answer that is not logically equivalent to "throw the entire business model out the window."

    You could do something revolutionary like ask the people providing the images. Just because someone came up with a business plan means that other people need to support it.

    Stop making sense and being reasonable.

    There is nothing reasonable about requiring consent for learning. That's insane in fact.

    I want to learn calculus. By your logic, I can go photocopy a calculus textbook without the author's or publisher's consent, because "learning ". Academic publishers charge ridiculous amounts of money for textbooks, and I'm pretty sure they (and the courts) would strongly disagree with your understanding of the relationship between copyright and "learning ".



  • @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.

    It reminds me of the "corporations aren't people!" brain worm, but now that we've got pretty convincing generative AI, it's possible they're actually right now.

    Once again, I've never argued in favor of AIs being granted human rights. (I actually find the concept horrifying, for reasons not relevant to this discussion.) I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    But for some reason you think they have a right to learn.

    No, I never said that. I said no one has a right to prohibit learning, and I said that anything not prohibited is permitted. There is no need for a specific "right to learn" -- for humans or otherwise -- to exist at all.


  • Discourse touched me in a no-no place

    @Gustav said in I, ChatGPT:

    That their systems can partially reproduce those works isn't the problem

    I think I might disagree there. That's a very very big problem. A "get an immense fine and have the execs go to jail and have the corporation forced to be wound up under judicial supervision" scale of problem potentially.


  • Banned

    @dkf would you have the same problem if they used a Pajeet instead of ML?


  • ♿ (Parody)

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    As you can see, asking an AI to produce a copy of even one of the most famous works of art of all time does not produce a copy of it. It produces something that is vaguely similar and recognizable as being inspired by the Mona Lisa, but nothing morea derivative work that is a copyright violation if the original is protected by copyright.

    Obviously, the Mona Lisa is in the public domain, so your derivative isn't violating the copyright that doesn't exist. However, if the AI creates a similarly derivative version of an image that is protected, then the author's copyright has been violated.

    Agreed. But please don't use single-fact syllogisms here. "If someone uses a tool to do something bad that can also be done without the tool, then they have done something bad" conveys exactly as much meaning if all mention of the tool is excised, leaving you with a simple tautology. It provides no rational insight whatsoever about the tool itself.

    You might believe that but people often don't realize that.

    I know. People using this argument in this very thread don't realize that. That's why I'm bringing it to their attention, so they'll realize it and abandon the nonsense.

    It reminds me of the "corporations aren't people!" brain worm, but now that we've got pretty convincing generative AI, it's possible they're actually right now.

    Once again, I've never argued in favor of AIs being granted human rights. (I actually find the concept horrifying, for reasons not relevant to this discussion.) I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    But for some reason you think they have a right to learn.

    No, I never said that. I said no one has a right to prohibit learning, and I said that anything not prohibited is permitted. There is no need for a specific "right to learn" -- for humans or otherwise -- to exist at all.

    So what's the law that prohibits prohibiting learning?


  • I survived the hour long Uno hand

    We should ask the AI itself what it thinks about its alleged violation of copyright. I'm sure this will clear everything up.


  • ♿ (Parody)

    @dkf said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    That their systems can partially reproduce those works isn't the problem

    I think I might disagree there. That's a very very big problem. A "get an immense fine and have the execs go to jail and have the corporation forced to be wound up under judicial supervision" scale of problem potentially.

    Yeah, I have more of a problem with the reproduction than with the ML portion of this. But I put the reproduction on the user who does it. Possibly with some shared culpability by the providers of the tool, depending on lots of stuff.



  • @Gustav said in I, ChatGPT:

    I agree it's absurd to forbid humans from learning,

    But it's certainly not absurd to require humans to pay for (or otherwise get permission to use, e.g., from a library that probably paid for) the textbooks they use for learning.



  • @dkf said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    And learning is not prohibited, therefore there is no justification nor consent needed.

    You missed a key clause: "learning by humans is not prohibited".

    No, I did not. I said precisely what I meant. (And you still never provided a definiton of learning, BTW.)

    Also, the acts of the AI are in law the acts of the "legal person" that owns the AI because an AI is just a computer program.

    Please cite the relevant statute. Specifically, the portion placing liability for the use of a tool on the owner of the tool rather than the user of the tool.

    If OpenAI were just going round getting their AI models to learn stuff by sucking up the entire content of the internet, they'd still be a bunch of jerks (for ignoring robots.txt and so on), but they might have some protection from the law.

    That isn't all they do.

    They also let paying customers get new works derived from the ingested data by the action of the models — that's totally what their stated business model is! — and that very much includes works that are extremely likely to be found to be clearly derivative works of current copyrighted material, and making derivative works of copyrighted material for money is absolutely something that people have been punished for.

    The courts got the monkey selfie case right. Why do you think they'll get this one so wrong? (Are you aware of the concept of de minimis copying, and the legal implications thereof?)


  • Banned

    @HardwareGeek said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    I agree it's absurd to forbid humans from learning,

    But it's certainly not absurd to require humans to pay for (or otherwise get permission to use, e.g., from a library that probably paid for) the textbooks they use for learning.

    It's definitely absurd to charge $200 for it and release a new edition every year with no content changes but most of the chapters shuffled around so that the students are effectively unable to buy used.

    But that's just the tip of the iceberg that's the absurdity of US higher education system.



  • @Mason_Wheeler said in I, ChatGPT:

    The system is learning how to produce new content.

    AND derivative content that infringes the authors' right to control creation of derivative works, which is governed by copyright law.



  • @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    Nobody argues that. I'm just saying that the material they're learning from must be sourced in a legal way, ie. they must obtain all the right licenses for all the works they're using, for the purposes covering what they're doing.

    Please cite the relevant statute.

    And what they're doing is scraping websites with bots in violation of TOS,

    TOS is a joke with no legal force whatsoever.

    saving copies in permanent storage in violation of license agreements,

    Not happening.

    and internally redistributing those copies to other distinct systems within their IT infrastructure in violation of the Copyright Act of 1976 (or more specifically, USC Title 17 as it currently stands).

    What part of the Copyright Act of 1976 prohibits such behavior? It's a pretty massive bill with over 700 sections.

    That their systems can partially reproduce those works isn't the problem, it's everything they had to do to come to a point where their systems can partially reproduce those works that's the problem.

    No one has yet demonstrated that there is a problem there either.



  • @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    There is nothing reasonable about requiring consent for learning. That's insane in fact.

    I want to learn calculus. By your logic, I can go photocopy a calculus textbook without the author's or publisher's consent, because "learning ".

    Not at all. By my logic, you can read a calculus textbook without the author's or publisher's consent, because learning. (If a friend gave you access to their book, for example, you could even do so without paying for it, completely free from any legal or moral entanglements!) The fact that reading, for a computer, necessarily requires ephemeral copies is irrelevant.


  • BINNED

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    It's this ephemeral thing again, isn't it? Please confirm because otherwise this whole rant doesn't make a lick of sense to me.

    Yes. Ephemeral copies do not count. Period. Absolute. Undermine that rule and our entire technological society falls apart. So please leave it alone!

    It’s not the ephemeral copies. If they were truly ephemeral, you wouldn’t need them at all. (Think: not evaluating code that’s free of side effects.)

    The ML models they produce, that is not the software to run or train them but the data (weights), are derivative works of the training data. OpenAI have produced the software and run the training, but the input data is all the result of somebody else’s work, and the model weights are determined by that.
    It’s “transformative”, sure, but to what degree that currently qualifies for fair use would be up for a court to decide. Without a fair use defense, it’s derivative.



  • @boomzilla said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    No, I never said that. I said no one has a right to prohibit learning, and I said that anything not prohibited is permitted. There is no need for a specific "right to learn" -- for humans or otherwise -- to exist at all.

    So what's the law that prohibits prohibiting learning?

    What part of "anything not prohibited is permitted" do you not understand? It is literally that simple. But somehow when people invoke the magic C word, everyone loses their minds!



  • @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    The system is learning how to produce new content.

    AND derivative content that infringes the authors' right to control creation of derivative works, which is governed by copyright law.

    Same question I asked dkf: are you familiar with the concept of de minimis copying and its legal implications?



  • @Gustav said in I, ChatGPT:

    dunno about "the Luddite", whatever it is

    @Mason_Wheeler gave an accurate description of the origin of the word Luddite, possibly the only accurate thing he's posted in this thread, but credit where it's due.



  • @Mason_Wheeler said in I, ChatGPT:

    I don't want "the common plebs" to have to get a license to learn either. I find the very idea intrinsically offensive, no matter who it applies to.

    Do you find the very idea of having to purchase textbooks intrinsically offensive, too? (Irrespective of the prices publishers charge; those are highly offensive.)


  • Banned

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    Nobody argues that. I'm just saying that the material they're learning from must be sourced in a legal way, ie. they must obtain all the right licenses for all the works they're using, for the purposes covering what they're doing.

    Please cite the relevant statute.

    "What isn't prohibited is allowed." And it's not prohibited to allow use of a website conditional on adherence to TOS. It's also not prohibited to do nothing about people violating your TOS until you feel like it.

    And what they're doing is scraping websites with bots in violation of TOS,

    TOS is a joke with no legal force whatsoever.

    Please cite the relevant statute.

    saving copies in permanent storage in violation of license agreements,

    Not happening.

    Yes happening. What do you think they do, store their entire source database in RAM on a single machine?

    and internally redistributing those copies to other distinct systems within their IT infrastructure in violation of the Copyright Act of 1976 (or more specifically, USC Title 17 as it currently stands).

    What part of the Copyright Act of 1976 prohibits such behavior?

    Mostly §106(1) and (2).

    That their systems can partially reproduce those works isn't the problem, it's everything they had to do to come to a point where their systems can partially reproduce those works that's the problem.

    No one has yet demonstrated that there is a problem there either.

    Mass scraping in violation of TOS is a fact. You cannot deny it's happening, the most you can do is claim (without evidence) that TOS is non-binding. The other two... it would be EXTREMELY hard to teach an ML model anything if they weren't happening. We know for a fact OpenAI hired an army of Indians to comb through some kind of permanently stored database. Was this database populated by the aforementioned scrapers? No definite proof but it would be weird if it wasn't.



  • @topspin said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    It's this ephemeral thing again, isn't it? Please confirm because otherwise this whole rant doesn't make a lick of sense to me.

    Yes. Ephemeral copies do not count. Period. Absolute. Undermine that rule and our entire technological society falls apart. So please leave it alone!

    It’s not the ephemeral copies. If they were truly ephemeral, you wouldn’t need them at all. (Think: not evaluating code that’s free of side effects.)

    The ML models they produce, that is not the software to run or train them but the data (weights), are derivative works of the training data. OpenAI have produced the software and run the training, but the input data is all the result of somebody else’s work, and the model weights are determined by that.
    It’s “transformative”, sure, but to what degree that currently qualifies for fair use would be up for a court to decide. Without a fair use defense, it’s derivative.

    You're still getting fair use entirely backwards.

    The most fundamental right of all, in this context, is freedom of speech. Copyright is an exception to this right, which must necessarily be highly limited in order to avoid infringing too egregiously upon the right of free speech. "Fair use" is what we call free speech in the context of copyright, nothing more, nothing less. It is not an exception to copyright; it is the rule that copyright is an exception to.


  • Banned

    @Mason_Wheeler said in I, ChatGPT:

    @topspin said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    There is no such law. It. Does. Not. Exist. Saying "copyright copyright copyright copyright copyright copyright copyright" does not magic it into existence. If the law you're imagining in your head actually existed, the World Wide Web could not exist. This is not a hypothetical; there were attempts by big copyright interests in the 90s to establish restrictions that would have strangled the Web in its cradle, and they were shot down. They are relevant to this case. Whether you know it or not, you are trying to destroy, through your argument, the very platform you are using to make the argument on.

    It's this ephemeral thing again, isn't it? Please confirm because otherwise this whole rant doesn't make a lick of sense to me.

    Yes. Ephemeral copies do not count. Period. Absolute. Undermine that rule and our entire technological society falls apart. So please leave it alone!

    It’s not the ephemeral copies. If they were truly ephemeral, you wouldn’t need them at all. (Think: not evaluating code that’s free of side effects.)

    The ML models they produce, that is not the software to run or train them but the data (weights), are derivative works of the training data. OpenAI have produced the software and run the training, but the input data is all the result of somebody else’s work, and the model weights are determined by that.
    It’s “transformative”, sure, but to what degree that currently qualifies for fair use would be up for a court to decide. Without a fair use defense, it’s derivative.

    You're still getting fair use entirely backwards.

    Well, so is the copyright law as it currently stands, so 🐮 👈. Fair use is defined as an "except when" to the main law that says "the author has an exclusive right".



  • @Mason_Wheeler said in I, ChatGPT:

    Open-source alternatives to the giant corporate AIs were already emerging last year, and getting good surprisingly quickly.

    Open-source AI is fine, but if it's being trained on copyrighted material, that's just as bad as proprietary AI being trained on copyrighted material.


  • BINNED

    @Mason_Wheeler copyright is the exception to free speech. Fair use is the exception to copyright.

    That’s entirely irrelevant to the point that it’s unclear and up for a court to decide if using massive amounts of copyrighted works to create a derivative work in the form of an AI model is copyright violation or fair use, simply on the grounds of “but it’s :airquotes: learning :airquotes:”.



  • @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    Nobody argues that. I'm just saying that the material they're learning from must be sourced in a legal way, ie. they must obtain all the right licenses for all the works they're using, for the purposes covering what they're doing.

    Please cite the relevant statute.

    "What isn't prohibited is allowed." And it's not prohibited to allow use of a website conditional on adherence to TOS. It's also not prohibited to do nothing about people violating your TOS until you feel like it.

    ...huh? :wtf_owl:

    And what they're doing is scraping websites with bots in violation of TOS,

    TOS is a joke with no legal force whatsoever.

    Please cite the relevant statute.

    Are you seriously asking me to prove a negative?

    saving copies in permanent storage in violation of license agreements,

    Not happening.

    Yes happening. What do you think they do, store their entire source database in RAM on a single machine?

    Do you have any idea at all how a diffusion model works? Or a vector database? These copies you're trying to assert into existence are not there. In any form. At all. What exists in permanent storage are vague things analogous to "ideas and concepts." The lack of any solid mapping from source to destination is known as "polysemanticity" and is understood to be a major factor in the "black box" nature of AIs that makes it difficult for researchers to understand what's going on under the hood.

    and internally redistributing those copies to other distinct systems within their IT infrastructure in violation of the Copyright Act of 1976 (or more specifically, USC Title 17 as it currently stands).

    What part of the Copyright Act of 1976 prohibits such behavior?

    Mostly §106(2).

    Finally, an actual statute!

    So now we're back to the same question. Are you familiar with the concept of de minimis copying and its legal implications?

    Mass scraping in violation of TOS is a fact. You cannot deny it's happening, the most you can do is claim (without evidence) that TOS is non-binding.

    Burden of proof remains on the accuser.



  • @Mason_Wheeler said in I, ChatGPT:

    includes what human beings do that is commonly understood as learning

    "Learning" is not some magical key to bypass all copyright restrictions. Yes, there are some limited educational purposes that fall under "fair use" in the US. But "learning" doesn't mean you can copy, say, textbooks just because you intend to learn from them.



  • @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    Open-source alternatives to the giant corporate AIs were already emerging last year, and getting good surprisingly quickly.

    Open-source AI is fine, but if it's being trained on copyrighted material, that's just as bad as proprietary AI being trained on copyrighted material.

    If it's just as bad as something that there's nothing wrong with, then there is nothing wrong with it.



  • @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    includes what human beings do that is commonly understood as learning

    "Learning" is not some magical key to bypass all copyright restrictions. Yes, there are some limited educational purposes that fall under "fair use" in the US. But "learning" doesn't mean you can copy, say, textbooks just because you intend to learn from them.

    How do you learn from a textbook, then? Is it not by copying as much of the information presented as possible into a mental model of concepts and associations within your mind?


  • BINNED

    @Mason_Wheeler said in I, ChatGPT:

    Do you have any idea at all how a diffusion model works? Or a vector database? These copies you're trying to assert into existence are not there. In any form. At all. What exists in permanent storage are vague things analogous to "ideas and concepts." The lack of any solid mapping from source to destination is known as "polysemanticity" and is understood to be a major factor in the "black box" nature of AIs that makes it difficult for researchers to understand what's going on under the hood.

    Do you have an idea how these things are trained? The database exists on OpenAI’s servers (we have no evidence for this, but it’s abundantly clear). They’re not scraping things live during training. They’re training using the database they’ve amassed. And they’re not deleting it afterwards, because they want to train the next iteration of their model, too.



  • @dkf said in I, ChatGPT:

    dogs have their priorities and they're not quite the same as humans'.

    I don't know. There are a lot of people who would spend all their time eating, sleeping, and sniffing butts if they could.


  • BINNED

    @Mason_Wheeler said in I, ChatGPT:

    @HardwareGeek said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    includes what human beings do that is commonly understood as learning

    "Learning" is not some magical key to bypass all copyright restrictions. Yes, there are some limited educational purposes that fall under "fair use" in the US. But "learning" doesn't mean you can copy, say, textbooks just because you intend to learn from them.

    How do you learn from a textbook, then? Is it not by copying as much of the information presented as possible into a mental model of concepts and associations within your mind?

    By doing exactly that.
    That’s the part where the law doesn’t treat your mind like a book press or a computer.



  • @topspin said in I, ChatGPT:

    we have no evidence for this, but it’s abundantly clear

    Your entire argument in a nutshell! :rofl:


  • BINNED

    @Mason_Wheeler said in I, ChatGPT:

    @topspin said in I, ChatGPT:

    we have no evidence for this, but it’s abundantly clear

    Your entire argument in a nutshell! :rofl:

    That’s like saying I can’t prove Google is tracking people. If you think that’s a good argument to make…



  • @topspin said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @topspin said in I, ChatGPT:

    we have no evidence for this, but it’s abundantly clear

    Your entire argument in a nutshell! :rofl:

    That’s like saying I can’t prove Google is tracking people. If you think that’s a good argument to make…

    You can easily prove Google is tracking people, because we have tangible evidence in the form of cookies.


  • BINNED

    @Mason_Wheeler said in I, ChatGPT:

    @topspin said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @topspin said in I, ChatGPT:

    we have no evidence for this, but it’s abundantly clear

    Your entire argument in a nutshell! :rofl:

    That’s like saying I can’t prove Google is tracking people. If you think that’s a good argument to make…

    You can easily prove Google is tracking people, because we have tangible evidence in the form of cookies.

    That doesn’t prove anything in the strict sense of the word. It’s no more evidence than we have that OpenAI has a database.


  • Banned

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    @Gustav said in I, ChatGPT:

    @Mason_Wheeler said in I, ChatGPT:

    I simply see no reason why learning should be considered "a [exclusively] human right" or an exclusively human capability.

    Nobody argues that. I'm just saying that the material they're learning from must be sourced in a legal way, ie. they must obtain all the right licenses for all the works they're using, for the purposes covering what they're doing.

    Please cite the relevant statute.

    "What isn't prohibited is allowed." And it's not prohibited to allow use of a website conditional on adherence to TOS. It's also not prohibited to do nothing about people violating your TOS until you feel like it.

    ...huh? :wtf_owl:

    I'm sorry, I messed up splitting up the quote. That was the answer to "TOS is a joke with no legal force whatsoever."

    And what they're doing is scraping websites with bots in violation of TOS,

    TOS is a joke with no legal force whatsoever.

    Please cite the relevant statute.

    Are you seriously asking me to prove a negative?

    I am asking you to prove free service providers are allowed less freedom than paid service providers in what terms and conditions they put around use of their services. A.k.a. that TOS is a joke. For now you only talk about unenforceability of TOS, you haven't linked to any source that it is true. And my own search only yielded that you aren't allowed to put in your TOS 99% of what corporations usually put in their TOS, not that TOS is illegal/unenforceable in itself.

    saving copies in permanent storage in violation of license agreements,

    Not happening.

    Yes happening. What do you think they do, store their entire source database in RAM on a single machine?

    Do you have any idea at all how a diffusion model works? Or a vector database? These copies you're trying to assert into existence are not there. In any form. At all. What exists in permanent storage are vague things analogous to "ideas and concepts."

    No, that's after learning. I'm talking about the database that exists before learning. That the model learns from. That one contains unaltered source works. Unless they keep it all in RAM on a single machine, they must have some sort of storage and redistribution layer in their system. And that's what's violating copyright.

    and internally redistributing those copies to other distinct systems within their IT infrastructure in violation of the Copyright Act of 1976 (or more specifically, USC Title 17 as it currently stands).

    What part of the Copyright Act of 1976 prohibits such behavior?

    Mostly §106(2).

    Finally, an actual statute!

    So now we're back to the same question. Are you familiar with the concept of de minimis copying and its legal implications?

    Yes I am. It means fuck all regarding full copies of full works of art.

    Mass scraping in violation of TOS is a fact. You cannot deny it's happening, the most you can do is claim (without evidence) that TOS is non-binding.

    Burden of proof remains on the accuser.

    Are you denying mass scraping is happening? Or that it violates TOS?


Log in to reply