Files
davideisinger.com/static/archive/www-robinsloan-com-dkgq9f.txt
David Eisinger 56c023648e add links
2025-03-02 01:14:03 -05:00

336 lines
17 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
[1]Blog [2]About [3]Moonbound
This is a post from [4]Robin Sloans lab blog & notebook. You can [5]visit the
blogs homepage, or [6]learn more about me.
[7]Is it okay?
February 11, 2025 Macbeth Consulting the Witches, 1825, Eugène Delacroix [8]
Macbeth Consulting the Witches, 1825, Eugène Delacroix
How do you make a language model? Goes like this: erect a trellis of code, then
allow the real program to grow, its development guided by a grueling training
process, fueled by reams of text, mostly scraped from the internet. Now. I want
to take a moment to think together about a question with no remaining practical
importance, but persistent moral urgency:
Is that okay?
The question doesnt have any practical importance because the AI companies
and not only the companies, but the enthusiasts, all over the worldare going
to keep doing what theyre doing, no matter what.
The question does still have moral urgency because, at its heart, its a ques
tion about the things people all share together: the hows and the whys of
humanitys common inheritance. Theres hardly anything bigger.
And, even if the companies and the enthusiasts rampage ahead, there are still
plenty of us who have to make personal decisions about this stuff every day.
You gotta take care of your own soul, and Im writing this because I want to
clarify mine.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
A few ground rules.
First, if you (you engineer, you AI acolyte!) think the answer is obviously
“yes, its okay”, or if you (you journalist, you media executive!) think the
answer is obviously “no, its not okay”, then I will suggest that you are not
thinking with sufficient sensitivity and imagination about something truly new
on Earth. Nothing here is obvious.
Second, Id like to proceed by depriving each side of its best weapon.
On the side of “yes, its okay”, I will insist that the analogy to human
learning is not admissible. “Dont people read things, and learn from them, and
produce new work?” Yes, but speed and scale always influence our judgments
about safety and permissibility, and the speed and scale of machine learning is
off the charts. No human, no matter how well-read, could ever field requests
from a million other people, all at once, forever.
On the side of “no, its not okay”, I will set aside any arguments grounded in
copyright law. Not because they are irrelevant, but becausewell, I think
modern copyright is flawed, so a victory on those grounds would be thin, a bit
sad. Instead, Ill defer to deeper precedents: the intuitions and aspirations
that gave rise to copyright in the first place. To promote the Progress of Sci
ence and useful Arts, remember?
I hope partisans of both sides will agree this is a fair swap. Put down your
weapons, and lets think together.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
I want to go carefully, step by stepyet I want to do so with brevity. Lan
guage models produce somanyWORDS, and they seem to coax just as many out
of their critics. Logorrhea begets logorrhea. We can do better.
Ill begin with my sense of what language models are doing. Here it is: lan
guage models collate and precipitate all the diverse reasons for writing,
across a huge swath of human activity and aspiration. Start to enumerate those
reasons: to inform, to persuade, to sell this stupid alarm clock, to dump the
CUSTOMERS table into a CSV fileand you realize its a vast field of desire
and action, impossible to hold in your head.
The language models have many heads.
In this formulation, language models are not merely trained on human writing.
They are the writing: all those reasons, granted the ability to speak for
themselves. I imagine the PyTorch code as a mech suit, with squishy language
strapped in tight
To make this workyou already know this, but I want to underscore itonly a
truly rich trove of writing suffices. Train a language model on all of
Shakespeares works and you wont get anything useful, just a brittle
Shakespeare imitator.
In fact, the only trove known to produce noteworthy capabilities is: the entire
internet, or close enough. The whole extant commons of human writing. From here
on out, for brevity, well call it Everything.
This is what makes these language models new: there has never, in human
history, been a way to operationalize Everything. Theres never been any
thing close.
Just as, above, I set copyright aside, I want also to set aside fair use and
the public domain. Again, not because they are irrelevant, but because those
intuitions and frameworks all assume we are talking about using some part of
the commonsnot all of it.
I mean: ALL of it!
If language models worked like cartoon villains, slurping up Everything and
tainting it with techno-ooze, our judgment would be easy. But of course, digiti
zation is trickier than that: the airy touch of the copy complicates the sce
nario.
The language model reads Everything, and leaves Everything untouchedyet sud
denly this new thing exists, with strange and formidable powers.
Is that okay?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
As we begin to feel our way across truly new terrain, we can inquire: how much
of the value of these models comes from Everything? If the fraction was just
one percent, or even ten, then we wouldnt have much more to say.
But the fraction is, for sure, larger than that.
What goes into a language model? Data and compute.
For the foundation models like Claude, data means: Everything.
Compute combines two pursuits:
1. software: the trellises and applications that support the development and
deployment of these models, and
2. hardware: the vast sultry data centers, stocked with chips, that give them
room to run
Theres a lot of value in those pursuits; I dont take either for granted, or
the labor they require. The experience you get using a model like Claude
depends on an ingenious scaffolding. [9]Truly! At the same time: I believe
anyone who works on these models has to concede that the trellises and the
chips, without data, are empty vessels. Inert.
Reasonable people can disagree about how the value breaks down. While I believe
the relative value of Everything in this mix is something close to 90%, Im
willing to concede a 50/50 split.
And here is the important thing: there is no substitute.
Youve probably heard about the race to generate novel training data, and all
the interesting effects such data can have. It is sometimes lost in those dis
cussions that these sophisticated new curricula can only be provided to a lan
guage model already trained on Everything. That training is what allows it to
make sense of the new material.
Also, it is often the casenot always, but oftenthat the novel training
data is generated bya language modelwhich has itself been trained
onyou guessed it.
Its Everything, all the way down.
Would it be possible to commission a fresh body of work, Everythings equal in
scale and diversity, without any of the encumbrances of the commons? If you
could do it, and you trained a clean-room model on that writing alone, I con
cede that my question would be moot. (There would be other questions! Just not
this one.) Certainly, with as much money as the AI companies have now, youd
expect they might try. We know they are already paying to produce new content,
lots of it, across all sorts of business and technical domains.
But this still wouldnt match the depth and richness of Everything. I have a
hypothesis, which naturally might be wrong: that it is precisely the naivete of
Everything, the fact that its writing was actually produced for all those dif
ferent reasons, that makes it so valuable. Composing a fake corporate email,
knowing it will be used to train a language model, youre not doing nothing,
but youre not doing the same thing as the real email-writer. Your document
doesnt have the samewhat? The same grain. The same umami.
Maybe one of these companies will spend ten billion dollars to commission a
whole new internets worth of text and prove me wrong. However, I think there
are information-theoretic reasons to believe the results of such a project
would disappoint them.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
So! Understanding that these models are reliant on Everything, and derive a
large fraction of their value from it, one judgment becomes clear:
If their primary application is to produce writing and other media that crowds
out human composition, human production: no, its not okay.
For me, this is intuitively, almost viscerally, obvious. Here is the ultimate
act of pulling the ladder up behind you, a giant “fuck you” to every human who
ever wanted to accomplish anything, who matched desire to action, in writing,
part of Everything. Here is a technology founded in the commons, working to
undermine it. Immanuel Kant would like a word.
Fine. But what if that isnt the primary application? What if language models,
by collating and precipitating all the diverse reasons for writing, become flex
ible general-purpose reasoners, and most of their “output” is never actually
read by anyone, instead running silent like the electricity in your walls?
Its possible that language models could go on broadening and deepening in this
way, and eventually become valuable [10]aids to science and technology, [11]to
medicine and more.
This is trickyits so, so trickybecause the claim is both (1) true, and
(2) convenient. One wishes it wasnt so convenient. Cant these companies
simply promise, with every passing year, that AI super science is just around
the cornerand meanwhile, wreck every creative industry, flood the internet
with garbage, grow rich on the value of Everything? Let us cook—while culture
fades into a sort of oatmeal sludge.
They can do that! They probably will. And the claim might still be true.
If super science is a possibilityif, say, Claude 13 can help deliver cures
to a host of diseasesthen, you know what? Yes, it is okay, all of it. Im
not sure what kind of person could insist that the maintenance of a media
status quo trumps the eradication of, say, most cancers. Couldnt be me. Fine,
wreck the arts as we know them. Well invent new ones.
(I know that seems awfully consequentialist. Would I sacrifice anything, or
everything, for super science? No. But art and media can find new forms. Thats
what they do.)
Obviously, this scenario is especially appealing if the super science, like
Everything at its foundation, flows out into the commons. It should.
Sois super science really on the menu? We dont have any way of knowing; not
yet. Things will be clearer in a few years, I think. There will either be real
undeniable glimmers, reported by scientists putting language models to work, or
there will still only be visions.
For my part, I think the chance of super science is below fifty percent, owing
mostly to the friction of the real physical world, which the language models
have, so far, avoided. But, I also think the chance is above ten percent, so,
I remain curious.
Its not unreasonable to find this wager suspicious, but if you do, I might
ask: is there any possible-but-unproven technology that you think is worth pur
suing even at the cost of itchy uncertainty in the present? If the answer is
“yes, just not this one”: fair enough. If the answer is “no”: aha! I see youve
answered the question at the top of this page for yourself already.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Where does this leave us?
I suppose its not surprising, in the end:
If an AI application delivers some profound public good, or even if it might,
its probably okay that its value is rooted in this unprecedented operational
ization of the commons.
If an AI application simply replicates Everything, its probably not okay.
Ill sketch out my current opinions more specifically:
I think the image generation models, trained on the Everything of pictures,
are: probably not okay. They dont do anything except make more images. They
pee in the pool.
I think the foundation models like Claude are: probably okay. If it seemed, a
couple of years ago, that they were going to be used mainly to barf out text,
that impression has faded. Its clear their applications are diverse, and often
have more to do with processes than end products.
The case of translation is compelling. If language models are, indeed, the
Babel fish, they might justify the operationalization of the commons even
without super science.
I think the case of code is especially clear, and, for me, basically settled.
Thats both (1) because of where code sits in the creative process, as an inter
mediate product, the thing that makes the thing, and (2) because the commons of
open-source code has carried the expectation of rich and surprising reuse for
decades. I think this application has, in fact, already passed the threshold of
“profound public good”: opening up programming to whole new groups of people.
But, again, its important to say: the code only works because of Everything.
Take that data away, train a model using GitHub alone, and youll get a far
less useful tool.
Maybe (it turns out) Im less interested in litigating my foundational question
and more interested in simply insisting on the overwhelming, irreplaceable con
tribution of this great central treasure: all of us, writing, for every conceiv
able reason; desire and action, impossible to hold in your head.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Did we make progress here? I think so. Its possible my question, at the
outset, seemed broad. In fact, its fairly narrow, about this core mechanism,
the operationalization of the commons: whether I can live with it, or not.
One extreme: if these machines churn through all media, and then, in their
deployment, blow away any prospect for a healthy market for human-made media,
Id say, no, thats not what we want from technology, or from our future.
Another extreme: if these machines churn through all media, and then, in their
deployment, discover several superconductors and cure all cancers, Id say,
okaywere good.
What if they do both? Well, it would be a bummer for media, but on balance Id
take it. There will always be ways for artists to get out ahead again. More on
that in another post.
I also think there are some potential policy remedies that would even out the
allocation of value herealthough, these days, imagining interesting policy
is a sort of fantastical entertainment. Even so, Ill post about those later,
too.
In this discussion, I set copyright and fair use aside. I should say, however,
that Im not at all interested in clearing the air for AI companies, legally.
Theyve chosen to plunge ahead into new terrainso let them enjoy the fog of
war, Civ-style. Let them cook!
[12]To the blog home page
I'm [13]Robin Sloan, a fiction writer. The main thing to do here is sign up for
my newsletter:
[14][ ] [15][Subscribe]
This website doesnt collect any information about you or your reading.
It aspires to the speed and privacy of the printed page.
Dont miss [16]the colophon. Hony soyt qui mal pence
References:
[1] https://www.robinsloan.com/lab/
[2] https://www.robinsloan.com/about/
[3] https://www.robinsloan.com/moonbound/
[4] https://www.robinsloan.com/
[5] https://www.robinsloan.com/lab/
[6] https://www.robinsloan.com/about/
[7] https://www.robinsloan.com/lab/is-it-okay/
[8] https://www.clevelandart.org/art/1962.109?utm_source=Robin_Sloan_sent_me
[9] https://www.youtube.com/watch?v=ugvHCXCOmm4#t=9780
[10] https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/?utm_source=Robin_Sloan_sent_me
[11] https://darioamodei.com/machines-of-loving-grace?utm_source=Robin_Sloan_sent_me
[12] https://www.robinsloan.com/lab/
[13] https://www.robinsloan.com/about?utm_source=Robin_Sloan_sent_me
[16] https://www.robinsloan.com/colophon/