LLMs are Directing Open Source Licensing
I’m often looking at the licensing of open source projects, either out of personal interest or while moderating posts
to /r/opensource
, and through this I’ve seen a recent trend of projects having something like this in their readme:
## License
MIT License - See [LICENSE](LICENSE) file for details
But the interesting part is that the referenced LICENSE
file is frequently missing.
This wouldn’t be surprising as a one-off, but I was seeing this many times per week, all in a similar format, all MIT licensed.
I suspected the rising use of LLMs as the cause, so I started to query this with authors, which confirmed my suspicions. Examples: A, B, C, D. I queried others too but there often seems to maybe be a little hesitance by authors to directly confirm their use of an LLM, so I get an indirect response. Examples: A, B.
Since it’s always MIT, I worry a little about how this trend may further reduce opportunity for other licensing options to be considered while further popularising this single very permissive license. Through a cynical lens, I could theorize that maybe this is purposeful to keep project licenses friendly to the large companies creating these LLMs, who tend to hiss back at the sign of a copyleft license. Realistically though, the MIT license is probably generated due to its very high popularity, being the most used license on GitHub (where I guess most LLMs would have been trained on) by quite a margin.
This situation does raise some questions for me though: Are authors really understanding the terms they’re applying when an LLM is choosing their license? If not, could this create more conflict from scenarios where users exercise license rights which the author doesn’t like/expect? Are licensing requirements of included/dependant software being ignored due to this? How strong is the bias towards MIT?
Ultimately it’s up to the author to understand and take responsibility for what they’re publishing, even if using an LLM to help them along but, with this often being overlooked already, I’m expecting we’ll see more “dawg i chatgpt’d the license” moments in the future.