I found an interesting post on LWN.net that analyzed the Linux 2.6.20 source code to discover who contributed the most code and what company they work for. It struck me that the Linux development hierarchy is a lot like traditional software development.
The top 20 people contribute about 50% of the code. I think most large software projects are like that. The "code gods" pump out the core code while hundreds, sometimes thousands, of detail coders and testers work out the rest.
LWN.net took several approaches to looking at the contributors such as; lines of code changed, changesets, lines removed, and signoffs. Lines of code changed seems to be the most reasonable measure of contribution, although each measure is open to interpretation. A little over 48% of the code lines changed were contributed by 20 individuals.
Developers with the most changed lines | ||
---|---|---|
Jeff Garzik | 20712 | 6.0% |
Patrick McHardy | 15024 | 4.3% |
Jiri Slaby | 13917 | 4.0% |
Avi Kivity | 11726 | 3.4% |
Andrew Victor | 9710 | 2.8% |
Amit S. Kale | 9537 | 2.7% |
Stephen Hemminger | 9120 | 2.6% |
Geoff Levand | 8396 | 2.4% |
Michael Chan | 8307 | 2.4% |
Chris Zankel | 8099 | 2.3% |
Mauro Carvalho Chehab | 7390 | 2.1% |
Adrian Bunk | 6138 | 1.8% |
Yoshinori Sato | 5232 | 1.5% |
Al Viro | 4981 | 1.4% |
Benjamin Herrenschmidt | 4588 | 1.3% |
Thierry MERLE | 4549 | 1.3% |
Dan Williams | 4516 | 1.3% |
Jonathan Corbet | 3924 | 1.1% |
Gerrit Renker | 3857 | 1.1% |
Jiri Kosina | 3805 | 1.1% |
LWN.net next looked at who was paying these contributors. Meaning, the domain name of the company they worked for. It was not possible to get a domain name in all cases. But, here are the results from LWN.net.
Top lines changed by employer | ||
---|---|---|
(Unknown) | 66154 | 19.0% |
Red Hat | 44527 | 12.8% |
(None) | 38099 | 11.0% |
IBM | 25244 | 7.3% |
Astaro | 15306 | 4.4% |
Linux Foundation | 13638 | 3.9% |
Qumranet | 12108 | 3.5% |
Novell | 11930 | 3.4% |
Intel | 11652 | 3.4% |
SANPeople | 9888 | 2.8% |
NetXen | 9607 | 2.8% |
Sony | 8497 | 2.4% |
Broadcom | 8349 | 2.4% |
Tensilica | 8195 | 2.4% |
Nokia | 5581 | 1.6% |
MontaVista | 4394 | 1.3% |
University of Aberdeen | 4324 | 1.2% |
LWN.net | 3975 | 1.1% |
Secretlab | 3370 | 1.0% |
HP | 3211 | 0.9% |
While "unknown" and "none" accounted for 30% of the changes, the remaining 18 companies accounted for almost 60% of the code lines contributed. It is possible that some significant percentage of the "unknown" and "none" actually worked for some of these companies, but made their contributions from home.
Where is Google? Not surprisingly, Red Hat, IBM, and Novell were big contributors. But where is Google? They certainly use Linux and lots of Open Source software, but why don't they show up as even 1% contributors?
The Long Tail of software development - It would be interesting to see the distribution of contributors for the remaining 50% of the code. My guess is that there is a very long tail of small contributors. Again, not unlike lots of big traditional software development projects.
How many Open Source users actually make changes to the source code? I recently spoke at a Fortune 500 CIO conference. During my speech I did a real time poll of the audience of CIOs. The results confirmed my gut feel for how the market really works. Here are the questions and the results.
How many use Windows Server? 100%
How many use Linux? 45%
How many use both? 45%.
How many of you have made changes to the source code? 8%
Very few Open Source users ever touch the source code. So is it really about the source code?
A small number of companies contribute most of the code to Open Source development, so is it really about the community?
There are lots of free open source distributions of Linux, various databases, application servers, etc. Yet, Microsoft, Oracle, and BEA do pretty well in each market. Is it really about the price?
Everyone has an opinion, but I haven't seen any real survey data that covers these questions. Have any of you seen studies on this?
My real time survey of the CIO audience confirmed my belief that Open Source users are not zealots. They have pragmatic reasons for choosing Linux for some jobs and Windows for others. Understanding all those reasons will take more study.
Subscribe - To get an automatic feed of all future posts subscribe here, or to receive them via email go here and enter your email address in the box in the right column.
"How many of you have made changes to the source code? 8%
Very few Open Source users ever touch the source code. So is it really about the source code? "
I think that speaks for CIO's (the pool you polled). I think the majority of users that actually modify code modify smaller programs/utilities/etc rather than larger applications and much less the kernel.
Posted by: Cory Boston | February 27, 2007 at 12:09 PM
Cory, I get yor point about the audience, but the assumption was that the CIOs were responding on behalf of all the developers at their company. No offense, but most CIOs don't code...and shouldn't.
I also agree that probably most modifications to source code are very simple tweaks to utilities, drivers, and controls.
However, it is still surprising how few companies actually touch the source code. I think it is just the comfort in knowing that they can if they need to.
Posted by: DonDodge | February 27, 2007 at 12:15 PM
I believe that Open Source for the developers its about the code and for the CIO its about price and for both its about freedom.
Posted by: Fredrik Pettersson | February 27, 2007 at 01:44 PM
It's not about modifying the source code; it's about being able to read it and find out how things work. Making changes to your own copy of a project is best avoided, as you'll have to merge those changes in to future versions of the software (unless you can convince them to accept your changes back in to the trunk).
Being able to see exactly how the software you are running works is invaluable, both for working around bugs and for writing extensions. Most popular open-source tools include an extension or plugin mechanism which offers some level of protection against changes in the software breaking your custom code.
Posted by: Simon Willison | February 27, 2007 at 02:40 PM
Of course it's also about having an insurance policy. If you have access to the source code (and a license to do what you need with it) you don't have to trust your vendor not to discontinue support, as happened to several million Visual Basic programmers a few years ago.
Posted by: Simon Willison | February 27, 2007 at 02:43 PM
Google maintains its own Linux kernel which they've forked a LONG time ago.
Ingo Molnar works for Google now btw but apparently they're going to allow/want him to work on the mainline kernel.
Posted by: Kevin Burton | February 27, 2007 at 10:47 PM
It is not about the source code, it is about who can influence the direction of the software. With open source no one can have power to influence like MS with Windows. The beauty of OSS is that they are influenced by democratic process with balance of all stake holders.
Anyway your discovery is very revealing and killing big myth about OSS.
Posted by: TanNg | February 28, 2007 at 11:41 AM
Thanks for all the comments. I agree that part of the appeal of Open Source is the ability to look at the source code to see what it is doing, and perhaps how to tune it. I also agree that having the source is an "insurance policy" of sorts, just knowing that you have the source and can change it if you ever want to.
Just to be clear, this is NOT my research. The fine people at Linux News Weekly did the research and published it on their site. See the link above. I am simply adding my experience, my(biased)opinions, and the results of a live CIO survey I conducted recently.
BTW, Chris DiBona at Google is a personal friend. Chris is in charge of all open source programs at Google, an outstanding author of many software books, and formerly associated with Slashdot. My "Where is Google?" question is not a swipe at Google...I was just surprised.
Posted by: DonDodge | February 28, 2007 at 03:41 PM
I think that it is foolish to belive that no one has the influence over OOS like MS has over Windows. Microsoft actually has a lot less control over their OS than they would like. Their customers, especially their major corporate customers, have a huge influence over the development of Windows. Even non-MS products like Samba have resulted in changes to Windows.
Companies like IBM also have a much larger influence over open source than most people seem to realize, and they do leverage it to their advantage. Consider the ODF/OpenXML debate in which IBM is trying to kill OpemXML, something open source proponents have been demanding for years, in order to sell more copies of Lotus.
Posted by: Jonathan Allen | February 28, 2007 at 03:48 PM
"A small number of companies contribute most of the code to Open Source development, so is it really about the community?"
Don't forget that there's much more to community contributions than just writing source code. I'm a software developer, but generally don't work on the code for open source products that I use.
However, I do make numerous contributions in terms of filing bug reports, helping out users on mailing lists, etc.
Not everyone donates code. Some people donate QA. Some people donate tech support. Some people donate marketing (e.g., via blogging and evangelizing).
Looking at open source community contributions only from the perspective of code will give you a misleading picture.
Posted by: DAR | March 01, 2007 at 09:52 AM