Paradoxically, a non-profit organization started for the public benefit has become the world’s leader in creating infrastructure software companies and is responsible for generating billions of dollars in enterprise value. This organization is known as the Apache Foundation and was started in 1999 by the group that developed the HTTP web server, now the world’s most widely used web server. They hoped to ensure that their software could be freely distributed and modified without regard to commercial interests and to provide themselves with legal protections from large companies that took issue with their work.
Despite the anti-commercial origins of the foundation, and of the open source movement more broadly, it has ended up serving as an incubator of sorts for some of the largest enterprise software businesses of the last decade. Cloudera and Hortonworks were based on Apache Hadoop, Databricks on Apache Spark, and Confluent on Apache Kafka, just to name a few. In all these cases, groups of open source developers helped to create a project under the foundation’s banner and then distributed it under the Apache license. They then went on to create for-profit entities providing commercial support or enterprise features on top of these projects – also known as the “open core” model.
“Open core” has proven in recent years to be one of the most effective business models in the world of enterprise software. It has the viral, bottom-up adoption that many consumer businesses benefit from as developers start using it organically within large enterprises and share it with their peers and friends. However, when open source code is customized and architected into a product, developers now control the budget that leads to very large, sticky, and expansive enterprise contracts. It’s in some ways the best of both the consumer and enterprise worlds.
Catalysts for open source
The meteoric rise of open source and open core has been well documented over the past several years. We have seen a rapid rise in the number of startups with an open core model, the funding for these types of companies, and most importantly, in the usage of open source software within companies. Remarkably, 60-80% of every modern application’s stack consists of open source code. What is less discussed however, is the changes in licensing models for open source software and the Apache Foundation’s role in pioneering more permissive licensing.
Broadly speaking, there are two types of licenses for open source projects: copyleft and permissive. Under a copyleft license, when software is modified and redistributed, all the same distribution terms of the original software must apply. Thus, if a developer builds a new product by modifying or extending an open source project, the new product must be open sourced in the same exact manner as the original project. The original copyleft license was known as the GNU General Public License, created by Richard Stallman and the Free Software Foundation in the late 1980s. Their core belief was that all software should be free and to protect the interests of open source developers at all costs.
When the Apache Foundation was creating their own license in 1999, they instead decided to opt for a more “permissive” one, modeled after the BSD license. This was done for the purposes of helping the HTTP community grow quickly and ensuring that anybody could use the software without worrying about adhering to license regulations. Under permissive licenses, open source projects can be more freely modified and distributed, even if this means being turned into a commercial product or discrediting the core developers of the project. While the Apache Foundation was not the first to create or use a permissive license for open source distribution, they pioneered its use for major projects and as a result, the Apache license has become one of the most widely used today.
According to WhiteSource, the Apache Foundation’s license is used in nearly 25% of projects and it is the second most popular license, after the MIT license, a similarly permissive license.
Moreover, mirroring the rapid rise of open core and open source, the past 8 years have seen a rapid decline in the use of copyleft licenses and a corresponding increase in the use of permissive licenses. It is fascinating to note that the rise of open source can largely be attributed to changes in the legal structure behind licenses and increased empowerment for developers to create commercial products using open source components.
So if the Apache Foundation can be viewed as an incubator for the next great enterprise software companies, which projects and corresponding commercial entities should we keep our eyes on? I personally think Astronomer, based on Apache Airflow, Imply Data, based on Apache Druid and Preset.io based on Apache Superset are the most promising based on their Github traction and popularity among the developer and data science communities. As developer tooling continues to grow as an important category in enterprise software, I have no doubt that Apache Combinator will continue to churn out interesting companies and that permissive licenses will become nearly ubiquitous in the world of open source software.