Going way back to the days of in-person trade shows and random encounters that led to meaningful conversation with like-minded strangers, one chat I had at 2018’s Open Source Summit always stands out in my mind, and I often reflect on what that attendee said: “Open source is not just the technology, but it’s a mindset in general.”
It’s clear from our 2021 State of Open Source License Compliance report that the value of open source technology has been proven and pretty much cemented. I probably don’t have to tell you that our report found that the use of open source software exploded in 2020. Our audit team’s analysis showed that more than half of the organizations’ codebases were comprised of open source software. More than half. What a resounding vote of confidence for open source.
But not everyone has moved past the technology to embrace the open source mindset – how to remove roadblocks to its safe and secure use by enabling better collaboration to perpetuate that use. This is evidenced by a 117% increase in issues related to licensing and security from 2019 to 2020 – one issue for every 12,126 lines of code to be exact. For context, a simple iPhone game has hundreds of thousands of lines of code, and by last count, Google’s products have more than 2 billion lines of code.
I’ll zero in on one contributing factor in that increase here – binaries. The use of binaries –files produced that compile source code from various origins and sometimes technologies – increased by 58% this year in our audits. Binaries are aimed at simplification, but their contents can be complex, and without the right tools and strategies, it can be difficult to identify how licenses affect work in the aggregate. This includes ensuring that those licenses are compatible and all obligations are met.
This is because there’s a lot of disparity in the rights granted by the 1,000’s of open source licenses. They carry different obligations which are activated depending on how you use the software and how it is used in conjunction with open source software under different licenses.
Let’s review two important buckets of data. Open source software licenses are either permissive or copyleft licenses. We found that permissive licenses made up—on average—63% of the codebase. Permissive licenses, as their name suggests, are the licenses with the fewest obligations. They ensure the freedom to use, modify and redistribute, while also allowing proprietary derivative works. They present few issues with compatibility: in general, it’s only required to give credit to the author through proper attributions.
Copyleft licenses carry more obligations – and can often present compatibility issues. To copyleft a program, according to the GNU Project, “we first state that it is copyrighted; then we add distribution terms…(to) give everyone the rights to use, modify and redistribute the program’s code or any program derived from it, but only if the distribution terms are unchanged.” There are weak copyleft licenses and strong copyleft licenses.
Weak copyleft licenses make up about 20% of the codebases on average of organizations in our report. These licenses carry obligations beyond simple attribution requirements, depending on whether the open source code is modified, how it’s linked, and how it’s packaged and distributed.
Our analysis revealed some 12% of the codebase is under a strong copyleft license. These licenses carry all the obligations of weak copyleft licenses, and also require that work completed under a strong copyleft license and then combined with work completed under a different open source license be published under a compatible license. This is why strong copyleft licenses are called “viral,” because they have a so-called viral effect on other licensed work.
We quickly begin to see how complex it can be for a developer to know whether open source code they’re using is compatible when it comes to its licensing terms. It’s not surprising then that one out of every eight license compliance issues uncovered by audits were considered P1 — high severity issues with strong copyleft licenses that require immediate attention and urgency.
Software composition analysis automates the tracking of open source components in use and their licenses. By collecting all information in a central database, setting rules and automating alerts, licensing issues, including ones borne of incompatibility, are prevented and risk they expose the organization to is reduced.
Companies are not aware of 95% of their compliance issues. By pushing software composition analysis further left and embedding detection rules and alerts early in the design phase of the software development lifecycle, you can eliminate IP and vulnerability risk and empower developers.