OSS continued to devour the software world in the 2010s. Richard Kemp, partner at Kemp IT Law, looks at where we’ve come from and where we’re going in the open source world of the 2020s.
As far back as 2013, code audit provider Black Duck observed that “if software is eating the world, open source is eating the software world”.[i] Since then OSS has continued to carry all before it, and in its 2019 analysis[ii] Synopsis (which acquired Black Duck in December 2017) noted of the codebases it had scanned that:
- OSS was present in 96% of scans, rising to over 99% where the codebase exceeded 1,000 files;
- OSS made up more than 50% of the codebase across 13 out of 17 industry sectors (the exceptions were telecoms, manufacturing, transportation and EdTech); and
- the most common OSS components were JQuery (in 56% of codebases) and Bootstrap (40%).
In the 2010s the key benefit of OSS shifted from cost to speed. Research company Forrester noted in 2019:
“Developers face the challenge of creating differentiated, customized, and compelling customer experiences quickly. As a result, they no longer write all of their own code to solve every problem. Instead, they assemble, configure, and automate their code and often rely on common open source components to quickly add application functionality.”[iii]
Over the same period key OSS risks also shifted, from licence compliance to security vulnerability. Licence compliance developed in the 2000s:
“as companies played catch up to remedy a landscape of rampant non-compliance … often in panic, at great expense and managerial angst”.[iv]
But, as the time between disclosure of a vulnerability and its exploitation was constantly shrinking, the continuous rise in OSS uptake led to growing concerns around the time taken to remediate these OSS security issues. In addressing them, the OSS code audit providers became the natural home for OSS security vulnerability auditing and remediation, adding them to their OSS licence, policy management and reporting services as Software Composition Analysis (‘SCA’) providers.[v]
Within the OSS world there were big changes too, and the 2010s saw a notable rise in the popularity of permissive licences (up from 41% in 2012 to 67% in 2019) and a corresponding drop in use of copyleft licences (down from 59% in 2012 to 33% in 2019).[vi] In particular, between 2012 and 2019 the permissive MIT licence rose from 11% to 27% and Apache from 13% to 23%.[vii] As shown in Figure 4 below, WhiteSource’s 2020 OSS Licence Guide ranked the top 10 OSS licences in 2019 by share as permissive (MIT, Apache-2.0 and BSD, copyleft (GPLv3, GPLv2 and LGPLv2.1) and weak copyleft (Microsoft Public and Eclipse).
Top 10 OSS Licences in 2019 by share (source: WhiteSource)
The GPL family of licences
By way of quick reprise, OSS is software provided under licence granting the licensee certain freedoms – the difference between OSS and other software lies not in the code but in the licensing terms applied to the code. As espoused by the Free Software Foundation (‘FSF’) set up in 1985 by ex-MIT academic Richard Stallman, these four freedoms are to develop software that is free (i) to run for any purpose, (ii) to be studied and adapted through source code access, (iii) to be redistributed and (iv) to be improved, and for those improvements to be freely redistributable.
Although OSS licensing has raised complex questions around patents, the key innovation in the GPL[viii] family of licences adopted by the FSF was ‘copyleft’ or ‘inheritance’, the idea that the freedoms guaranteed by the GPL would also apply to new works derived from the original GPL-licensed software. In the FSF’s words:
“copyleft is a general method for making a program free software and requiring all modified and extended versions of the program to be free software as well.”[ix]
The legal propagation mechanism for this idea is the GPL licence term that states, broadly, where you modify software originally licensed under the GPL and pass on the modified software, the modifications must also be licensed under the GPL.
That the copyleft licensing term ‘works’ is now no longer in doubt, although complex questions remain as to the precise border between modifications that trigger copyleft and those that do not. The copyright analysis of some of these questions is before the US Supreme Court in Google v Oracle analysed at Section B. The border also depends to an extent on whether it is GPLv2[x] (published by the FSF as long ago as 1991) or GPLv3[xi] (2007) that applies.[xii]
Briefly under GPLv2, the copyleft trigger is ‘distribution’; and when the trigger is pulled, the answer to the question ‘what does GPLv2 cover?’ is (i) the original program licensed under GPLv2 and (ii) any ‘work based on [that] program’ (effectively, a derivative work under copyright law). Under GPLv3, the copyleft trigger changes from ‘distribution’ to ‘conveying’ and when the trigger is pulled, the answer to the question ‘what does GPLv3 cover?’ is (i) the original program licensed under GPLv3; and (ii) the ‘Corresponding Source’ (as defined in GPLv3). Effectively, the language of GPLv2 around ‘distribution’ and ‘work based on the program’ changes in GPLv3 to ‘conveying’ and ‘Corresponding Source’.
GPLv2 was largely pre- ASP (application software provision, or accessing my software on your remote server) and SaaS (software as a service, or accessing your software on your remote server). The correct interpretation of GPLv2 has always been considered to be that use of GPL software on an ASP or SaaS basis was not ‘distribution’ within GPLv2. GPLv3 put this point beyond doubt where it states expressly that:
“mere interaction with a user through a computer network, with no transfer of a copy, is not conveying”.
Applying copyleft to SaaS to remove this gap was addressed by the GNU Affero GPL version 3 of 2007 (‘AGPL’).[xiii] The AGPL is identical to GPL3 except for clause 13, which states that making software available over a network triggers the copyleft:
“… if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge ….”
Clause 13 effectively applies a double condition for copyleft to apply where the AGPL is used. The first is use by remote interaction through a computer network (i.e. covering ASP and SaaS). The second is that the AGPL program has to be ‘modified’. SaaS use of an unmodified AGPL program – most SaaS use – therefore does not trigger copyleft.
‘Modify’ is defined in the AGPL (as it is in GPLv3) as meaning to:
“copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy.”
What this means in turn throws you back on the detailed, technical copyright law questions of the type being considered in Google v Oracle:
- has code been copied?
- if so is that copying substantial?
- has functionality been replicated in a way that potentially infringes copyright?
- does the fair use doctrine apply?
In the international context (where ‘modifying’ AGPL code takes place outside the US), the following additional questions also apply:
- which country’s laws will apply to determine infringement? (This is normally the country where the alleged infringement took place. The question is relevant because ‘adaptation’ in a copyright sense has different meanings under UK and US law for example); and
- where the AGPL is invoked by way of defence to infringement proceedings, under which US state’s or country’s laws will the AGPL be interpreted as a contract or copyright licence?
The AGPL achieved early success, being adopted by the popular ‘NoSQL’ cross-platform database program MongoDB in 2009. However, uptake has been inhibited by challenges in extending conventional OSS licence compliance and internal policies – which largely addressed distribution to outside the organisation – towards imposing internal use controls on the ‘as a service’ access envisaged by AGPL clause 13. AGPL adoption was also dealt a blow when MongoDB, Inc. moved away in October 2018 from the AGPL to a new licence (MongoDB’s Service Side Public Licence[xiv]) so that now, according to TechRepublic, AGPL’s share of open source projects is “virtually zero (as in “none”).”[xv]
The Cryptographic Autonomy Licence (CAL)
One of the most recent OSS licences to be approved by the Open Source Initiative (‘OSI’) as meeting the requirements of the Open Source Definition (‘OSD’) is Holochain’s Cryptographic Autonomy Licence (‘CAL’). By way of background, Holochain is:
“is an energy efficient post-blockchain ledger system and decentralized application platform that uses [P2P] networking for processing agent centric agreement and consensus systems between users.”[xvi]
On 20 February 2020 Holochain released the CAL as “the first licence specifically designed to protect end users’ rights and ownership of data and control of their cryptographic keys – and by extension their security”. The licence can be described as ‘AGPL +’ as, like the AGPL, it requires a redistributor providing either the software or access to it over a network to make available source code to any modifications it makes (clause 2.1) under the terms of the CAL or compatible licence (clause 4.1.2).
The ‘+’ in the CAL is novel requirement at clause 4.2 to “maintain user autonomy” of data processed using the Holochain software: effectively, the licence bars use of Holochain with distributed ledger applications that restrict a user from accessing cryptographic keys controlling their own data:
“We want Holochain apps to be trusted as maximizing end-user autonomy and control. As that starts to happen, we can’t let someone claim their software is a “Holochain” app if they are actually maintaining central control of end-user cryptographic keys.”[xvii]
The CAL went through a number of iterations before OSI approval and prompted lively debate around the implicit extension of OSS freedoms and principles beyond software copyright to APIs and data. The last point to be settled was whether requiring sharing of the user’s own data was compatible with the bar on field of use restrictions at OSD paragraph 6. Although the OSI decided that it was compatible, the point caused sufficient controversy to lead to the resignation of OSI co-founder Bruce Perens in January 2020 shortly before the CAL was approved.
The decline of the GPL licence family from a share of just under 60% in 2012 to 33% a few years later, coupled with their effective absence from the SaaS world, are striking and show little sign of changing. Developer convenience and the desire to avoid complexity, whether in licensing or platforms, appear to be behind this powerful trend. It would certainly explain why the MIT License – at less than 200 words[xviii] – is the most popular software licence on the GitHub software development version control platform. In fact as the TechRepublic article noted, “this shift toward permissive licensing has become so pronounced that on GitHub it’s still far too common for projects not to have a license [and] the GitHub generation is having to be coaxed into slapping on a license at all.”
This blog was first published as part of the white paper which you can read in full here: Algo IP: Rights in Code – 2020 Update