Licenses and Copyrights
Table of Contents
All of your work remains your own
Neither the Center for Open Science (COS), the Open Science Framework (OSF), nor the URE Methods Repository makes or retains any claim to the copyright or intellectual property for content you post on these sites. However, you grant COS a license to “use, publicly display, and publicly perform such content Content, subject to the Privacy Settings the Administrator or Proxy selects” (COS Terms of Service, Section 6). Please select an appropriate Privacy Setting when you create a project if you feel these terms are too broad. Further, the previous statements do not extend to content that is hosted outside the OSF or the URE Repository, and so material that already has a more restrictive copyright should only be linked to from your project.
All viewable content is available for personal use
All publicly viewable material on OSF is licensed, at minimum, for personal use. You may download it and share it with others. Further, it is possible that for material in the repository has a permissive license that allows you to modify it and/or incorporate it into your own work: please pay attention to the licensing terms.
Open licensing can allow you and others to (a) redistribute the content as you received it, (b) modify the content and distribute the derivative product, and/or (c) incorporate the content as a building block for your own work. In virtually all cases, you do also need to attribute the original authors for their original contribution.
Features of Licenses
While each license has particular characteristics, there are three major features that open licenses have, and we will refer to them by the following terms (and color code phrases that align to them) throughout this document. In the text below, we use the term “product” to refer to the intellectual property that you or another have created. This includes the text or content you enter into the OSF platform, documents subject to copyright you upload or provide via OSF storage, as well as source code, results, and data.
In nearly all licenses, attribution must be maintained through use, any sharing or redistribution, and even if the original product has been substantially modified by another user.
When the product (or parts of the product) are redistributed without modification or included as an unmodified component within another product, it usually is sufficient to include the original license along with relevant details to ensure a third party can identify what is from the original authors and what is an entirely new product by the new authors. This is particularly important if the new authors use a different license for their changes and additions.
When the product is modified and redistributed with a now combined authorship, the original license notice and authorship should be preserved and the license notice for the modifying authors added. When there is no share-alike provision, it is possible for the modification to be provided under a different license than the original product.
Redistribution requires the same terms. This is also known as Copy-Left or Viral licensing. All uses or distributions of your work must be within the same scope of licensing terms that you provide. Such licenses define the terms for products that build off of them; licenses that do not have a “share-alike” property allow someone to use your product as a basis for another product and then distribute their product under a different license, which may be propriety or for commercial gain. The new license may restrict redistribution or modification of the new product even though yours allowed it.
The share-alike property is a way to ensure people who benefit from you freely sharing your work will also freely share their work, but in counterpoint, it can lead to a lower reuse of your product as many users will not want to use such a license for products that are, or could be, eventually used in commercial or even non-commercial proprietary software. As perhaps the most famous example of this avoidance, Apple built MacOS off of BSD Unix instead of the more popular Linux because the former was distributed under the BSD License (with no share-alike clause) and the latter under the GPL (with a share-alike clause).
This governs whether others may modify your work and redistribute the modified version and/or incorporate the modified version into their own products. In traditional views of copyright, modification is prohibited, but is is extremely common in community-oriented licensing under the idea that others will contribute improvements or customize it to their own context. The Modification characteristic is separate from Attribution because the modified version often must retain a copy of your original license and any copyright alongside the new license and copyright that applies to the modified product. Note that the new license may be different for the modified product unless there is a share-alike clause as well.
How to Choose a License
Licenses come in many flavors and many formats. For significant intellectual property questions, you should consult a lawyer. This is a only a brief primer to help those who are ready to start building out a project for the URE Methods Repository.
Intellectual property concerns differ based on the format:
- For text and the content of the OSF projects, we recommend one of the Creative Commons licenses, which are well suited for copyright issues and we describe below. However, Creative Commons licenses should not be used for other types of products.
- For programmatic code and other products that can be used for personal gain or in which intellectual property and patent issues may be at play, there are a significant number of licenses and we describe several of the major ones.
- For Data Sets, licensing, usage, and privacy policies are still emerging. The Creative Commons licenses may work; we discuss additional issues below.
OSF provides a set of default licenses that can be selected for a project as well as the ability to link to any other license, so long as it provides COS and OSF the rights to show, display, and allow users to access the material. We recommend adding a license.txt file to the folder in addition to selecting a license from the project information pane for any OSF project that extends beyond wikis and includes files that are available for download. OSF also provides a general article on licenses here (external link).
Licenses for Documents and General Content
The Creative Commons (external link) details a series of licenses well suited for documents and the content of your OSF projects and are well suited for text and documents or anything subject to Copyright. It is not appropriate for source code or products in which patent issues or other intellectual property issues are at play (see the next section).
CC BY (link) stands for Creative Commons, BY attribution, allowing the use and redistribution of the material in any form, so long as attribution is provided to the original authors.
CC BY-SA (link), Share Alike: All uses or redistribution of the material must be with the same licensing terms as the original license, and attribution must be provided to the original authors. This license by itself allows for the modification of the original content and for the incorporation into other products, but the license for the modification or those product must also allow for redistribution in the same manner.
CC BY-NC (link), Non-Commercial: In addition to attributing the material to the original authors, any use or redistribution of the material must be for non-commercial purposes; however, this license allows for free modification of the content and the new product may be redistributed under a different license, and the new license could be commercial or proprietary.
- CC BY-ND (link), No-Derivatives (i.e., No Modification): Others can use and distribute the material as long as they attribute it to the original authors, but they cannot modify it. The content to be incorporated into commercial products and for the new product to be redistributed under a different license. No-Derivatives is most common if you have created a standalone component that can be built into another product, but you do not want others to edit it.
- CC BY-NC-ND (link): The material can only be used and redistributed in noncommercial contexts and cannot be modified. Additionally, attribution to the original authors must also be preserved.
- CC BY-NC-SA (link): The material can only be used in noncommercial contexts, any derivative product must be licensed in a noncommercial context, and attribution to the original authors must also be preserved.
- CC0 (link) is the Creative Commons Zero License, i.e., Public Domain. This should be used if you waive all rights, including the right to be cited or for the work to be attributed to you. The Creative Commons provides details on releasing material into the public domain here (external link).
Licenses for Source Code, Programs, or Technological Products
Creative Commons explicitly states that it is not well suited for program code or other products in which intellectual property may be at play. There are a wide variety of licensing options that allow you to share such products while restricting (or not restricting) how they can be used by others in future products. You can use the Choose A License (external link) navigator for more guidance on choosing a license to meet your needs.
Below, we provide a brief description of the major open-source licenses and features. We recommend that you save the license text to a license.txt file and upload it to your OSF Project Storage, even if you select a license from the project description.
The GNU Licenses
The GNU Licenses are a series of related licenses that allow modification and require the original license to be distributed with any derivative product. They also have a share-alike requirement specifying all of the source code of any derivative or modified product to also be made available. In short, the GNU licenses require any future use to be as open and community-centered as your original use, and is likely the license to use if you want to enforce that requirement – but see the discussion in the Share Alike section above for how that can lead to lower usage.
The GPL, 3.0
The GNU General Public License (GPL) 3.0, (license link and appendix/usage link) is a strong, share-alike license that requires the complete source code to be available for any subsequent product that uses or modifies the original product as well as any larger works that incorporates it. Attribution must be preserved, and contributors to the product expressly grant patent rights.
The LGPL, 3.0
The GNU Lesser General Public License (LGPL) 3.0, (link) is very similar to the GPL except in its share-alike term. The LGPL enforces a share-alike restriction for work that modifies (i.e., is derivative) of the original product, but allows someone that only “uses” or “links against” the product as-is and without modification the ability to distribute the larger product under a different license. This allows a future user to write a new program that use the unmodified, LGPL-licensed module or library without having to license the new product with the LGPL or GPL.
The Apache License, 2.0
The Apache License, 2.0 (license link, and appendix/usage link) allows modification and requires any redistribution to include the original license and apply a modification notice to all modified files. There is no share-alike requirement, but you are required to release all unmodified parts of the original product under the original license. So, derivative work or modifications can be redistributed under a difference license, and can be redistributed without the source. The main difference between this and the MIT/BSD licenses is it is more explicit about the terms, particularly in granting patent rights for derivative products and including an indemnification provisions and a patent termination provision for misuse. In a review of free software licenses (external link), the GNU Organization comments that these clauses are useful to “protect users from patent treachery,” and so users may find it preferable to the BSD or MIT license if they don’t want a share-alike requirement.
MIT and BSD Licenses
BSD and MIT Licenses require attribution, allow for modification, but do not have a share-alike requirement, and so others can incorporate them into other products that are distributed under other licenses, including proprietary and commercial ones. Further, these licenses do not require the source code to be redistributed in derivative products. These are very short and simple licenses; a casual interpretation of the BSD and MIT Licenses is, “Do whatever you want, don’t sue me, and tell people that I was an original author.”
The MIT License
The MIT License (link) requires future uses, modifications, or redistribution of your work to include the license terms and original copyright notice, but permits any other use.
The BSD 3-Clause "New/Revised" License
The BSD 3-Clause “New/Revised” License (link) requires the copyright license, disclaimers, and warranty to be maintained in any redistribution and states a derivative work cannot use the original authors’ names without their permissions (to avoid the perception that they endorse the new/derivative product).
The BSD 2-Clause "Simplified" License
The BSD 2-Clause “Simplified” License (link) removes the endorsement restriction about the original authors in the 3-Clause license. Arguably, the difference between the BSD 2-Clause License and the MIT License is only semantic. The MIT License provides more explicit detail about the freedoms that future users have and the BSD 2-Clause is more implicit as it doesn’t list any limitations.
Issues Specific to Licensing Data and Datasets
Please Note: There are significant security and governance issues when data contains personally identifiable information or if it can be combined with other public data to identify individuals. Those issues are far beyond the scope of licensing and use discussed here. Please do not make data available unless you are confident it is appropriately anonymized and/or deidentified, or you have agreements for use with those you are giving it to.
Core tenets of public research focus on the importance of being transparent about methods and data to collaborate, build off each others’ works, and establish professional dialogue about what the data mean and how best to analyze it. This, of course, requires that data and datasets are also available more widely than is the current norm.
Features of Data Requiring Attention
The character of data and datasets are very different from both text and source code, and so traditional copyright and patent issues may not apply as would seem sensible. Resorting a dataset could constitute “modification” even if none of the data is modified, and deriving new datasets in which the data is aggregated or transformed may not fall under traditional domains. Below are several key features and issues that you may wish to attend to when considering how to license your data for future use.
Structure vs. Content
Data licenses should attend to the difference between the content and the structure of the data, especially if you wish to restrict use or ensure that people do not redistribute the data with modifications. For simple data like spreadsheets, the structure includes the ordering of the rows, columns, and potentially tabs. For databases, the structure includes the database schema.
A Modified Dataset is one in which the data itself has been altered. If you have a dataset with a “Grade” field in which Kindergarten is represented as “K”, changing it to “0” to give you an interval field is modifying the dataset.
A Derivative Dataset is one that adds or removes data from the original dataset, but does not substantially change the data itself. Removing columns or rows to anonymize the data or augmenting one dataset with new columns that you connected from another dataset would lead to a derivative dataset. If another party generates a derivative dataset, the new dataset may include attribution to multiple sources, and licenses for redistribution may be subject to multiple conditions based on different licenses for the different sources.
A Derived Dataset is one in what new information is generated through analysis, transformation, or processing. One might create a new dataset that aggregates the first dataset, such as taking raw data on individuals and calculating statistics per day, site, or intervention. If another party generates a derived dataset from your data, there may be questions of ownership, attribution, and rights of redistribution.
A license can stipulate that you own any data derived from your source data, or it may allow the party who derived the data to own it, provided they have a share-alike license or at least license it back to you.
Confidentiality and Compliance with IRB and Terms of Data Collection
If you have collected data with assurances regarding the rights of the subjects or the intended uses of the data, you are obligated to ensure that any reuse, reanalysis, or derivation by other parties follows those same terms, and a general, open data license may not suffice, or it may require you to aggregate, anonymize, or deidentify the data before making it available.
Similarly, if you generate a derivative dataset in which one of the sources has confidential requirements or restrictive licenses, you are obligated to apply that to the derivative data; however, it may be the case that the license only applies to the derivative or derived data that relates to the sensitive data and that the new dataset has difference licenses that govern different tables or columns. Ensure that the different and possibly overlapping restrictions are clear if you distribute the derived or derivative data.
We recommend that you contact a resource on Data Governance if you need to discuss these issues.
Licenses for Data and Datasets
There are several organizations that have created permissive licenses to share data openly, and several options are summarized below.
Note: If you provided a guarantee, agreement, or contract to people from whom you collected data about the possible uses of the subsequent data, it is unlikely that an open license will cover your case as written. You will need to write a specific license that include those terms for future users and derivations, and you will likely need to include a share-alike clause to ensure that any derivations are shared under the same terms.
Open Data Commons
The Open Data Commons (external link) provides resources for the legal issues around open data, which pays attention to the difference between the “database” and its “contents.” The Open Data Commons has three licenses that cover data releases to the public domain, that require attribution only, and that have both attribution and share-alike clauses:
- The Public Domain Dedication and License (PDDL) (link) releases the data and database into the public domain and places no restrictions on its use. Data/bases in the public domain do not require attribution to the originator.
- The Attribution License (ODC-By) (link) allows others to use the data and database without restriction, so long as they attribute any public use or redistribution of it, including all works derived or produced from the database, to the originator. However, derivative or derived data do not need to be similarly shared made public, so long as the source is attributed to the originator.
- The Open Database License (ODbL) (link) allows any public use of the database, including derivations and redistributions, so long as the source is attributed to the originator, any work adapting or deriving from the data is also offered the ODbL license, and the derivative work is openly available. Note that others are allowed to restrict these derivative works in some contexts as long as they also provide an openly accessible version: i.e., distributing the data within a proprietary program would be acceptable if the data is also openly available on the website.
Can I Use Creative Commons for Data Licensing?
Creative Commons states that its licenses are suitable for open and publicly available data. However, while it may be acceptable for small and homogenous datasets, it does not attend to the nuances of data discussed above, such as the difference between the structure/model and the contents.
Thus, while CC0, CC-BY, and CC-BY-NC license likely function as intended, the No-Derivatives (ND) clause likely would make the dataset useless as it would suggest users could not make even simple changes such as resorting the data based on different column values.
Including Work on OSF that is Copyrighted or has a Restrictive License
If you would like to include work that is already under a restrictive copyrighted, such as published journal articles in which you do not retain the rights, you have a couple options.
- You can paste a link directly to the where the article is hosted. Of course, if it is behind a paywall, most people will not be able to access it.
- To avoid confusion, you can edit the Citation section to refer to the published article. However, if you are providing other content that users may rely on that is not part of the published article, you may want to provide a citation for your commentary instead.
- You can provide summary, review, or descriptive information in the wiki section to accompany the link. You can also provide supplemental information such as analytic schemas, codebooks, interview protocols, or datasets for others.
- If you are allowed to provide access to drafts or preprints, you may include them in the File Storage section of an OSF project.