Overview
The UCF 4.0 is collaborative in nature. That means that we are targeting users who have the following beliefs:
Ethical data usage is paramount in AI, ensuring respect for privacy and individual rights.
Data sovereignty is a right, empowering owners to control their data’s usage.
Innovation thrives under clear, fair data licensing, stimulating safe and legal AI development.
Collaboration is essential, and federated data licensing fosters a community of shared knowledge and resources.
Intellectual property protection is vital, and licensing guards these rights in the AI domain.
The integrity and quality of data underpin reliable and effective AI technologies.
Open access and commercial interests must be balanced for mutual advancement in AI.
Fair compensation and recognition for data providers are crucial for a sustainable AI ecosystem.
Public trust in AI is built on responsible and transparent data use.
Anticipating and adapting to future legal challenges ensures resilience in AI innovation.
We will use these beliefs as the cornerstone for our UCF 4.0 persona we’ve named Alex River (found HERE).
We need to protect their data, their rights, while providing for fair usage of content for AI purposes. With that in mind all of the shared content within UCF 4.0 must follow the tenets set forth in the Federated Data License.
AND, we must provide information about the license, training about the license, and the license itself must be created for each organization that wishes to be a contributor within UCF 4.0 and beyond.
This PRD focuses on these issues.
Background
In the swiftly advancing realm of artificial intelligence (AI), data serves as the cornerstone of ingenuity. Our users, those who believe the tenets above, are spearheading a critical reassessment of data governance in AI, advocating for a Federated Data license. This license stands as a testament to the ethical utilization, equitable distribution, and safeguarding of creative intellect within the AI sector.
Navigating the Terrain of Creative Commons
The spectrum of Creative Commons licenses has historically offered a structured model for content creators to delineate the permissible uses of their works. These licenses, recognized for their adaptability and lucidity, include specific stipulations identifiable by the acronyms BY, SA, ND, and NC. Each plays an essential part in the proliferation and application of creative content.
The Attribution license (BY) mandates the acknowledgment of the creator, honoring the intellectual labor and originality infused in the work.
ShareAlike (SA) ensures derivatives remain under similar terms, promoting an environment of collective innovation.
NonDerivative (ND) prevents alterations to the work’s core essence, preserving the creator’s vision.
NonCommercial (NC) restricts the use of works for profit, maintaining the works’ integrity for communal benefit.
These licenses have effectively served traditional creative fields, but AI’s unique nature calls for a more bespoke approach—a Federated Data License.
The Imperative for a Federated Data License
AI is intrinsically generative, learning from extant data to forge novel creations. While the Creative Commons licenses safeguard individual rights effectively, they fall short of addressing the distinctive generative aspect of AI. This gap is precisely where the Federated Data License becomes vital.
Enabling Ethical Innovation
For visionaries like Alex Rivera, ethical innovation is imperative. A Federated Data License would lay down a definitive framework for AI developers, allowing ethical use of collective data. It would embody the spirit of BY, attributing the original data sources, while fostering AI’s generative nature under harmonized conditions that resonate with the collaborative essence of SA. This framework would ensure that AI’s new creations honor the contributions of original data curators.
Safeguarding Data Authenticity
In the AI sphere, the sanctity of data is critical. Expanding on the ND clause of Creative Commons, a Federated Data License would ensure AI respects the original data’s integrity, preventing misrepresentation or distortion of the foundational patterns and truths.
Harmonizing Commercial and Public Good
The Federated Data License reconciles the commercialization of AI outcomes with public welfare, as inspired by the NC clause. It would pave the way for monetization that honors both data contributors and the community at large, ensuring just recompense and spurring innovation that propels societal advancement.
Advantages of a Federated Data License
A Federated Data License is not merely an alternative; it is an evolutionary necessity in generative AI for several reasons:
Adaptability: It would be crafted to acclimate to the dynamism of AI, offering a resilient framework supportive of ongoing innovation.
Harmonization: A singular license across the AI domain would mitigate the current jumble of licensing terms, simplifying compliance and understanding of data usage rights.
Clarity in AI Training: It would dispel legal uncertainties in AI model training, emboldening more entities to engage in AI development confidently.
Ethical Benchmarks: The license would embed ethical norms within AI development, ensuring alignment with societal values.
Public Confidence: Commitment to ethical practices under such a license would enhance public trust in AI, a crucial element for its broader acceptance.
The Path Forward
Champions like Alex Rivera view the adoption of a Federated Data License as a commitment to a future where AI is a force for human empowerment and innovation is pursued with profound ethical consideration. It is a recognition that the data we harness must be managed with the greatest care and foresight.
Objective
Imagine standing at a crossroads: one path leads toward passive observation, the other toward proactive governance. Adopting a Common Generative License is a decisive step on the latter path, ensuring that your company’s data narrative unfolds on your terms.
With each sentence in this guide, we construct a blueprint for your organization’s data strategy in the AI epoch. We've navigated the technical depths, charted the course of compliance, and emerged with a strategic vision. Now, the quill is in your hand to script the next chapter of your organization's legacy in the digital age.
We will provide:
A marketing campaign, including SEO, that explains and champions the Federated Data License.
Various one-pagers and other downloadable materials (such as the license itself) for use by our users.
An automated link that creates a bespoke version of the license for the user.
Storage of that license within the UCF 4.0 structure that ties the license to the user’s account for use when sharing content within the UCF 4.0 structure.
Personas
The persona for this is named Alex River and can be found HERE.
Success Metrics
List project goals and the metrics we’ll use to judge success
Goal | Metric |
An SEO campaign successfully integrating our keywords and driving increased traffic to our site. | Measure keyword growth for: federated data, generative AI, AI data licensing, common generative license, data sovereignty |
All organizations or accounts signed up for UCF 4.0 AND are sharing data will have a generative common license. | 100% of organizations sharing community data within UCF 4.0 |
Users who create the license must have their HubSpot intent score reflect that fact. |
Current Challenges / Limitations
Landing pages do not exist, nor does the intent score, SEO campaign, blog posts, etc.
The interfaces on all UCF 4.0 products must show attribution (and allow attribution to be added manually for content that is pasted in) and must maintain attribution chains as the content is shared and modified.
The interfaces for all UCF 4.0 products must show the applicable license for that content – to include the common generative license for all material created or transformed by the user organization sharing the data.
Benefits
The common generative license is not just an alternative; it is a necessary evolution. Here’s why it serves as a better fit for the world of generative AI:
Adaptability: AI is an ever-evolving field, and the license would be designed to adapt to new technologies and methodologies, providing a sustainable solution that supports continuous innovation.
Harmonization: A common license across AI would reduce the fragmentation caused by different licensing terms, making it easier for AI practitioners to understand and comply with the data usage rights.
Clarity in AI Training: By establishing clear guidelines for training AI models, a common generative license would alleviate the legal ambiguities surrounding the use of data, thus encouraging more organizations to engage in AI development without fear of inadvertent infringement.
Ethical Standards: Such a license would codify ethical standards into the fabric of AI development, ensuring that AI serves the common good and aligns with society's values.
Public Trust: By demonstrating a commitment to ethical practices, organizations can foster greater public trust in AI technologies, which is critical for widespread adoption and acceptance.
Subscription / Pricing / Billing Impacts
Organizations that wish to share data must create a Federated Data license prior to sharing anything.
Beta and Early Access
Not necessary but this is something we’ll be sharing with our community as it evolves.
Risks and Assumptions
Delays:
In enterprises and organizations with a strong legal underpinning, employees do mostly not have authority to sign or create any form of legal commitment, including sharing of company intellectual property.
Any form of legal commitment needs to be reviewed and approved by the legal representatives of the company.
Legal teams typically will ask for changes in almost any terms and conditions. The bigger a company, the more they ask suppliers to conform to their standards, not the other way around.
Legal needs time to review the documents. Depending on their workload and the priority given to this request, this may add some time to the process.
Lower sharing
Any additional step in a process causes lower conversion and completion rates. No matter how well it is done. This step adds extra ‘work’ for the end-user and likely for the legal department. They might have other priorities or just fail to see the value.
Competition / alternatives
The need we identify is existing and real. There is a risk that other organizations who are in a stronger position to communicate and create a standard, will produce a license with the same goal. Eg OSI is also in the process of creating a definition of open source as it relates to AI.
Milestones and Phases
Description | Success Measurement |
Initial content is created | Content is approved for loading into blogs/etc. |
Fill out button | A button for the user to fill out the appropriate information for the Federated Data license is created along with whatever hosting page necessary. When the user fills it out the following happens:
|
Landing page created | The landing page is up and running on the website, and the “fill out” button works. |
SEO campaign | Content is posted and the SEO campaign is running showing improved ranking for the key terms |
UCF 4.0 integration | Prior to an organization sharing information the organization fills out the form and creates a usable license, which is attached to all UCF 4.0 content they create. All success factors for “fill out button” above apply here. |
Product Requirements
Use Cases
Happy Day Scenarios
As a conscientious contributor, I am able to go to the Unified Compliance website and fill out the Federated Data license which allows me to fill in specific information about my organization that I represent. Once the information is filled out and I agree to share content to the community within the UC 4.0 product, I (and anyone in my organization using the same workspace), can then share content. All content that we share will have our unique license attached to that content as well as attribution to my organization.
Rainy Day Scenarios
I forwarded the Federated Data license to my legal department and did not obtain an approval. I emailed and called but I was told they are very busy and I will have to wait.
Requirements
Requirement | User Story | Importance | Status | Jira Issue | Comments |
---|---|---|---|---|---|
Form related | |||||
Ability for potential contributors to go the UC HubSpot website page and choose an option to fill out Federated Data license. | P1 | Done | The form is a gated HubSpot form so that only authenticated users can fill it out. | ||
Ability for potential contributors to fill out the Federated Data license with information specific to their organization. | P1 | Done | Note that the form is already built by the marketing team within HubSpot. | ||
Ability to automatically email the license to the user once the form has been completed. | P1 | Done | Part of the HubSpot workflow | ||
Ability to store the form in HubSpot to the organization, once the form is filled out. | P1 | Done | Is the form attached to the company, CCH account, UC 4.0 workspace, or other? Yes. For example here’s the HubSpot URL for the form I filled out. | ||
Ability to update the HubSpot contact intent to reflect that they have filled out the Federated Data license. | P1 | Done | Part of the HubSpot workflow | ||
Ability for the UC 4.0 product to have access to the Federated Data license. | P1 | In design | |||
Ability for authorized users of the workspace to view the license. | P2 | Not started | Anyone in the company should be able to see their own license. | ||
Ability to require the customer to fill out the form prior to contributing content. | P1 | In design | For customers that do not intend share, there is no need to fill out the Federated Data license. Could be a link to the same website form and once filled out would be attached this workspace. | ||
Ability to forward the draft license to another person in the company eg legal | |||||
General terms / EULA | |||||
Ability for each user (contributor or not) be able to confirm they have read the terms (aka EULA) | P1 | Done | |||
Ability to require each user (contributor or not) to re-confirm they have read the terms (aka EULA) if/when the EULA has been updated. | P3 | Not started | |||
Ability for an end user to view the current terms at any time. | P1 | Done | |||
License and Attribution | |||||
Ability to automatically attach the license to any content that is shared with the community. | P1 | In design | Includes any existing content such as PlantUML and glossaries and sets framework for any future content type that would be added later. Will not do in the first stage. | ||
Ability to automatically assign attribution for content sourced from content contributors. | P1 | In design | Includes dictionaries (e.g., OED, MW …) or other content providers like PCI, GRI … | ||
Ability to automatically assign attribution for any content created within the platform. | P2 | In design | Example, if an AT&T user creates a PlantUML diagram, AT&T is attributed to the creation of that diagram. Will not do in first stage. | ||
Ability to automatically assign attribution for any content transformed within the platform. | P2 | Partially done for Dictionary term definitions. | Initially for any content that a user transforms (e.g., updates a term definition) and will set groundwork for additional transformation to content such as tagging and mapping to common controls. Will not do in the first stage. | ||
Ability for users to manually assign attribution (could be multiple) to content created (or later updated) within the platform. | P1 | Completed for Dictionary terms but not PlantUML | Example, if an AT&T user creates a PlantUML diagram, but used sources from two other company’s websites, the user must be able to site both sources by name and source URL. | ||
Ability for users to copy shared content and make changes while keeping the attribution chain. | P2 | Need to review | Example, if an ACME user copies and modifies an OED attributed dictionary term-definition pair, the copied term-definition pair will indicate that OED was the original source and ACME attributed to the updates. Will not do in the first stage. | ||
Ability for any user to see the attribution chain within the user interface and within API results. | P2 | Need to review | Example, if an ACME user copies and modifies an OED attributed dictionary term-definition pair, the attribution chain will show that OED was the original source and was modified by ACME. Will not do in the first stage. | ||
Ability for any user to see (through the UI and API response) the license associated with each attributable source per shared content. | P1 | Need to review | Example: if a PlantUML diagram was originally created by AT&T, then copied and modified by IBM, any user can see the attribution chain (in the UI and API response), but also see the licenses from both AT&T and IBM. Example: if an ACME user maps in a PCI authority document, each enrichment task (e.g., tagging terms, matching to common controls) is attributed to ACME. End-users are able to see attribution to PCI as the authority source and ACME as the content enrichment provider including links to licenses for both PCI and ACME. | ||
Ability for any user to view and download their organization’s license from within the workspace (e.g. PDF) to share with legal, compliance, or other teams. | P2 | Will not do in the first stage. |
Cost
Estimated effort to fulfill all P1 requirements:
Design and Documentation: 4 weeks
Build Solution: 4 weeks.
Total Duration: 8 weeks.
Estimated Number of resources to fulfill all P1 requirements:
Design Resources: 1.
Build Resources: 3.
Open Questions
List any open questions that come to mind throughout the lifecycle of this project
Question | Answer | Date Answered |
---|---|---|
What happens when a company wants to update their general commons license? | For now they don’t. | 11/22 |
What happens if a company is allowed to update an agreement and already have licensed content under an existing agreement? | Same as above for now. | 11/22 |
What is our stance if an employee failed to adhere to internal regulations, signs the license and starts sharing even when they technically were not allowed to? |
Out of Scope / Future Functionality
List the known features that are out of scope for this project or might be revisited at a later time.
As is case with the assumptions, it is important to list these out so that architects and engineers can plan accordingly for these later updates.
Impacted Product Components
If this project is a component to other areas or an update to an existing product, specifically call out where this product will interact with other areas.
User Interaction and Design
Link to mockups, prototypes, or screenshots related to the requirements.
Process Flow Diagrams
Links to user journeys, process flow, or other diagrams related to the requirements.
Guides
If there are UI components to this requirement, list the main areas where interactive user guides would be needed.
Additional References
List and link to any other reference sites, documents … that might be important to the reader.