AI Data Licensing, Ethical New Norm
The way major generative AI tools are trained is changing. Initially, these tools were trained on publicly available data, but now sources of training data are increasingly restricting access and pushing for licensing agreements. As a result, new licensing startups have emerged to keep the source material flowing.
The Dataset Providers Alliance (DPA), a trade group formed this summer, wants to make the AI industry more standardized and fair. The alliance is made up of seven AI licensing companies and has released a position paper outlining its stances on major AI-related issues.
The DPA advocates for an opt-in system, meaning that data can be used only after consent is explicitly given by creators and rights holders. This represents a significant departure from the way most major AI companies operate.
The alliance suggests five potential compensation structures to make sure creators and rights holders are paid appropriately for their data. These include:
Standardizing compensation structures is potentially a good thing, as it helps smooth the road for mainstream adoption. The DPA also endorses some uses of synthetic data, arguing that it will constitute the majority of training data in the near future.
The very existence of the DPA demonstrates that the AI Wild West days appear to be coming to an end. As the AI industry continues to evolve, it’s essential to have standards and guidelines in place to ensure that data is used ethically and fairly.
The rise of generative AI tools has led to an increased demand for data licensing. However, the current system often disregards the rights of data owners. A new trade group, the Dataset Providers Alliance (DPA), aims to change this by advocating for an opt-in system that prioritizes creator consent.
Experts agree that the current opt-out system is fundamentally unfair to creators, as it puts the burden on data owners to opt out of data use. The DPA’s proposed opt-in system would require AI companies to obtain explicit consent from creators before using their data. This shift towards a more standardized approach could ultimately lead to more ethical AI development.