In a significant move to enrich the capabilities of artificial intelligence systems, the Bibliothèque et Archives nationales du Québec (BAnQ) is embarking on an initiative to create a comprehensive database of cultural and government content. This project, which has entered its experimental phase following a successful feasibility study, aims to address the current limitations of AI in accurately representing Quebec’s diverse society, economy, and Indigenous languages.
Addressing AI’s Limitations
Often, major generative AI systems struggle with the portrayal of Quebec’s unique identity and cultural nuances. The lack of substantial data pertaining to the province in existing AI training datasets has been a recognised issue. Valérie D’Amour, who conducted the feasibility study, highlighted that “all scenarios are a little bit on the table right now” as the team seeks to explore various ideas and validate possibilities with stakeholders across the cultural sector.
Marie Grégoire, the president and CEO of BAnQ, emphasised that the primary objective of this initiative is to ensure that artificial intelligence systems accurately reflect the intricate tapestry of Quebec’s society and culture. “That means having Quebec references, whether in small models or large models, whether they come from research or from the business community,” she stated.
A Controlled Approach
BAnQ has clarified that the proposed platform will not serve as a public distribution channel for creative works, with strict controls on data access. This measure aims to safeguard the intellectual property of creators and ensure that their contributions are appropriately recognised and compensated. The database is envisioned not merely as a repository but as a strategic asset that will help establish guidelines for how local content is identified and catalogued within AI systems.

Concerns surrounding copyright have been prevalent as the project progresses. Grégoire asserted that the new platform could provide creators with better protection than the current landscape, which she described as “a bit like the Wild West.” By centralising data and streamlining compensation mechanisms, BAnQ hopes to create a sustainable model for the cultural sector.
Cultural Representation and Risks of Bias
Quebec’s culture has long been underrepresented in the datasets currently utilised by AI technologies. Destiny Tchéhouali, co-holder of a research chair focused on French-language artificial intelligence at Université du Québec à Montréal, cautioned that the existing biases in AI could exacerbate the marginalisation of Quebec’s cultural narratives, particularly those of Indigenous peoples. “We run the risk of reproducing linguistic biases and cultural biases,” he explained, highlighting the need for a database that accurately reflects the province’s diversity.
As BAnQ moves forward, it plans to begin by integrating its own collections before considering contributions from external sources. This gradual approach aims to build a robust foundation for the eventual expansion of the database.
Financial Support and Future Prospects
The initiative stems from a recommendation put forth in a 2024 report by Quebec’s innovation council, which noted the limited availability of data on Quebec in AI training datasets as a significant barrier. The feasibility study estimates that the project will require a budget of approximately £8.5 million over the next five years, with the aim of operationalising the platform by 2029. The Quebec government has already allocated £200,000 for the feasibility study and an additional £600,000 to support the upcoming experimental phase.
Despite the potential benefits, concerns linger among artists regarding the implications of contributing to AI training datasets. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, noted that many creators fear that their contributions could ultimately undermine their livelihoods. “Even if artists earn income from it, they are still feeding the beast that will eventually be used to replace contracts they may lose because of AI,” he stated.
Why it Matters
The development of this database represents a crucial step towards ensuring that Quebec’s rich cultural heritage is adequately represented in the digital age, particularly within AI systems. By harnessing local data, BAnQ aims to combat biases that have historically sidelined Quebec’s cultural narratives. This initiative not only promises to enhance the understanding of Quebec society within AI frameworks but also seeks to protect the rights and livelihoods of creators in an increasingly automated world. As the province navigates the complexities of technology and culture, this project could serve as a model for similar efforts globally, ensuring that diverse voices are not just heard but celebrated.