The Bibliothèque et Archives nationales du Québec (BAnQ) is embarking on an ambitious project aimed at creating a comprehensive database of cultural and governmental content. This initiative seeks to enrich artificial intelligence (AI) systems with a deeper understanding of Quebec’s unique society, culture, and Indigenous languages. Following a successful feasibility study, BAnQ has entered the experimental phase of this important initiative, which is designed to address the current inadequacies in AI’s representation of Quebec.
Addressing the AI Data Gap
BAnQ’s initiative responds to a pressing challenge: the limited availability of Quebec-related data in AI training datasets. Valérie D’Amour, the lead on the feasibility study, highlighted that many generative AI systems struggle to accurately reflect the nuances of Quebec’s society, economy, and culture due to this scarcity. In an interview, she stated, “All scenarios are a little bit on the table right now. We have a lot of ideas and we want to validate the possibilities with cultural stakeholders, as well as with data owners and providers, who will be involved in the discussions.”
The proposed databank aims to serve as a repository of knowledge that will not function as a public distribution channel for creative works. Instead, access to the data will be rigorously controlled to ensure that it is used responsibly and ethically. BAnQ’s president and CEO, Marie Grégoire, emphasised the goal of enhancing AI systems to better reflect the diverse tapestry of Quebec’s cultural landscape. “That means having Quebec references, whether in small models or large models,” she elaborated.
Learning from Global Initiatives
Quebec’s approach is not unique; similar projects are emerging globally. In Sweden, for instance, large collections of Nordic-language texts have recently been assembled to develop generative AI models tailored to Scandinavian languages. This international context underscores the growing recognition of the importance of localised data in AI development.

BAnQ plans to commence its databank initiative with its own existing collections before expanding to include data from external sources. This strategy aligns with recommendations from Quebec’s innovation council, which highlighted the critical need for more regional data to improve AI training effectiveness.
Protecting Cultural Integrity
The initiative has sparked discussions among creators about the implications of contributing their work to AI systems. Destiny Tchéhouali, a co-holder of a research chair focused on French-language AI and digital technologies, pointed out that Quebec culture is often “underrepresented in the corpora currently circulating in the AI world.” He cautioned that without careful management, there is a risk of perpetuating linguistic and cultural biases, particularly concerning Indigenous peoples.
Grégoire acknowledged these concerns, arguing that the proposed platform could provide enhanced protection for creators. “Right now, it’s a bit like the Wild West,” she remarked, referring to the current landscape of data use. “Data is being harvested for free, and that should not be the case.” The database is envisioned as a centralised hub that facilitates fair compensation for creators whose works are integrated into AI systems.
However, some artists have expressed trepidation about the potential consequences of feeding their work into AI training. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, commented, “The main criticism we hear in the field is that, even if artists earn income from it, they are still feeding the beast that will eventually be used to replace contracts they may lose because of AI.”
Future Outlook
The feasibility study outlines a projected timeline for the platform to become operational by 2029, although D’Amour noted that this schedule will be reassessed as the experimental phase progresses. The initiative is expected to require a budget of approximately $10.5 million over five years, covering both operational and capital costs. BAnQ has already secured $340,000 from the Quebec government for the feasibility study and an additional $750,000 to support the project’s 12-month experimentation phase.

Why it Matters
This initiative represents a crucial step towards ensuring that Quebec’s rich cultural heritage is not just preserved but also integrated into the evolving landscape of artificial intelligence. By creating a dedicated platform for local data, BAnQ aims to empower creators and enhance the representation of Quebec’s diverse identities in global AI systems. As discussions about AI ethics and cultural representation continue to unfold, this project could serve as a model for other regions looking to safeguard their cultural narratives in an increasingly digital world.