In a significant step towards enriching artificial intelligence systems with Quebec’s unique cultural and linguistic heritage, the Bibliothèque et Archives nationales du Québec (BAnQ) has announced the commencement of an experimental programme aimed at creating a comprehensive databank of cultural and governmental content. This initiative, which follows a feasibility study completed earlier this year, seeks to ensure that AI technologies better reflect the province’s diverse society, including its Indigenous languages and cultures.
Addressing Data Gaps in AI
The challenge faced by many generative AI systems is their reliance on limited datasets that often fail to capture the nuances of Quebec’s society and culture. Valérie D’Amour, who oversaw the feasibility study, noted, “All scenarios are a little bit on the table right now. We have a lot of ideas and we want to validate the possibilities with cultural stakeholders, as well as with data owners and providers.” This collaborative approach aims to engage various cultural institutions and data stewards in discussions about the databank’s development.
Marie Grégoire, BAnQ’s president and CEO, emphasised the project’s significance, stating that it aims to ensure AI systems incorporate Quebec-specific references. “That means having Quebec references, whether in small models or large models, whether they come from research or from the business community,” she explained.
Learning from Global Initiatives
Similar projects have emerged internationally, notably in Sweden, where extensive collections of Nordic-language texts have been compiled to aid the development of generative AI models in Scandinavian languages. BAnQ intends to start with its own collections before exploring data from other sources, reflecting a strategic approach to creating a robust foundation for the databank.

The initiative is rooted in a recommendation from Quebec’s innovation council, which highlighted the scarcity of Quebec-related data in existing AI training datasets. Destiny Tchéhouali, a co-holder of a research chair focused on French-language AI at the Université du Québec à Montréal, pointed out that Quebec’s culture is “underrepresented in the corpora currently circulating in the AI world,” warning of the risks of perpetuating linguistic and cultural biases. He described the proposed databank as “strategic infrastructure” that could help establish guidelines for identifying and cataloguing local content within AI systems.
Protecting Creative Works in the AI Era
As BAnQ develops its databank, concerns regarding copyright issues within the cultural sector have surfaced. However, Grégoire believes the new platform could provide artists with enhanced protection compared to the current landscape, which she likened to “the Wild West.” She stated, “Data is being harvested for free, and that should not be the case.” The databank could serve as a centralised system that facilitates fair compensation for creators whose works are utilised in AI training.
Despite these potential benefits, some artists remain apprehensive. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, voiced concerns that contributing to AI training systems might ultimately jeopardise their livelihoods. “The main criticism we hear in the field is that, even if artists earn income from it, they are still feeding the beast that will eventually be used to replace contracts they may lose because of AI,” he remarked.
Project Timeline and Funding
The feasibility study outlines an ambitious timeline, with the platform expected to become operational by 2029. However, D’Amour indicated that this timeline will be reassessed following the experimental phase. The project is projected to have a budget of nearly $10.5 million over five years, covering both operational and capital expenses. BAnQ has received $340,000 from the Quebec government for the feasibility study and an additional $750,000 to support the upcoming 12-month experimental phase.

Why it Matters
This initiative from BAnQ represents a crucial endeavour to ensure that Quebec’s rich cultural tapestry is accurately represented in the rapidly evolving domain of artificial intelligence. By collecting and curating local data, this project not only aims to enhance the quality of AI outputs but also seeks to safeguard the livelihoods of creators in the province. As AI continues to shape various facets of society, ensuring that it reflects the unique characteristics of Quebec will be vital in combating cultural erasure and promoting a more inclusive digital future.