Quebec’s Bibliothèque et Archives nationales du Québec (BAnQ) is embarking on a significant initiative to develop a comprehensive database that will contain cultural and governmental content. This project is designed to enhance the capability of artificial intelligence systems in accurately reflecting the nuances of Quebec’s society, culture, and Indigenous languages. Following a thorough feasibility study, BAnQ has launched the experimental phase of this ambitious databank, which will predominantly feature information in French and Indigenous languages.
Addressing Data Gaps in AI
The impetus behind this initiative stems from growing concerns about the inadequacy of data regarding Quebec in existing AI training sets. Valérie D’Amour, who spearheaded the feasibility study, highlighted the limitations of major generative AI systems, which often fail to provide reliable insights into the region’s unique cultural and economic landscape. “All scenarios are a little bit on the table right now,” D’Amour explained, indicating that the project is still in a flexible planning stage. She emphasised the importance of collaboration with cultural stakeholders and data providers to explore all possibilities.
BAnQ has made it clear that the resulting platform will not function as a public distribution channel for creative works, assuring strict control over data access. Marie Grégoire, BAnQ’s president and CEO, stated that the primary aim is to ensure AI systems reflect the richness of Quebec’s culture. “That means having Quebec references, whether in small models or large models, whether they come from research or from the business community,” she noted.
Learning from Global Examples
This initiative is not isolated; similar efforts have been observed internationally. In Sweden, extensive collections of Nordic-language texts have been compiled to support the development of generative AI models tailored for Scandinavian languages. BAnQ intends to start by utilising its own collections before considering external data sources, thus laying the groundwork for a robust cultural databank.

The project aligns with a recommendation from Quebec’s innovation council, which identified the scarcity of Quebec-centric data in AI training datasets as a critical issue. Destiny Tchéhouali, co-holder of a research chair focused on French-language AI at Université du Québec à Montréal, echoed these concerns. He emphasised that Quebec’s culture is significantly underrepresented in the data currently utilised in the AI landscape, which could perpetuate linguistic and cultural biases, especially regarding Indigenous peoples.
Cultural Protection and Artist Concerns
As BAnQ moves forward, copyright issues have emerged as a significant consideration within the cultural sector. Grégoire argued that the proposed database could actually enhance protections for creators compared to the existing chaotic landscape. “Right now, it’s a bit like the Wild West,” she remarked, highlighting the unregulated nature of data harvesting. The anticipated database could serve as a centralised hub, facilitating fair compensation for creators whose works are incorporated into AI training systems.
However, some artists express apprehension about the implications of contributing their work. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, pointed out that even if artists gain some immediate income, they risk enabling a system that may ultimately jeopardise their contracts in the future. “The main criticism we hear in the field is that artists are feeding the beast that could eventually replace their jobs,” he cautioned.
Future Outlook and Funding
The feasibility study outlines a timeline aiming for the platform to be operational by 2029, although D’Amour acknowledged that this schedule may be revised after the experimental phase. The projected budget for the next five years is approximately CAD 10.5 million, which will cover both operational and capital costs. To support this initiative, BAnQ has already secured CAD 340,000 from the Quebec government for the feasibility study and an additional CAD 750,000 for the initial 12-month experimental phase.

Why it Matters
This initiative represents a vital step towards ensuring that artificial intelligence systems are informed by and reflective of Quebec’s unique cultural landscape. By addressing the existing data gap, BAnQ is not only advocating for a more accurate representation of Quebec society in AI but also seeking to protect the rights and livelihoods of local creators. As AI continues to shape various aspects of our lives, the establishment of a culturally informed databank will be crucial in mitigating biases and fostering a more inclusive future for technology and its creators.