In a significant move to enrich artificial intelligence (AI) systems with a nuanced understanding of Quebec’s culture, language, and Indigenous heritage, Bibliothèque et Archives nationales du Québec (BAnQ) is embarking on an ambitious project. Following a successful feasibility study earlier this year, BAnQ is set to develop a comprehensive database of cultural and governmental content in French and Indigenous languages, aimed at training AI models that better reflect the province’s unique identity.
Addressing the Data Gap
The initiative arises from growing concerns that existing AI platforms often lack reliable information about Quebec’s societal fabric. Valérie D’Amour, who spearheaded the feasibility study, acknowledged the challenges posed by the limited availability of data pertaining to Quebec in most AI training datasets. “All scenarios are a little bit on the table right now,” D’Amour noted in a recent interview, highlighting the collaborative approach BAnQ intends to adopt with cultural stakeholders and data providers.
Marie Grégoire, BAnQ’s president and CEO, emphasised the importance of inclusivity in this digital transformation. “Our goal is to ensure that AI systems encapsulate the essence of Quebec society and culture,” she stated, adding that the future platform will not serve as a public repository for creative works but will instead offer controlled access to its data resources.
Learning from Global Examples
This initiative is not unique to Quebec; similar projects have surfaced in other regions, notably Sweden, which has curated extensive collections of Nordic-language texts to support the development of AI models tailored for Scandinavian languages. BAnQ plans to initially leverage its existing collections before potentially incorporating external data sources.

The push for a Quebec-centric database follows a recommendation from the province’s innovation council, which, in a 2024 report, pointed out the significant data deficit regarding Quebec in AI training resources. Destiny Tchéhouali, who co-holds a research chair focused on French-language AI at Université du Québec à Montréal, highlighted that “Quebec culture remains underrepresented in the corpora currently circulating in the AI world.” He cautioned that without a strategic approach, the risk of perpetuating linguistic and cultural biases remains high.
Protecting Creators’ Rights
As BAnQ develops its database, copyright concerns loom large within the cultural sector. Grégoire contended that the new platform might actually enhance protections for creators, contrasting it with the current landscape where data is often harvested without appropriate compensation. “Right now, it’s a bit like the Wild West,” she remarked, advocating for a more structured approach that safeguards creators’ rights and ensures fair compensation when their works are used.
However, some artists express apprehension about contributing their work to AI training frameworks, fearing it may jeopardise their livelihoods. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, reflected this sentiment, stating, “The main criticism we hear in the field is that, even if artists earn income from it, they are still feeding the beast that will eventually be used to replace contracts they may lose because of AI.”
Timeline and Budget
The feasibility study envisions the database becoming operational by 2029, although the timeline will be reviewed following the experimental phase. The initiative is estimated to require a budget of nearly $10.5 million over five years, covering both operational and capital expenses. So far, BAnQ has secured $340,000 from the Quebec government for the feasibility study and an additional $750,000 to support the project’s upcoming 12-month experimentation phase.

Why it Matters
This initiative represents a pivotal step in ensuring that Quebec’s rich cultural landscape is accurately represented within the rapidly evolving realm of AI. As technology continues to shape our interactions and perceptions, it is essential that local voices and narratives are included in the data that feeds these systems. By developing a dedicated database, BAnQ not only seeks to safeguard Quebec’s cultural identity but also aims to redefine how AI interacts with diverse societies, paving the way for a more equitable digital future.