Quebec’s Bibliothèque et Archives nationales du Québec (BAnQ) is embarking on an ambitious project aimed at creating a comprehensive database that will encapsulate the province’s cultural and governmental content. This initiative, which is currently in its experimental phase, seeks to bolster the training of artificial intelligence (AI) systems, thereby improving their comprehension of Quebec’s unique society, culture, and Indigenous languages. Following a feasibility study earlier this year, BAnQ is now exploring the potential of this databank, emphasising the necessity for AI to reflect the rich tapestry of Quebec life.
A Call for Cultural Representation in AI
The primary motivation behind BAnQ’s initiative stems from the growing concerns that mainstream generative AI systems often lack reliable data regarding Quebec’s social and cultural landscape. Valérie D’Amour, who spearheaded the feasibility study, expressed that “All scenarios are a little bit on the table right now,” indicating a willingness to explore various avenues in collaboration with cultural stakeholders and data providers. This collaborative approach is crucial as BAnQ seeks to validate the project’s potential with those who possess relevant insights and resources.
Marie Grégoire, the president and CEO of BAnQ, highlighted the ultimate objective of ensuring that AI systems provide an accurate reflection of Quebec society. “That means having Quebec references, whether in small models or large models, whether they come from research or from the business community,” she stated. The emphasis on local data underscores the importance of cultural specificity in AI training, a sentiment echoed by many experts in the field.
Learning from Global Initiatives
Quebec’s initiative is not occurring in isolation; similar projects have been launched globally, such as in Sweden, where extensive collections of Nordic-language texts have been curated to enhance AI models tailored for Scandinavian languages. BAnQ intends to commence its databank with its existing collections before potentially expanding to include external datasets. This strategic approach will allow them to build a solid foundation rooted in local knowledge before venturing into broader partnerships.

A report from Quebec’s innovation council in 2024 underscored the necessity of this initiative, citing the “very small quantity of data on Quebec” available in existing AI training datasets. Destiny Tchéhouali, a co-holder of a research chair on French-language artificial intelligence, voiced concerns about the underrepresentation of Quebec culture in current AI corpora. “We run the risk of reproducing linguistic biases and cultural biases,” he warned, particularly regarding Indigenous peoples, who face even greater risks of misrepresentation.
Addressing Copyright and Economic Concerns
As BAnQ moves forward with its plans, copyright issues have emerged as a significant concern for the cultural sector. Grégoire believes that the new platform could better safeguard creators’ rights compared to the current landscape, which she described as “a bit like the Wild West.” She emphasised that the database could serve as a centralised hub that not only protects creators but also facilitates fair compensation for the use of their works.
However, hesitation persists among artists who fear that contributing their work to AI training might jeopardise their livelihoods. Maxime Harvey, a postdoctoral researcher at the National Institute of Scientific Research, noted that the prevailing criticism is that artists, even if they earn from their contributions, may ultimately be “feeding the beast” that could replace traditional contracts with AI-generated content.
The feasibility study anticipates that the platform will become operational by 2029, though D’Amour indicated that this timeline is subject to revision following the experimental phase. The study outlines a projected budget of approximately $10.5 million through 2030, encompassing both operational and capital costs. To support this endeavour, BAnQ has secured $340,000 from the Quebec government for the feasibility study and an additional $750,000 to fund the project’s 12-month experimentation phase.
Why it Matters
The creation of this cultural databank is a vital step towards ensuring that Quebec’s rich cultural heritage is adequately represented in the rapidly evolving landscape of AI. By prioritising local data, BAnQ not only aims to enhance the accuracy of AI systems but also seeks to protect the rights and livelihoods of creators in the digital age. This initiative holds the potential to reshape how AI interacts with culture, fostering a more inclusive and representative technological future that resonates with the diverse voices of Quebec society.
