Quebec is taking significant strides towards bridging the gap between artificial intelligence and its unique cultural landscape. The Bibliothèque et Archives nationales du Québec (BAnQ) has embarked on an ambitious project to create a comprehensive database of cultural and governmental content. This initiative, which is currently in its experimental phase, seeks to enhance the representation of Quebec’s society, economy, and Indigenous languages in AI training datasets.
Addressing Data Deficiencies
The move follows a detailed feasibility study conducted earlier this year, which highlighted the shortcomings of current AI models in accurately reflecting Quebec’s diverse culture. Valérie D’Amour, who oversaw the study, noted the necessity of developing a robust dataset to ensure AI systems can provide accurate information about the region. “All scenarios are a little bit on the table right now,” D’Amour expressed in an interview. The project aims to engage cultural stakeholders and data providers to validate ideas and explore potential collaborations.
BAnQ has clarified that the future platform will not serve as a public distribution channel for creative works, emphasising that access to the data will be meticulously regulated. This approach aims to protect the integrity of the cultural content while facilitating its use in AI applications.
A Commitment to Cultural Representation
Marie Grégoire, president and CEO of BAnQ, articulated the project’s goal of ensuring that AI systems accurately reflect Quebec’s cultural identity. “That means having Quebec references, whether in small models or large models, whether they come from research or from the business community,” she stated. This initiative aligns with a growing trend in other regions, such as Sweden, where large collections of Nordic-language texts have been assembled to support the development of generative AI models.

The database plan is rooted in a recommendation from a 2024 report by Quebec’s innovation council, which identified the limited availability of Quebec-specific data as a significant barrier to the region’s representation in AI. Destiny Tchéhouali, a co-holder of a research chair focused on French-language AI at the Université du Québec à Montréal, echoed these concerns, stating that Quebec’s culture is often “underrepresented in the corpora currently circulating in the AI world.”
Navigating Copyright Concerns
As BAnQ develops this innovative database, copyright issues have emerged as a critical point of discussion within the cultural sector. While some artists express concerns that their contributions may undermine their livelihoods, Grégoire argued that the new platform could ultimately provide better protection for creators. “Right now, it’s a bit like the Wild West,” she remarked, highlighting the current state of data usage where creators are often not compensated adequately.
The proposed database could serve as a centralised gateway, enabling fair compensation for artists whose works are used in AI training. By fostering collaboration among cultural organisations, BAnQ aims to ensure a sustainable future for creators while navigating the complexities of the digital age.
Future Prospects and Funding
The feasibility study envisions the platform becoming operational by 2029, although timelines may change based on the outcomes of the experimental phase. The estimated budget for the next five years stands at nearly $10.5 million, encompassing both operating and capital costs. The Quebec government has allocated $340,000 for the feasibility study and an additional $750,000 to support the 12-month experimentation phase of the project.

As this initiative evolves, it will be crucial for stakeholders to address the concerns of artists while fostering an environment that values and protects local cultural expressions.
Why it Matters
The development of this cultural database is pivotal not only for enhancing the representation of Quebec’s society in AI systems but also for safeguarding the rights and livelihoods of its creators. As AI continues to play an increasing role in various sectors, ensuring that it reflects the rich tapestry of Quebec’s culture and languages is essential. By prioritising local content and protecting the interests of artists, this initiative could set a precedent for how cultural data is managed and utilised in the age of artificial intelligence.