Algorithmic transparency: Conversational chatbots

Algorithmic transparency: AOC conversational chatbots

Overview 

The AOC's conversational virtual assistant, also called a chatbot, is a support service for public employees and citizens, 24 hours a day, 7 days a week, which aims to facilitate the processes of digital identification and access to the information from the transparency portal, as well as improving the user experience and increasing the self-resolution of incidents and queries. 

The service provides information relevant to the specific questions of each user in an easier and more agile way than searching for the information on the website. 

Use cases 

The AOC currently applies the conversational virtual assistant to three content areas:

La Paula is the chatbot that supports the VALid service on the most common questions raised by users in the process of identifying the processing and notification services of the AOC. 

La Rita can be found on the application and management page of the idCAT Certificate (https://idcat.cat/) and helps to resolve the most common doubts and questions related to obtaining, using and renewing the idCAT Certificate. 

Un new chatbot on the Transparency and Electronic Headquarters portal that helps to find the information or procedure that the user needs. It is in the pilot phase in five municipalities: Fornells de la Selva, Llagostera, Sarrià de Ter, Tordera and Tremp. 

AOC chatbots are able to handle complex queries because they interpret people's natural language using artificial intelligence algorithms.

They answer based on: 

  1. The public information of the AOC Support portal (https://www.aoc.cat/suport) which is structured in the form of frequently asked questions (FAQ). This information is prepared and loaded into the bot manually. 
  2. The training of guided rule-based conversational flows. 

Conversational assistants increase the efficiency of the Open Administration of Catalonia by allowing the automation of answers to the most frequent questions received through the web channel. 

Contact information 

Responsible body
Open Administration of Catalonia 

Contact team for inquiries
Innovation and Data Branch

Team email
innovacio@aoc.cat  

External supplier
ONE MILLION BOT, SL 

Supplier email
info@1millionbot.com  

More detailed information about the system 

Familiarize yourself with the information the system uses, the algorithm's operating logic, and its governance. 

Data sets 

Two main sources of data are used: 

Data to train the algorithms 

Public information from the AOC Support portal (https://www.aoc.cat/suport) which is structured in the form of questions-answers in the content area of ​​the chatbot. These are the sample questions asked by users when they are assisted by the AOC's User Service Center (CAU). 

The information is owned by the AOC and is shared under the Creative Commons CC-Zero license. It may be copied, modified, distributed and reused without permission, including for commercial purposes. 

The training material does not contain personal data (reason why the data protection analysis has not been done) 

Information is loaded into the algorithmic system manually. Some of the sample questions are also used to test the model. 

Conversation recording 

The system records in a file the questions asked and the answers given during the chat session. The service development team periodically analyzes the log to see how the service has responded to user questions. 

Through the analysis, the content areas in which the training material and the answers provided by the service need to be improved are identified. After the analysis, the training material is redefined and more sample questions can be added or new guided conversation flows can be created to improve the user experience of the service. User comments can also be used to refine answers and add more information. 

It is planned that in the next version of the bot, the conversation log will be periodically deleted from the system automatically after analysis, but usage-related reports such as usage rates, response rates and other parameters, will continue to be stored in order to monitor long-term progress. 

The chat log is owned by, but not licensed to, the AOC. 

Data processing 

The operational logic of automatic data processing and the reasoning carried out by the system is based on the following model and methodology: 

Architecture of the model 

The Xatbot service is based on Google's Dialogflow technology and the 1MillionBot administration platform hosted on the Google Cloud Platform (GCP) infrastructure. The servers are located in the Google Cloud Data Center in Belgium within the European Economic Area (EEA). 

The service uses AI algorithms and machine learning models to process questions in natural language. 

The user asks the questions through a digital interface that works on different devices (web, mobile, tablet) 

Additionally, chatbots enable guided conversation flows that facilitate query resolution. 

A system has been implemented to collect the opinion of users and to be able to assess the satisfaction of the overall conversation. This information is used to continuously improve the service. 

Yield 

The development of the use and the quality of the service are continuously evaluated.  

Usage statistics and quality measures are recorded regularly in automatic monthly reports.  

The measure of quality includes the number of correct answers compared to all questions and direct comments made by users about the service. 

The accuracy level of the system is one success rate above 95% (updated 24/5/2023) 

No discrimination 

To guarantee equality in the use of the service, it is available in Catalan, Spanish and English. The user person can choose the language at the beginning of the conversation or at any time from the bot menu. 

The service understands natural language even if there are spelling and/or linguistic errors. 

Human supervision 

The chatbot algorithm acts directly, but under surveillance ex post of the staff responsible for the final decision, which has been particularly intense in the initial period of operation. In this way the biases of the algorithm can be compensated. 

Thus, the service is supervised by the staff who manage it: 

  • On the one hand, the service automatically identifies and collects questions that could not be answered. These questions are sent to the CAU specialists (first or second level) to develop the conversational flows that will allow them to be answered next time. 
  • On the other hand, monitoring experts periodically monitor the logs of comments and conversations and analyze the statistics of the use of the service. Based on this data, experts identify the issues that interest users and develop new conversation flows and expand the conversational corpus with new keywords. 

Risk management 

The chatbot service is identified from the very beginning as a "virtual assistant" to prevent the user from thinking that they are interacting with a person. 

The chatbot service never asks for personally identifiable information. However, the service involves a risk related to the processing of personal data if the user enters unnecessary personal data into the service.  

To manage this risk, conversation logs are regularly reviewed by experts, who remove any personal data that the automated system may have missed. 

It is planned that in the next version, the service will have an automated function that will immediately delete any social security number or email address from the conversation log. In addition, during the chat session, the user may request that their conversation be deleted from the system so that the information entered cannot be used to feed the bot.

More information