Authors: Diego Oliveira Farias (oliveiraf@tcu.gov.br); Eric Hans Messias da Silva (erichm@tcu.gov.br); Erick Muzart Fonseca dos Santo (erickmf@tcu.gov.br); Monique Louise de Barros Monteiro (moniquebm@tcu.gov.br); Tibério Cesar Jocundo Loureiro (tiberio.loureiro@tcu.gov.br)
What is Artificial Intelligence?
Over time, many definitions have been given to the term Artificial Intelligence (AI), and the association of the term with others, such as machine learning and deep learning, has resulted in difficulties for a better understanding of the topic.
AI’s artificial aspect is relatively simple. It refers to any non-natural thing created by humans. Using terms such as machines, computers, or systems can also represent it. Intelligence, however, is a much broader, challenged concept, which explains why an agreement on defining AI has yet to be reached (Miaihle and Hodes, 2017).
AI can be defined as the use of digital technology to create systems capable of carrying out tasks usually thought of as requiring Intelligence.
In this context, we can mention the Organization for Economic Cooperation and Development (OECD)’s definition, which considers AI as a system based on a machine that can, for a specific set of objectives defined by humans, make predictions, recommendations, or decisions that influence real or virtual environments.
Current AI mainly involves machines using statistics to find patterns in large amounts of data and carry out repetitive tasks without needing constant human guidance. Thus, AI is unrelated to a technological solution applied to all cases since it generally only provides good performance with significant, relevant, and high-quality data.
Artificial Neural Networks
Traditional machine learning algorithms heavily rely on data representation to create relationships between the data and the predictions they can lead to. For example, consider the difference between a diagnostic system that depends on patient information a doctor provides (e.g., brain-machine interface – BMI, blood type, blood glucose level) to propose a diagnosis and a system capable of identifying tumors from a radiographic image. While traditional algorithms can extract correlations between the first group of information provided by the doctor, referred to as features or attributes, and a potential diagnosis, in the second example, such systems have limitations in analyzing unstructured data like images, as they cannot extract meaning from just a set of pixels.
One solution to this problem is to use techniques to learn the relationships between attributes and the output (prediction) and the best way to represent the input data.
In this context, the technique of transfer learning stands out. It is increasingly employed, especially in computer vision and natural language processing (NLP), where the knowledge acquired by a pre-trained model in a specific domain/task is “transferred” to another domain/task. It enables the “democratization” of using AI models since new models can be trained with only a fraction of the data and computational resources that would be used if a model had to be taught “from scratch”. Transfer learning is inspired by how humans learn, as we rarely learn something from scratch but often learn by analogy, incorporating previously acquired experience into new contexts.
There is no doubt that the architectures and training strategies of neural networks adopted in recent years have led to considerable advancements in tasks such as text translation, question answering, and chatbots, even in tasks trained from scratch. However, significant changes in the sample distribution of the data led to performance degradation, indicating that the models had become specialized in performing well only with specific inputs (e.g., specific languages or text types).
Challenges remained to be overcome for less popular languages than English or even more specific or unexplored tasks. In the case of languages, there is a problem with less-spoken languages that have limited availability of labeled corpus for training NLP models.
In the 1960s, the first step towards transfer learning was using vector spaces to represent words as numerical vectors. In the mid-2010s, models like word2vec, sent2vec, and doc2vec were introduced. These models were trained to express words, sentences, and documents in vector spaces so that the distance between vectors was related to the difference in meaning between the corresponding entities. The training aimed at associating the meaning of a word with its context, i.e., adjacent words in the text, representing an example of unsupervised learning.
Once words, sentences, or paragraphs are represented as vectors, it is possible to use classification or clustering algorithms, where the input data is represented as points in a vector space. For example, in the case of classification, it is a semi-supervised approach since the classification task is supervised, but the representation of the input data was obtained in an unsupervised manner while still embedding textual semantics.
Subsequently, character-level vectorization started being used to deal with words not seen in the initial vocabulary (e.g., new words, slang, emojis, foreign words, or names of people).
This description can be understood as an early form of transfer learning since the pre-trained vectorization model can already embed a certain level of semantics or meaning to words, sentences, etc.
In 2018, there was a true revolution in the NLP field when researchers began to apply transfer learning at a more abstract level, providing not just pre-trained vectorization models but entire neural networks pre-trained on generic, unsupervised tasks at a higher level. Examples include neural networks implementing language models, statistical models trained to predict the next word or set of words given the previous terms. Through a process known as fine-tuning, one can take one of these pre-trained models and perform additional brief training focused on optimizing the model for the specific task to be trained, adjusting the network’s weights. This movement is even referred to as the “ImageNet moment,” about the widespread use of pre-trained neural networks on the ImageNet database for various applications in computer vision.
The OpenAI Generative Pretrained Transformer (GPT) stands out among the pioneering innovations in transfer learning for NLP. Based on the neural network model called Transformer (Vaswani et al, 2017), which allows greater parallelism and performance compared to previous architectures that lacked the same degree of parallelism and had difficulty dealing with long texts. In its most recent formulation – GPT-4 – it is capable of generating realistic texts automatically, similar to those written by humans.
ChatGPT, Large Language Models, and Generative AI
In November 2022, OpenAI ChatGPT was launched, leading Artificial Intelligence to a new stage: in a few days, the chatbot became the most famous achievement in recent technology history due to its impressive capabilities of understanding and generating texts.
Despite its “intelligence” and popularity, ChatGPT’s core is based on an old technique – language modeling. In a simple definition, language modeling is concerned with using statistical models to predict the most common sequences of words in a language. So, they are just models capable of predicting the next most probable word given a series of words. Each element predicted by the model can be reused to predict another word, and so this process continues until we get complete paragraphs and texts.
In the last few years, researchers started to use neural language models. In simple terms, these are language models implemented as neural networks. So, suppose we have a massive dataset of texts. In that case, we can use it to train a neural network whose optimization objective is to generate the most probable word from the sequence of words given to it until the current iteration. This idea was initially implemented with recurrent neural networks. Still, in 2018, the Transformer architecture – a new family of models based on attention models and feed-forward neural networks – demonstrated even better results.
Next, as the number of parameters in these neural models increased from millions to billions or trillions, they became called large language models.
A significant advantage of training language models comes from the dataset: it does not need to be labeled by humans. This occurs because, if we have a corpus of texts, it is already “annotated” in the sense that we always know the next word. The labels are already there even in settings with slightly different optimization objectives (e.g., masking some words and training the model to predict the masked words). This technique is called self-supervision but can also be considered a kind of unsupervised learning (at least from the point of view of human annotators).
We currently do not have many details about the inner workings of ChatGPT – we only know it uses additional techniques from Reinforcement Learning besides traditional language modeling. However, several skillful open-source language models have been launched thanks to its advent. These models are particularly interesting to researchers and government institutions because they are cost-friendly compared to OpenAI models Furthermore, we possess complete control over the model, allowing us to customize it according to our requirements (e.g., understanding legal texts).
Finally, in the Brazilian Federal Court of Accounts (TCU), we launched a tool based on ChatGPT called ChatTCU. The current version is a secure wrapper over the underlying OpenAI model because it enables the auditors to traffic messages securely without sending classified data to OpenAI. In future versions, we will extend ChatTCU features with data related to TCU jurisprudence, besides several other public or non-public data owned by the institution.
Conclusions
The incorporation of AI in the audit activity offers SAIs a unique opportunity to improve the effectiveness and efficiency of their operations. Through automated analysis of large volumes of data, AI can identify complex patterns, anomalies, and trends in real time, providing valuable insights for auditors. Additionally, AI can streamline review and analysis processes, significantly reducing the time required to perform a full audit. By freeing audit professionals from routine and repetitive tasks, AI allows them to focus their expertise on high-level analysis and strategic decision-making. Finally, with the use of AI, SAIs can strengthen the accuracy, completeness, and reliability of their audit activities, thereby strengthening public trust in financial institutions and audited bodies.