Unleashing the Statistical Abacus: Exploring Large Language Models and Their Knowledge Encapsulation

Image generated by Microsoft Bing Image Creator at https://www.bing.com/create

In April 2023, The Economist published a series of in-depth feature articles delving into Large Language Models. The articles pointed out that Large Language Models understand things in a statistical rather than a grammatical way, making them more like abacus than biological mind.

While Large Language Models are not yet at the level of biological intelligence, their statistical-based operations have, to some extent, encapsulated information and knowledge from human society.

The true marvel of Large Language Models lies in their knowledge encapsulation capabilities. The statistical abacus metaphor serves as a useful analogy to understand this phenomenon. Just as an abacus can store and manipulate numerical values, Large Language Models store vast quantities of information encoded within their neural network weights. This information includes facts, concepts, reasoning abilities, and even subtle linguistic nuances acquired during the training process.

Former Microsoft Corporate Vice President, Qi Lu, stated that GPT, developed over the course of more than 4 years has not only encapsulated knowledge recorded in texts from around the world but has also gradually aligned with human values, thanks to Reinforcement Learning from Human Feedback (RLHF) enhancement method.

Qi Lu quoted Ilya Sutskever, co-founder and Chief Scientist of OpenAI, who believes that models like GPT3, GPT3.5, and GPT-4 already possess a model of the world. Although the model's task is simply to predict the next word, this is just an optimization technique that conveys information about the world.

Qi Lu's summary of Ilya Sutskever's viewpoint may stem from a conversation Ilya Sutskever had with Jensen Huang, co-founder and CEO of NVIDIA, in March 2023. Ilya Sutskever stated during this conversation that when we train a large neural network on a vast amount of diverse Internet text to accurately predict the next word, we are not just learning the statistical correlations in text; instead, to compress them really well, what the neural network learns is some representation of the process that produced the text. This text is actually a projection of the world.

In general, Generative AI based on Large Language Models operates on the following principles:

i) It encapsulates knowledge through highly compressed text storage.

ii) It uses efficient generation and re-sequencing of language tokens to demonstrate knowledge.

iii) It relies on high-density and continuously shrinking AI chips and semiconductors for information storage and retrieval.

iv) It utilizes smaller, more energy-efficient, and space-saving AI chips and semiconductors to reduce computing cost and business operation expenses.

The application of these principles has made Large Language Models powerful tools capable of processing and analyzing vast amounts of information and exhibiting remarkable language generation abilities. They not only partially understand human social knowledge and information but also further deepen their significance in applications by aligning with human values.

However, like any powerful technology, Large Language Models are double-edged swords, carrying both immense potential and ethical risks. A paramount concern lies in the potential misuse of ChatGPT, where biases can be propagated and misleading information disseminated, thereby influencing crucial matters such as election outcomes. Additionally, the statistical nature of Large Language Models may hinder their ability to fully comprehend the meaning, context, emotion, and sentiment of sentences they encounter or generate, leading to frustration among users and distress for individuals with anxiety disorders. Moreover, the inherent randomness in sentence generation based on statistical probabilities occasionally results in the production of fictional answers by LLM chatbots, thereby misleading human users and creating farcical situations.

Furthermore, biases and incompleteness present in the training data can influence the modeling results of Large Language Models, leading to the generation of inaccurate or misleading outputs. The process of fine-tuning Large Language Base Model (LLM Foundation Model) by utilizing data that reflects societal biases, may inadvertently perpetuate such biases in their responses.

Another significant concern is the substantial environmental impact associated with training and operating Large Language Models. The training and daily functioning of these models heavily rely on power-hungry supercomputers, resulting in significant electricity consumption. For instance, Microsoft's data centers employ various computer devices to train OpenAI's GPT-3, ChatGPT, and GPT-4, consuming substantial amounts of electricity for operation and requiring considerable quantities of water for cooling purposes.

During an event held at MIT in April 2023, OpenAI CEO Sam Altman revealed that the development cost of GPT-4 exceeded $100 million. Analyst had estimated that the daily operational expenses for ChatGPT in answering user queries amount to at least $700,000. The computational resources required to train and maintain these large models, along with the resulting carbon emissions, present a pressing environmental concern.

In summary, while we enthusiastically embrace Large Language Models as valuable human assistants, we must bear in mind that they are ultimately tools that lack true intelligence. It is crucial to continuously monitor and evaluate the performance and impact of Artificial Intelligence technologies in real-world applications, while also implementing reasonable regulations through legislation. This approach will enable us to deal with the complexity and unpredictability of the real world more effectively using Artificial Intelligence, ensuring that AI technology bestows the utmost advantages upon humanity.

~ This is the English translation of my article published on Malaysian Chinese News Portal Oriental Daily

Note: The translation was by ChatGPT

Reference

The Economist. (2023). Large, creative AI models will transform lives and labour markets. [online] Available at: https://www.economist.com/interactive/science-and-technology/2023/04/22/large-creative-ai-models-will-transform-how-we-live-and-work [Accessed 27 May 2023].

The Economist. (2023). Artificial brains are helping scientists study the real thing. [online] Available at: https://www.economist.com/science-and-technology/2023/05/24/artificial-brains-are-helping-scientists-study-the-real-thing [Accessed 27 May 2023].

‌经济学人·商论 (2023). 大型语言模型（LLM）的潜力有多大？. [online] Weixin Official Accounts Platform. Available at: https://mp.weixin.qq.com/s?__biz=MjM5MjA1Mzk2MQ==&mid=2650912061&idx=2&sn=38af76a4a6a303e767363355419d2e8e [Accessed 27 May 2023].

www.youtube.com. (2023). 【分享】5月7日 | 陆奇 | 《新范式新时代新机会》| 官方完整高清字幕版 | 奇绩创坛. [online] Available at: https://www.youtube.com/watch?v=-LECKZqygzk [Accessed 27 May 2023].

‌‌www.youtube.com. (2023). CONFERENCE JENSEN HUANG (NVIDIA) and ILYA SUTSKEVER (OPEN AI).AI TODAY AND VISION OF THE FUTURE. [online] Available at: https://www.youtube.com/watch?v=ZZ0atq2yYJw [Accessed 27 May 2023].

Mok, A. (2023). ChatGPT could cost over $700,000 per day to operate. Microsoft is reportedly trying to make it cheaper. [online] Business Insider. Available at: https://www.businessinsider.com/how-much-chatgpt-costs-openai-to-run-estimate-report-2023-4 [Accessed 27 May 2023].

Knight, W. (2023). OpenAI’s CEO Says the Age of Giant AI Models Is Already Over. [online] Wired. Available at: https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/ [Accessed 27 May 2023].

Li, P., Yang, J., Islam, M. and Ren, S. (2023). Making AI Less ‘Thirsty’: Uncovering and Addressing the Secret Water Footprint of AI Models. [online] Available at: https://arxiv.org/pdf/2304.03271.pdf [Accessed 27 May 2023].

Gendron, W. (2023). ChatGPT needs to ‘drink’ a water bottle’s worth of fresh water for every 20 to 50 questions you ask, researchers say. [online] Business Insider. Available at: https://www.businessinsider.com/chatgpt-generative-ai-water-use-environmental-impact-study-2023-4 [Accessed 27 May 2023].

Foster, L. (2023). AI Chatbots Guzzle Water. Why That’s a Problem. [online] www.barrons.com. Available at: https://www.barrons.com/articles/ai-chatbots-water-shortage-google-chatgpt-openai-microsoft-b86da898 [Accessed 27 May 2023].

Atillah, I.E. (2023). AI chatbot blamed for ‘encouraging’ young father to take his own life. [online] euronews. Available at: https://www.euronews.com/next/2023/03/31/man-ends-his-life-after-an-ai-chatbot-encouraged-him-to-sacrifice-himself-to-stop-climate- [Accessed 27 May 2023].

Bharade, A. (2023). A widow is accusing an AI chatbot of being a reason her husband killed himself. [online] Business Insider. Available at: https://www.businessinsider.com/widow-accuses-ai-chatbot-reason-husband-kill-himself-2023-4 [Accessed 27 May 2023].