š§ Large Language Models, ChatGPT, and You
Our brains are organic GPUs, liquid cooled to 98.6 degrees.
English and math have always been two different subjects in school. But what if I was to tell you that English is math? Or should I say, how we compose sentences is nothing more than pattern recognition, statistical analysis, and some handy repetition. Now, Iām not going to get into the infinitely complex topic of explaining how humans learn and use language, but Iāll pose an idea; At the core, computers are super-powered calculators. If a computer can successfully communicate with humans using written language, does that mean we can use math to speak? The answer is yes.
If youāve been on the internet recently, youāve heard of something called ChatGPT. The name explains a lot about the concept. GPT stands for Generative Pre-trained Transformer, which is a fancy way of saying ācomputer program trained to generate human-like text.ā ChatGPT enables you to chat with a GPT model. Itās completely broken the internet, and the conversations can sometimes feel like weāve finally reached the future that Terminator depicted. But they also leave you wondering, āHow the heck does this GPT thing work?ā
The Generative Pre-trained Transformer is many things in one. Itās a Generative (it generates new text) Pre-trained (itās seen a lot of text from various places) Transformer (it transforms input text into output text). GPT is actually one implementation of a Large Language Model (LLM). And a key innovation that enables all of these LLMs to work is called embedding. Weāll start there.
An embedding model is a black box that converts words into coordinates on a graph. It can take single words, phrases, sentences, or even whole paragraphs. The goal of embedding is to be the bridge between what we read and what computers read. When comparing the coordinates of similar phrases (āDogā, āpuppyā, āgolden retrieverā), theyāll be graphed nearby each other. This same process is applied to sentences by preserving the order of the words and applying the embedding to the whole sentence, through a process called positional encoding. I canāt explain how this works, but letās continue knowing we have this magical black box.
When you can turn words into numbers, Wikipedia becomes a Sudoku puzzle. Researchers at companies like OpenAI (creators of ChatGPT) were able to scrape billions of lines of text from all over the internet, put them through an embedding model, and feed them into a machine learning model (the transformer) that could learn the abstract patterns we use when communicating ideas through language. Thatās why these models are called Large Language Models, because theyāre very large and very language.
When you ask ChatGPT something like āWhat is the meaning of life?ā, does it actually know the answer? Sadly, the answer is no. ChatGPT doesnāt think; it graphs. In a rapid-fire sequence of linear algebra, ChatGPT takes the question you asked, encodes it, finds nearby text in its coordinate plane and returns that text to you in a way that is statistically similar to other answers itās seen to this question.
A depressing way to look at this would be to evaluate how Iām writing this memo right now. Each word I type on this paper is my brain making the best statistical choice, given the words Iāve written so far, my historical knowledge of how I should structure these words and the vocabulary at my disposal. In the end, writing is just the regurgitation of twenty-eight years of words and sentences that Iāve been exposed to, strung together in a way that satisfies some innate optimization problem wired up inside of my premotor cortex.
Iāve been having a lot of fun playing with ChatGPT and the other models OpenAI has available. Learning more about these foundational technologies has been intellectually enjoyable, but itās also pushed me to wonder about more fundamental philosophical questions. How does the world change if large language models become widespread? What does that mean for our understanding of language, consciousness, and creativity? I donāt have any definitive answers yet, but maybe if Iāll find some insights by asking ChatGPT.