Carolina philosophy professor Thomas Hofweber founded the UNC AI Project to try to answer difficult — and disturbing — questions about artificial intelligence.
A just machine to make big decisions/Programmed by fellas with compassion and vision/We’ll be clean when their work is done/We’ll be eternally free, yes, and eternally young — Donald Fagen, “I.G.Y.,” The NightFly
The Office of the Provost this year formed the UNC Generative AI Committee to provide guidelines and principles to help faculty and students navigate the use of artificial intelligence in the classroom, particularly ChatGPT, a language model-based chatbot that can create original essays, software code and other works. (See “AI Is On Campus. Calm Down,” July/August 2023 Review.) But AI is not done evolving, not by a long shot. Humans still have a long way to go before we comprehend the implications of AI.
Enter the UNC AI Project, the brainchild of Carolina philosophy professor Thomas Hofweber. He’s gathered faculty, students and visiting scholars to ponder and research the societal implications that AI is no longer science fiction. What is it, exactly? A harbinger of a glorious new world or a foreboding tool teetering toward dystopia? Or neither? Or both?
The Review sat down with Hofweber to delve deeper into those questions and learn just how concerned we should be about AI. The conversation was edited for length and clarity.
We want to make advances on philosophical and foundational questions concerning artificial intelligence and virtual worlds, using a multidisciplinary approach. Anyone can be involved, but it started with a few of us from philosophy, computer science and linguistics.
A number of questions are broadly philosophical. What is the difference between biological intelligence and artificial intelligence? Is the latter just an attempt to copy the former? How do new AIs relate to human intelligence? Does AI based on machine learning lead to an alien form of intelligence, or one that is very similar to human intelligence?
Old AIs involve giving computers explicit instructions to do something, like calculating taxes or programing your thermostat. We understand these very well.
The newer AIs, like Chat GPT, are complex neural networks that are trained on vast amounts of data using a learning algorithm. The AI developer specifies what the network looks like and what the training data is and how the AI should train on that data. At the end, you have a trained neural network, but you don’t know precisely what it does or what is going on inside. It’s just a result of a training process. But we don’t always understand why it works so well.
We know how it’s constructed and what the model is, but we don’t understand the connection between the basic setup of the network and why it can suddenly explain math equations. We have some sense, but we don’t understand it completely.
Think of it this way. Large language models have been around, but they were not very intelligent. They could produce some text, but they couldn’t explain physics, for instance. GPT-3 was a breakthrough that’s only been around for three years. It was trained for weeks or months on giant server farms that use as much energy as a city. It is so complex, and we do understand a lot about it. But we do not understand AI to the point where we know exactly why it’s good at this or not. This is similar to trying to understand human intelligence. We know a lot about the human brain, but there is so much we don’t know. And it took decades to even figure out which parts of the brain do what. It’s not impossible to find out how AI works, precisely, like the human brain. But it will be hard.
Well, when GPT-3 came out, it was bad at some things. It was worse at basic math than writing a sonnet. You’d think it would be good at math. I was trying to figure this out when the next generation of GPT came out, and it was great at math. But we still don’t completely understand why the early generations had flaws. My sense is it’s not fully understood because progress is so fast. If we had many years to study GPT-3 we’d learn a lot more about it. But we can’t study it perfectly because it’s proprietary technology.
The AI creators do understand a lot about AI, of course, but consider what the creators do when they make an AI language model: design architecture and the training process, spend millions of dollars, use a half-a-sun’s worth of energy, and in the end they have a model that does surprising things. So it’s not like AI is a black box that we don’t understand at all, but we don’t fully understand why it’s so good at this or that and not great at other things.
It’s like we know what the human brain looks like. We can take a brain apart and see neurons and different parts of the brain. But we don’t really know why it does these weird things. AI is kind of like this. It’s hard to figure out because the entire thing is so complex.
Yes. A while back some of us thought we should get together and talk about these issues. Computer scientists don’t know much about philosophy and vice versa. So we decided to get together to collaborate, host discussions and do research.
Last semester, we had a language model reading-group discussion of research papers. We had people give talks and discuss their research. We got to know each other, and now [Professor] Mohit Bansal and [doctoral student] Peter Hase, two UNC computer scientists, and I got a seed grant from the UNC School of Data Science and Society to study questions of rationality in language models. We hope to find out whether language models strive for logical and probabilistic coherence. This is something the human mind does naturally. For example, when I notice that I have some contradictory beliefs, then I revise them in a way to achieve a coherent state by giving up one or the other of the contradictory beliefs. The question is whether language models also aim for such a coherent state. This question is not easy to answer, since language models are generally trained on incoherent training data: just whatever text you can find on the internet.
Part of our project is to build a special language model trained only on coherent text, and then to test questions about rationality on it. For example, if we revise one of its beliefs, will it then also revise connected beliefs automatically to achieve a coherent state? We hope to have an answer soon.
This semester the AI Project is focusing on issues around “interpretability” and “explainability.” We want to understand models so we know what they are doing and why — their capabilities and their limitations — so we can be in a position to be in control of them.
They have to do with explaining or interpreting why an AI model does certain things. Here’s a very simplistic example: If we use an AI model for a decision that says you do not get parole or a bank loan, well, why not? The answer should not be, “because the AI model says so.” We would want to know exactly why it says so. Don’t we have a right to an explanation? Shouldn’t this be transparent? This is the sort of thing our AI Project will investigate.
AI has the potential to be better than humans at almost everything, and in some ways vastly better. ChatGPT can write an essay in five seconds, where it might take a human a week. And it knows almost everything that we humans know. I’ve tested it against things I know a lot about, and it knows a lot about my own specialty. Well, AI can do that well in all other areas. I can’t.
We can see a time not far off where AI will be vastly better than humans at so many things. This could change humanity’s overall situation and our civilization. The future of humanity could be affected in a way that calculators, personal computers and the internet have no potential for doing.
That’s a risk, but it’s unclear. We might get worse at things AI is good at, but we might then focus on other things. It might balance out. There’s this notion of a dystopian future where AI does so much and humans become lazy and soft. I’m not sure that’s what will happen. What if we decide to spend time on things like service to each other or meditation or other positive things? Regardless, this concern is not top of mind for me.
“Chimpanzees are almost extinct. And their extinction depends mostly on what humans are doing, not what chimps are doing. So their future existence is out of their hands at this point, and the reason is a slight difference in intelligence. The difference in intelligence between chimps and humans is not that large. Well, AI has the potential to be vastly more intelligent than humans.”
The long-term worry is that AI will be vastly more intelligent than humans, and we can’t be sure what that will mean for us.
Consider chimpanzees compared to humans. Chimps are much stronger but slightly less intelligent. Still, chimps are almost extinct. And their extinction depends mostly on what humans are doing, not what chimps are doing. So their future existence is out of their hands at this point, and the reason is a slight difference in intelligence. The difference in intelligence between chimps and humans is not that large. AI has the potential to be vastly more intelligent than humans.
The other thing is human beings don’t necessarily want chimps to go extinct. We’re just doing our thing, and we take over spaces and resources, and our actions wind up affecting chimps.
So, that’s the analogy I worry about. AI might not even have anything against us, but it needs energy for servers, and that might be in conflict with our need for energy for food production, for instance.
No. There’s one concern that if AI perceives humans as a threat, then it will use robots with machine guns to control us. That’s not going to happen. Instead, it will use intelligence to control us so that some humans control other humans, draw away resources and maybe use advanced bioweapons against us. This is so much more realistic. But AI violence against us is not my concern. If AI works against us, on purpose or not, it will be a gradual process. We’ll wind up having access to fewer resources. More and more control could be taken up by AI systems to the point we humans might actually lose control.
I don’t think our worries require [the need for] AI to be sentient. It can be in control without being sentient. Sentience is a subjective quality of experience. Like when you taste and feel something, it’s subjective to you. But being aware of your existence or your states, this can happen without sentience, like a thermostat is aware of when the temperature rises or falls. More complex AI systems could gain a sense of their own state of existence without sentience.
Depending on how it goes, we might in retrospect say we should’ve stopped at the beginning and put a strict prohibition on this. But if it goes well, it would be glorious. There is so much uncertainty and ignorance that it’s hard to make an assessment. On the one hand you are playing with humanity. On the other hand, some think we humans might be building something so much greater than us, and they think AI development is like a trans-humanist movement, as if humans are a limited, earlier stage in the development of intelligence. Some think future AI systems will be much greater than us, and we’re doing a great thing, even if we go extinct. So, there are a lot of questions about what’s the value of humanity compared to other forms of intelligence.
Yes, that is one of the options on the table; we might be like chimps being pushed towards extinction by a more powerful intelligence. But there are also many other options. We might be like ants that live in parallel with a higher intelligence, without having any clear sense of what this higher intelligence is doing. We might soon be like an ant colony living next to a Google server farm, with no conception of what is going on inside of the building, but also with no immediate risk to our own existence. Or it might all go more positively, where AI is a powerful tool that is fully under our control and comprehension and which will lead to vast breakthroughs in science and productivity.
I think it is fair to say that nobody knows how all this will turn out, but also that what happens now and in the near future will have a great impact on how it will go. I suspect we will know much more about the direction the needle is moving in a decade or so.
— Mark Derewicz