2024-01-03 Starting the year

Chronolink: [2023-12-21]

Research into representations and uncertainty in neuro-ai.

I was able to collect the phd direct passage application form.

Reconnected with the kindle by starting to read A brief history of intelligence.

Brief idea: One could automate grounded theory by using LLMs by scanning a directory of text files and then using the LLM to generate a list of topics (or providing a topic seed…). Maybe compare the analysis of different LLMs. Or even reading theory out of text and then generating a list of topics referencing the theory.

The only the only existing model we have that works is the human mind. 22:14 And that seems to work upon quite different principles than just scale. Yes. Um, I think something that’s very 22:22 interesting about Transformers, though, is I was arguing before that I don’t like this neuron analogy, but actually, um, the large 22:31 language models like ChatGPT are the closest thing we have to, um, 22:36 something like the human brain, because they don’t they do, just as I expressed, map an input to an output. 22:43 But they also have this kind of short tum memory, which is the context window. So in a sense, 22:48 and it’s ironic that we don’t refer to that in terms of neural analogy, because that to me is the first thing we’ve built that sort of looks 22:55 something like the human mind. You have this context and it makes the next, uh, prediction of the token based on that context. 23:02 And in principle, um, you could then operate on the previous context that, 23:09 you know, the system itself could operate on the previous context, it could summarize it, it could file things away. 23:15 It could, uh, ask itself to generate other, you know, different hypotheses to explain something and compare them and decide 23:23 on something and use that as a sort of scratch memory in the same way that we have a working memory. But strangely, we don’t refer to 23:30 that in terms of the neural analogy, which I find, uh, quite ironic. I don’t know if people are working on that kind of thing. 23:37 I seem that people are working on everything, but I haven’t personally read any work in which the, uh, transformer system goes back and 23:45 edits things in its past context. Yes. Um, but I assume that that would be one direction that you might 23:52 go to try and make this system do something that’s more like thinking. 23:58 I mean, in the end, a purely feedforward system can’t really do anything sophisticated. You need probably to be doing some 24:08 kind of manipulation, if only to generate internal consistency. So there’s no way a large language model can have internal consistency. 24:16 It’s learned everything on the internet. It thinks that the Earth is both flat and round with different 24:22 statistical proportions. Uh, you know, hopefully mostly most people on the internet think it’s round, and 24:27 that’s the conclusion it’s come to. But in the weight somewhere is Earth flatness. And so to get to another level of 24:36 cognition, you’re going to need something that builds an internally 24:42 consistent model of what’s out there, whether that needs, as you might argue, uh, interaction with the real world or whether that 24:50 can be done purely in the domain of language remains to be seen. But I think that might be one direction that people would take 24:59 things. You know, the body of knowledge of humans is a kind of virtual phenomenon that supervenes on 25:06 all of us physical earthlings. So, you know, like this, this, 25:11 this infosphere that we’ve created, it’s like a symbiotic organ. ISM, and that has consistent artifacts of knowledge, as you said. 25:20 But many humans do hold the view that the Earth is flat. So it’s just another example of this interesting kind of like levels of, 25:27 of emergence. But they they hold an internally consistent view that the world is flat. They I mean, as far as they’re concerned, 25:33 it’s internally consistent. Obviously there are inconsistencies that are quite easily proven. But um, within their mind, 25:41 they explain they have a model of the world that whereby they explain everything. Well, if the moon is flat, well, 25:47 obviously the sun must be flat as well, and so must, um, and, and you know, when you look at the horizon of the sea, 25:54 it looks flat, and consequently, the earth can’t be round. You know, they explain away other phenomena and build up a model 26:00 that backs up their hypothesis. And there’s no sense that transformer 26:06 system is doing anything like that. It just starts at the beginning and predicts the next word and has the statistics. 26:14 Um, uh, that are consistent with what’s previously in its context window. Yes. You could argue. Though, that, um, humans, 26:23 our brains are also very chaotic, but we have this confabulation and post hoc rationalization in much the same way. 26:29 So we, you know, subconsciously we hold conflicting views. But when we try and explain our views and to avoid cognitive dissonance, 26:36 we kind of we try and reduce what we think to something simple. Right? 26:42 But we have a finite number of views that are sort of partially rounded theories of the world. Yes, the large language model has 26:48 everything that humanity’s ever created with no preference for one thing or another other than its statistical likeliness. So yes. Yes. 26:56 So that’s still, you know, even if you have multiple conflicting views of the world, you know, there’s that famous 27:03 Walt Whitman poem where he says, do I contradict myself? Very well, I contradict myself, I am multitudes, I am large, you know 27:12 that that is human beings captured. Um, but we don’t have every view on everything simultaneously. We’re trying on some level to 27:20 come up with, uh, consistent models of the world. And we need to do that because we need to take actions in the world. 27:25 And it’s impossible to do that if, uh, if you have 50,000 conflicting views of how things work. And this is really interesting 27:33 because Hinton says one of the reasons why ChatGPT is a kind of superintelligence is because it knows all of the things. 27:39 But I would argue, as you just did, I think that we are kind of bounded as observers as there’s a computational, um, kind of restriction to how 27:48 many things a single observer can understand at one time. And we’ll get more into this later. But I think with cognition, it’s not 27:56 just knowing, it’s also thinking. So just knowing everything isn’t actually the whole piece is it? Yeah. Can you deduce new facts? 28:05 I think in one of your other podcasts you talked about, if you trained ChatGPT with data only up to, you know, the early 20th century, 28:13 would it be able to reproduce the, um, Einstein’s theory of relativity? 28:18 I think we all know the answer to that. It wouldn’t. Definitely not. And what are the missing pieces? But one? 28:24 You know what that’s getting at? What I was saying before, it’s true to to to build that theory, you need to have a model of the world. 28:32 And you need to realize that model of the world is wrong, that certain facts. I don’t personally believe. You need to observe those facts 28:39 yourself, but certain facts are inconsistent with that theory. And then you need to somehow come up with a new model that itself 28:48 will make new predictions about the world that people can go and test in the case of physics. But but I think that happens on 28:53 a sort of more minor scale, you know, with your theory about how businesses work or how your friend’s personality works and 29:02 how best to interact with them. You know, you have theories about everything, uh, that occasionally break, and 29:08 you have to radically rethink them. From a computational point of view, they are finite state automata. I mean, you’re saying any finite