ORIGINAL FATHER OF AI ON DANGERS! (Prof. Jürgen Schmidhuber)

Published 2023-08-13
Please check out Numerai - our sponsor @
numer.ai/mlst

Patreon: www.patreon.com/mlst
Discord: discord.gg/ESrGqhf5CB

Professor Jürgen Schmidhuber, the father of artificial intelligence, joins us today. Schmidhuber discussed the history of machine learning, the current state of AI, and his career researching recursive self-improvement, artificial general intelligence and its risks.

Schmidhuber pointed out the importance of studying the history of machine learning to properly assign credit for key breakthroughs. He discussed some of the earliest machine learning algorithms. He also highlighted the foundational work of Leibniz, who discovered the chain rule that enables training of deep neural networks, and the ancient Antikythera mechanism, the first known gear-based computer.

Schmidhuber discussed limits to recursive self-improvement and artificial general intelligence, including physical constraints like the speed of light and what can be computed. He noted we have no evidence the human brain can do more than traditional computing. Schmidhuber sees humankind as a potential stepping stone to more advanced, spacefaring machine life which may have little interest in humanity. However, he believes commercial incentives point AGI development towards being beneficial and that open-source innovation can help to achieve "AI for all" symbolised by his company's motto "AI∀".

Schmidhuber discussed approaches he believes will lead to more general AI, including meta-learning, reinforcement learning, building predictive world models, and curiosity-driven learning. His "fast weight programming" approach from the 1990s involved one network altering another network's connections. This was actually the first Transformer variant, now called an unnormalised linear Transformer. He also described the first GANs in 1990, to implement artificial curiosity.

Schmidhuber reflected on his career researching AI. He said his fondest memories were gaining insights that seemed to solve longstanding problems, though new challenges always arose: "then for a brief moment it looks like the greatest thing since sliced bread and and then you get excited ... but then suddenly you realize, oh, it's still not finished. Something important is missing.” Since 1985 he has worked on systems that can recursively improve themselves, constrained only by the limits of physics and computability. He believes continual progress, shaped by both competition and collaboration, will lead to increasingly advanced AI.

On AI Risk: Schmidhuber: "To me it's indeed weird. Now there are all these letters coming out warning of the dangers of AI. And I think some of the guys who are writing these letters, they are just seeking attention because they know that AI dystopia are attracting more attention than documentaries about the benefits of AI in healthcare."

Schmidhuber believes we should be more concerned with existing threats like nuclear weapons than speculative risks from advanced AI. He said: "As far as I can judge, all of this cannot be stopped but it can be channeled in a very natural way that is good for humankind...there is a tremendous bias towards good AI, meaning AI that is good for humans...I am much more worried about 60 year old technology that can wipe out civilization within two hours, without any AI.”

However, Schmidhuber acknowledges there are no guarantees of safety, saying "there is no proof that we will be safe forever." He supports research on AI alignment but believes there are limits to controlling the goals of advanced AI systems.

Schmidhuber: "There are certain algorithms that we have discovered and past decades which are already optimal in a way such that you cannot really improve them any further and no self-improvement and no fancy machine will ever be able to further improve them...There are fundamental limitations of all of computation and therefore there are fundamental limitations to any AI based on computation.”

Overall, Schmidhuber believes existential risks from advanced AI are often overhyped compared to existing threats, though continued progress makes future challenges hard to predict. While control and alignment are worthy goals, he sees intrinsic limits to how much human values can ultimately be guaranteed in advanced systems. Schmidhuber is cautiously optimistic.

Note: Interview was recorded 15th June 2023.
twitter.com/SchmidhuberAI

Panel: Dr. Tim Scarfe @ecsquendor / Dr. Keith Duggar @DoctorDuggar

Pod version: podcasters.spotify.com/pod/show/machinelearningstr…

TOC:
[00:00:00] Intro / Numerai
[00:00:51] Show Kick Off
[00:02:24] Credit Assignment in ML
[00:12:51] XRisk
[00:20:45] First Transformer variant of 1991
[00:47:20] Which Current Approaches are Good
[00:52:42] Autonomy / Curiosity
[00:58:42] GANs of 1990
[01:11:29] OpenAI, Moats, Legislation

All Comments (21)
  • References: Metalearning Machines Learn to Learn (1987-): people.idsia.ch/~juergen/metalearning.html. Schmidhuber's overview page on metalearning. 2022 survey: people.idsia.ch/~juergen/deep-learning-history.htm… - Schmidhuber's overview of the history of modern AI and deep learning. en.wikipedia.org/wiki/Gottfried_Wilhelm_Leibniz people.idsia.ch/~juergen/leibniz-father-computer-s… Leibniz, who published the chain rule in 1676: the chain rule is foundational for training neural networks. Heron of Alexandria built the first programmable machine in the 1st century. en.wikipedia.org/wiki/Hero_of_Alexandria people.idsia.ch/~juergen/deep-learning-history.htm… Gauss and Legendre had the first linear neural networks, around 1800: linear regression / method of least squares. Zuse built the first program-controlled general-purpose computer in 1941. people.idsia.ch/~juergen/zuse-1941-first-general-c… Bremermann's limit, discovered in 1983, sets the ultimate physical limits of computation. The ancient Antikythera mechanism was the first known gear-based computer. people.idsia.ch/~juergen/deep-learning-history.htm… Transformers are the neural networks behind ChatGPT: twitter.com/SchmidhuberAI/status/15769661299937976…. Schmidhuber's tweet on his 1991 system, which is now known as an unnormalised Transformer with linearised self-attention. Overview of neural subgoal generators since 1990: people.idsia.ch/~juergen/deep-learning-miraculous-… - Schmidhuber's overview of neural nets that learn by gradient descent to generate subgoals. Overview page on RL planners since 1990: https://people.idsia.ch//~juergen/world-models-planning-curiosity-fki-1990.html - Schmidhuber's overview of reinforcement learning planners and intrinsic motivation through generative adversarial networks. Overview of GANs since 1990: people.idsia.ch/~juergen/deep-learning-history.htm… - Schmidhuber's overview of the history of GANs. Kolmogorov complexity: people.idsia.ch/~juergen/kolmogorov.html - Schmidhuber's overview of Kolmogorov complexity and its generalisations. Maximizing compression progress like scientists and artists do: people.idsia.ch/~juergen/artificial-curiosity-sinc… - Schmidhuber's overview of formalizing curiosity and creativity. Overview of adversarial agents designing surprising computational experiments: people.idsia.ch/~juergen/artificial-curiosity-sinc… - Schmidhuber's overview of approaches where networks can alter themselves: people.idsia.ch/~juergen/metalearning.html Set of all computable universes (1997): people.idsia.ch/~juergen/computeruniverse.html - Schmidhuber's proposal that we live in a simulation computed by an optimal algorithm that computes all logically possible universes. OOPS paper (2004) where he emphasized coming limits to Moore's Law: people.idsia.ch/~juergen/oops.html - Schmidhuber's paper predicting the end of exponential growth in computing.
  • So, we have the Godfather of AI and the Einstein of AI already. I hope the next one isn't the Oppenheimer of AI 😬😅
  • @CodexPermutatio
    I really like the way this video is edited showing all those little snippets from his seminal papers. Nicely done.
  • Quotations from Schmidhuber (timestamps tba): "Humanity is a stepping stone to something that transcends humanity." "The universe itself is built in a certain way that apparently drives it from very simple initial conditions to more and more complexity." "Machine learning itself is the science of credit assignment." "Science in general is about failure and 99% of all scientific activity is about creating failures. But then you learn from these failures and you do backtracking and you go back to a previous decision point where you maybe made the wrong decision and pursued the wrong avenue." "As far as I can judge, all of this cannot be stopped but it can be channeled in a very natural and I think good way, in a way that is good for humankind." (on AI progress) "There are certain algorithms that we have discovered and past decades which are already optimal in a way such that you cannot really improve them any further and no self-improvement and no fancy machine will ever be able to further improve them." (on limits to self-improvement) "... then for a brief moment again it looks like the greatest thing since sliced bread and and then you get excited again. But then suddenly you realize, oh, it's still not finished. Something important is missing." (on overhyping new methods) "Generally speaking, there is not so much competition and there are not so many shared goals between biological beings such as humans and a new type of life that, as you mentioned, can expand into the universe and can multiply in a way that is completely infeasible for biological beings." (on advanced AI based on self-replicating factories) "The important thing was that the first network had to invent good keys and good values depending on the context of the input stream coming in. So it used the context to generate what is today called an attention mapping, which is then being applied to queries. And this was a first Transformer variant." (on his 1991 work on linear Transformers) "One network learns to quickly reprogram another part of the network." (on fast weight programming like in linear Transformers) "To achieve all of that, you need to build a model of the world, a predictive model of the world, which means that you have to be able to learn over time to predict the consequences of your actions such that you can use this model of the world that you are acquiring there, to plan, to plan ahead." (on components needed for AGI) "Generally speaking, if you share goals, then you can do two things. You can either collaborate or compete. An extreme form of collaboration would be to maybe marry another person and set up a family and master life together. And an extreme form of competition would be war." (on shared goals leading to cooperation or conflict) "Most CEOs of certain companies are interested in other CEOs of competing companies. And five year old girls are mostly interested in other five year old girls. And supersmart AIs are mostly interested in other supersmart AIs." (on interest arising from shared goals) "What you really want to find is a network that has low complexity in the sense that you can describe the good networks, those with low error with very few bits of information [...] if you minimize that flat minimum second order error function, then suddenly you have a preference for networks like that." (on minimizing network complexity) "My fondest memory. Oh, it's usually when I discover something that I think nobody has known before...These rare insights, that's what's driving scientists like myself, I guess."
  • @odiseezall
    Great interview, wish it was longer. JS has such methodical rational clarity and I appreciate the way he engages the listener in his answers instead of laying out the table. It's clear that this is just the beginning for AI and we're in for a wild ride, with good and bad, but most importantly with great change.
  • Schmidhuber was my first "love" in the AI field. This was because reading his papers gives students the possibility of appreciating the vastness of the field as no other author is really able to do. I think that there is no author so prolific and diversified as he is. Thank you very much for this interview! I wished it was longer! So I'm hoping for a second one! :D
  • @jannis9138
    This interview shows why computer scientist should not lead the discussion about societal impacts of AI. Schmidhubers understanding of society is utterly devastating. CEOs are only interested in CEOs and so AI will only be interested in AI? He does not have any understanding of class and identity fights going on in a society and how AI will be used. He seems to think that the only bad application of AI are weapons, and never thought about the impact of social networks on individuals. Also he seems to not understand that there is still the possibility of regulating AI based on a democratic process, it seems during the interview it was the first time that he thought about these questions. Very shocking and even more shocking that most of the comments here assign him a rational clarity 😅
  • @NelsLindahl
    I had been waiting for this interview to happen for MLST. It did not disappoint.
  • @rrathore01
    Brilliant Episode!! The idea of curiosity networks and regularization were specially though provoking, this will keep me occupied for upcoming weeks. Thanks alot 🙏
  • @mjedm6914
    "Making the World Differentiable" is one of my favorite papers ever written, so much so that I have a framed print of the abstract and the artwork of the first figure hanging on my office wall (the hand being watched by the camera that is feeding into a neural net with the output connected back to the hand) .
  • @karenreddy
    So his great argument is that AI will be fine because people want companies to make good AI. What if I happened to be very greedy, and what is good for me harms others? The issue with AI is that it is insidious, whereas Nukes are obviously bad. Nukes incentivize some level of cooperation and avoidance of major conflict as the threat is obvious. AI can grow relatively quietly, and by the time it possibly becomes a threat there's no longer a way to switch it off. There are so many ways in which his simplistic logic fails. It's the view of a child, really.
  • @99dynasty
    39:34 That pause… the topic became a bit intractable and divergent, perhaps some assumptions baked into it, he just didn’t have anything to add. This is a testament to how carefully Jurgen thinks. His overall lack of compulsion in his conversation is remarkable.
  • @vallab19
    Prof. Jürgen Schmidhuber's voice of wisdom; "I am much more worried about the 60 year old technology that can wipe out civilization within 2 hours without any AI ". He is wonderful.
  • @MattFreemanPhD
    Just because there are fundamental algorithmic limitations doesn’t imply that we are anywhere near them
  • @Daniel-Six
    Man, this guy is a refreshing voice (from the past) in the modern ML saga. Great mixture of intellect, sagacity and humor. His perspective on the likely inscrutible aims of AGI's feels intuitively correct.
  • @user-wo6sr2xr3p
    How has Jürgen Schmidthuber not got an award for his Gödel Machine? Maybe i get it wrong but i find it game changing his concept of meta learning and optimal self improvers. His explanation of how conciousness and curiosity appears through problem solving is the most elegant explanations for that i have heard. Its really mind blowing his work in general, but the Gödel machine has got to be his most profound and transcendental i think. Do check it out on his web blog, he has a lot of interesting posts there
  • @AsgerAlstrupPalm
    Wow. He is at another level. Maintaining the higher viewpoint, understanding the underlying game in such detail and verbalizing the important principles is such a gift. I love it. Who cares about the Turing award? In the end, of course the specific way we get these problems solved matters, but ultimately we need to align to the broader and stable ideas that are true also in 10 years to make sure we are headed in the right direction. He provides the best response to the deep question about AI risk I have heard. It is about humans vs humans. Best episode so far. Rarely have I seen MLST stumped
  • It was very enjoyable, even the awkward moment of silence at 39:41 :) The comparison about boredom ruling the meta-learning is simply genius 57:03 . Compression of knowledge as a drive to learn 1:00:41, as a way of life. Close to pure philosophy. Thanks for this!!
  • @partymarty1856
    I like how the one guy looks like his screen froze but that's just his personality
  • @antigonemerlin
    11:16 >I found that when I go back and read original source materials, you know, let's say Einstein's first paper on diffusion or anything like that, you know, because they're breaking new ground, they're kind of considering like a wider array of possibilities. And then over time, you know, the field becomes more and more focused kind of on a on a narrower avenue of that. And you can go back and look at the original work and actually gain a lot of inspiration for alternative approaches or alternative considerations. So in a sense, it's it's kind of in the in the sense of forgetting is as important as learning. Tangentially related, but I found that applies to other domains as well. A lot of fantasy writing is influenced by Tolkienesque writing, so in order to write differently, I found it helpful to go back to the source material, which in this case happens to be the cultural folklore and mythology, and instead of, for example, focusing on Norse/Anglo-Saxon mythology (which was what Tolkien did), to focus on other cultures instead.