Scientific Discovery with AI: Unlocking the Secrets of the Universe Key Requirements and Pioneering Projects Highlighting AI’s Contribution to Astrophysical Research
Chi-kwan Chan (CK) is an Associate Astronomer/Research Professor at Steward Observatory, University of Arizona, and has been serving as the Secretary of the Event Horizon Telescope (EHT) Science Council since 2020. He recently led the publication of the computational and theoretical modeling/interpretation of our black hole, Sgr A*. Professor Chan created EHT's computational and data processing infrastructure and continues to lead it to this day, along with EHT's Software and Data Compatibility Working Group. He is a Senior Investigator of Black Hole PIRE, a leader of the Theoretical Astrophysics Program TAP, a Data Science Fellow, and a member of the Applied Mathematics Program. In addition to pioneering the use of GPU to accelerate the modeling of black holes, Professor Chan also developed many new algorithms to improve and accelerate modern research, built cloud computing infrastructures for large observational data, and applied machine learning algorithms to speed up and automate data processing. Professor Chan has taught and mentored in subjects of machine learning, numerical analysis, cloud computing, and quantum computing, and is an avid hiker.
Using the imaging of black holes as a case study, this talk highlights the key requirements for AI to make meaningful contributions to astrophysical research. Dr. Chan introduces several pioneering projects that are integrating AI into astrophysics, covering aspects such as instrumentation, simulations, data processing, and causal inference. He also discusses an innovative project aimed at enabling AI to gain scientific insights independently.
This evening, we're here to learn from Dr. Chi-kwan Chan, who likes to go by CK.
Thank you very much for the kind introduction. It really is my honor to share some of my recent research in astronomy and AI to this audience of the Open AI Forum. So I want to give everyone a quick overview of what astronomers do.
This is an image of the Milky Way. The white stuff you see here, they're actually stars. There are many, many stars there. The dark stuff, they are cloud dust blocking the star lights. So in order to see through the dust, let's switch to a different frequency. Let's go to a radio wave. This is how the Milky Way looks like in radio wave. And then we can just keep zooming in to the center. And then we see this complicated magnetic structure around the center.
Dr. CK Chan leads the Event Horizon Telescope collaboration and is the architect of its computational and data process infrastructures. Our guest tinkers with supercomputers, cloud computing infrastructure, machine learning, software, and data.
We see even more structure if we keep zooming in. What we call here is the mini spiral, which is just monocular cloud. And then there is a dot in the middle, which turns out is not a star. It's not a point source. Instead, it's a ring-like structure that we believe is the supermassive black hole at the center of the Milky Way.
So a few years ago, our collaboration also published another image that is a supermassive black hole in a different galaxy, M87. These two pictures here are the two only direct image of black hole that humankind have right now.
In this talk, I will use black hole as a case study to show how scientific breakthroughs happen. I will then share some progress in how astronomers use AI to improve astronomy and how we use astrophysics to also push AI. And finally, I also hope to share with you some of the progress that we are aiming to make AI capable of doing scientific discovery.
So while I go through this talk, there will be three simple concepts that just keep coming up. They are interpolation versus extrapolation, pattern recognition, and high-order intelligent or high-order thinking.
For the people with machine learning background, I'm sure you all know this. Interpolation is just when you have a quantity you want to measure as a function of some other stuff. You sample your quantity that is your data point. So what interpolation is, is just asking what are the values of this quantity between your data points.
Extrapolation, on the other hand, is trying to make prediction. In the machine learning language, this is trying to go outside your sampling space and find out what these quantities look like.
Pattern recognition has different meaning in different fields, but what I mean here is really just about finding our common property from seemingly unrelated objects and trying to make connections.
High-order intelligent, what I mean here is the ability of not just solving a math problem, but instead able to explain why you go through some steps. It's about your strategy.
So sometimes people call this abstract thinking, computational thinking, and I will say the scientific method itself is also this high-order intelligence.
So this very simple concept will help us understand how scientific breakthroughs happen.
Why do we want to use black holes as a case study? You can ask Chad GPT. So when we do that, when we ask what is the most important unsolved problem in astrophysics, every time you ask, the order comes out a little bit different, but black holes always show up. And it may be a little bit counterintuitive. Even though black holes themselves are black, they're actually the most energetic source in the universe.
So they affect galaxy evolution. They may also explain dark matter and dark energy. So black holes, they're actually in the central of modern astronomy.
Now imagine you're in this space station because the space station is falling in gravity and you're falling together with it. So of course, you just feel like you're floating in the space station.
So this is a very important point because when you watch movie and when you remember Dr. String opened the portal for Lockheed to fall, he got very angry by falling for 30 minutes. But Lockheed's an alien, right? So falling for 30 minutes for him should just be like floating, traveling in space for 30 minutes. He really shouldn't be that angry.
Anyway, joke aside, now if we turn this argument around, we can actually also figure out that standing on earth with gravity, this is indistinguishable from staying in a spaceship, which is accelerating.
According to Newton, these two systems, the math work are the same. They appear to be the same. But what Einstein did in 1907 was he came up with a crazy idea that these two systems didn't just appear the same, they're actually physically the same.
So according to Newton, these two systems, the math work are the same. They appear to be the same. But what Einstein did in 1907 was he came up with a crazy idea that these two systems didn't just appear the same, they're actually physically the same.
So this idea nowadays, we call it Einstein's Equivalent Principle. And following this principle, Einstein derived the field equation, which says space-time curvature is equivalent to mass and energy density.
Well, that space-time tells matter how to move, and matter tells space-time how to curve. So using this idea, now you can start making predictions. So imagine you are far away in deep space, far away from any mass.
In this case, particles, photons, they will just follow their most natural way to move in a straight line. But now if you introduce some mass, it will curve the space-time. Although in four dimensions, the particles still want to move in the most natural way, their progression in 3D will appear to be curved. So this is what we call gravity.
With the equation, we can actually calculate how the space-time curves. We can calculate this particle motion. So this allows us to test the prediction. This is exactly what Eddington did in 1919. So he went to a solar equipment brand, take a picture, and he showed that the background stars in the image actually bend around the sun's gravity.
So these results show up in New York Times, Einstein got famous, and then the rest was history. And also, at that time, when Eddington was asked by a reporter, if it's true that there are only three people in the world who understand general relativity, Eddington's reply was, who was the third?
So he was probably joking, but I do think there's an important point here that we will come back to later.
Now after testing Einstein's theory in the solar system, weak gravitational field, we start to ask, can the prediction from this theory be accurate in other situations? So there is a very strange prediction saying that if you have a very dense, a very compact object, then you can bend the space-time so much that when you are in each of these dots, this cone here is called your future light cone. This is when you shine your flashlight, you make it go around, how far can the light go?
So the space-time can curve so much that when you are close to this central object, your future light cone can actually only point to the center. So the space-time is curved so much that your future is locked into this strange object. This is what we call a black hole.
Okay, so now you can think about a thought experiment that you have this crazy black hole, this crazy object here, what if you shine a flashlight to it? So what will happen is most of the photons will actually go inside the hole, so you will appear black, but because of the curved space-time, some photons will go around the black hole and come back to you. So when you look at this object, shine your flashlight, you actually see a ring here. So this is a unique prediction, a unique signature that we can test if black hole exists.
Now you may wonder where there's no flashlight in space, so is it really possible to test the black hole? It turns out it's possible. From observation, astronomers actually believe there are supermassive black holes in each galaxy in the universe, and because of the plasma, because gas is trying to fall to the black hole, they actually got heated up with the magnetic field and form a crystalline flow that you see in the foreground of this image.
This plasma can heat up to 10 to 11 Kelvin. They're very hot, very bright, and with this numerical simulation, we can also set up a virtual screen, just like you're doing video game with ray tracing. In each pixel of the virtual screen, you trace back using Einstein equation and then solve for the radiative transfer equation to compute how much emissivity you get from the pixel. So this is very similar to ray tracing algorithm in video game computer graphics.
So with this type of technique and theory, this is a typical movie we get from the black hole simulations. In the middle, you see this accretion disk, which are the in-falling plasma to the black hole, and then you can also see magnetic field got pushed out, that's what we call the funnel. And we also have the photon ring, which is the unique signature of the black hole. This is how a movie looks like.
So now we have theory, we have this numerical simulation, we can predict how the black hole looks like. So the next step is just to go to a telescope, take a picture, and then we have confirmation. But it turns out this to be much, much more difficult than you think, because black holes, they're very compact, they're very small on sky. So even if you look for the biggest known black hole on sky, their size, their apparent size on the sky is like putting a donut on the surface of the moon.
So this may not mean too much thing for you. So another explanation is this is like the size of an atom in the length of your arm. So this is very, very tiny. And using a single telescope, there's no way you can resolve such a small object. But black holes, they are important, as I mentioned earlier.
So a group of astronomers actually came together to form the Event Horizon Collaboration in order to take this challenge and image black holes. The method we use is called very long baseline interferometry. So I can explain this in this cartoon a little bit. So each radio telescope on Earth is only a single pixel camera, so you can't really take image with this radio telescope.
But what you can do is you can record the radio wave front from a radio source on sky. So if you point this radio telescope to the same region on the sky, and now you imagine there are some features, that is photon ring, the black hole, other stuff in this region of the sky. Each of these brightness feature, they will send radio waves to the telescope.
So in this example, the wave actually arrives at both telescopes at the same time. And now imagine you have another feature in your sky, on your sky, which sends out another type of wave. This blue wave hits the lower telescope first. So by measuring the delay of this radio wave arriving to the telescope, it's possible to reconstruct the black hole image. So this is the technique we use.
So the Event Horizon Telescope, we get, we find all the radio telescope around the world. We put the instrument, we upgrade them, and we form this telescope array. We're constantly improving them. So in our 2017 observation, that was the data we got for the first black hole image. But after that, we also add new telescope to improve our array.
And this is actually a movie I took earlier this year during the observation campaign. You can see this is the radio telescope. I keep it, it just keeps spinning, pointing to different source on black hole. And because we are observing in radio wave, we can even observe during the day.
So in 2022, the two telescopes in Arizona, among our, you know, eight telescopes around the world, around the world, we collect four petabytes of data only from two telescopes. And this is the four terabytes of data on a table. It's half a million dollars, probably not too expensive compared to the hardware you guys need to change AI models. But yeah.
Yeah, so after gathering this data from each of the telescopes, we combine them. We put two telescopes together from this virtual data set, and this is how our data look like. This turns out to be equivalent to the four-way coefficient of an image. So for people who do signal processing, this is something you probably are very familiar with.
But look at this top. There are actually a lot of holes in your data. If we can fill the data point of the whole full space, we can just do an inverse four-way transform and get our image. But unfortunately, because there are only that many radio telescopes around the world, so this is the best we can do. There are a lot of holes.
So even when we use computational method to do image reconstruction, there are millions of image solutions and all of them agree with the data. So all of the images you see here, they are consistent with the data we observe.
Many, many of them are green light, like a donut, but then there are some of them are not. So in order to be sure that we are observing a black hole with a ring, we use an unsupervised learning method to automatically classify all of this image.
So, what we find is there are mainly four different classes. Three of them actually ring light, and only very, very few images go to the fourth category that doesn't have a ring. So with this, we have an approximation of our statistics, and we know with very, very high trust we are observing a black hole at the center of the Milky Way. So I want to remind people that the theory of general relativity came up 100 years ago, so it actually took us 100 years to understand the theory to do the simulation and construct the instrument to gather the data to confirm the theory, so this is a remarkable achievement.
Now however, having only the image is the beginning, because I mentioned to you that this hot plasma around the black hole, we also want to measure their property. So what we do is we, you know, spin up our supercomputer and start to run many, many supercomputer simulations of black holes. This is one of those models with certain parameters, but we actually don't know the magnetic configuration, we don't know the spin, we don't know, there are many, many unknown parameters around the black hole. So what we do is we just change this parameter and create a simulation library that you see here. So different rows here corresponding to different black hole spins, but because we don't really know the plasma temperature, so that is another parameter that we introduce. And then we also don't know the magnetic field strings around the black hole, so we also have two types of simulations.
And with such a large simulation library, we can start comparing these simulations with the data we observe, and then we can start ruling our models. If we look at the black hole size, it turns out you will only rule out a few models, because we actually start the model by knowing the black hole size from the beginning. So now you take in the shape of the black hole we measure from the Event Horizon Telescope to rule out more. You bring in multi-wavelength information, you rule out even more. And at the end, in this very large simulation library, only two models are consistent with our observation. So ladies and gentlemen, this is the best theoretical model we have for the supermassive black hole at the center of the Milky Way. It is likely to be a highly spinned black hole with strong magnetic field. The magnetic field is strong enough to stop the accretion force, to stop the plasma falling into it. And we also know the electron temperature is lower than the ion temperature. So these are all quantities that astronomers care about.
Now I start this journey by telling you general relativity, but general relativity is not the only gravity theory we have. So what astronomers do is we also try to predict what the other gravity theory will predict for such a mass. These are some of the predictions. We can also compare this to our observation and we can rule out many of them. So at the end of the day, after all of this work, we show that Einstein is still correct. His prediction is consistent with our data from the supermassive black hole at the center of the Milky Way.
Alright, so in this black hole story, I think I showed you how some of the brightest mind in human history was trying to extrapolate the physics we learned from Earth to apply to the whole universe. So using their high order thinking, high order intelligence, they identify pattern, they identify that having gravity staying on Earth is the same as a falling or accelerating spaceship. And then they use this to extrapolate to a theory that we don't have data. So it turns out this interpolation and this high order concept are capable of extrapolation. So now if we just add back the data piece for this free concept, we actually get back our scientific methods. And not only that, we have a framework to come up with new theory.
So speaking of the data, I guess the coming decade is very exciting in astronomy because there are multiple large scale projects coming online. So one of them is the Rubin Observatory. This is an actual image of the observatory in Chile. You just finish construction and we are getting our first light. So this is a survey telescope, meaning that you will scan the southern sky every three nights, get a 10 petabyte of data throughout its 10 year lifespan, and then you will create a movie of the full sky. So using this data, we should be able to resolve the dark matter and dark energy problem. And then for the Event Horizon Telescope that I'm part of, we are trying to upgrade the EHT. So all this yellow and orange label here, they are our existing telescope. And we are planning to add more telescopes, which are shown as cyan here. So with this new telescope capacity, we are planning to make movies of black holes. The movies I showed you earlier, those are only simulations, they are not real data. But with this upgrade, we will be able to take a real movie from a real black hole. In addition to that, we are also working with NASA to put a space telescope in orbit. By combining the data from the space telescope with ground-based telescope, we can measure very, very fine feature from the black hole image. So that will allow us to test gravity to the most precision way in the strong field region.
So with this much data coming, all astronomers actually know that AI will play a very, very important role in astronomy. So in Arizona, we are forming this National AI Research Institute for Astronomy, trying to bring AI and machine learning algorithm to all steps in astronomy research. So one of them is we want to start with data gathering. So telescopes, in some sense, they are just giant robots with very good vision. So their operations are very simple, but the challenge here is they are very precise. So the hardware precision is actually not enough for you to point the telescope in the correct direction. You always need to calibrate your telescope, and that wastes telescope time. So what we are doing now is to put machine learning algorithm to predict this error in the mechanical structure so that we can point the telescope very accurately and improve our observation.
And in addition to that, there's something that astronomy will always hate. I guess people here, especially the one with kids, you know the song, Tingle, Tingle, Little Stars. Astronomers hate that song because the atmosphere, when it's moving, it actually breaks up our image of the star. But there are methods to correct that called adaptive optics. The algorithm is actually very interesting. So when your telescope, when the light comes from the sky, hits the mirror of your telescope, it will bounce back and go to a secondary mirror. This adaptive optic technology is trying to control the secondary mirror to make it bend to correct the incoming light in real time. So by doing so, we can counteract the atmosphere turbulence. As you can imagine, this needs to be done in real time. So this is both a computation-intense algorithm, but at the same time, we have very, very low latency requirement. We need to be very, very risk-balanced. We even talk to the high-frequency trading people, but their problem is easier than us, so their methods are not useful.
So in the National AI Research Institute for Astronomy, we are trying to bring machine learning algorithms to do predictive control of adaptive optics. So we want to use statistics to predict what is the upcoming turbulent realization will look like and correct that to improve our telescope. So I have a movie here.
We are trying to image planets around a very bright star. So without adaptive optics, this is what you see. It's just all terrible and all messed up. You can't even see a star.
But now if you turn on adaptive optics, you will be able to see the star. And then we can also use additional instruments to remove the light from the central star.
And now we turn on adaptive optics, and you start to see this very dim source. This turns out to be a planet around a star. And now if we use computation method to subtract the background, we can actually see the planets.
So these are the type of machine learning and AI technology we want to bring into astronomy observations.
Now another thing very important is data processing. So you may ask why AI is still not used everywhere in astronomy. So there are a few major bottlenecks. So one bottleneck here is in astronomy, we always push the sensor limit because we want to image the farthest thing. Those are usually where the scientific discovery is from.
So because of that, we don't really have the ground truth data. We just have this very, very noisy image that we want to correct. So that is one reason that we can't really use all of the box machine learning.
Now another reason is that astronomy data sets are very often very non-uniform, sparse. For example, this plot I already showed you from the EHT. We have holes in our data. This is not just a picture from your digital camera.
So again, many, many out-of-the-box machine learning algorithms wouldn't work. And then another very important thing is because we are doing science, we want to know the error bar of our measurement. Conventional machine learning algorithms also do not provide all of that.
And then a final point is in science, reproducibility is the hallmark of a scientific method. So we want to be able to reproduce our result. That also poses a challenge in machine learning.
So in NARIA, we are trying to fix many, many of these problems in our data processing pipeline. There are some interesting realizations. When I say multi-wavelength astronomy, that we are combining different sensors, different data from different sensors to improve our statistics, that's actually the same as sensor fusion in machine learning.
So we are trying to learn the technique from sensor fusion, but because we care about statistics, we are also pushing our algorithm to the industry.
Now, for this long uniform data set, we are trying to use graph neural network and also transformer to process them. So these are some of our new research direction.
And then the other hand is astronomy method and tool, we can use them to advance AI algorithm to obtain, for example, error bars and to improve AI safety. So these are the different direction we have in using AI in data processing.
Now, another interesting aspect is data interpretation. After we get our data, we need to figure out what's going on. The questions we care to ask are, why does A happen? What if B would change?
These are the typical questions we want to ask. Many, many decades-old problems in astronomy just have too many different potential causes for astrophysicists to shout out.
So some of these problems include what causes galaxies to stop forming stars, what causes turbulence, and what causes black hole to grow. All of these questions, they have many, many potential causes.
So what we are trying to do is to use causal AI, which is just a causal graph with symbolic regression to automatically identify reasons from our data. So I want to show you a simulation here. This is from one of the faculty members in Arizona.
This is called Universal Machine. In simulation, we can simulate many, many different realizations of the whole universe with statistics that are comparable to real observations.
So applying causal AI to data like this, we may be able to really answer what stops star formation, what causes dark matter and dark energy.
Now, another thing I want to bring up is hypothesis identification. So this is always the challenging part here because data processing is, in some sense, a machine that is able to automate and process them.
So we are collaborating with MIT to use their technology from database. So the idea here is if you send a job to a database, there are actually many, many implementations that can give you the same result.
And then we are using planning tools from MIT to optimize this algorithm. And by doing so, we are able to scan through many, many papers and identify measurement and value from these papers.
And one interesting point to point out is we are actually using ChatGPT for processing language in scientific papers.
So up to this point, I guess I showed you the black hole story. I also showed you how astronomers are trying to use AI to improve our astronomy observation.
But there's one thing that is always difficult, which is hypothesis generation. Even when we use AI, ChatGPT to process paper, we can only gather existing hypotheses from paper.
Generating hypotheses is always difficult. So it is true that when you ask ChatGPT, it's possible to provide you some hypotheses. But very often, they're actually interpolation of existing ideas.
These ideas are still very useful because as humans, we have a limited scope in our knowledge. So having an AI knowing a lot more than us and tell us some idea, that is very, very useful.
But this is still very different from Newton discovering law of gravity or maybe Einstein discovering general relativity. So one question here is, is it possible for neural network and AI to come up with hypotheses themselves?
So my view here is, it is possible. And in fact, if we remind ourselves that this concept that we introduced in the black hole, we repeatedly see this high order concept that come out from pattern recognition capable of extrapolation.
So you have these low level tasks, low level ideas. But once you bring them, you will find different ideas into a higher order concept. Those high order concepts are actually extrapolatable.
So maybe this extrapolation is not unique to humans. Maybe AI is capable of doing this. And so I actually have a framework here.
Maybe when we think about this high order intelligence of scientific discovery, all we are seeing is just pattern recognition. And once we identify a high order pattern, and then we interpolate in that, that automatically gives us extrapolation.
Maybe that will work. And of course, if your AI is not smart enough and it gives you many, many bad ideas, that is OK, because we can bring back our data to throw away the incorrect ideas.
So now we actually have a framework that can do science from machine learning. So with this, I want to introduce a very small project. We are experimenting in our AI institute that we call Small Steps.
The goal here is we want to use neural network to rediscover physics law from data. So this project is actually different from physics-informed machine learning that many of you hear about.
In physics-informed machine learning, we try to put in physics law. It could be ODE. It could be Newton's second law. It could even be symmetry in your image.
We put in this known idea in your neural network to improve the performance. This is not what we do in this project.
that what we want is to use data to train a neural network and let the neural network rediscover all the scientific discovery we have. We only use standard architecture, so RNN transformer. We don't program the physics log in. And the more interesting thing is, in our test set, we actually put in some physics that's not in the training set. And we want our model to be able to actually learn the physics log and give us the correct prediction.
There are many useful tools here in this exercise. For example, embedding space, merging, and alignment. We find that very useful because you can come up with different physics system, train your neural network, and at the end, merge their embedding space. We are also able to identify and mask the different embedding space. So this means when we have a smart neural network that learn, for example, gravity, we can screen out part of this netting space and make it forget about gravity and then test its behavior.
This is only the beginning, but I do want to show you some quick results. So what I show you here, this is a synthetic data for the orbit of Mercury. So using our method, we are able to, by just giving you very, very few data, a few initial conditions, that it's able to figure out the velocity, the position, the mass of the planet. It's actually able to predict the planet trajectory for a very, very long time.
Earlier, I mentioned that we have this latent space masking method. So if we remove part of the latent space, this behavior disappears. So it seems our neural network is capable of learning Newton's law and also calculus. So up to this point, just more steps. I think we have a few achievements.
Interpolation is very easy. This is just like your Asian astronomy looking at the sky, locking in the data, and then interpolate between data points. So this is something very, very easy to do. Supervised learning can already do that. But in addition to that, by creating some information bottleneck, regularizing your neural network, we actually rediscover epicycle. So this is a very old technique that all astronomers, they put circle into the circle to model the orbits of planets. We can actually rediscover this in our neural network.
And then we also rediscover the solar-centric model of the solar system. By just telling our model the location of this planet, you actually figure out Earth is not the center of this system. So this is a coordinate rediscovery that we are able to achieve.
And then, as I mentioned earlier, by masking the latent space, there's some strong indication that maybe we are learning Newton's law of motion and calculus. So these are the major discovery and advancement in physics and astronomy. There are a few more, even more advanced principles. So there's the principle of action, the model with specification, and also general relativity.
Right now, we are not able to rediscover this yet. And to be honest, I actually don't know how to come up with data for the model to rediscover the action principle. So if people have ideas, please come talk to me. I will be very excited if this works.
Now, even though this is only a work in progress, I think there is an important idea here. Because we are now not just simulating physics. We're actually simulating scientific discovery itself.
So I want to end this talk with this graph I already showed you. I covered many, many things today, very quick. I start with the black hole story. I told you how Newton, how Einstein, some of the brightest mind, using very few information, but also their high-order thinking, to come up with theory. And then in those theory, you can interpolate. You can interpolate between different force. And then you can predict, extrapolate to a scenario that you've never seen. In fact, that is how we design our mission to the moon, for example. Nobody has ever done that before, so we don't have actual data. But by interpolating the force, you are able to make prediction of that mission successful.
But maybe this type of skill is not unique to human. By simulating scientific discovery, it seems machine learning and our neural network is able to do something like this.
And now you may also ask, if artificial intelligence is able to make scientific discovery for me in the future, then what should the scientists do? So this is the time that I want to bring back the story that Eddington commented, that there are only two people in the world who understand general relativity. After 100 years of research, understanding, and development, and also after 400 years of mathematical development, nowadays our high school kids can understand and learn calculus. Our college students can learn and understand general relativity.
So if AI is capable of scientific discovery, maybe the high-order thinking in this neural network can actually tell us how this new discovery in a new framework. And then bring this knowledge that right now is beyond human reach to a level that we can understand.
So if that happens, if those type of AGI, or even superintelligent, happen, that they can explain science discovery, in that case, AI will not only be co-pilot, but they will also be mentor for human to discover science.
So this is all I have for today. Thanks for your attention. Thank you so much, CK.
We are now going to move into the questions from the audience, and we'll start with a virtual question. This is from Peter. He's watching us virtually.
CK, can you please provide suggestions or solutions about how to promote AI use in scientific area, especially astronomy?
Yeah, that is a wonderful question. I think there is a very general sense that, yes, astronomy, we need AI. We want that to happen. But I think the major roadblock, one of them, is actually the error bar part. So if you use an AI algorithm to process your data, very often you don't have the error bar for your result. So I guess the answer to this is maybe we should really push Bayesian neural network and techniques like this to give us the statistic of the result. And with that, people should be able to accept machine learning in astronomy. Thank you.
Anybody from the in-person audience have a question?
Hi. Thank you for speaking, just doing this. And it's just incredible with what you're talking about. I'm in computer engineering, but I know nothing about black holes. I just know a very small snippet of it. But the last part, I think, really spoke out to me because I thought it was interesting that you didn't feed the AI any things that are the laws of physics, essentially, and you let the AI learn on its own. And I was wondering, why did you make that decision? Or is it better to let the AI discover it on its own to make better predictions? Is that the reason why? With past models, maybe, you've already experimented with AI that knows physics already. And have they made, I guess, worse predictions overall?
Yeah, so that's a wonderful question. I guess one reason I'm interested in this is we are expecting a lot of data in the coming decade. The Rubin Observatory I showed you is going to generate 10 petabytes of data. So we actually don't think there are enough astronomers to really understand the universe. So in that case, it would be super cool if AGI or superintelligent can actually do scientific discovery itself. So I think that is a strong motivation for doing it.
doing this experiment. Of course, we are only at the beginning. We don't even able to really interpret the neural network. So right now, we're just trying different method to see if our guess is correct.
OK. We have one more from the virtual audience. But keep your hands up, and Caitlin will bring a microphone. This is from Leobo. Can you go into any details of how the small steps model discovers calculus? What does that manifest?
Yeah, so the idea is actually not too complicated. So I guess this figure here can actually represent that. So the idea is you do want to throw in a lot of synthetic data to the model for you to first memorize the orbits and all of that. But then we also have a regularizer in all of our neural network. So after the neural network remember all the orbits, we tune up the regularizer. So what that does is you force a smaller space for the neural network to do the prediction. So just automatically, then you actually jump to the next level and figure out this epicycle. So if you keep repeating that, and if you are lucky, then you get the calculus and Newton's law. Thank you for the lovely talk. My question is, how would you modify the architecture that you're using to approach astrophysics to engage topics with more priors, like chemical physics, advanced materials, high energy, light matter, sort of interactions, that sort of thing?
OK, that is, again, another wonderful question. We actually don't have a very good solution yet. But I guess based on the black hole example I showed you, in astronomy we can do a lot of simulations. I guess part of this is the complexity in our system is kind of low. It's not like human activity that you have a lot of complexity. So where we often, by just writing down simple physics law, we actually have a good enough approximation to generate synthetic data. Of course, there's still a lot of limitation that we want to improve.
Yeah, I mean, I actually had a question also about synthetic data. I guess, yeah, what are the challenges that you see with using synthetic data given you have so many simulation capabilities? I feel like it could probably help address some of the problems you mentioned with using AI.
Yeah, so I actually didn't really get the questions. Are you asking if the synthetic data is enough?
Yeah, what are the challenges that limit using synthetic data to solve some of the ground truth problems?
OK, well, I guess it's partly a chicken and egg problem. Because when you make your synthetic data, you actually need to make assumption. And before you do your observation, you don't really know if your assumptions are correct. So in this sense, I will say the synthetic data and training part only allow you to check consistency. But you can check if the model, the physical model you make to create your synthetic data is consistent with your observation. But then in some sense, we always worry that there may be an edge case that doesn't work.
Amazing talk. So what would you think would be the reason some principles are easier to be discovered and some are harder? Do you think that's more related to the nature of the problem or more related to the data training itself?
Yeah, so I think the very easy thing is to rediscover their interpolation. So in that sense, we don't even need to go beyond supervised learning. You just provide enough label and data. You can learn the pattern. So that is the easy part. I think the big jump for us is really that our model seems to be able to solve differential equation. But I do have, again, this is very early result. So something I worry is in transformer and this recurrent neural network, you can see that as a Euler integrator. So even though I say we rediscover recalculus, maybe that is actually built into this Euler integrator. So there's still a lot of work to be done in this research. But I just think in general, trying to simulate scientific discovery itself is very interesting.
OK, we're going to take one from the virtual audience. This is from Sanjay. Is small steps model open source?
Not yet. I guess we still care about paper. So hopefully, we will write some paper and then we will make it available.
Awesome. Thank you. My question is related to the general concept of AI for scientific data. It seems like the hypothesis that this works really well for experimental science, where you have data and you're trying to model the data. How would it work for a theoretical science where you actually start from axioms and then you develop downstream?
That is a great question. I can even make a comment here that for scientific field that there are a lot of data like climate and hydrology, I would say all the physical model are wrong. It's just the level. So in that case, I think NVIDIA actually give us a very good demo that just some machine learning model can sometimes outperform the best theoretical models. But going back to your question, I think for this theory-driven field, we always need to do simulation to turn the theory into prediction that we can test. And machine learning is able to accelerate this simulation. For example, all of the simulation library I show you, they take a lot of supercomputer time to run. And not yet in black hole, but in cosmology, we actually have researcher who change neural network based on simulation data. And then the neural network is able to mock the simulation with many, many order of magnitude on increase in performance. So I think that is one very good application of machine learning in theory-driven field.
My question about the small steps model, what happens when you reduce the problem space to, let's say, just two dimensions? And you say, let's just try to figure things out like flat land. And can it find a simple Euler-Lagrange formula or something? Or just like you used Mercury, which is great, but that's got relativistic functions about its orbit if you just said strictly Newtonian space and simplified the data.
So that is a great suggestion. I guess we didn't try that. I think one reason is because you're looking at history. We know that this field data with your smart people is able to make discovery. So we are just trying to reproduce this discovery now. But I think you make a wonderful point. In order to understand a simulated scientific discovery, it may be better to just come up with a very, very simple model to see what is the minimum requirement. You can find the limitations in the computation space, too, and just to see.
That's right. I think that's a wonderful suggestion. I'm a bit curious. Whenever you mentioned rediscovering heliocentrism, what was the coordinate system that you gave it? Was that sort of Earth-centric? And also, how do you deal with overfitting and the size of the model and that sort of thing?
Yes. So the data we gave you is just sky coordinate of the planet. So the simplest thing for the machine learning model to do is just to memorize this coordinate. And then the next simplest thing is to do the Fourier transform so that it becomes circle within circle. But when we start to do the regularization, that you don't have enough free parameter for the model to fit, all of a sudden, you start to realize that the best coordinate to use is to put the sun in the center. So I actually forgot to mention this. I think another application of simulating scientific discovery is you can start to ask, when a scientific discovery happens, is it really the person that is super smart, or you see the data? And then it also helps you to figure out when we have a scientific bottleneck nowadays, which part is the thing that we should invest to develop. So I think there's just a lot of application in this type of work.
Thank you so much for this talk, Dr. Sze, it was amazing. I'll give you a choice, either what keeps you up at night, or what's on your wish list for AGI to discover?
Okay, this is a great question. I do have a simple answer for the first question. I told you that I'm working in this UN Horizon Telescope project, it's a global team. What keeps me up at night is I do worry that I don't have enough funding to support the team. So that is one worry that we always have.
In terms of the discovery from AGI, I will say quantum gravity, maybe? That was a great question. Maybe that requires super-artificial, super-intelligent, but I hope that you will lower that in a level that we can understand. Thank you so much.
I was actually going to ask something very similar to that, but I was going to add, when do you think we're going to have new physics from AGI, or AI, or ASI?
Yeah, that is also a wonderful question. My hope is in a few years, because in Arizona, we do work very hard to bring all the AI tools in astronomy. So I hope that when we actually get our project to ground, then AI-driven discovery will happen. But the caveat here is, at the beginning, you will only be in the data processing. The AI-automated scientific idea, I think it will take longer. But I guess probably I should ask OpenAI to help us do that.
Everybody, thank you so much for being here, and please help me give one more round of applause to Dr. CK Chan.