news

fei-fei li's latest dialogue: advances in ai technology will bring unimaginable new application scenarios

2024-09-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

recently, fei-fei lianda16z partnerMartin Casadoas well asresearcher Justin Johnson expanddiscussedAI the history, current status and future development direction of the field, topicsall aspects of ai technology are covered, especially the future potential of generative ai and spatial intelligence.
li feifei emphasized that generative ai already existed during her graduate studies, but the early technology was not mature. with the leap of deep learning and computing power, generative ai has made remarkable progress in recent years and has become one of the core breakthroughs in the field of ai.
she also introduced her latest startup, world labs, which focuses on "spatial intelligence," or the ability of machines to understand and interact in 3d and 4d space.
she pointed out that spatial intelligence is not only applicable to the generation of the virtual world, but can also be integrated with the real world and widely used in the fields of augmented reality (ar), virtual reality (vr) and robotics.AI technological advances will bring us unimaginable new application scenarios, including virtual world generation, augmented reality, and interaction with the physical world.
the following is the main content of this conversation, enjoy~

Martin Casado

in the past two years, we have seen a wave of consumer ai companies and technologies emerge, which is crazy. and you have been working in this field for decades. so we might talk about the key contributions and insights you have made in this process.

Feifei Li

it's a very exciting time, looking back, ai is at an exciting time. i've been working in this field for more than 20 years, and we came out of the last ai winter and witnessed the birth of modern ai. then we saw the rise of deep learning, which showed us all kinds of possibilities, such as playing chess.

then we started to see deep development of the technology and industry adoption of early possibilities, like language models, and now i think we are in the middle of a cambrian explosion.

in a sense, now in addition to text, we are also seeing pixels, video, audio, etc. beginning to be combined with ai applications and models, so this is a very exciting time.

Martin Casado

i've known both of you for a long time, and many people know you too, because you're very prominent in this field. but not everyone knows about your beginnings in the field of ai, so maybe we can briefly introduce your background to help the audience establish a basic understanding.

Justin Johnson

well, i first got into ai towards the end of my undergraduate studies. i was studying math and computer science at caltech, which was a great time. during that time, there was a very famous paper published, the "cat paper" by homer neck lee and andrew ng et al. at google brain, which was my first exposure to the concept of deep learning.

this technology was amazing to me, and it was the first time i encountered this recipe: when powerful general learning algorithms, huge computing resources, and a lot of data come together, something magical happens. i was exposed to this idea around 2011 or 2012, and i felt at the time that this was what i would do in the future.

obviously, to do this kind of work you had to go to graduate school, so i found feifei at stanford, who was one of the few people in the world doing this in-depth at the time. it was a great time to be working on deep learning and computer vision because it was the moment when the technology was going from its infancy to maturity and widespread application.

during that time, we saw the beginnings of language modeling and the beginnings of discriminative computer vision—you can understand what’s in a picture by looking at it. this period also saw the early development of what we call generative ai today, and the core parts of the algorithms such as generating images and generating text were also solved by academia during my doctoral studies.

every morning when i woke up, i would open arxiv to check the latest research results. it was like opening a christmas present. there were new discoveries almost every day. in the past two years, the rest of the world has also begun to realize that there are new "christmas gifts" received every day through ai technology. but for those of us who have been working in this field for more than ten years, this experience has long been there.

Feifei Li

obviously, i am much older than justin. i entered the field of ai from physics because my undergraduate background is in physics. physics is a subject that teaches you to think about bold questions, such as the unsolved mysteries in the world. in physics, these questions may be related to the atomic world and the universe, but this training made me interested in another problem - intelligence. therefore, i did my doctoral research in ai and computational neuroscience at caltech. although justin and i did not overlap at caltech, we share the same alma mater.

Justin Johnson

still have the same instructor?

Feifei Li

yes, your undergraduate advisor was also my phd advisor, pietro perona. when i was doing my phd, ai was in a cold winter in the public eye, but it was not so in my eyes. it was more like a hibernation period before spring, when machine learning and generative models were gathering strength. i think i am a "native" in the field of machine learning, while justin's generation is the "natives" of deep learning.

machine learning was the precursor to deep learning, and we experimented with a variety of models. but at the end of my phd and while i was an assistant professor, my students and my lab realized that there was an overlooked element of ai that drove generalization, and the field had not thought deeply about it at the time: data. we were focusing on complex models like bayesian models, and overlooked the importance of letting the data drive the model.

this is one of the reasons why we bet on imagenet. at that time, the datasets in all fields were very small. the standard datasets for computer vision and natural language processing were only a few thousand or tens of thousands of data, but we realized that we needed to upgrade to the internet scale. fortunately, the internet era was also rising, and we rode this wave. it was at this time that i came to stanford.

Martin Casado

these eras are like the ones we talk about often, like imagenet was obviously the era that pushed or at least made computer vision popular and feasible in the field of generative ai. we usually mention two key breakthroughs: one is the transformer paper, which is "attention", and the other is "stable diffusion" which is less talked about.

is this a reasonable way to think about these two algorithmic breakthroughs from academia (and google in particular)? or was this more of a deliberate process? or were there other major breakthroughs that are less often mentioned that also got us to where we are today?

Justin Johnson

yes, i think the biggest breakthrough is computing power. i know the ai ​​story is often the computing story, but even though it’s often mentioned, i think its impact is underestimated.

the growth we have seen in computing power over the past decade is staggering. the first paper that is considered the breakthrough moment of deep learning in computer vision was alexnet, a 2012 paper where a deep neural network performed extremely well on the imagenet challenge, far outperforming other algorithms at the time.

the algorithms you might encounter in graduate school pale in comparison to alexnet, a 60-million-parameter deep neural network that was trained for six days on two gtx 580 graphics cards, the most powerful consumer graphics cards at the time, released in 2010.

i looked up some data last night to put this into a larger context. nvidia's latest graphics card is the gb200. can you guess how big the difference in computing power is between the gtx 580 and the gb200?

the number is in the thousands, so i did some calculations last night. for example, the two weeks of training, the six days were run on two gtx 580s, and if you scale it up, it can probably run in less than five minutes on a gb200.

if you think about it this way, there is really a good argument - alexnet's paper on the imagenet challenge in 2012 is really a very classic model, that is, the convolutional neural network model.

in fact, this concept has been around since the 1980s, and i remember my first graduate paper, which was about the same thing, with a six- or seven-layer network structure. the only difference between alexnet and the convolutional neural network model is the gpu—two gpus and a huge amount of data were used.

so what i was going to say is that most people are now familiar with the so-called bitter lesson, which is that if you develop an algorithm, just make sure you take advantage of existing computing resources as they become available over time. so you just need a system that can keep getting better.

on the other hand, there seems to be another equally convincing argument that new data sources actually unlock deep learning. imagenet is a good example. while many people believe that the self-attention mechanism is important for the transformer model, they will also say that it is a way to take advantage of manually annotated data.

because humans provide annotations for sentence structure, and if you look at the clip model, it actually uses the internet to have humans tag images with alt tags. so this is really a story about data, not about computation. so is the answer a little bit of both, or is it more one way or the other? i think it's a little bit of both, but you also mentioned another very key point.

Martin Casado

i think there are actually two distinct eras in the field of algorithms. the imagenet era was the era of supervised learning, where we had a lot of data but we didn't know how to train with just the data itself.

the expectation with imagenet and other contemporary datasets is that we have a lot of images, but we need humans to annotate each image. and all the data we train on is looked at and annotated by human annotators one by one.

the big breakthrough with algorithms is that we now know how to train on data that doesn’t rely on human annotations. to a normal person without an ai background, it seems like if you’re training on human data, humans have actually already done the annotations, but the annotations aren’t explicit.

Justin Johnson

yes, philosophically, it's a very important question, but it's more true in the domain of language than in the domain of images. yes, but i do think it's an important distinction. clip was indeed annotated by humans. i think the self-attention mechanism is that humans have understood the relationships between things, and then you learn from those relationships.

so it's still annotated by humans, but the annotations are implicit rather than explicit. the difference is that in the supervised learning era, our learning task is more constrained. we have to design an ontology of concepts that we want to discover.

for example, in imagenet, fei-fei li and her students spent a lot of time thinking about what the one thousand categories in the imagenet challenge should be. and in other datasets of the same period, such as the coco dataset for object detection, they also spent a lot of time deciding which 80 categories to put in.

Martin Casado

so let's talk about generative ai. when i was doing my phd, before you guys came along, i took andrew ng's machine learning course, and i took daphne koller's very complex bayesian course, and it was all very complex for me.

a lot of it was predictive modeling. i remember you unlocked the whole vision space, but generative ai has only emerged in the last four years or so. it's a completely different space to me - you're no longer recognizing objects, you're not predicting anything, you're generating new things.

so maybe we can talk about what are the key factors that make generative ai possible, how it’s different from what’s come before, and whether we should think about it differently, whether it’s part of a continuum or a whole new field?

Feifei Li

it's very interesting, even when i was a graduate student, generative models were already there. we were trying to do generation, but no one remembers it, even with letters and numbers, we were trying something. jeff hinton had some papers on generation, and we were thinking about how to generate.

in fact, if you look at it from the perspective of probability distribution, it is mathematically possible to generate, but what was generated at the time was not amazing at all. so, although the concept of generation exists from a mathematical perspective, in fact, no generation effect is satisfactory.

then i would like to mention a doctoral student in particular who came to my lab with a strong interest in deep learning. the entire doctoral study experience of this doctoral student can almost be said to be a microcosm of the development trajectory of this field.

his first project was data, and i forced him to do it. although he didn't like it, he admitted afterwards that he learned a lot of useful things. "now i'm glad you said that." so we turned to deep learning, and the core problem was how to generate text from images. in fact, there are three clear stages in this process.

the first stage is to match the image and the text. we have the image and the text, and then we have to see how they are related. my first academic paper, which was also my first doctoral thesis, was about image retrieval based on scene graphs. next, we continued to study in depth, generating text from pixels, and he and andrej have done a lot of work in this area, but it is still a very lossy way of generating, and the information is lost a lot when it is obtained from the pixel world.

there was a very famous work in the middle, when someone first achieved real-time. in 2015, a paper called "artistic style from neural algorithms" was published by leon gatys. they showed how to convert real-world photos into van gogh-style pictures.

we may take it for granted now, but back in 2015, when that paper popped up on arxiv, it blew my mind. i felt like a “generative ai virus” had been injected into my brain. i thought to myself, “oh my god, i need to understand this algorithm, play around with it, and try to turn my own images into van gogh-style.”

so i spent a long weekend re-implementing the algorithm and getting it working. it's actually a very simple algorithm, my implementation was about 300 lines of code, and it was written in lua, because pytorch didn't exist at the time, we used lua torch. but even though the algorithm was simple, it was slow. for each image you had to run the optimization loop, which took a lot of time. the images were beautiful, but i just wanted it to be faster. in the end, we did make it faster.

another thing i'm very proud of is that he did a very cutting-edge work in the last part of his phd research before generative ai really went global. this project was one of the earliest generative ai works to generate complete images by inputting natural language. we used gans, but it was very difficult to use at the time. the problem was that we were not ready to describe a complete image in natural language.

so, he adopted a scene graph structure input method, with the input content being "sheep", "grassland", "sky", etc., and generated a complete image in this way.

we've gradually seen a complete shift from data matching to style transfer to image generation. you asked if this was a huge change, and for people like us it was an ongoing process, but for the general public the results did seem sudden and impactful.

Martin Casado

i read your book, it's a really great book, i highly recommend it. and, fei-fei, i would like to mention that for a long time, a lot of your research and direction has been focused on areas like spatial intelligence, pixel processing, etc. now what you're doing with world labs is also related to spatial intelligence. can you talk about how this is part of your long-term journey? why did you decide to do this now? is this some kind of technological breakthrough or personal reason? can you take us through the transition from the background of ai research to world labs?

Fei-Fei Li

for me, this is both a personal quest and an intellectual journey. you mentioned my book, but my entire intellectual journey has really been a search for a north star and a belief that these north stars are critical to the advancement of our field.

in the beginning, i remember after grad school, i thought my north star was "telling stories with images" because to me, that was an important part of visual intelligence, which is part of what you call ai.

but when justin and andrej finished their work, i thought, “oh my god, this is my lifelong dream, what am i going to do next?” it’s progressed much faster than i expected—i thought it would take a hundred years to achieve this.

visual intelligence has always been my passion. i firmly believe that for every intelligent being, whether human, robot, or other form of being, it is crucial to learn how to see the world, how to reason, and how to interact with the world. whether it is navigation, manipulation, manufacturing, or even building civilization, vision and spatial intelligence play a fundamental role.

it’s probably as fundamental as language, or even older and more fundamental in some ways. so the north star for world labs is to unlock spatial intelligence, and now is the right time.

like justin said, we have the resources we need — the computational power and a deeper understanding of the data. we have become more sophisticated in our understanding of the data than we were in the days of imagenet.

we also have advances in algorithms, such as the cutting-edge work our co-founders ben mildenhall and christoph lassner have been doing with nerf, and we feel now is the right time to take the initiative, focus on this area, and unlock its potential.

Martin Casado

just to make it clear, you’ve started this company, world labs, and the problem you’re solving is “spatial intelligence.” can you briefly describe what spatial intelligence is?

Fei-Fei Li

spatial intelligence refers to the ability of machines to understand, perceive, reason, and act in 3d space and time. specifically, it refers to understanding how objects and events are located in 3d space and time, and how interactions in the world affect these 3d locations.

this is not just about keeping the machine in the data center or mainframe, but about letting it go out into the real world and understand this rich 3d and 4d world.

Martin Casado

when you say "world", do you mean the real physical world or an abstract conceptual world?

Fei-Fei Li

i think it's a little bit of both. and that's what we're looking at long term. even if you're generating virtual worlds or content, there's still a lot of benefit in being able to locate in 3d. or when you're recognizing the real world, being able to apply that 3d understanding to the real world is also part of it.

Martin Casado

you have a really strong team of co-founders. why do you think now is the right time to do this?

Fei-Fei Li

this is actually a long-term evolutionary process. after graduating from my ph.d., i began to look for ways to become an independent researcher and think about big problems in the field of ai and computer vision. at that time, i came to the conclusion that the past decade was mainly about understanding existing data, and the next decade will be about understanding new data.

the data of the past was mainly images and videos that already existed on the web, and the data of the future is brand new - smartphones have appeared, and these phones have cameras, have new sensors, and can locate themselves in the 3d world. it's not just a matter of taking a bunch of pixels from the internet and trying to decide whether it's a cat or a dog.

we hope to use these images as a universal sensor of the physical world, helping us understand the 3d and 4d structure of the world, both in physical and generative space.

after graduating from my phd, i made a big transition and entered the field of 3d computer vision, working with my colleagues to study how to predict the 3d shape of objects. later, i became very interested in the idea of ​​learning 3d structure from 2d data.

when we discuss data, we often mention that it is difficult to obtain 3d data, but in fact, 2d images are projections of the 3d world, and there are many mathematical structures that can be used. even if you have a lot of 2d data, you can deduce the structure of the 3d world through these mathematical structures.

2020 was a breakthrough moment. our co-founder ben mildenhall came up with the nerf (neural radiance fields) method. it was a very simple and clear way to infer 3d structure from 2d observations, and it ignited the entire field of 3d computer vision.

at the same time, llm was starting to come to the fore. a lot of the work on language modeling has actually been going on in academia for a long time. even during my phd, i did some language modeling work with andrej karpathy in 2014.

Justin Johnson

this actually happened before transformer, but in the era of gpt-2, it is difficult to make such models in academia because they require too much computing resources. however, it is interesting that the nerf method proposed by ben only needs to be trained for a few hours on a single gpu.

this made many academic researchers start to refocus on these problems, because some core algorithmic problems can be solved with limited computing resources, and you can get the most advanced results on a single gpu. so many academic researchers were thinking: how can we promote the development of this field through core algorithms? fei-fei and i talked a lot, and we were very sure of this.

Fei-Fei Li

yes, we found that our research directions are moving towards similar goals to some extent. i also want to tell a very interesting technical problem, or a technical story about pixels.

many people who work on language research may not know that before the era of generative ai, those of us working in the field of computer vision actually had a long history of research called 3d reconstruction.

this goes back to the 1970s, where you could take a photograph — because humans have two eyes, you could use stereoscopic photographs to try to triangulate and construct a 3d shape. however, this is a very hard problem that has not been fully solved to this day because there are complications like matching problems.

there has been a long history of progress in this area, but when nerf was combined with generative methods, especially in the context of diffusion models, 3d reconstruction and generation suddenly began to merge. in computer vision, we suddenly discovered that if we see something, or imagine something, both can converge in the direction of generating it. this was a very important moment, but many people may not have noticed it because we don't talk about it as widely as we talk about llm.

Justin Johnson

yes, there is reconstruction in pixel space, like you reconstruct a real scene, and if you can't see that scene, you use generative techniques. these two are actually very similar. you've been talking about language and pixels throughout this conversation, maybe this is a good time to talk about spatial intelligence versus language approaches, like are they complementary, or are they completely different?

Fei-Fei Li

i think they are complementary. i'm not sure how to define "completely different", but i can try to make a comparison. nowadays, many people are talking about gpt, open ai, and multimodal models. people think that these models can process both pixels and language. so can they achieve the spatial reasoning we want? to answer this question, we need to open the "black box" of these systems and see how they work at the bottom.

the underlying representation of language models and the multimodal language models we are seeing now is "one-dimensional". we talk about context length, transformer, sequence, attention mechanism, but at the end of the day, the representation of these models is based on one-dimensional serialized tokens.

this representation is very natural when dealing with language, since text itself consists of a one-dimensional sequence of discrete letters. this one-dimensional representation is the basis for the success of llms, as well as the multimodal llms we are seeing today, which "squeeze" other modalities (such as images) into this one-dimensional representation.

in spatial intelligence, we think in the opposite direction—we believe that the three-dimensional nature of the world should be at the core of our representation. from an algorithmic perspective, this opens up new opportunities for us to process data and get different types of outputs, helping us solve some very different problems.

even at a crude level, you might say, “multimodal llms can also see images.” they can, but they process images without keeping the essence of three dimensions at the core of their approach.

Justin Johnson

i completely agree that discussing the fundamental difference between one-dimensional and three-dimensional representations is very core. in addition, there is a slightly more philosophical point, but it is equally important to me: language is essentially a purely generated signal, and there is no language in the world. you don't go out into nature and see words written in the sky. no matter what data you input, the language model can almost spit out the same data through sufficient generalization, which is the characteristic of language generation.

but the 3d world is different, it obeys the laws of physics, has its own structure and materials. being able to fundamentally extract this information, represent it, and generate it is a completely different problem. although we will borrow some useful ideas from language models, it is fundamentally a different philosophical problem.

Martin Casado

right, so language models are one-dimensional and probably a poor representation of the physical world because it's human-generated with losses. and the other modality for generative models is pixels, which are 2d images and videos. if you watch a video, you can see a 3d scene because the camera can pan. so how is spatial intelligence different from a 2d video?

Fei-Fei Li

there are two points worth thinking about here. one is the underlying representation, and the other is the convenience of the user experience. these two can sometimes be confused. we perceive 2d - our retina is a two-dimensional structure, but our brain sees it as a projection of the three-dimensional world.

you may want to move objects, move the camera, and in principle you can do these things with a 2d representation and model, but it is not suitable for solving the problem you are asking. a 2d projection of a dynamic 3d world may be modeled, but putting the 3d representation at the core of the model is better suited to the needs of the problem.

our goal is to incorporate more 3d representations into the core of the model to provide a better experience for users. this is also related to my "north star". why do we emphasize "spatial intelligence" instead of "flat pixel intelligence"?

because the trajectory of intelligence, if you look back at evolution, is that it ultimately allows animals and humans to move freely in the world, interact, create civilizations, and even make a sandwich. so translating this 3d nature into technology is the key to unlocking countless potential applications, even if some of them seem like superficial advances.

Martin Casado

i think this is a very subtle but important point. maybe we can take this a step further by talking about some application scenarios. when we talk about developing a technology model that can achieve spatial intelligence, what might that look like specifically? what are the potential application scenarios?

Fei-Fei Li

there are many things that our spatial intelligence model can do, but one thing that i'm particularly excited about is world generation. similar to text-to-image generators, we already have text-to-video generators today - you take an image or a video and it generates a two-second amazing clip. but i think we can take this experience to a 3d world.

we can imagine that spatial intelligence will help us elevate these experiences to 3d in the future, generating not just a picture or a video, but a complete, simulated, rich, interactive 3d world. perhaps it will be used for games, perhaps for virtual photography, and the application areas are incredibly wide.

Justin Johnson

i think the technology will improve over time. it's very hard to build these things, so static problems might be relatively easy, but in the long run, we want it to be fully dynamic and interactive, like everything you just described.

Fei-Fei Li

yes, that's exactly the definition of spatial intelligence. we'll start with more static questions, but everything you mentioned is in the future planning of spatial intelligence.

Justin Johnson

this is also reflected in the name of our company, "world labs" - the name means it is about building and understanding the world. when we first tell people this name, they don't always understand it because in the field of computer vision, reconstruction and generation, we usually distinguish between what we can do. the first level is to recognize objects, such as microphones, chairs, and other discrete objects in the world. a lot of the imagenet work is about recognizing objects.

but then we moved up to the level of scenes - scenes are made up of objects. for example, right now we have a recording studio with a table, a microphone, and a person sitting in a chair, which is a combination of objects. but the "world" we imagine goes beyond scenes. a scene may be a single thing, but we want to break these boundaries and walk out the door, walk into the street, see the traffic passing by, see the leaves swaying in the wind, and be able to interact with these things.

Fei-Fei Li

another very exciting thing is about the term "new media". with this technology, the boundaries between the real world, the virtual imaginary world or the augmented world, the predicted world become blurred. the real world is 3d, so in the digital world, there must be a 3d representation to merge with the real world. you can't effectively interact with the real 3d world with only 2d or even 1d.

this capability unlocks endless application scenarios. like the first application scenario justin mentioned, the generation of virtual worlds can be used for any purpose. the second one could beaugmented realityaround the same time that world labs was founded, apple released the vision pro, and they used the term “spatial computing.” we’re pretty much talking about the same thing, and we’re emphasizing “spatial intelligence.” spatial computing requires spatial intelligence, there’s no doubt about that.

we don’t know what the future hardware will look like — it could be goggles, glasses, or even contact lenses. but at the interface between the real and virtual worlds, whether it’s augmenting your work capabilities, helping you fix your car even if you’re not a professional mechanic, or just providing a “pokemon go++”-like experience for entertainment, this technology will become the operating system for ar/vr.

Justin Johnson

in the extreme, what ar devices need to do is to accompany you all the time, understand the world you see in real time, and help you complete tasks in your daily life. i am very excited about this, especially the fusion between virtual and reality. when you can perfectly understand the 3d of your surroundings in real time, it may even replace some things in the real world.

for example, we now have screens of various sizes - ipads, computer monitors, tvs, watches, etc., which present information in different scenarios. but if we can seamlessly merge virtual content with the physical world, these devices will no longer be necessary. the virtual world can show you the information you need at the right moment and in the most appropriate way.

another huge application is the hybridization of the digital virtual world and the 3d physical world, especially in the field of robotics. robots must act in the physical world, while their computation and brain are in the digital world. the bridge between learning and behavior must be built by spatial intelligence.

Martin Casado

you mentioned virtual worlds, augmented reality, and now you're talking about the purely physical world, like for robotics. this is a very broad direction, especially if you plan to be involved in these different areas. how do you think deep tech relates to these specific application areas?

Fei-Fei Li

we think of ourselves as a deep technology company, as a platform company, providing models that can serve these different application scenarios. as for which application scenario is more suitable for what we focused on at the beginning, i think the current equipment is not perfect enough.

i actually got my first vr headset when i was in graduate school, and when i put it on, i thought, “oh my god, this is crazy!” i’m sure a lot of people have had a similar experience when they first used vr.

i like the vision pro so much that i stayed up all night to buy one the day it was released, but it's not yet fully mature as a mass-market platform, so we as a company may choose to enter a market that is more mature.

sometimes there is simplicity in versatility. we have a vision as a deep technology company, and we believe that there are some fundamental problems that need to be solved well, and if solved well, they can be applied to many different fields. we see the long-term goal of the company as building and realizing the dream of spatial intelligence.

Justin Johnson

in fact, i think that's where the impact of what you do is. i don't think we'll ever really get there completely, because it's such a fundamental thing - the universe is essentially an evolving 4d structure, and spatial intelligence in the broad sense is about understanding the full depth of that structure and finding all the applications within it. so while we have a particular set of ideas today, i believe this journey will take us to places that we simply can't imagine right now.

Fei-Fei Li

the amazing thing about technology is that it keeps opening up more possibilities. as we keep pushing forward, those possibilities will continue to expand.