
The Curse of Variety

-
Interactive transcript
[MUSIC PLAYING]
CATHY WU: My name is Cathy Wu. I'm the Gilbert W. Winslow Assistant Professor of the Department of Civil and Environmental Engineering as well as the Institute for Data Systems and Society. And I'm also a Principal Investigator in the Laboratory for Information and Decision Systems. What brought me to MIT is to advance control and optimization in transportation.
My overall research area is what you might call learning-enabled control and optimization, especially for mobility systems. And what this means is, as we've discussed, transportation is full of these control and optimization problems, for instance, traffic control, traffic signal control, as well as routing of vehicles, routing to deliver packages, for instance.
And these are notoriously challenging problems due to nonlinear dynamics in the control problems and due to what's often called the curse of dimensionality for these logistics and operations questions. And so with a lot of advances and a lot of-- basically, with a lot of advances in machine learning in recent years, I think we have an opportunity to actually revisit a lot of the longstanding challenges in the field.
So one very interesting challenge to me that my research group here is uncovering that, in some sense, existed all along, but we're trying to bring it more to light, is something that we're calling the curse of variety. This is to describe how, when we think of, say, for instance, traffic signal control, we don't just mean a single control problem, because the control problem could be different depending on the time of day.
It could be different depending on the weather conditions. It could be different depending on if it's electric vehicles on the roads or trucks, or what kinds of technologies are there to measure what kind of information on the roadways, what kind of sensing, what kind of communication. All of these factors actually induce an exponentially large set of different control problems.
And this is also true for optimization problems, for operations, for logistics problems. And so we see a lot of evidence in the literature, where we see hundreds, if not thousands, of papers on these control problems, on traffic control, on platooning of vehicles, on how to drive efficiently and so on. And it's not scalable for us to individually study each of these.
And so we're really excited, and we're modeling these problems, we're devising learning-enabled, machine learning-based methods that allow us to much more scalably we solve not just a single control problem or a single optimization problem, but a family of control problems, a family of optimization problems that could be exponential in size.
The implications here are that having to solve these control problems one at a time, each problem effectively takes a grad student a few months at best, a year, two years to solve. And there's thousands of these problem variations, which is why we're calling this the curse of variety.
And so in order for us to gain the confidence that a control strategy-- an improved control strategy-- is worth, say, deploying on our streets may take years. And we actually lose out on a lot of opportunity to improve safety, improve environmental issues, improve equity if we need to wait that long. And so being able to accelerate this process is a huge potential.
[MUSIC PLAYING]
-
Interactive transcript
[MUSIC PLAYING]
CATHY WU: My name is Cathy Wu. I'm the Gilbert W. Winslow Assistant Professor of the Department of Civil and Environmental Engineering as well as the Institute for Data Systems and Society. And I'm also a Principal Investigator in the Laboratory for Information and Decision Systems. What brought me to MIT is to advance control and optimization in transportation.
My overall research area is what you might call learning-enabled control and optimization, especially for mobility systems. And what this means is, as we've discussed, transportation is full of these control and optimization problems, for instance, traffic control, traffic signal control, as well as routing of vehicles, routing to deliver packages, for instance.
And these are notoriously challenging problems due to nonlinear dynamics in the control problems and due to what's often called the curse of dimensionality for these logistics and operations questions. And so with a lot of advances and a lot of-- basically, with a lot of advances in machine learning in recent years, I think we have an opportunity to actually revisit a lot of the longstanding challenges in the field.
So one very interesting challenge to me that my research group here is uncovering that, in some sense, existed all along, but we're trying to bring it more to light, is something that we're calling the curse of variety. This is to describe how, when we think of, say, for instance, traffic signal control, we don't just mean a single control problem, because the control problem could be different depending on the time of day.
It could be different depending on the weather conditions. It could be different depending on if it's electric vehicles on the roads or trucks, or what kinds of technologies are there to measure what kind of information on the roadways, what kind of sensing, what kind of communication. All of these factors actually induce an exponentially large set of different control problems.
And this is also true for optimization problems, for operations, for logistics problems. And so we see a lot of evidence in the literature, where we see hundreds, if not thousands, of papers on these control problems, on traffic control, on platooning of vehicles, on how to drive efficiently and so on. And it's not scalable for us to individually study each of these.
And so we're really excited, and we're modeling these problems, we're devising learning-enabled, machine learning-based methods that allow us to much more scalably we solve not just a single control problem or a single optimization problem, but a family of control problems, a family of optimization problems that could be exponential in size.
The implications here are that having to solve these control problems one at a time, each problem effectively takes a grad student a few months at best, a year, two years to solve. And there's thousands of these problem variations, which is why we're calling this the curse of variety.
And so in order for us to gain the confidence that a control strategy-- an improved control strategy-- is worth, say, deploying on our streets may take years. And we actually lose out on a lot of opportunity to improve safety, improve environmental issues, improve equity if we need to wait that long. And so being able to accelerate this process is a huge potential.
[MUSIC PLAYING]
-
Interactive transcript
CATHY WU: One of our projects in the environmental direction has to do with the big question of, how do we mitigate climate change? And right now, the conversation is very heavily focused on electrification. But it is starting to come to light that it cannot be the only solution.
Electrification, fleet turnover does take time, takes on the order of decades. And it may not be accessible globally. So one of the things that we're looking at is, what are some low-cost, near-term opportunities that are sort of a really good bang for your buck?
And so we're looking specifically at intersections. It turns out that, especially in the US, we have so many intersections. And something that's very particular to intersections is you have to stop. You have to stop at intersections. You have to stop, then you have to re-accelerate in order to get through the intersection.
Every time you accelerate, you're burning fossil fuels. You're contributing to greenhouse gas emissions. And we have done some sort of preliminary analysis that indicates that potentially up to 20% of land transportation CO2 is just wasted at intersections. It's in part due to our very heavy trucking industry, but just that 40% of vehicle miles traveled are nearby to intersections.
And so what I mean by wasted is that this is the energy that's spent just re-accelerating and idling at intersections, not the energy spent doing productive work, carrying cars through the intersection. So we're looking at what can we do there. There's a very, very classic well-known strategy called eco driving at signalized intersections. And this is the idea of-- you can basically time cars such that when they get to the light, it's green.
This is about timing cars, not about timing traffic signals. You can think of traffic signals as binary, like, go or stop. And with that, you may have to stop, and then you have to re-accelerate. With timing of cars, you can potentially have a smoother trajectory. And this has the potential to save on the re-acceleration piece.
So there's extensive studies, thousands of papers on this topic very supportive as a low-cost, near-term intervention. And this has been study for decades. But we don't see almost any of it in practice. And given how urgent this climate crisis is, that sort of poses an interesting question as to why.
And so we are investigating-- we're sort of breaking down this problem into its constituent parts. And basically what we're finding in the literature is that the individual studies will look at individual scenarios for eco driving, so individual intersections at this time of day, this route.
But it's not really clear if it was effective at this intersection, would it be effective for another intersection? Would it be effective for Boston? Would it be effective for New York? How effective is it at a higher scale, at a higher level for something like climate mitigation?
And so the largest project in my group right now is basically trying to model the impact. And it's still ongoing, so I can't say too much about the outcomes. But we're modeling 10 major Metropolitan areas across the US, which comprises 30,000 intersections, each with different roadway topologies, like, grade of roads or elevation, different weather conditions, different travel demand, and fuel mix, and so on.
So this is one example where the curse of variety sort of kills you and has sort of-- I would claim has delayed the adoption, if indeed we are successful in confirming this to be effective, it would have delayed the impact of this very simple, low-cost, near-term intervention because we could not actually estimate the outcome. And so each of these scenarios for each intersection with different configurations, different weather conditions, and whatnot is its own multi-agent control problem that needs to be solved. And so we're devising general purpose machine learning techniques that can solve not just one, but a whole family that's comprising of tens of thousands of these scenarios.
[MUSIC PLAYING]
-
Interactive transcript
[MUSIC PLAYING]
CATHY WU: Something that's actually been of interest on the-- actually, from industry players has been we have an area of inquiry on what we call learning for a combinatorial optimization. So up to now, we've discussed more about control problems. But as I mentioned earlier, this curse of variety also occurs for combinatorial operations type problems.
So one example is the very classic vehicle routing problem. So with this, you can think of Amazon delivery trucks-- every day is routing 100,000 packages for the city of Austin and has to figure out which trucks, which packages, what order of places to visit, and trying to do this as efficiently as possible with as few trucks as possible, and so on.
So this is not just one problem. This is also a family of problems that are known as vehicle routing problems. So for instance, the trucks have capacities. The trucks, as they become electric, will require charging in addition. They, in other cases-- for instance, in ride hailing-- they may need to pick up objects as well as drop off objects. During the pandemic, it was very important that there was a cold chain. There could also be cases where delivery is by bicycle. And these are all different variations on the same central problem.
That, by itself, is fine. What the challenge is is that every one of these variations typically requires custom heuristics, which are very expensive, may take years to develop to actually solve these problems. These are what are known as NP-hard problems. These are problems where it is unlikely that there even exist efficient algorithms to solve, let alone us be able to find them.
What we're doing is basically taking a learning-enabled approach. Our approach is essentially that, well, we have done a lot of work to solve some variations of these problems. It is very possible that these new variations-- let's say delivery by bicycle, delivery for electric vehicles-- are not that, that, that different. But right now, with classical approaches, it still takes years to develop effective solutions for these variations. Our hope is that learning-enabled approaches with modern, deep learning-based methods can actually help us automatically identify heuristics that can be effective for these problems.
And so we've actually been able to demonstrate this successfully for vehicle routing problems, where what we're able to say is that-- let's say we're able to solve a small version of a problem, a small vehicle routing problem. But many times, especially in transportation settings, problems are large. We're delivering thousands, if not hundreds of thousands, say, of these packages. What we've been able to do is contribute learning-enabled approaches that can take these existing solution methods that work on small problems and incorporate them into this learning framework that's based on a problem decomposition that is able to then scale and accelerate that existing solver for large problems and is able to do this in a way that can automatically identify heuristics for these specialized variations of the vehicle routing problem.
So what we're doing now is, actually, we're taking some of these similar ideas. And we are trying to see if maybe this works not only for vehicle routing problems, but maybe also for multi-agent robotics problems. And so we're actually applying this to the setting of automated warehouses.
We also are exploring a few directions not only for big problems, but also for small problems. And in particular, there's a very general class of solvers, known as mixed-integer program solvers, that are very effective. You can formulate whatever combinatorial problem you want. If it's small enough, this-- these tools can solve it. But once they get too large, these problems-- these solvers struggle. So we're also looking at integrating machine learning into these solvers, these very general-purpose solvers, to see to what extent we can push the boundaries on the types of problems that they can solve.
-
Interactive transcript
[MUSIC PLAYING]
CATHY WU: The first thing I want to do is I want to clarify something. Traffic congestion is an inherent property for when a lot of people want to share a resource. And in this case that resource is space.
So while we can't really eliminate traffic congestion, we can potentially eliminate traffic jams. So what's the difference? So you can think of traffic congestion as delay. So you're driving home from work and it takes longer during rush hour than at 2:00 AM.
So that's delay. We can think of traffic jams as more like variation in travel. So sometimes you're going, sometimes you're stopping. Sometimes you're going, sometimes you're stopping. It turns out that one thing that we can do is when we sort of take the average, when we smooth out that variation, we don't hurt travel time.
When we smooth out the variation, we don't hurt travel time. In fact, we might even be able to improve travel time. That's what we're finding.
And so actually something I wanted to add a bit earlier than this is that the kind of stop-and-go and stop-and-go, like why do we care about this, rather, if there's still going to be congestion overall, it's still going to have some delay? The challenge or the importance is that this stop-and-go is associated with a lot of car crashes, is associated with environmental harms, because we're wasting fuel, and there's a lot of air pollution. So the point is that there are many benefits to managing traffic even if we cannot eliminate traffic congestion.
And another point I want to make is that we usually think of traffic congestion as occurring on the streets. But like I said, if we think about it as we're sharing a resource that's finite, this also occurs in warehouses. This also occurs in ports. This also even occurs in chips, in printed circuit boards, because we want to fit a lot of wires into the same space.
So getting back to the question of how to eliminate traffic jams, it turns out that you can reduce the variation on travel without harming travel time. So you can sort of selectively have some vehicles slow down at the right times at the right places, and by doing so we can actually eliminate traffic jams. And this is something that we and others in recent years are really starting to systematically uncover as a potential for connected and automated vehicles as a key application of these vehicles.
And we are able to do so more systematically with these learning-enabled approaches, in particular, this powerful methodology of deep reinforcement learning. So in short, you can eliminate traffic jams by slowing down in order to speed up. And that's often also true in life.
-
Interactive transcript
[MUSIC PLAYING]
CATHY WU: So we have this recent work. Title's a little bit long. It's called "The Impact of Task under Specification in evaluating Deep Reinforcement Learning." This one's actually a really fun work. And this actually is a little bit indicative of my style of research, where we start with a practical problem and I love it when it sort of reveals a more common problem than our application.
So this goes back to the cursive variety. We basically thought, OK, let's apply standard deep reinforcement learning methodology to control traffic signals for a variety of intersections. We were modeling traffic signals-- sorry, we were modeling traffic intersections from Salt Lake City in Utah, in collaboration with Utah Department of Transportation. And we found that, unfortunately, and also to our surprise, the existing methods, the existing deep reinforcement learning methods for traffic signal control, even though in previous papers they reported outperforming our classic control strategies, the fixed time control, the things that you see, adaptive-- actually mostly fixed time control.
In our study, when we considered this sort of broader set, with more variation in traffic intersections, that was no longer the case. The algorithms no longer did as well. And this was even if we say trained on this set, I should clarify. So it wasn't just taking that work and like generalizing it to this set.
We were actually training on this set of intersections. And the algorithms themselves were not robust to this broader set of intersections. In our case, we found that the classic currently implemented strategies outperformed these learning-based approaches. This highlighted-- I mean, this in some sense can be thought of as a negative result for machine learning. But for us it actually pointed out an issue in how we evaluate machine learning methods, because we previously concluded they are better.
But when we evaluate them slightly differently with a broader set of what we think of as a traffic intersection, that was no longer the case. So this is what we call task under specification, where if we think of an intersection as, say, one thing or five things, where in actuality it's 1,000 things or maybe even more, 10,000 things, if we evaluate them, if we train them and evaluate them only on a few instances of what a traffic intersection is, we might draw the wrong conclusion. And we found this to not only be the case for traffic signal control, but when we went back to standard benchmarks that the community relies on for evaluating machine learning reinforcement learning methods, we found a similar conclusion.
And so in this work we revealed the problem. And we also reveal how hard it is to solve. And we have some ideas on how to overcome it. But it's actually been amazing to engage with the community and sort of highlight this. We've had a lot of folks in from industry who are applying reinforcement learning and they have come to us saying, thank you for justifying what I'm seeing in my own experience using reinforcement learning. And we've had other reinforcement learning researchers engage in conversations on how to start addressing this challenge.