Review of my prediction from two months ago (Apr 25):
1. Tesla won't have genuine level 4 autonomy by the end of August.
2. To hit level 4, Tesla will have to follow Waymo's strategy: (a) Lidar/radar sensors, (b) geo-fencing with hi-def pre-mapping, and (c) the phone-a-human feature.
So far Tesla is holding firm against (a), partially/mostly doing (b), and leaning so hard on (c) that I don't believe it counts as level 4.
Also I have a new long-shot prediction today: Tesla will pause their robotaxi service on September 1, citing burdensome legislation that takes effect in Texas on that day.
More confident prediction: Tesla will either not be in compliance with the new Texas law by September 1st or will comply by being officially classified as supervised level 2 autonomy.
[Aside: I'm not sure if anyone is paying attention to these comments I'm adding. Maybe I'll repeat all this in an update in the next AGI Friday.]
Oh man, it's yet again breaking my brain how compelling that is. Bayesian-update in progress; thank you! I really might be high on copium here. So embarrassing. (Not to mention the embarrassment of going further out on this limb without doing more research on the new Texas law. Thank you also for that.)
Review of possible outcomes:
1. Tesla is cheating and gets caught
2. Tesla is cheating and this gradually becomes clear as they fail to scale up as promised
3. Tesla has pulled this off and the supervision is just abundance of caution
4. Tesla is faking-it-till-making-it but does end up making it and there's no proof they were ever faking it
Basically, I'm either vindicated in my skepticism, proven wrong, or I'm right but sound like an idiot to those who weren't also skeptical all along.
The more I look into this, the more uncertain I am how how hard it will be for Tesla to comply. But also it sounds like it could be as late as March 2026 that this Texas law has teeth? So who knows. California's laws are stricter, like having to report every disengagement.
So a variant on my prediction about Tesla's compliance in Texas is that Tesla won't launch in California any time soon. I'm not sure how to define "anytime soon" yet. To be fully vindicated it'd be "before Tesla adds lidar". If Tesla launches (legally) in California this year, I'll probably concede I was super wrong.
You say that, but comma.ai which provided the aftermarket man in the middle attack self driving system aftermarket upgrade I installed into my toyota 2019 corolla is an entirely vision based (radar not required) self driving system that also similarly took the end to end approach to driving entirely off of vision, and I was convinced by the founder’s claims that an end to end vision model would perform better than more heuristic bent system that labeled features in a separate layer before passing onto the driving controller, but here we are with Waymo actually delivering results…
(For reference, I uninstalled my comma 2 after it died permanently but it helped me on many late night drives. Its MTBF was also too low for an automotive context, but apparently the three raised that to like a thousand or more hours? It was really neat but I like driving now. Not road tripping, however.)
Oh, that's very cool that you've tried out Comma.ai! A thousand hours between failures is super impressive but also sounds so dangerous, in terms of lulling you into complacency about supervising it. But just one or two more 9's (or 0's if we're talking MTBF) and I think we're near average human level.
I guess I agree that Comma.ai's progress towards level 4 is evidence that lidar and radar aren't strictly necessary. I mean, we know from humans that 2 cameras and maybe a microphone are *technically* sufficient. Probably one-eyed deaf people can drive well enough too, so technically-technically one single camera on a swivel suffices.
I still like the comma 2, but it just dropped dead twice during its lifetime while I was driving hands off wheel. I can see why they pushed the 3 so much, with the 2 just being a very obviously modified chinese smartphone glued onto an external heatsink and controller.
After sleeping on it (and discussing it with commenters on my Manifold market), here's where my current level of cynicism is at:
You know the trope where Marketing tells lies to customers and Engineering has to scramble to make them be true? I believe (with, um, just barely over 50% confidence?) that Tesla is stringing us along with these controlled demos while they finish getting to actual level 4 autonomy. If they pull that off then I'll just end up looking like I was high on copium, as the kids say. So I'm hoping the cheating comes to light before then. Hopefully not via a faux-autonomous Tesla killing someone, like what happened with Uber's self-driving program.
But I guess even more than avoiding looking like an idiot, I want a freaking self-driving car. So I will begrudgingly root for Tesla actually pulling this off. Which, to say it one more time, I don't believe they have *yet*.
"Full Self-Driving (FSD) has only been at some hundreds of miles between critical disengagements"
a) citation needed for that
b) even if accurate, the number would be an average. So maybe Tesla chose an easy 30-mile (a ChatGPT calculation, the claim is that the trip took 30') route where the FSD rate is pretty good already.
On point (b), I would've thought so but from the video it didn't look easy! I genuinely don't know what to believe right now. Maybe the crowdsourced data is wrong? Maybe the version of FSD in the robotaxis and this autonomous delivery is a new, much better version? (Do you think it's one or both of those?)
I guess I'm predicting (with lower and lower confidence) that the crowdsourced data is correct, Tesla is still at level 2, and they're basically faking level 4. If you're feeling confident I'm wrong, please do help me understand your thinking.
This tracker thing is tricky. It's self-reported by a group of interested individuals who disengage from FSD on various occasions based on personal criteria. I am not big on statistics, but that sounds like a lot of noise to filter out :D
I've also watched an embarrassing number of videos by people demo'ing 13.2.9 and have seen them screw up in various ways, like running a stop sign. My vague impression is that having a serious crash every thousand-ish miles -- if the humans were checked out -- seems about right.
Someone on Twitter made an interesting comment: regular FSD drivers will (on average) probably disengage more often than the safety monitor in the robotaxi. Not that they will be less cautious, but they would better know the limits of the system.
"Waymo doesn’t do highways for their commercial service yet but they’ve done so in testing for a decade and a half. I was in the back seat of one doing so in 2011! (With a human supervising in the driver’s seat back then, of course.) Is Musk aiming for the technicality that Waymos’ highway driving might never have happened to have literally no one in the car, not even the back seat? I doubt even that’s true, but can’t say for sure. It reads to me as pretty disingenuous in any case."
It seems to me that to be completely truthful you would also say that to the best of your knowledge this was the first fully autonomous drive on the freeway, no?
You assume that Waymo has done it, but the last mention of it on their website I could find was that a year and a half ago they said they would start testing autonomous rides on the freeway, but they don't give a timeline, and there have been no updates that I can find. I found a couple Reddit posters claiming to have seen an empty Waymo on the freeway, but that's the closest thing to a confirmation I could find in ten minutes of looking.
I guess I don't know anything with certainty but I'm pretty close to certainty on what Waymo's doing. Namely, that typical Waymo driving, including on highways, is without real-time monitoring by humans. Tesla's a massive question mark. I'm definitely aiming to be maximally truthful! Even to the point of trying to convey my full probability distribution.
I agree that reports from randos on Reddit aren't reliable but the prior was high and I guess we can call that some weak Bayesian evidence? But also, to reemphasize, Musk's claim seems disingenuous even if it were technically true. It's like if Musk trumpeted "to our knowledge this is the first FULLY AUTONOMOUS drive with no other vehicles on the road". See what I mean? If Waymo did autonomous highway driving first but with passengers in the car, with no ability to intervene, that's, if anything, more impressive than with an empty car. There's just no meaningful first for Tesla to claim. (Nor a meaningless first, but that's the part I'm not totally sure of.)
To start, my prior would have been that Waymo has done tons of autonomous driving on the freeway, and I agree that if it is autonomous, there's someone in the passenger seat monitoring, but unable to intervene that that would count. However, I wanted to see if Musk's claim had any truth to it, and when I was looking I couldn't actually find anything other than claims they were going to start testing 18 months ago. I would think if they were successfully moving along with it that they would have given an update, but now I'm in the position that think that their silence might actually mean something.
At the very least, it would seem to me that it isn't disingenuous to claim you were the first to do something when your main competitor hasn't made any claims of doing it.
Again though, I did a Google search and a ChatGPT ask, so it's not like I did a deep dive into this, and maybe there is better information out there. Though it does seem weird to me that it would be hard to find info if they were doing this regularly.
Ah, I think it's just that Waymo hasn't taken a member of the public on a freeway without a safety driver (they've taken me personally on a freeway *with* a safety driver, in 2011). But since Tesla didn't do that either, I'm sticking to my "no meaningful first" claim, and disingenuousness from Musk.
But let me work harder to be fair: The autonomous delivery itself is a cool first. If they didn't cheat. Maybe it's a cool first even if they did cheat?
(As a Waymo fanboy I just have to add that Waymo *totally could've* done an autonomous delivery years ago, if they sold cars.)
Review of my prediction from two months ago (Apr 25):
1. Tesla won't have genuine level 4 autonomy by the end of August.
2. To hit level 4, Tesla will have to follow Waymo's strategy: (a) Lidar/radar sensors, (b) geo-fencing with hi-def pre-mapping, and (c) the phone-a-human feature.
So far Tesla is holding firm against (a), partially/mostly doing (b), and leaning so hard on (c) that I don't believe it counts as level 4.
Also I have a new long-shot prediction today: Tesla will pause their robotaxi service on September 1, citing burdensome legislation that takes effect in Texas on that day.
More confident prediction: Tesla will either not be in compliance with the new Texas law by September 1st or will comply by being officially classified as supervised level 2 autonomy.
[Aside: I'm not sure if anyone is paying attention to these comments I'm adding. Maybe I'll repeat all this in an update in the next AGI Friday.]
Per ChatGPT, Tesla should be fine on September 1st, they already cover what is being asked https://chatgpt.com/share/6862360f-95ec-800e-80e2-d013746ef87f
Oh man, it's yet again breaking my brain how compelling that is. Bayesian-update in progress; thank you! I really might be high on copium here. So embarrassing. (Not to mention the embarrassment of going further out on this limb without doing more research on the new Texas law. Thank you also for that.)
Review of possible outcomes:
1. Tesla is cheating and gets caught
2. Tesla is cheating and this gradually becomes clear as they fail to scale up as promised
3. Tesla has pulled this off and the supervision is just abundance of caution
4. Tesla is faking-it-till-making-it but does end up making it and there's no proof they were ever faking it
Basically, I'm either vindicated in my skepticism, proven wrong, or I'm right but sound like an idiot to those who weren't also skeptical all along.
Another possibility: Tesla has something far inferior to Waymo but also far superior to the average human driver.
The more I look into this, the more uncertain I am how how hard it will be for Tesla to comply. But also it sounds like it could be as late as March 2026 that this Texas law has teeth? So who knows. California's laws are stricter, like having to report every disengagement.
So a variant on my prediction about Tesla's compliance in Texas is that Tesla won't launch in California any time soon. I'm not sure how to define "anytime soon" yet. To be fully vindicated it'd be "before Tesla adds lidar". If Tesla launches (legally) in California this year, I'll probably concede I was super wrong.
You say that, but comma.ai which provided the aftermarket man in the middle attack self driving system aftermarket upgrade I installed into my toyota 2019 corolla is an entirely vision based (radar not required) self driving system that also similarly took the end to end approach to driving entirely off of vision, and I was convinced by the founder’s claims that an end to end vision model would perform better than more heuristic bent system that labeled features in a separate layer before passing onto the driving controller, but here we are with Waymo actually delivering results…
(For reference, I uninstalled my comma 2 after it died permanently but it helped me on many late night drives. Its MTBF was also too low for an automotive context, but apparently the three raised that to like a thousand or more hours? It was really neat but I like driving now. Not road tripping, however.)
Oh, that's very cool that you've tried out Comma.ai! A thousand hours between failures is super impressive but also sounds so dangerous, in terms of lulling you into complacency about supervising it. But just one or two more 9's (or 0's if we're talking MTBF) and I think we're near average human level.
I guess I agree that Comma.ai's progress towards level 4 is evidence that lidar and radar aren't strictly necessary. I mean, we know from humans that 2 cameras and maybe a microphone are *technically* sufficient. Probably one-eyed deaf people can drive well enough too, so technically-technically one single camera on a swivel suffices.
I still like the comma 2, but it just dropped dead twice during its lifetime while I was driving hands off wheel. I can see why they pushed the 3 so much, with the 2 just being a very obviously modified chinese smartphone glued onto an external heatsink and controller.
PS: Long version of the video Markos linked to, of the autonomous Tesla delivery: https://www.youtube.com/watch?v=lRRtW16GalE
It sure does look impressive.
After sleeping on it (and discussing it with commenters on my Manifold market), here's where my current level of cynicism is at:
You know the trope where Marketing tells lies to customers and Engineering has to scramble to make them be true? I believe (with, um, just barely over 50% confidence?) that Tesla is stringing us along with these controlled demos while they finish getting to actual level 4 autonomy. If they pull that off then I'll just end up looking like I was high on copium, as the kids say. So I'm hoping the cheating comes to light before then. Hopefully not via a faux-autonomous Tesla killing someone, like what happened with Uber's self-driving program.
But I guess even more than avoiding looking like an idiot, I want a freaking self-driving car. So I will begrudgingly root for Tesla actually pulling this off. Which, to say it one more time, I don't believe they have *yet*.
"Full Self-Driving (FSD) has only been at some hundreds of miles between critical disengagements"
a) citation needed for that
b) even if accurate, the number would be an average. So maybe Tesla chose an easy 30-mile (a ChatGPT calculation, the claim is that the trip took 30') route where the FSD rate is pretty good already.
And here is the video of the Tesla delivery from their official account https://www.youtube.com/watch?v=GU16hXSSGKs
Ah, thanks for the video! That wasn't released yet when I published this post.
On point (a), I believe the following crowdsourced data is the best we have: https://teslafsdtracker.com/
On point (b), I would've thought so but from the video it didn't look easy! I genuinely don't know what to believe right now. Maybe the crowdsourced data is wrong? Maybe the version of FSD in the robotaxis and this autonomous delivery is a new, much better version? (Do you think it's one or both of those?)
I guess I'm predicting (with lower and lower confidence) that the crowdsourced data is correct, Tesla is still at level 2, and they're basically faking level 4. If you're feeling confident I'm wrong, please do help me understand your thinking.
This tracker thing is tricky. It's self-reported by a group of interested individuals who disengage from FSD on various occasions based on personal criteria. I am not big on statistics, but that sounds like a lot of noise to filter out :D
I've also watched an embarrassing number of videos by people demo'ing 13.2.9 and have seen them screw up in various ways, like running a stop sign. My vague impression is that having a serious crash every thousand-ish miles -- if the humans were checked out -- seems about right.
Someone on Twitter made an interesting comment: regular FSD drivers will (on average) probably disengage more often than the safety monitor in the robotaxi. Not that they will be less cautious, but they would better know the limits of the system.
In case that matters, people have figured out the location of the delivery, about 15 miles from the Tesla factory https://x.com/DevinOlsenn/status/1938750271766798649/photo/1
"Waymo doesn’t do highways for their commercial service yet but they’ve done so in testing for a decade and a half. I was in the back seat of one doing so in 2011! (With a human supervising in the driver’s seat back then, of course.) Is Musk aiming for the technicality that Waymos’ highway driving might never have happened to have literally no one in the car, not even the back seat? I doubt even that’s true, but can’t say for sure. It reads to me as pretty disingenuous in any case."
It seems to me that to be completely truthful you would also say that to the best of your knowledge this was the first fully autonomous drive on the freeway, no?
You assume that Waymo has done it, but the last mention of it on their website I could find was that a year and a half ago they said they would start testing autonomous rides on the freeway, but they don't give a timeline, and there have been no updates that I can find. I found a couple Reddit posters claiming to have seen an empty Waymo on the freeway, but that's the closest thing to a confirmation I could find in ten minutes of looking.
I guess I don't know anything with certainty but I'm pretty close to certainty on what Waymo's doing. Namely, that typical Waymo driving, including on highways, is without real-time monitoring by humans. Tesla's a massive question mark. I'm definitely aiming to be maximally truthful! Even to the point of trying to convey my full probability distribution.
I agree that reports from randos on Reddit aren't reliable but the prior was high and I guess we can call that some weak Bayesian evidence? But also, to reemphasize, Musk's claim seems disingenuous even if it were technically true. It's like if Musk trumpeted "to our knowledge this is the first FULLY AUTONOMOUS drive with no other vehicles on the road". See what I mean? If Waymo did autonomous highway driving first but with passengers in the car, with no ability to intervene, that's, if anything, more impressive than with an empty car. There's just no meaningful first for Tesla to claim. (Nor a meaningless first, but that's the part I'm not totally sure of.)
To start, my prior would have been that Waymo has done tons of autonomous driving on the freeway, and I agree that if it is autonomous, there's someone in the passenger seat monitoring, but unable to intervene that that would count. However, I wanted to see if Musk's claim had any truth to it, and when I was looking I couldn't actually find anything other than claims they were going to start testing 18 months ago. I would think if they were successfully moving along with it that they would have given an update, but now I'm in the position that think that their silence might actually mean something.
At the very least, it would seem to me that it isn't disingenuous to claim you were the first to do something when your main competitor hasn't made any claims of doing it.
Again though, I did a Google search and a ChatGPT ask, so it's not like I did a deep dive into this, and maybe there is better information out there. Though it does seem weird to me that it would be hard to find info if they were doing this regularly.
Ah, I think it's just that Waymo hasn't taken a member of the public on a freeway without a safety driver (they've taken me personally on a freeway *with* a safety driver, in 2011). But since Tesla didn't do that either, I'm sticking to my "no meaningful first" claim, and disingenuousness from Musk.
But let me work harder to be fair: The autonomous delivery itself is a cool first. If they didn't cheat. Maybe it's a cool first even if they did cheat?
(As a Waymo fanboy I just have to add that Waymo *totally could've* done an autonomous delivery years ago, if they sold cars.)