Yes, but game engines also hold the entire world inside themselves. There’s no guessing, no estimating, no making sure that what it’s looking at is actually a human or a bush - it already knows that.
The problem with computer vision being lazy is that it can’t ignore something without understanding what it’s looking at, and it can’t understand what it’s looking at without analyzing the data. It’s a circular problem, and will be ridiculously hard to solve - the crux of the issue is that we as people are analyzing that same data, we just don’t realize it.
So then it kinda just sounds like they’re doing things they want to do