Large language models (LLMs) like GPT-4 can identify a person’s age, location, gender and income with up to 85 per cent accuracy simply by analysing their posts on social media.
But the AIs also picked up on subtler cues, like location-specific slang, and could estimate a salary range from a user’s profession and location.
Reference:
arXiv DOI: 10.48550/arXiv.2310.07298
It sounds like the reason they used reddit was so they could easily find users who had expressly revealed the information in question, and use it to verify that the AI was accurately deducing the same info from style alone.
They used reddit because it has corraled dumb users. Users a no longer around anywhere else in the Internet, just here on social media. And yes, what better place to find dumb users than on reddit!