全球范围内,每个用户和AI的聊天,组成了海量且杂乱的非结构化数据集,而大语言模型非常善于从这些不经意的闲谈中提取用户的微数据:
In these cases, the user’s search strategy acts as a filter on a pre-existing corpus of information. Large language models introduce a qualitatively different dynamic: rather than selecting from existing content, they generate content on demand.
,更多细节参见同城约会
function createLineParser() {
So I trained seven separate binary classifiers instead, then used majority voting to decide if a sentence was AI-generated: