Epoch AI, a research outfit, estimates the well of high-quality textual data on the public internet will run dry by 2026.
研究机构 Epoch AI 估计,到 2026 年,公共互联网上的高质量文本数据将会枯竭。
英语百科
Text corpus
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed). They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.