The common approach of marking the approximately duplicate records is that a pair of records are compared in a window with fixed length after these records are indexed by a certain keyword.
We have put forward a new Chinese information manipulation method, which combines the keyword indexing method with the whole - length indexing method, compares with the indexing...