In the process of conversion between Unicode and the old coding system, there must be some words that cannot be represented by Unicode. Unicode officials use a placeholder to represent these words, which is: U+FFFD replacement characters.
Then encode UTF-8 of U+FFFD, which is exactly''. If this'' is repeated many times, such as'', and then displayed in the environment of GBK/CP 936/GB 2312/GB18030, a Chinese character has 2 bytes, and the final result is: Fafa (0xEFBF), Jin.
Python code: 1. & gt& gt& gtu'\uFFFD '。 Encoding ('utf-8')*22. ' '3.& gt& gt& gt4.& gt& gt& gtPrintu' \ ufffd '。 Encoding ('UTF-8') * 2 Output result: "Copy".