The Byte Order Mask (BOM) uses the Unicode character U+FEFF to determine the encoding of a text file according to the following rule:
+-------------+-----------------------+
| Bytes | Encoding Form |
+-------------+-----------------------+
| 00 00 FE FF | UTF-32, big-endian |
| FF FE 00 00 | UTF-32, little-endian |
| FE FF | UTF-16, big-endian |
| FF FE | UTF-16, little-endian |
| EF BB BF | UTF-8 |
+-------------+-----------------------+
My question is: is there any combination of bytes that can make one UTF encoding to be confused with another UTF encoding?
For example, if I have a UTF-16 big-endian encoded file without BOM and with the characters U+EFBB and U+BF40 (EF BB BF 40) can it be confused with an UTF-8 encoded file with BOM and the ASCII character @
?