To a computer, text characters are symbols. These symbols are assigned numbers (integers) in order to store these symbols in memory. Encoding is the system by which these numbers are assigned. Many different encoding methods arose to handle special characters and various languages. Issues can arise when translating or editing files with a different encoding from the one they were created with. Encoding methods are not always compatible and interchangeable.
Issues between encoding systems were the inspiration for the Unicode format. The goal of Unicode is to provide a unique number for each character, regardless of language, platform, or program. It does this by assigning letters to code points like U+#### where #### is a hexadecimal number. Within Unicode there are different methods (formats) for storing these unique numbers. A discussion of those various methods is outside of the scope of this article, but to keep things simple, UTF-8 is an efficient way to store the Unicode format and is considered the best practice for encoding.
Software applications used for creating websites may save with any of the various character encodings. It can be helpful to know where to check or how to change your file's character encoding in your software. This is an important enough issue that the W3C (the organization that develops Web standards) has information about setting character encoding in the most popular web design software:
Setting encoding in web authoring applications
Character encoding becomes especially important when you are editing a file. If you try to open a file with a different encoding than what it was created with, issues can occur in the display of characters. In our next article, we'll discuss the character encoding check in the cPanel File Manager and what to look for if you plan to edit a file in the cPanel File Manager.