Monday, 9 July 2012

Previous Interview Question Next Interview Question

Use of Character Sets in Meta tag

To display an HTML page correctly, the browser must know what character-set to use.
The character-set for the early world wide web was ASCII. ASCII supports the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters.

The Unicode Consortium

The Unicode Consortium develops the Unicode Standard. Their goal is to replace the existing character-sets with its standard Unicode Transformation Format (UTF).
The Unicode Standard has become a success and is implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc. The Unicode standard is also supported in many operating systems and all modern browsers.
The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.
Unicode can be implemented by different character-sets. The most commonly used encodings are UTF-8 and UTF-16:
UTF-8A character in UTF8 can be from 1 to 4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages
UTF-1616-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. UTF-16 is used in major operating systems and environments, like Microsoft Windows 2000/XP/2003/Vista/CE and the Java and .NET byte code environments
Tip: The first 256 characters of Unicode character-sets correspond to the 256 characters of ISO-8859-1.
Tip: All HTML 4 processors already support UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16!

Previous Interview Question Next Interview Question


Post a Comment