HTML Encyclopaedia

Special Characters

It is well known that, in order to include the characters <, & and > in an HTML document the author has to write &lt; &amp; and &gt; since these characters have special meanings to HTML interpreters. There are a significant number of special characters that can be included using this sort of notation. These are all characters outside the normal ASCII character set, their direct inclusion as binary is uncertain as they all have the "high" bit set, further there are several different interpretations of high bit set characters in use.

In HTML there are two ways of including such special characters

  1. Use the notation &# followed immediately by the decimal value associated with the character in the table below.
  2. Use the notation & followed by a vaguely descriptive code known as an entity, these are also listed below.

For example the French word for French is français. [Note the small hook under the letter c, this makes it into c cedilla] The HTML could be either

fran&ccedil;ais
     or
fran&#231;ais
There are some contexts in which HTML 3.2 distinguishes between letters and other symbols, the special characters that are treated as letters are indicated, these are the only characters for which the HTML 3.2 standard specifically lists entities.

DescriptionEntityDecimal Value&#<decimal> &<entity>Treated as letter
required spacenbsp160  N
inverted exclamationiexcl161¡¡N
cent signcent162¢¢N
pound signpound163££N
currency signcurren164¤¤N
yen signyen165¥¥N
broken barbrvbar166¦¦N
section signsect167§§N
umlautuml168¨¨N
copyright signcopy169©©N
feminine ordinalordf170ªªN
left guillemotlaquo171««N
logical not signnot172¬¬N
soft hyphenshy173­­N
registered trademarkreg174®®N
spacing macronmacr175¯¯N
degreedeg176°°N
plus-minus signplusmn177±±N
superscript 2sup2178²²N
superscript 3sup3179³³N
spacing acuteacute180´´N
mumicro181µµN
pilcrowpara182N
middle dotmiddot183··N
spacing cedillacedil184¸¸N
superscript 1sup1185¹¹N
masculine ordinalordm186ººN
right guillemotraquo187»»N
one quarterfrac14188¼¼N
halffrac12189½½N
three quartersfrac34190¾¾N
inverted question markiquest191¿¿N
A graveAgrave192ÀÀY
A acuteAacute193ÁÁY
A circumflexAcirc194ÂÂY
A tildeAtilde195ÃÃY
A diaeresisAuml196ÄÄY
A ringAring197ÅÅY
AE ligatureAElig198ÆÆY
C cedillaCcedil199ÇÇY
E graveEgrave200ÈÈY
E acuteEacute201ÉÉY
E circumflexEcirc202ÊÊY
E diaeresisEuml203ËËY
I graveIgrave204ÌÌY
I acuteIacute205ÍÍY
I circumflexIcirc206ÎÎY
I diaeresisIuml207ÏÏY
ETHETH208ÐÐY
N tildeNtilde209ÑÑY
O graveOgrave210ÒÒY
O acuteOacute211ÓÓY
O circumflexOcirc212ÔÔY
O tildeOtilde213ÕÕY
O diaeresisOuml214ÖÖY
multiplication signtimes215××N
O slashOslash216ØØY
U graveUgrave217ÙÙY
U acuteUacute218ÚÚY
U circumflexUcirc219ÛÛY
U diaeresisUuml220ÜÜY
Y acuteYacute221ÝÝY
THORNTHORN222ÞÞY
sharp sszlig223ßßY
a graveagrave224ààY
a acuteaacute225ááY
a circumflexacirc226ââY
a tildeatilde227ããY
a diaeresisauml228ääY
a ringaring229ååY
ae ligatureaelig230ææY
c cedillaccedil231ççY
e graveegrave232èèY
e acuteeacute233ééY
e circumflexecirc234êêY
e diaeresiseuml235ëëY
i graveigrave236ììY
i acuteiacute237ííY
i circumflexicirc238îîY
i diaeresisiuml239ïïY
etheth240ððY
n tildentilde241ññY
o graveograve242òòY
o acuteoacute243óóY
o circumflexocirc244ôôY
o tildeotilde245õõY
o diaeresisouml246ööY
division signdivide247÷÷N
o slashoslash248øøY
u graveugrave249ùùY
u acuteuacute250úúY
u circumflexucirc251ûûY
u diaeresisuuml252üüY
y acuteyacute253ýýY
thornthorn254þþY
y diaeresisyuml255ÿÿY

eth and thorn are Icelandic letters.

A spacing cedilla (184) is a cedilla that appears after the character it was associated with. I.e. C cedilla would appear as c¸ rather than ç. The usefulnes of the spacing marks is unclear, non-spacing marks would have been much more useful as any composite special character could then have been formed, such as the w circumflex used in Welsh, u macron used in some Anglo-Saxon place names, l slash used in zloty (Polish Currency), S with upside down circumflex in Skoda (Czech car manufacturer) there would also then have been room for more of the Greek alphabet widely used in science and mathematics. Perhaps one day WWW browsers will be Unicode capable.

Both Netscape 3.0 and Microsoft Internet Explorer 3.0 handled all these correctly although I'm not too sure what a masculine or feminine ordinal is supposed to look like.