Here's a PHP variable for all XHTML entities with their equivalent numeric char reference:
Code: Select all
$entities_xhtml = array('"'=>'"', '<'=>'<', '>'=>'>', '&'=>'&', ' '=>' ', '''=>''', '¡'=>'¡', '¢'=>'¢', '£'=>'£', '¤'=>'¤', '¥'=>'¥', '¦'=>'¦', '§'=>'§', '¨'=>'¨', '©'=>'©', 'ª'=>'ª', '«'=>'«', '¬'=>'¬', '­'=>'­', '®'=>'®', '¯'=>'¯', '°'=>'°', '±'=>'±', '²'=>'²', '³'=>'³', '´'=>'´', 'µ'=>'µ', '¶'=>'¶', '·'=>'·', '¸'=>'¸', '¹'=>'¹', 'º'=>'º', '»'=>'»', '¼'=>'¼', '½'=>'½', '¾'=>'¾', '¿'=>'¿', 'À'=>'À', 'Á'=>'Á', 'Â'=>'Â', 'Ã'=>'Ã', 'Ä'=>'Ä', 'Å'=>'Å', 'Æ'=>'Æ', 'Ç'=>'Ç', 'È'=>'È', 'É'=>'É', 'Ê'=>'Ê', 'Ë'=>'Ë', 'Ì'=>'Ì', 'Í'=>'Í', 'Î'=>'Î', 'Ï'=>'Ï', 'Ð'=>'Ð', 'Ñ'=>'Ñ', 'Ò'=>'Ò', 'Ó'=>'Ó', 'Ô'=>'Ô', 'Õ'=>'Õ', 'Ö'=>'Ö', '×'=>'×', 'Ø'=>'Ø', 'Ù'=>'Ù', 'Ú'=>'Ú', 'Û'=>'Û', 'Ü'=>'Ü', 'Ý'=>'Ý', 'Þ'=>'Þ', 'ß'=>'ß', 'à'=>'à', 'á'=>'á', 'â'=>'â', 'ã'=>'ã', 'ä'=>'ä', 'å'=>'å', 'æ'=>'æ', 'ç'=>'ç', 'è'=>'è', 'é'=>'é', 'ê'=>'ê', 'ë'=>'ë', 'ì'=>'ì', 'í'=>'í', 'î'=>'î', 'ï'=>'ï', 'ð'=>'ð', 'ñ'=>'ñ', 'ò'=>'ò', 'ó'=>'ó', 'ô'=>'ô', 'õ'=>'õ', 'ö'=>'ö', '÷'=>'÷', 'ø'=>'ø', 'ù'=>'ù', 'ú'=>'ú', 'û'=>'û', 'ü'=>'ü', 'ý'=>'ý', 'þ'=>'þ', 'ÿ'=>'ÿ', '−'=>'−', 'ˆ'=>'ˆ', '˜'=>'˜', 'Š'=>'Š', '‹'=>'‹', 'Œ'=>'Œ', '‘'=>'‘', '’'=>'’', '“'=>'“', '”'=>'”', '•'=>'•', '–'=>'–', '—'=>'—', '™'=>'™', 'š'=>'š', '›'=>'›', 'œ'=>'œ', 'Ÿ'=>'Ÿ', 'ƒ'=>'ƒ', 'Α'=>'Α', 'Β'=>'Β', 'Γ'=>'Γ', 'Δ'=>'Δ', 'Ε'=>'Ε', 'Ζ'=>'Ζ', 'Η'=>'Η', 'Θ'=>'Θ', 'Ι'=>'Ι', 'Κ'=>'Κ', 'Λ'=>'Λ', 'Μ'=>'Μ', 'Ν'=>'Ν', 'Ξ'=>'Ξ', 'Ο'=>'Ο', 'Π'=>'Π', 'Ρ'=>'Ρ', 'Σ'=>'Σ', 'Τ'=>'Τ', 'Υ'=>'Υ', 'Φ'=>'Φ', 'Χ'=>'Χ', 'Ψ'=>'Ψ', 'Ω'=>'Ω', 'α'=>'α', 'β'=>'β', 'γ'=>'γ', 'δ'=>'δ', 'ε'=>'ε', 'ζ'=>'ζ', 'η'=>'η', 'θ'=>'θ', 'ι'=>'ι', 'κ'=>'κ', 'λ'=>'λ', 'μ'=>'μ', 'ν'=>'ν', 'ξ'=>'ξ', 'ο'=>'ο', 'π'=>'π', 'ρ'=>'ρ', 'ς'=>'ς', 'σ'=>'σ', 'τ'=>'τ', 'υ'=>'υ', 'φ'=>'φ', 'χ'=>'χ', 'ψ'=>'ψ', 'ω'=>'ω', 'ϑ'=>'ϑ', 'ϒ'=>'ϒ', 'ϖ'=>'ϖ', ' '=>' ', ' '=>' ', ' '=>' ', '‌'=>'‌', '‍'=>'‍', '‎'=>'‎', '‏'=>'‏', '‚'=>'‚', '„'=>'„', '†'=>'†', '‡'=>'‡', '…'=>'…', '‰'=>'‰', '′'=>'′', '″'=>'″', '‾'=>'‾', '⁄'=>'⁄', '€'=>'€', 'ℑ'=>'ℑ', '℘'=>'℘', 'ℜ'=>'ℜ', 'ℵ'=>'ℵ', '←'=>'←', '↑'=>'↑', '→'=>'→', '↓'=>'↓', '↔'=>'↔', '↵'=>'↵', '⇐'=>'⇐', '⇑'=>'⇑', '⇒'=>'⇒', '⇓'=>'⇓', '⇔'=>'⇔', '∀'=>'∀', '∂'=>'∂', '∃'=>'∃', '∅'=>'∅', '∇'=>'∇', '∈'=>'∈', '∉'=>'∉', '∋'=>'∋', '∏'=>'∏', '∑'=>'∑', '∗'=>'∗', '√'=>'√', '∝'=>'∝', '∞'=>'∞', '∠'=>'∠', '∧'=>'∧', '∨'=>'∨', '∩'=>'∩', '∪'=>'∪', '∫'=>'∫', '∴'=>'∴', '∼'=>'∼', '≅'=>'≅', '≈'=>'≈', '≠'=>'≠', '≡'=>'≡', '≤'=>'≤', '≥'=>'≥', '⊂'=>'⊂', '⊃'=>'⊃', '⊄'=>'⊄', '⊆'=>'⊆', '⊇'=>'⊇', '⊕'=>'⊕', '⊗'=>'⊗', '⊥'=>'⊥', '⋅'=>'⋅', '⌈'=>'⌈', '⌉'=>'⌉', '⌊'=>'⌊', '⌋'=>'⌋', '⟨'=>'〈', '⟩'=>'〉', '◊'=>'◊', '♠'=>'♠', '♣'=>'♣', '♥'=>'♥', '♦'=>'♦');
$entities_xhtml_preserve = array('"'=>'"', '<'=>'<', '>'=>'>', '&'=>'&', ' '=>' ');
My next step was to convert all text entity codes to their numeric equivalents, apart from $entities_xhtml_preserve above... I want the user of the CMS to be able to enter the text entities of < > & " and
Final step is to convert all decimal numeric entities and hex numeric entities to their proper UTF8 character for saving into my DB:
My thanks to Andrew Simpson's function in his comment on the mb_decode_numericentity() man page (
)... saved many hours!