Validation Problems - Sorry! This document can not be checked

By:

on

August 31, 2008

Validation Error - Sorry! This document can not be checkedWe use validators for our themes in order to produce standards compliant websites (HTML, CSS, speed and accessibility).  We ran into this problem earlier in the week though with a theme we inherited from someone else.  The error was that not all of the characters included in the file were UTF-8, the new standard for international character sets. 

I believe that someone had just cut/paste the header values in from an old site and included them directly in the page.tpl.php file.  This isn't unusual when upgrading a site, but certainly demonstrates the importance of validation. As a result, the W3C HTML validator produced this error:

Sorry, I am unable to validate this document because on line 22
it contained one or more bytes that I cannot interpret as utf-8
(in other words, the bytes found are not valid values in the
specified Character Encoding). Please check both the content of
the file and the character encoding indication.

The error was: utf8 "\xE9" does not map to Unicode

Now looking at this file in a text editor line 22 looked just fine (even in vim).

 

However after forcing the file through the GNU/Linux iconv program and renaming the file:

iconv -f ISO8859-1 -t UTF-8 -c page.tpl.php > page.tpl.php.new

and W3C could validate the page. Loading the whole page in a decent text editor and saving as UTF-8 would also work.

About The Author

Mike Gifford is the founder of OpenConcept Consulting Inc, which he started in 1999. Since then, he has been particularly active in developing and extending open source content management systems to allow people to get closer to their content. Before starting OpenConcept, Mike had worked for a number of national NGOs including Oxfam Canada and Friends of the Earth.