This patch is important for local-encoding, non-utf8 wikis that allow $NonEnglish? page names. It fixes certain accessibility issues.
UseModWiki has an issue in generating links to pages that contain non-ASCII characters. This issue only affects wikis that have the $NonEnglish? config variable set to true.
The issue is that methods like ScriptLink dump non-ASCII characters into anchor's HREF attributes, which produces a nonstandard URI. Most browsers and other user agents interpret these according to the content-encoding of the html page, and produce %-encoded URIs. Exceptions are certain Internet Explorer versions, Microsoft's web crawler (msnbot.msn.com). These produce URI's that are UTF-8 encoded, and result in an "Invalid Page".
The end result is that some users are denied access to the non-English pages, and some crawlers are unable to index the wiki properly. This is a serious accessibility issue.
The following patch implements proper URI encoding for page titles. Note that no URI encoding is done for bracket links and the like. This works for page and script links. This patch also fixes WikiBugs/AmpersandBug.
--- wiki.pl 2003-09-11 14:21:02.000000000 +0200 +++ ProperUriEscaping.pl 2005-11-18 13:54:45.000000000 +0100 @@ -1093,12 +1093,14 @@ sub ScriptLink { my ($action, $text) = @_; + $action = &UriEscape($action); return "<a href=\"$ScriptName" . &ScriptLinkChar() . "$action\">$text</a>"; } sub ScriptLinkClass { my ($action, $text, $class) = @_; + $action = &UriEscape($action); return "<a href=\"$ScriptName" . &ScriptLinkChar() . "$action\"" . ' class=' . $class . ">$text</a>"; } @@ -1242,6 +1244,7 @@ sub ScriptLinkTitle { my ($action, $text, $title) = @_; + $action = &UriEscape($action); if ($FreeLinks) { $action =~ s/ /_/g; } @@ -1406,7 +1409,7 @@ $result .= &GetPageLinkText($id, T('View current revision')); } if ($UseMetaWiki) { - $result .= ' | <a href="http://sunir.org/apps/meta.pl?' . $id . '">' + $result .= ' | <a href="http://sunir.org/apps/meta.pl?' . &UriEscape($id) . '">' . T('Search MetaWiki') . '</a>'; } if ($Section{'revision'} > 0) { @@ -1513,7 +1516,7 @@ # Normally get URL from script, but allow override. $FullUrl = $q->url(-full=>1) if ($FullUrl eq ""); - $url = $FullUrl . &ScriptLinkChar() . $newid; + $url = $FullUrl . &ScriptLinkChar() . &UriEscape($newid); $nameLink = "<a href=\"$url\">$name</a>"; if ($RedirType < 3) { if ($RedirType == 1) { # Use CGI.pm @@ -1781,7 +1784,14 @@ } return $text; } - + +sub UriEscape { + my ($uri) = @_; + $uri =~ s/([\x00-\x1f\x7f-\xff])/sprintf("%%%02X", ord($1))/ge; + $uri =~ s/\&/\&/g; + return $uri; +} + sub QuoteHtml { my ($html) = @_; @@ -4070,7 +4080,7 @@ $address =~ s/\n//g; close(EMAIL); my $home_url = $q->url(); - my $page_url = $home_url . "?$id"; + my $page_url = $home_url . &ScriptLinkChar() . &UriEscape($id); my $editors_summary = $q->param("summary"); if (($editors_summary eq "*") or ($editors_summary eq "")){ $editors_summary = "";