UseMod Wiki: WikiPatches/MarkupWithinParagraphs

This patch allows the and <tt> tags to apply across line-breaks, but only within the same paragraph (i.e. up to a blank line).

Note: This has nothing to due with the Usemod 2.0. parser alluded to by SunirShah - see RumorAboutUseModTwo. This is my own attempt to implement paragraph level tags - basically because I'm impatient and can't wait for 2.0 ;-)

Background: The reason why markup characters are traditionally limited to a single line on most Wikis, is because this limits the damage caused by a "runaway" opening tag (i.e. where the closing tag gets forgotton).

However, as suggested by SunirShah, if the concept of a paragraph is introduced, then the markup could be made more flexible whilst still preventing runaway tags from being too much hassle.

As an example of the difference this patch makes consider the following markup:

<b>This is bold.</b> This is not.

<b>This is
bold</b>.  This is not.

<b>This is

not bold</b>

This produces the following:

This is bold. This is not.

This is bold. This is not.

This is

not bold

On the unpatched code only the first of the three examples will be formatted. After patching the first two examples will work, only the third will not.

--DavidClaughton.

P.S. I've now upgraded the patch to make it clearer and added support for $HtmlTags = 1.

N.B: The diff below is based on the 0.92 release of Usemod. It does work for 1.0, but you might have to unpick the changes manually as all the line numbers are different.

--- wiki.pl.original  2001-04-22 02:44:10.000000000 +0100
+++ wiki.pl  2003-02-25 10:46:56.000000000 +0000
@@ -1153,6 +1181,7 @@
 sub WikiToHTML {
   my ($pageText) = @_;
 
   %SaveUrl = ();
   %SaveNumUrl = ();
   $SaveUrlIndex = 0;
@@ -1180,22 +1209,31 @@
     # The <pre> tag wraps the stored text with the HTML <pre> tag
     s/\&lt\;pre\&gt\;((.|\n)*?)\&lt\;\/pre\&gt\;/&StorePre($1, "pre")/ige;
     s/\&lt\;code\&gt\;((.|\n)*?)\&lt\;\/code\&gt\;/&StorePre($1, "code")/ige;
+    # Note that these tags are restricted to a single paragraph
+    my ($t);
     if ($HtmlTags) {
-      my ($t);
       foreach $t (@HtmlPairs) {
-        s/\&lt\;$t(\s[^<>]+?)?\&gt\;(.*?)\&lt\;\/$t\&gt\;/<$t$1>$2<\/$t>/gis;
+        s/
+          \&lt\;$t(\s[^<>]+?)?\&gt\;            # Match Opening tag with params.
+          (?>(.*?)((\n\n)|(\&lt\;\/$t\&gt\;)))  # Match upto Closing Tag or end para.
+          (?<!\n\n)                            # Fail if end of para.
+        /<$t$1>$2<\/$t>/gisx;                  # Replacement String.
       }
       foreach $t (@HtmlSingle) {
-        s/\&lt\;$t(\s[^<>]+?)?\&gt\;/<$t$1>/gi;
+        s/
+          \&lt\;$t(\s[^<>]+?)?\&gt\;            # Match tag with params.
+        /<$t$1>/gix;                          # Replacement String.
       }
     } else {
-      # Note that these tags are restricted to a single line
-      s/\&lt\;b\&gt\;(.*?)\&lt\;\/b\&gt\;/<b>$1<\/b>/gi;
-      s/\&lt\;i\&gt\;(.*?)\&lt\;\/i\&gt\;/<i>$1<\/i>/gi;
-      s/\&lt\;strong\&gt\;(.*?)\&lt\;\/strong\&gt\;/<strong>$1<\/strong>/gi;
-      s/\&lt\;em\&gt\;(.*?)\&lt\;\/em\&gt\;/<em>$1<\/em>/gi;
-    }
-    s/\&lt\;tt\&gt\;(.*?)\&lt\;\/tt\&gt\;/<tt>$1<\/tt>/gis;  # <tt> (MeatBall)
+      foreach $t (qw/b i strong em tt/){
+        s/
+          \&lt\;$t\&gt\;                          # Match Opening tag
+          (?>(.*?)((\n\n)|(\&lt\;\/$t\&gt\;)))    # Match upto Closing tag or end para.
+          (?<!\n\n)                              # Fail if end of para.
+        /<$t>$1<\/$t>/gisx;                      # Replacement string.
+      }
+      s/\&lt\;br\&gt\;/<br>/gi; 
+    }
     if ($HtmlLinks) {
       s/\&lt\;A(\s[^<>]+?)\&gt\;(.*?)\&lt\;\/a\&gt\;/&StoreHref($1, $2)/gise;
     }

Comments

Is there an extra "}"?

 +      foreach $t (qw/b i strong em tt/){
 +        s/
 +          \&lt\;$t\&gt\;                          # Match Opening tag
 +          (?>(.*?)((\n\n)|(\&lt\;\/$t\&gt\;)))    # Match upto Closing tag or end para.
 +          (?<!\n\n)                              # Fail if end of para.
 +        /<$t>$1<\/$t>/gisx;                      # Replacement string.
 +      } # maybe extra?
 +      s/\&lt\;br\&gt\;/<br>/gi; 
 +    }

No, in that hunk there is also an extra "}" in the lines removed. Therefore it looks OK to me. -- MarkusLude

Ok, Thanks. -- JuanmaMP