Lets start off from here:
string Fields = "{url};{displaytext}"; string InputSyntax = "[url={url}]{displaytext}[/url]"; string HtmlSyntax = "<a href=\"{url}\">{displaytext}</a>"; string input = // some input text string output = BBCode.ConvertToHtml(input, InputSyntax, HtmlSyntax, Fields);
After BBCode.Net receive the InputSyntax(The BBCode), it will start to study the syntax structure.
Loop through all the provided fields and replace it with a temporary string.
From this,
[url={url}]{displaytext}[/url]
we got:
[url=^`````````````^]^`````````````^[/url]
replace this
^`````````````^
with a Regex syntax and to construct and Regex Search Pattern to match this search condition:
Look for any block of text that match this pattern:
[url= {any characters} ] {any characters} [/url]
Start Regex Syntax Replacement Process....
string tempInputSyntax = oriInputSyntax.Replace("\\", "\\\\")
.Replace(".", "\\.")
.Replace("{", "\\{")
.Replace("}", "\\}")
.Replace("[", "\\[")
.Replace("]", "\\]")
.Replace("+", "\\+")
.Replace("$", "\\$")
.Replace(" ", "\\s")
.Replace("#", "[0-9]")
.Replace("?", ".")
.Replace("*", "\\w*")
.Replace("%", ".*");
string _regexValue = ".+?";
string RegexPattern = tempInputSyntax.Replace(_tempValueStr, _regexValue);
return RegexPattern;
This is the Regex Search Pattern for this BBCode :
"\\[url=.+?\\].+?\\[/url\\]"
The Regex symbol of "."
(dot) means any character (dynamic pattern). Regex symbol of plus
"+?"
means repeats the previous item once or more until next fixed pattern. Brackets
"["
and "]"
means within a character range. But, in this case, we are not using it as function, it appears there as one of the fixed character of non-field block, therefore we escape it with
"\"
and becomes this "\\["
and this
"\\]"
. Why double slash "\\"
? It is escape sequence for C#. C# will tell you that single slash like this
"\["
is not a recognized escape sequence.
However, if you are entering the Regex formula at UI, not code behind, double slash is not needed.
\[url=.+?\].+?\[/url\]
Read more: Regular Expression Basic Syntax
You can notice that, if the value in 1st field contains the closing bracket symbol
"]"
, then it will break the syntax (syntax error). There are 2 result of this:
<a href="http://www[codeproject]com">The [odeProje[t</a>
Example of text:
Lost of programming tips can be obtained in search engines, example of search engines: [url=http://www.google.com]Google[/url], [url=http://www.yahoo.com]Yahoo[/url], [url=http://www.bing.com]Bing[/url], etc... Ebooks are available too.
Searching the text for InputSyntax with Regex:
using System.Text.RegularExpressions;
MatchCollection mc = Regex.Matches(text, RegexPattern)
Respond.Write(mc.Count.ToString()); // Result: 3
foreach (Match m in mc)
{
string customInsertPart = m.Value;
Respond.Write(customInsertPart);
}
3 blocks are identified and extracted
[url=http://www.google.com]Google[/url]
[url=http://www.yahoo.com]Yahoo[/url]
[url=http://www.bing.com]Bing[/url]
Retrieved the following values from previous processes.
string customInsertPart = [url=http://www.google.com]Google[/url]
string oriInputSyntax = [url=^`````````````^]^`````````````^[/url]
string _tempValueStr = ^`````````````^
string[] nonFieldArray = oriInputSyntax.Split(new string[] { _tempValueStr }, StringSplitOptions.RemoveEmptyEntries);
return nonFieldArray;
the structure of non-field blocks is identified in nonFieldArray
:
Blocks Text Length
------ ------ ------
0 [url= 5
1 ] 1
2 [/url] 6
Get Field Index:
var _idxFields = new Dictionary<int, string>(); for (int i = 0; i < nonFieldArray.Length; i++) { // Remove non Field Block inputSyntax = inputSyntax.Substring(nonFieldArray[i].Length, inputSyntax.Length - nonFieldArray[i].Length); // Get Field index foreach (string s in _fields) { if (inputSyntax.Length < s.Length) break; // Calculate the Field's Length string b = inputSyntax.Substring(0, s.Length); // Check, if the current field's name // If match if (b == s) { // Add the field and index into dictionary _idxFields[i] = b; // Remove field from inputSyntax inputSyntax = inputSyntax.Substring(s.Length, inputSyntax.Length - s.Length); break; } } } return _idxFields;
The structure of the InputSyntax is studied.
Result:
_idxFields
Count = 2
[0]: {[0, {url}]}
[1]: {[1, {displaytext}]}
Get values:
var _idxValues = new Dictionary<int, string>(); for (int i = 0; i < nonFieldArray.Length; i++) { // Remove non field block customPart = customPart.Substring(nonFieldArray[i].Length, customPart.Length - nonFieldArray[i].Length); // Current non-field block is the last block // Terminate the loop. // No more value block should exist after last block if (i + 1 >= nonFieldArray.Length) break; // Detect next non-field block and calculate value length int v = customPart.IndexOf(nonFieldArray[i+1]); // Get the index and value into dictionary _idxValues[i] = customPart.Substring(0, v); // Remove the added value from input text customPart = customPart.Substring(v, customPart.Length - v); }
Values obtained. Stored inside _idxValues.
_idxValues
Count = 2
[0]: {[0, http://www.google.com]}
[1]: {[1, Google]}
// Loop through all values foreach (KeyValuePair<int, string> kv in _idxValues) { bool portentialScriptExists = false; // Find out whether the value contains "<" if (kv.Value.Contains("<") || kv.Value.Contains("<")) { _idxValues[kv.Key] = ""; portentialScriptExists = true; ; } if (portentialScriptExists) { StringBuilder sb = new StringBuilder(); // Recombine the non-Fields with original values for (int n = 0; n < nonFieldArray.Length; n++) { sb.Append(nonFieldArray[n]); if (_idxFields.ContainsKey(n)) sb.Append(_idxValues[n]); } // Return the filtered value, the Html Conversion is skipped. return sb.ToString(); } }
Fill in All Extracted Values into HtmlSyntax's Field
foreach (KeyValuePair<int, string> kv in _idxFields)
{
HtmlSyntax = HtmlSyntax.Replace(kv.Value, _idxValues[kv.Key].Replace("<", "<"));
}
Final step, replace InputSyntax in Text with HtmlSyntax
text = text.Replace(InputSyntax, HtmlSyntax);