A VBScript program to parse a TechNet Wiki article HTML for problems, many of which affect the Table of Contents feature.
When you publish or edit an article in the TechNet Wiki, you can include the [toc] tag in the body of the article to have a Table of Contents automatically created. The Table of Contents is created from the headings in the article.
When a TechNet Wiki article with the [toc] tag is saved, a script analyzes all of the headings. Headings are enclosed by <h1> and </h1> tags in the HTML, where the digit "1" can be any digit from 1 to 9. For example, a heading line in the HTML might be similar to below before being saved:
<h3>This is an Example</h3>
If a heading does not have an anchor tag (such as in my example), one is added when the article is saved. The name in the anchor tag will be the text of the heading, but with spaces and special characters replaced by underscores. For example, the above heading will be modified as follows when the Wiki article is saved:
<h3><a name="This_is_an_Example"></a>This is an Example</h3>
Each anchor name tag is enclosed by the <a name> and </a> tags. When a Wiki article with the [toc] tag is displayed in a browser, a script creates the Table of Contents from all headings in the article. When you click on an entry in the Table of Contents, the browser will jump to the appropriate heading line by referencing the anchor name tag. This makes it important that the anchor names be unique in the article. If two anchors have the same name, the Table of Contents will only jump to the first instance of the name.
Several problems in the HTML of TechNet Wiki articles can make the TOC (Table of Contents) feature not work properly. The VBScript program linked on this page is designed to parse the HTML to find these problems. The script finds the following problems:
<h3><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a>Windows Server 2008 R2</h3>
This article was saved 4 times, and each time the anchor tag was not recognized and was re-created from the text of the heading. This heading line will not appear in the Table of Contents because the anchor tag name is still not recognized by the script that creates the TOC. The fix is to remove the extra anchor name tags (so there is only one) and replace the "0" character with something else, such as the string "Oh", in the remaining anchor tag. For example, the heading above could be fixed as follows:<h3><a name="Windows_Server_TwentyOhEight_R2"></a>Windows Server 2008 R2</h3>
In addition, the VBScript program finds all duplicate anchor name tags in the TechNet Wiki article, whether in headings or not.
In preparation to run the script, open the TechNet Wiki article and click on the "Edit" tab. In the Wiki editor select "HTML" near the bottom (the default selection is "Design"). Copy the entire HTML of the article into a text file and save it. Then run this script and either specify the file with the HTML of the article as a parameter or let the script prompt you for the file.
The output of the script will display the headings where problems were found. However, the script does not fix the problems. You will need to edit the Wiki article, select HTML in the editor, find the problem lines, and fix them yourself. You can search on the string "<h" (without the quotes) to find all heading lines in the HTML.
The script output indicates the problem and the heading. If headings have duplicate anchor name tags, the script will list the duplicates twice, once as duplicate headings, then again as duplicate anchor tags. If the heading has embedded zero characters, only the first anchor name tag is displayed, to avoid the very long heading that results when the anchor tag is repeated many times. When decimal RGB color values are found, the script also outputs the closest standard color name in square brackets. Typical output can be similar to below:
----- Analyze Headings
## Embedded "0" in name tag: <h3><a name="Server 2008"></a>Server 2008</a></h3>
## Blank heading: <h4></h4>
## Blank heading: <h3><a name="see_also"></a></h3>
## Duplicate anchor name tag: <h1><a name="see_also"></a>see also</h1>
Number of problems in headings: 4
----- Analyze Anchor Name Tags
## Duplicate anchor name tag: <a name="see_also"></a>
Number of duplicate anchor name tags: 1
----- Search for color values
## Found rgb color: rgb(192, 0, 0) [best standard color: FireBrick]
## Found rgb color: rgb(42, 42, 42) [best standard color: DarkSlateGray]
## Found rgb color: rgb(255, 0, 0) [best standard color: Red]
Total color values found: 3
ParseWiki.txt <<-- Click here to view or download the program