When you publish or edit an article in the TechNet Wiki, you can include the [toc] tag in the body of the article to have a Table of Contents automatically created. The Table of Contents is created from the headings in the article.

When a TechNet Wiki article with the [toc] tag is saved, a script analyzes all of the headings. Headings are enclosed by <h1> and </h1> tags in the HTML, where the digit "1" can be any digit from 1 to 9. For example, a heading line in the HTML might be similar to below before being saved:

<h3>This is an Example</h3>

If a heading does not have an anchor tag (such as in my example), one is added when the article is saved. The name in the anchor tag will be the text of the heading, but with spaces and special characters replaced by underscores. For example, the above heading will be modified as follows when the Wiki article is saved:

<h3><a name="This_is_an_Example"></a>This is an Example</h3>

Each anchor name tag is enclosed by the <a name> and </a> tags. When a Wiki article with the [toc] tag is displayed in a browser, a script creates the Table of Contents from all headings in the article. When you click on an entry in the Table of Contents, the browser will jump to the appropriate heading line by referencing the anchor name tag. This makes it important that the anchor names be unique in the article. If two anchors have the same name, the Table of Contents will only jump to the first instance of the name.

Several problems in the HTML of TechNet Wiki articles can make the TOC (Table of Contents) feature not work properly. The VBScript program linked on this page is designed to parse the HTML to find these problems. The script finds the following problems:

  1. Embedded "0" characters in headings. This character is no problem in the text of the heading. However, the script that runs when the article is saved will include the "0" in the anchor tag name. This causes the heading to not appear in the TOC at all. It is hoped that this bug will be fixed in the future, but for now the workaround is to replace all "0" characters in the anchor name tag with something else, such as the string "Oh". In addition, when the article is saved, the anchor tag with the "0" characters is not recognized, so the script creates another with the same problem. This adds to the "trash" in the HTML, making it harder to read. For example, HTML similar to the following can be found in a Wiki article when a heading has the zero character (this one line will be word wrapped):

    <h3><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a><a name="Windows_Server_2008_R2"></a>Windows Server 2008 R2</h3>

    This article was saved 4 times, and each time the anchor tag was not recognized and was re-created from the text of the heading. This heading line will not appear in the Table of Contents because the anchor tag name is still not recognized by the script that creates the TOC. The fix is to remove the extra anchor name tags (so there is only one) and replace the "0" character with something else, such as the string "Oh", in the remaining anchor tag. For example, the heading above could be fixed as follows:

    <h3><a name="Windows_Server_TwentyOhEight_R2"></a>Windows Server 2008 R2</h3>

  2. Duplicate anchor name tags in headings. This can cause the Table of Contents feature to link to the wrong location in the article.
  3. Blank headings (no text to be displayed). This can result in blanks in the Table of Contents, and the TOC may link to the blank line.
  4. Leading digits in headings. This is usually not a problem. However any leading digits in the heading are not included in the anchor name tag created by the TOC script. This can in some cases cause the anchor name tag to be a duplicate, if two headings only differ by the leading digit.

In addition, the VBScript program finds all duplicate anchor name tags in the TechNet Wiki article, whether in headings or not.

ParseWiki.txt <<-- Click here to view or download the program

In preparation to run the script, open the TechNet Wiki article and click on the "Edit" tab. In the Wiki editor select "HTML" near the bottom (the default selection is "Design"). Copy the entire HTML of the article into a text file and save it. Then run this script and either specify the file with the HTML of the article as a parameter or let the script prompt you for the file.

The output of the script will display the headings where problems were found. However, the script does not fix the problems. You will need to edit the Wiki article, select HTML in the editor, find the problem lines, and fix them yourself. You can search on the string "<h" (without the quotes) to find all heading lines in the HTML.

The script output indicates the problem and the heading. If headings have duplicate anchor name tags, the script will list the duplicates twice, once as duplicate headings, then again as duplicate anchor tags. If the heading has embedded zero characters, only the first anchor name tag is displayed, to avoid the very long heading that results when the anchor tag is repeated many times. Typical output can be similar to below:

----- Analyze Headings
## Embedded "0" in name tag: <h3><a name="Server 2008"></a>Server 2008</a></h3>
## Blank heading: <h4></h4>
## Blank heading: <h3><a name="see_also"></a></h3>
## Duplicate anchor name tag: <h1><a name="see_also"></a>see also</h1>
Number of problems in headings: 4
----- Analyze Anchor Name Tags
## Duplicate anchor name tag: <a name="see_also"></a>
Number of duplicate anchor name tags: 1