Creating an Extraction Rule for VSTS 2008 Web Tests

Extraction rules are essentially for finding data in the HTTP response and placing it in the output context of the web test. There are a few built-in tests, but they mostly focus on the HTML tags themselves and the attributes. In my case I really needed the data between span tags. I think this could probably be done with the existing rules and some regular expressions, but I couldn’t resist the chance to write some code and learn something new.

All you need to do is place the class file in the Test Project and compile it. The rule automatically becomes available to the tests in the project.  Here is my class that finds a span for a given ClientId. It overrides the Execute method of the ExtractionRule base class and attempts to find a span for the given ID. If the span is found, it parses the HTML string to find the content of the span tag.

namespace WebTestDemo

{

    [System.ComponentModel.DisplayName("Span Extractor")]

    public class ExtractSpan : ExtractionRule

    {

        // The name of the desired input field

        private string nameValue;

        public string ClientId

        {

            get { return nameValue; }

            set { nameValue = value; }

        }

 

        public override void Extract(object sender, ExtractionEventArgs e)

        {

            string[] tagTypeFilter = new string[] { "span" };

 

            //Fail the test if nothing is found (this may need to be modified)

            e.Success = false;

 

            if (e.Response.HtmlDocument != null && e.Response.IsHtml)

            {

                try

                {

                    //Find the span tag based on ID. Exception if none found

                    HtmlTag result = e.Response.HtmlDocument.GetFilteredHtmlTags(tagTypeFilter).First(t => string.Equals(t.GetAttributeValueAsString("ID"), this.nameValue, StringComparison.OrdinalIgnoreCase));

 

                    //The span was found (no exception), now find the data

 

                    //Get the location of the ID in the span tag

                    int startPosition = e.Response.BodyString.IndexOf(this.nameValue);

 

                    //Find the position of the data immediately following the closing angle bracket of the span,

                    //  accounting for the > character as well

                    startPosition = e.Response.BodyString.IndexOf(">", startPosition) + 1;

 

                    //Get the position of the closing tag for the span

                    int endPosition = e.Response.BodyString.IndexOf("</span>", startPosition);

 

                    //Fetch the content

                    string content = e.Response.BodyString.Substring(startPosition, endPosition - startPosition);

 

                    //Add the value to the context output. This could just as easily go to a file or a DB

                    // this step is not necessary for the extraction to succeed

                    e.WebTest.Context.Add(this.ContextParameterName, content);

 

                    //Mark the extraction as successful

                    e.Success = true;

                }

                catch (Exception)

                {

                    e.Success = false;

                    e.WebTest.Context.Add(this.ContextParameterName, string.Format("span tag id={0} not found", this.nameValue));

                }

            }

        }

    }

}

Now that I have the class, I need to wire it up to a URL in a web test. Right-click the URL and choose Add Extraction Rule…

AddRuleToTest[1]

I need to set two properties:

  1. Context Parameter Name. This comes from the base ExtractionRule class, and is the name for the data that ends up in the output.
  2. ClientId. This is a custom property from my class. It is the ClientId of the rendered control in the HTML output. The class finds the span with this name and returns the data.

Now when I run the test, if the ClientId I specified was found, it shows up in the Context output after running the test. The Context Parameter Name was “SpanData” in this case:

ExtractionRuleResults[1]

This could be made more robust by not coding specifically for spans. Certainly there could be issues if the tag ID is used more than once or if there is significant nesting of spans within the tag you are trying to find. This code is intended to prove out the concept, it could certainly be made stronger.

One thing I want to mention is that all the code samples I have run across (MSDN included) show the RuleName property as the way to display the extraction rule name in the Visual Studio UI. But under compilation this property comes up as obsolete. I found the answer on Ed Glas’s blog. The obsolete message mentioned using attributes, but this info was not discoverable, so I was quite grateful for that posting to get the syntax correct.

Helpful Links

Must Read VSTS - Testing Related Blogs and Introductory Articles

How to: Create a Custom Extraction Rule

Custom Extraction Rule and Generating a Code Test from VSTS

Posted: Wednesday, August 26, 2009 8:29:00 PM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
.Net 3.5 | Performance | Visual Studio | VSTS

Useful LogParser Queries

If you deal with production servers and you don’t use LogParser, you should. It gives SQL-like abilities to query web server logs an other log types (like Event Logs). Coding Horror has a great article about the tool as well, and a link to a good article about forensic log parsing. Here are my most used web queries:

Find Pages with 500 errors
logparser "SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM C:\ProdLogFiles\ex*.log WHERE (sc-status >= 500) GROUP BY cs-uri-stem, code ORDER BY Hits DESC" -o:CSV >> C:\Data\ErrorPages.csv

Find 404 Requests
logparser "SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM C:\ProdLogFiles\ex*.log WHERE (sc-status = 404) GROUP BY cs-uri-stem, code ORDER BY Hits DESC" -o:CSV >> C:\Data\404Pages.csv

Find the Slowest Pages
logparser "SELECT TOP 100 cs-uri-stem AS Url, MIN(time-taken) as [Min], AVG(time-taken) AS [Avg], max(time-taken) AS [Max], count(time-taken) AS Hits FROM C:\ProdLogFiles\ex*.log GROUP BY Url ORDER BY [Avg] DESC" -o:CSV >> C:\Data\SlowPages.csv

Posted: Thursday, August 06, 2009 7:41:00 PM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
ASP.NET | IIS | Performance | Tools

DataSets and Calculated Columns

Ran into a performance issue in a .Net remoting situation. A Winforms app is calling an application server asking for data. A relatively large DataSet (>10,000 rows, <6 columns) being passed over the wire was causing a performance problem. The database and application servers processed it quickly. Examining the transfer with WireShark showed that the transfer wasn't so bad either. There was a flurry of data passed, and then a bunch of waiting on the client-side, with the client CPU usage around 50% the entire duration of the wait. Turns out there is a calculated column in one of the data tables. The column is not calculated on the application server-side, so as not to pass a bunch of data across the wire that would be unnecessary. The calc happens on the client. That was the source of the slowdown and CPU usage. In the end the solution to the problem was not using the calculated column, we found a different solution to fix the business problem. I suppose you could perform the calculation in the SQL statement that was ultimately filling the DataSet. That might take longer to transfer, but won't slow down the client app.

Posted: Wednesday, June 11, 2008 4:06:26 AM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
.Net 2.0 | Performance | Tools | Winforms

Caching Images in IIS 6.0

Yahoo posted a list of rules for improving the performance of your web site, along with a new FireFox-based tool for diagnosing your site's performance, called YSlow.

Their number one rule is to reduce the number of HTTP requests, and this only makes sense. I'll bet most of us ASP.NET developers are well aware of output caching, and how to do this in code. But what about those static files, like images and scripts? Well, there is an IIS setting for that. It's easy to do, and the payoff can be big if you have a very graphic-intense site. Here's what you do for IIS 6:

  1. Open the IIS Management console
  2. Find the directory containing your images (static content only)
  3. Right click the directory, and choose Properties.
  4. Click the HTTP Headers tab.
  5. Check the Enable Content Expiration check box.
  6. Click the Expire After radio button, and choose an interval.
  7. Click the OK button. Done!

The downside is that you won't get the payoff for the first time a user visits the site, but other pages using the same resources will be much snappier. Be aware that caching dynamically created content this way can cause some strange issues, so take care as to what you cache. As always, test it well before you release it and you will be rewarded.

Posted: Monday, August 20, 2007 4:11:05 PM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
IIS | Performance

Read and Learn

An awesome post by Jessica Fosler about finding memory leaks.
Posted: Tuesday, May 09, 2006 12:19:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
Performance

List Databinding Performance With DisplayMembers and ValueMembers

My copy of MSDN magazine arrived last night, and I read the article Practical Tips For Boosting The Performance Of Windows Forms Apps. Good read. Anyway, I was shocked to find out that I have been databinding lists improperly ever since I have been using .Net. I frequently wrote my code like this:

//Bad Code
combobox.DataSource = datatable;
combobox.DisplayMember = "State";
combobox.ValueMember = "Id";

//Good Code
combobox.DisplayMember = "State";
combobox.ValueMember = "Id";
combobox.DataSource = datatable;

Apparenly, order matters very much. In the first example, the combobox binds using the DisplayMember, then rebinds when updated with the ValueMember. In the second example, the binding only happens once.

In our current app, we have two lists that contain thousands of items that need to be bound, so we are binding them during the startup process so the user won't wait when requesting that data. The startup time was reduced by just under 40% by changing the order of the code for binding.

Posted: Wednesday, February 15, 2006 4:36:53 PM (Eastern Standard Time, UTC-05:00)  #    Comments - Trackback
.Net 2.0 | Performance | Winforms