Performance Recovery in Maven Projects Using HtmlUnit

test-code-sample

Introduction

Test-driven development, continuous integration, and Maven have drastically improved web application development, particularly in the arena of integration testing. Combined with tools such as Selenium 2/Webdriver, Sauce Labs,  Sonar, QAPlug,  Mockito, HtmlUnit..,  producing thoroughly-covered applications a is straightforward process.

Gargoyle Software’s HtmlUnit is a headless browser/acceptance testing tool that performs web page validation in a browser-agnostic fashion. In other words: It tests the behavior of the application, not the browser (although browser designation is an option). Web support (protocols, cookies, headers...) as well as application support (expected Exceptions, interaction with JUnit, Mockito, TestNG...) are included. Under the covers, HtmlUnit uses Rhino (http://www.mozilla.org/rhino/) to process Javascript.

In development, we use HtmlUnit/JUnit 4/Mockito with the Maven Failsafe plugin to carry out our unit and integration tests. This approach allows us to easily track down test failures, and to receive IDE/CI/Logback reporting of their causes.

That said: One issue that has cropped up on a few occasions has been performance. HtmlUnit’s Javascript processing features, while powerful, are a work-in-progress. The HtmlUnit API makes this clear in Webclient.getJavascriptEngine(): “This method is intended for testing only - use at your own risk.

When It Breaks, Investigate

Recently, while working on one of several web applications, I discovered the build time of one of the apps slowed considerably. What took about 10 minutes to build had suddenly jumped to 45 minutes (and climbing)! This was unacceptable because both apps were slated for launch in just a few weeks.

An unnecessarily long build time impedes the tightly-coupled effort between DEV and QA that often emerges when a deliverable nears launch.

When I looked at the code commits leading up to the performance hit, nothing jumped out as an obvious cause. Both apps shared a library undergoing heavy development, but my investigation revealed nothing with regard to degradation. The next step was to check out the build log.

In the build log, I discovered one of our tests had leapt from a few seconds to a few minutes! No backend changes could have caused this, but the frontend was in the process of changing scripts, adding web tracking, and making calls to third party APIs.

Preparing for the Breakage

In order to prove this actually happens, we are going to reproduce the situation.

Ingredients:

  • One Maven project including the maven-failsafe-plugin
  • HtmlUnit
  • Servlet container of your choice

Create a simple project with a greeting page containing a form that posts and renders a result.

I’m not including the steps to do this because the focus here is on HtmlUnit, not on setting up an application or environment. Whether you use Spring WebMVC/Struts, Tomcat/ Jetty, Eclipse/IntelliJ is irrelevant.

Create tests for simple webapp and verify that they work as expected:

preparingForBreakage

My fully tested webapp. Life is good.

Your test should be setup such that it starts and stops the servlet container, running the integration test against the deployed server (ie a [test]IT running in goals clean verify). Along with the unit tests should be one happy path test that submits the home page and validates the returned success page. This test will become the source of performance issues.

My test looks like this:

@Test

public void enterAName() throws IOException {
   HtmlPage homePage = webClient.getPage(HOME_URL);
   String homePageTitle = homePage.getTitleText();
   assertThat(homePageTitle, is(HOME_PAGE_TITLE));
   final HtmlForm userDetailsForm = (HtmlForm)homePage.getElementById(FORM_NAME);
   ((HtmlTextInput) userDetailsForm.getElementById(NAME_ELEMENT)).setValueAttribute(NAME_VALUE);
   HtmlPage successPage = userDetailsForm.getElementById(SUBMIT_ELEMENT).click();
   String successPageTitle = successPage.getTitleText();
   assertThat(successPageTitle, is(SUCCESS_PAGE_TITLE));
}

If this were an actual project, then I’d likely move the webClient into an abstract superclass and fetch the initial page in the setup method.

The end-to-end clean verify process is currently taking about 20 seconds on my machine. Let’s see if we can mess it up.

Messing with the Front End

Let’s add some latency to the front end. We’ll start with some JQuery and a call to Google Maps. Using a quick iframe for the maps call, an absolute reference to the jquery script, and a simple animation, the JSP becomes:

<%@ page contentType="text/html;charset=UTF-8" language="java" %>

<html>
<head>
<title>Home Page</title>
<script type="text/javascript" src="http://code.jquery.com/jquery-1.6.4.min.js"></script>
<script type="text/css">
.mover{
   border-width:2em;
}
</script>
</head>

<body>
<h1>Welcome Home</h1>
<form id="userDetailsForm" action="processName" method="post">
Name:<input type="text" id="userName">
<input id="submitForm" type="submit" value="Submit">
</form>

<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=ROFL&amp;aq=&amp;sll=40.880295,-117.46582&amp;sspn=22.780606,24.389648&amp;vpsrc=0&amp;ie=UTF8&amp;hq=ROFL&amp;hnear=&amp;ll=35.373292,-119.018712&amp;spn=0.096161,0.095272&amp;t=m&amp;z=13&amp;iwloc=A&amp;cid=2347641919807782017&amp;output=embed"></iframe><br /><small><a href="http://maps.google.com/maps?f=q&amp;source=embed&amp;hl=en&amp;geocode=&amp;q=ROFL&amp;aq=&amp;sll=40.880295,-117.46582&amp;sspn=22.780606,24.389648&amp;vpsrc=0&amp;ie=UTF8&amp;hq=ROFL&amp;hnear=&amp;ll=35.373292,-119.018712&amp;spn=0.096161,0.095272&amp;t=m&amp;z=13&amp;iwloc=A&amp;cid=2347641919807782017" style="color:#0000FF;text-align:left">View Larger Map</a></small>

<div class="mover">Moving</div>
</body>

<script type="text/javascript">
$(move);
function move(){
   $('.mover').animate(
      {fontSize: '+=40'}, 1000, 'linear',
      function(){
         $(this).after(stop());
    });
}

function stop(){
   $('.mover').html("Done!!!");
}
</script>
</html>
  • I don’t recommend using the iframe approach to embed a Google map into a web application. Building the map using Google Maps API offers greater customization, control, and browser integration (particularly with older versions of IE on WinXp).

What happened to the integration tests? They’re now taking about one minute to run! The test hasn’t changed - All it does is opens a page, enters a name, submits a form, and verifies the result.

Disabling Javascript

Since this test doesn’t depend on the scripts running in the front end, the performance situation can be improved by simply disabling Javascript in HtmlUnit’s browser. Adding the line to the setup() method knocks the test time back down to roughly 20 seconds:

webClient.setJavaScriptEnabled(false);

  •  If the webClient is provided by an abstract superclass, then JavaScript disablement/enablement should either occur in setup() and teardown(), or should be toggled off by default and on as needed.

Adding Elements

After some more development, my test starts failing again. This time, it’s complaining about a missing form element:

addingElements

My broken test.

HtmlUnit can’t find this element because the application supports multiple ages, and it dynamically adds them to the DOM via Javascript as the user indicates their need. Enabling Javascript will allow interaction with the dynamic parts of the page, but will also cause the build performance issues to return.

If build performance is top priority, then the unavailable-on-load element can be injected into the DOM during testing with something like:

HtmlElement ageElement = homePage.createElement("input");
ageElement.setAttribute("name", "age");
ageElement.setAttribute("value", "80");
userDetailsForm.appendChild(ageElement);

This will keep the build time low, but will also introduce a hole in the test coverage. Should the element’s identifier change in the actual script, this test will still pass - An additional test will need to be written for that. It’s not the cleanest solution, but neither is scriptural addition of required DOM elements.

Additional Notes

A useful way to determine what’s available and happening on a page as the test runs is to log its source code. This can be done by calling asXml() in the HtmlPage object.