Endjin - Home

Refactoring by pattern using Regular Expressions and ReSharper search by pattern

by Alice Waddicor

This blog describes two tools I used last year to take some of the work out of a repetitive refactor.

The tools were Visual Studio’s Find tool, with a regex, and Resharper’s Search by Pattern feature – I compare my experiences with the two here.

The refactor – adding a class to certain html elements

As part of some work updating one of our applications to use Bootstrap 3, I wanted to search for instances of a particular pattern, and alter those instances.

The aim was to add the class ‘form-control’ to any html input elements which didn’t have that class, and which did have the type attribute ‘text’. (In Bootstrap 3, textual input fields with the class ‘form-control’ get 100% width). These elements were scattered liberally throughout the solution, and it seemed like there should be a better way to alter them than by searching for all input elements and manually checking each one, or even finding all input elements matching this pattern and manually updating each one.

I couldn’t assume that the class and type attributes would always be in the same order, or that there would be a class attribute at all, or that there would be just one class. From the following examples, only the first four elements should be selected:

If you’re a fan of regex puzzles, you may want to work that one out before looking down!

Visual Studio Find with Regex

I initially tried searching for these elements with Visual Studio’s Find tool, applying a regular expression. Since Visual Studio 2013, this applies .NET regular expressions, replacing the earlier custom regex syntax used by 2010 and earlier. You enable a search using Regexes by using the full search menu (Ctrl + Shift + F, or Ctrl + Shift + H to replace), checking the ‘Use Regular Expressions’ box in the Find Options area.

Visual Studio search menu

As usual I used the regexr site to write and test the regex.

I used a negative lookahead (this specifies a group that can’t match after the main expression) to fill in gaps between the pieces of text I wanted to match on.

Here’s the process I went through to build the regex:

<input((?!form-control).)*.type="text"((?!form-control).)*>

  • Excludes anything with form-control (whatever the relative order of class/ type)
  • Class not captured

<input((?!form-control|class).)*(class="((?!form-control|").)*")?((?!form-control|class).)*>

  • Excludes anything with form-control
  • Captures classes where they exist
  • Doesn’t check type

<input((?!form-control|class).)*(class="((?!form-control|").)*")*((?!form-control|class).)*type="text"((?!form-control).)*>

  • Excludes anything with form-control
  • Checks type
  • Captures class when it’s before type

<input((?!form-control|class).)*(class="((?!form-control|").)*")*((?!form-control|class).)*type="text"((?!form-control|class).)*(class="((?!form-control|").)*")*((?!form-control).)*>

  • Excludes anything with form-control
  • Checks type
  • Captures class either before or after type

It’s not pretty. If you are a regex fan you can probably think of a better way of doing that. It also assumes that attributes values will always be wrapped in double quotes.

With the match functional, I was able to find all the elements in the solution that needed updating. The next step was to use the replace option to automatically update these elements, inserting the class ‘form-control’ where it wasn’t present. I needed to check capture groups 2 and 6, and if one of these was populated, retain the group, but insert a value of ‘form-control’. If neither were populated, I needed to insert a class attribute with the value ‘form-control’.

Visual Studio provides some pre-set options for substituting text based on a regex:

clip_image001[6]

… but none of these seemed like exactly what I needed.

At this point, I’m afraid I decided to move on and look at other ways of carrying out the substitution. If you know a way, using a regex substitution, please let me know in the comments!

ReSharper Search by Pattern

Next, I tried ReSharper’s Search by Pattern function, also known as Structural Search and Replace. (Resharper > Find > Search by Pattern).

I’m using ReSharper 9, although this feature (with some differences in behaviour) is also present in earlier versions going back to ReSharper 5.

Search by Pattern provides a language-aware, higher level abstraction for identifying text than a regex. It understands the types of entities that you might be searching for in different types of files in a solution.

clip_image001

Rather than writing one large regex for your entire desired match, you define the ‘placeholders’ (language entities) that you want to check, and then write smaller regexes for the content of these placeholders. In the case of html files, placeholders can represent tags, attributes, attribute values or content.

clip_image002

This means you don’t have to worry about writing a regex to pick out the entities such as attributes whose values you want to check.

It also makes searching for strings where a particular string is not present a bit easier – there is a ‘Should not’ option.

I set up an attribute value placeholder which would match any attribute value not containing ‘form-control’.

clip_image003

Once you’ve set up placeholders, you add them to a search canvas using a $placeholderName$ syntax.

I started off with this pattern:

clip_image005

I also needed to capture all the other attributes, so they could be added back in in the replace stage.

For this, I used an attribute placeholder, without specifying any regex for the attributes it should match.

clip_image006

To allow it to capture all of the remaining unmatched attributes, you can leave the maximal number of elements to match option unchecked. To make the presence of entities matching the placeholder optional, you can set the minimum number of elements to match to 0. This lets you use placeholders as capture groups, without affecting what is matched.

Here’s the pattern I ended up with:

clip_image008

I was worried that the order in which the placeholders were added would affect the results, but this appeared not to be the case. The search offers a ‘match similar constructs’ option, which according to ReSharper’s introductory blog about the search feature, searches for “constructs that are semantically identical”. This option is checked by default, but even when this was unchecked, my search pattern matched elements with both type first, and class first.

The $attributes$ placeholder will mop up any remaining attributes without you having to worry about excluding ones you’ve matched already, or the relative order of the attributes.

The pattern above requires the presence of a class attribute. I couldn’t find a way to match both elements with a class that matched our regex, and elements with no class at all. Because it was relatively quick and easy to search using patterns, compared to using a regex, I decided to just carry out the search and replacement twice, once for elements with a class that didn’t contain the value ‘form-control’, and once for elements without a class at all.

So next, the replacement.

According to ReSharper’s introductory blog about the search feature, “the replace pattern can be any text that is valid for the language you’re using, plus placeholders if you need them”.

For elements which contained a class attribute, I reused the placeholders from the original pattern to add back the element, and its original class and other attributes, adding a class value of ‘form-control’.

ReSharper search by pattern with class

For elements without a class attribute at all, I used the following search and replace patterns, where the $attribute$ placeholder matched any attribute without the name ‘class’:

ReSharper Attribute placeholder should not match class

clip_image011

There’s a video of the refactor being carried out using these two patterns below.

Re-using patterns

A key feature of ReSharper’s search by pattern function is that you can save patterns and re-use them at later dates, on the same or other code bases.

Once a pattern has been saved as a custom pattern, you can set it to show as a suggestion if you come across a match while working on the solution at a later date, letting you apply the refactor as a quick fix (Alt + Enter). You can also specify the suggestion text and the quick fix text (the default text is the pattern, which might not make sense out of context).

Saved patterns

There’s a video showing how to save a pattern, set it to a suggestion, and apply a quick fix below.

Resources

Resources for using search with pattern on the JetBrains site:

Introducing ReSharper structural search and replace

Reference – Search with pattern

Many thanks to Matt Ellis (@citizenmatt) and Hadi Hariri (@hhariri) from ReSharper for their helpful advice on using Search by Pattern.

About the author

Alice is a 3rd year apprentice at endjin, providing engineering services using the Microsoft Cloud. She comes from a writing background, and re-trained because of an interest in technology, particularly data processing, information extraction, and automation.