- 10 minutes read

Did you ever modify dozens or even hundreds of classes of your program? It doesn't have to be something complicated. Just imagine a minor change of the corporate design. Please make the headers blue. Oh, also remove the rounded borders and the shadows from the input fields. And while you're at it, the buttons should have that fancy ripple effect of Material Design, but only the "OK" buttons.

Guess you're awe-stricken? I'm not. I'll write a tool for that.

Large-scale refactorings

Probably you've already noticed it: this article goes a bit beyond refactoring. You can use my idea to restructure code. Time and again, I've used it to add a feature to each of the 70+ components of BootsFaces. Sometimes I've even used it to implement a breaking change of the API of the corporate framework. In other words: I modified half a dozen applications simultaneously without anybody noticing. After committing and pushing the change, the application still worked. The only difference being that fancy new feature my co-workers could use.

You need a program for that!

If you're about to modify your source code at large scale, write a program for that. You'll see that this is easier than trying to modify everything manually. Plus, you can test your changes. You can run the program as many times as it takes. It's easy to roll-back the changes and to try again, this time with slightly different parameters.

That takes the sting out of refactoring. Or restructuring, or adding features to your code.

Pedantic section: Adding features != refactoring != restructuring

My peer group uses the word "refactoring" in a generalized way, meaning "modifying the program to make it better." When I wrote the title of this article, I speculated you're using the term in the same broad sense.

Nonetheless, I like the definition of refactoring given by Martin Fowler. Refactoring is applying an algorithm modifying your code without changing the behavior of your source code.

Of course, most people use refactoring to improve the code. But that's not part of the definition. The bottom line is to provide an algorithm for modifying your code without breaking it.

That, in turn, makes refactoring such a versatile tool. Two decades ago, nobody dared to rename a variable. As things go, the role of a variable in an application often changes over time. In programming, it's not unusual that a variable starts as a "traffic-light" and becomes a "roundabout" after a couple of releases. I've often seen variables claiming to do one thing while playing a completely different role in reality. In the old days, people added a comment informing their coworkers that the "traffic-light" is a roundabout since last year's release, but that it was too risky to change the name.

Nowadays, almost every IDE offers a decent "rename" refactoring. So developers change the name of variables all the time without thinking twice.

Putting the idea into practice

Come to think about it, what your IDE does is to transform your source code by an algorithm. In other words, they've written a program for it.

If you're reading this article, you're almost certainly a programmer. You can write a program, too.

Mind you: the source code of your class is just a string. Almost every programming language is very proficient in manipulating strings.

For instance, once upon a time, I wanted to add a feature to every BootsFaces component. Simplifying things a bit, what I had to do is

  • take each of the 70+ component classes,
  • add an attribute at the beginning of the class,
  • and add a getter and a setter at the end of the class.

In reality, it was more than one attribute, but even so, the task seems scary if done manually. It's almost impossible to apply the same modification 70 times manually without errors.

Maybe you can do it using a global search and replace with a smart regular expression, but honestly: that's almost the same as writing a program.

How to write a program for that

On the other hand, writing a program for that isn't difficult. You just have to

  • scan the directory recursively for Java files,
  • filter the "*Component.java" files,
  • read them,
  • add the new variable after the constructor,
  • and add the getter and setter at the end of the class,
  • and write the new version of the file back to disk.

That's the general pattern: find the files you want to manipulate, read them, manipulate them, and write them. Easy does it!

What if it's not that simple?

OK, sometimes it's not that simple. You start to manipulate the code, and after a successful beginning, you encounter one special case after another. There's no point in pressing on. Time-boxing is a good strategy to avoid such scenarios. Give yourself an hour, a day, a week to solve the problem. After that time limit, it's time for plan B.

Today, I ran into exactly this scenario. As you may or may not know, I'm moving my blog from Wordpress to Angular. The first version of the relaunch simply used the original CSS files. Of course, I wanted to optimize that. The original CSS files support many features I'll never use, so let's get rid of them.

In theory, it's easy to find out which CSS rules are used and which are not. Just examine the class="whatever" snippets in your Angular source code files. Collect all these whatever class names in a list.

After that, read the CSS file and examine it rule by rule. If the CSS selector of the rule contains the .whatever selector (or any other from your list), keep the rule. If the CSS selector contains any CSS class name which is not on your list, drop it.

Euphoria

It's such a simple plan. Nothing can go wrong! The first CSS file I examine was easy going. I ran the algorithm over it, and a much smaller CSS file rewarded me.

I'm so proud of this success I'll show you the source code in a minute. But first, let me show you my Waterloo - and what I made of it.

Plan B

Thing is, the second CSS file I examined was a tad more complicated. It contained media queries, which make use of nested curly braces. My original algorithm wasn't up to that.

That's a valuable lesson: Do write a program to refactor code, but only do it as long as you can wrap your head around it. As to my experience, adding special cases soon results in unmaintainable code. That's precisely where we do not want to go. The idea of writing a program for refactoring - or restructuring - is to make the transition manageable. If things get difficult, be fast to abort the project.

So that's what I did when I encountered media queries and "keyframes." The latter are animations, which are also expressed using curly braces - in other words, another level of nested curly braces. I ignored the "time box" rule. I invested enough time to implement a working algorithm eliminating unused CSS rules.

Only I'd lost my self-confidence by that time. I did not want to rely on my algorithm.

Writing a program to analyze the code and to provide hints what to do

Even so, the program I'd written was useful. It gave me suggestions which parts of the code I can safely delete and which I must retain.

That's what I did. I examined the recommendations of my refactoring algorithm, double-checked them, and removed those CSS rules that still seemed superfluous.

How it's done

I've simplified my CSS example by a margin, but still, it's 86 line long. That's a lot for a blog post. On the other hand, that's something you jot down in one or two hours. Compare that to the challenge to find out which CSS classes are used or not manually.

Let's begin with the simple part. The code snippet extracts the CSS classes from the HTML and Angular code. It's a node.js program written in JavaScript, using the split() function to find the class="whatever" declaration in the HTML file. For example, content.split(/class=/) yields an array. The first entry is irrelevant. It contains the source code before the first class definition. The other entries start with the class="whatever" snippet. A peculiarity of JavaScript is that it cuts off the class= bit.

const fs = require('fs'); const usedClasses = {}; function findCSS(filename) { const content = fs.readFileSync(filename, 'utf8'); const chunks = content.split(/class="/); chunks.forEach((c, index) => { if (index > 0) { const pos = c.indexOf('"'); const classes = c.substring(0, pos).split(' '); classes.forEach(clazz => { if (clazz.length > 0) { usedClasses[clazz] = true; } }); } }); } function scanDirRecursive(dir) { const files = fs.readdirSync(dir); var declarations = ''; var imports = ''; files.forEach((f, id) => { if (!(f.startsWith('.') || f.endsWith('.spec.ts'))) { const name = dir + '/' + f; if (fs.lstatSync(name).isDirectory()) { scanDirRecursive(name); } else { findCSS(name); } } }); } scanDirRecursive('../src/app'); scanDirRecursive('../src/assets/articles/'); tidy('my-favority-css-file.css');

Anatomy of the program

This program follows a standard pattern:

  • Scan the directory recursively,
  • read every source code file,
  • analyze it and collect relevant information,
  • modify the file based on this information,
  • and write it back to the disk.

Writing the modified version back to disk

As I've mentioned above, the beauty of this approach is that you can always go back in time. If you're using a tool like Git or Subversion, that is. Just revert the changes, correct the algorithm modifying the source code, and start over.

The algorithm modifying the source code resembles the previous algorithm. It uses the split() function excessively. If you're more familiar with regular expressions, you can use them to achieve the same goal, just more readable.

function tidy(filename) { const content = fs.readFileSync(filename, 'utf8'); let remainingCSS = ''; const chunks = content.split('}'); chunks.forEach((c, index) => { const rule = c.split('{'); if (rule.length > 1) { const identifiers = rule[0]; const css = rule[1].trim(); const ids = identifiers.split(','); ids.forEach(id => { id = id.trim(); const classes = id.split('.'); let used = true; classes.forEach((c, j) => { if (j > 0) { if (c.includes(' ')) { c = c.split(' ')[0]; } if (!usedClasses[c]) { used = false; } } }); if (used) { remainingCSS += id + ' {\n' + css + '\n}\n\n'; } }); } }); fs.writeFileSync(filename.replace('.css', '.short.css'), remainingCSS); }

Wrapping it up

Many a time I've been puzzled why nobody but me does large-scale refactorings or restructurings. I'll never know for sure. Maybe I'm just bold and lightheaded, bordering on being irresponsible. Be that as it may, I prefer to believe that the key to success is not pure luck, but using and creating tools manipulating source code.

Give it a try. You'll notice it's easier than expected!


Comments