Pick the Right Tool for the task at Hand!

We’ve been told many times in our life that each tool is useful for certain tasks and using a tool for something unrelated or even somewhat related is not a very smart decision. For instance you do NOT use a hammer in order to cut a piece of wood! Maybe after a lot of struggling you will be successful (breaking it from a certain place you wanted maybe! :) ) but when you say it out loud (like I did now) it seems even more ridiculous! But sometimes it is not as irrelevant as the hammer situation. You can achieve what you’re looking for or you can get to where you want, but with extra effort and more time, cost and energy that is necessary which can be avoided! Imagine you want to get from point A to point B in the following picture:

Traveling form point A to B

Point A to B

There are different ways to do that. One is the blue path and the other is the red path (and many more). Both are doing the job and get you to the destination but I’m sure everyone agrees that blue path is better and more effective (from time, cost, energy, etc. perspectives). It’s kind of the same concept as using the appropriate tool in order to do the task at hand. When you do not use the appropriate tool, you’re taking the red path, probably you would get to your destination but with much more struggling that is necessary! On my machine, I have a directory which I store the episodes of a show I’m watching in and each episode files are stored in a subdirectory named in the form of a four-digit number ####! For some weird reasons I need to know what are the last 3 episodes of the show that I have. There are two different tools that I can use in order to get this small task done. One is a simple program in whatever language I want. Lets say Ruby as an example and the code can be something like the following:

class LatestEpisodesFetcher
 def initialize(entries)
  @entries = entries
 end

 def fetch
  return [] if @entries.empty?
  last_three = @entries.map(&:to_i).sort[-3..-1]
  last_three.select { |num| num != 0 }
 end
end

print "#{LatestEpisodesFetcher.new(Dir.entries(".")).fetch}\n"

And the other is a simple Bash script which can look like the following:

#!/bin/bash

set -e
ls | grep -v "whatever you don't want" | 
     sort -n | 
     tail -3 | 
     xargs

It’s super clear which is the red path and which one is the blue path. You can tell that EVEN by just looking at them and not more detailed technical reasons (e.g Bash is taking advantage of some simple, very efficient and super-fast UNIX commands for working with file system and its structure directly but in case of the Ruby code in best case they are some thin wrappers around the appropriate system calls and some other data processing on the results) Of course you can write the Ruby code in a more succinct way and even probably a one liner in more idiomatic Ruby style but the code I put here is showing my point more explicitly, so I’m gonna leave it that way. UNIX commands are pretty great for doing fast and quick analysis on files and directories and you can extract some interesting information (not just the weird one I showed here) very efficiently in order to get some perspective. And the great philosophy of UNIX — Everything is doing ONLY one thing and does it WELL! – gives you the ability to mix/compose all these small programs to achieve great stuff. The point I’m trying to make is pretty simple: pick the right tool for the task at hand! To be honest it did not need me to preach this much about it and bore you! So, sorry for that and happy hacking :)

Single Level of Abstraction (Don’t mix things in the wrong place)

This is going to be a short post on a very interesting and important programming concept that IMHO is really helpful and make a nice difference in the code! It’s about not mixing things that have different levels of abstraction in a method! The name of this technique is Single Level of Abstraction (not surprising really) by Kent Beck and it’s about not putting things that are in different levels of abstraction and details in the same method! It’s better to have same level of abstraction for the statements inside a method! It’s one of those somewhat fuzzy technique/principles in programming so I think an example would be good right now. For making the point lets imagine a VERY simple and almost-unreal piece of code! Lest say we store a matrix in a file and we want to compare two different matrices that are sitting in two files and check whether they’re equal or not. At one point, our compare method looks like the following:

class MatrixFileComparer:
…
def compare(self):
  with open(self.file1) as f1:
    lines1 = f1.readlines()
  with open(self.file2) as f2:
    lines2 = f2.readlines()
  return self.compare_lines(lines1, lines2)
…

As I said it’s a very simple piece of code and obviously it reads lines of each file and compare those lines together one by one! If you pay attention to the content of compare method, it’s more like an integration and delegator point of the MatrixFileComparer class! That means it is delegating different responsibilities to appropriate methods inside/outside of this class each of which doing one thing and do it well, then integrates results/effects of them! Such a method in a class should not have any duplication and any low level task or implementation detail and also it’s mostly invoking/delegating to other methods as I mentioned before! But if you look at our compare method, it’s also doing a lower-level task which is opening a file and reading all of its lines via the file handler object! Right there you can notice the different levels of abstraction in this method! One task is reading the content of a file, which is obviously more fine-grained and lower-level than the other thing, which is JUST delegating to compare_lines method!

Now lets fix this mixture of abstraction levels and see what it looks like:

…

def compare(self):
  lines1 = self.read_all_lines(self.file1)
  lines2 = self.read_all_lines(self.file2)
  return self.compare_lines(lines1, lines2)

def read_all_lines(self, file_name):
  with open(file_name) as f:
  return f.readlines()
…

Now if you look at the compare method, ALL it does is just method invocation and delegating the job to the appropriate method and getting its result in order to use in another step! The interesting thing is that all the steps in compare method are at the SAME LEVEL of ABSTRACTION now, unlike before that one step was doing something lower-level and the other step was doing something in a higher level of abstraction! It happened to eliminate the duplication that we had in our previous version of compare method as well but that’s just a bonus and it could have happened the other way around!

IMHO it’s a really interesting principle and helps to improve the code further! And the nice thing about it is that you don’t have to jump between different levels of abstraction and details in your mind while reading the code inside a method! Because when you have to do that and all of a sudden the method goes into more details and lower-level implementation tasks, it’s just a distraction from understanding WHAT that method is doing in each step and show you HOW it’s doing something in the middle of your higher level view! That’s a different level of abstraction and belongs to another method of the class!

Anyway, I should stop preaching now! I hope you find it interesting and helpful as well!

Happy Hacking!

Wishful Thinking & Test-Driven-Development

In the past few months I’ve been doing something a little bit different from the approach that I usually take while programming/developing software. If you’ve read the GREAT book SICP, you should be familiar with the term “Wishful Thinking”! (If you haven’t, I HIGHLY recommend reading it and also watching the talks by Professor Sussman and Abelson on the topics which you can find here)!

There are tons of places that you can see/hear that term but that book is probably one of the first places that talk about this idea. And it’s basically about not thinking about too many levels of abstraction at the same time and not jumping around with your brain for solving a problem. That means lay down the steps that you need to take to accomplish something and don’t think about HOW to do each of those steps at that very moment. It’s like a 1000 feet view of that specific task at hand. When you have the map for the steps, when you know WHAT are those steps, then try to figure out HOW to attack each of them. Then you focus in! Think of your whole application as a map and what you do here is basically taking a piece out and focus in in few steps! First laying down the steps of work needs to be done in order to accomplish that task and then figuring out HOW to do each of those steps! Hopefully following picture will give you some ideas.

map_focus_bubble_better

This sounds a lot like up-front design and frankly, if the level of abstraction that you are dealing with at one point is a whole application or even something really big in your application, you ARE doing an up-front design. And anybody knows that up-front design is not a good idea in most cases and I’m not going to talk about that, thousands of people said that way better than I ever can. But at the same time if we use this technique (Wishful Thinking) at a correct level of abstraction or at the right point, it can be really helpful and powerful. Imagine we’re writing a twitter client* application and the feature we’re trying to implement (the task at hand) is getting the last tweet of all the people you are following (what a useful feature BTW)! The naïve (for blog purposes) implementation steps for this feature is going to be something like the following:

  • Get the list of people who the current user is following.
  • Iterate through each of those people.
  • Get the last tweet for them.
  • Pack those tweets and Tada!

That’s like a 1000 feet view of this specific feature. And if you look at them, those can be exactly the methods that we can have in our code for this feature:

 def following_last_tweets
   following = get_following_of(current_user)
   last_tweets = []
   following.each do |person|
   last_tweets << get_last_tweet_for(person)
   end
   last_tweets
 end
 

And again this is not an up-front design or anything like that. We’re completely focused at a granular feature of the application at this point and laid down the steps needed to implement that. So the level of granularity in doing this matters and you should be careful of not flying so high! So now it’s the time for the question HOW we’re going to attack each of those steps? If you try to answer that question completely right away, the result of your code probably won’t be very elegant. What do we do then? The thing that we usually do! We answer that question for each of those steps via TDD. Actually we even comment out that code that we just wrote, because that code does not even work, it’s just a map for the steps that we need to take to accomplish that feature. Think of it as a TODO list in your code instead of having it in a separate text file or so!

The nice thing about this approach which is basically a combination of “Wishful Thinking” and “Test-Driven-Development”, is that we get some guidelines from our wishful thinking and then try to come up with a nice solution for each of those steps in a bottom-up and incremental fashion using our unit-tests and letting them DRIVE us toward that solution and a good design.

We don’t do any over-design or over-engineering and this is not against YAGNI either. If we are going through one of the steps and then realize: “This is not the way to go for it” it’s just a matter of reverting couple minutes of work (and thank God for having version controls to make it as easy as it can get), even if we realize that our map/guideline has problems, we haven’t gone too far with this approach and we can easily think of new steps and start over again. Because it’s just a matter of few minutes and our map wasn’t even functional, it was just few lines of comments for directing our overall approach for that specific piece of functionality.

That’s why it’s VERY important to use this technique or this combination of techniques at an appropriate level of abstraction and granularity in the application! Doing it at a VERY high level is just an up-front design and can make you do hours of effort and reverting all that back because of over-engineering/design and not considering some aspects which you’ll find later on and you KNOW that HAPPENS! Doing it at a VERY low-level is not going to provide any value (almost) and you might as well do the complete bottom-up approach without having any map or guideline cause your steps will become too much primitive. (e.g print a username or the like!!!)

It’s similar to Kent Beck’s rule of Single Level of Abstraction which we can only realize what that really means and how to get it done correctly by practicing A LOT and watching the code and thinking about it deeply for a while to see what’s going on at each point and is there a nice harmony between those lines of code at each method or module. For instance if all you’re doing in a method is calling some other methods and all of a sudden there’s a very primitive line of code like a = 2; in there, there’s a good chance that line does NOT belong there!

I’ve been trying this approach for a while and it’s been serving me well. I ended up writing more interesting and cleaner and better-designed code because of that. Maybe a lot of people write their code exactly like this or maybe we’ve been always doing this but we just lay down the steps in our mind unconsciously instead of writing them down IN THE CODE and attacking them one by one!

Anyway, I think it’s a good time for me to stop preaching, I found this approach interesting and useful and I thought maybe it is useful for someone else as well!

Hope you find it interesting as well and happy hacking ☺

* The reason that I recently use this kind of application as an example, is because I’m writing one just for fun and it’s full of interesting points and examples. You can find the code for it here!

Nested Stubbing => Shouts for Refactoring

A lot of programmers write unit tests during the development and also a lot of programmers do Test Driven Development. One thing that we usually forget while programming is Listening to our Tests. If we listen carefully to our tests they will give us a lot of interesting hints and information and frankly that’s why a lot of people call it Test-Driven-Design because we can find useful points in our tests that will help us to have a better design.

I’m going to talk about one of those points that we can find out very easily by listening to our tests and will help us to have a better design and having a piece of information in its right place in our program. That certain point is Nested Stubbing which BTW happens a lot in our tests.

Lets say we are writing a twitter client app and all the communication through the network, OAuth related parts, calling Twitter API, fetching and storing tweets, etc. is being done in separated modules (Separation of Concerns and Single Responsibility Principle etc.)

We have also different kinds of presentations (views if you will) for this application. One of them is a Console Based presentation for taking a look at the tweets in terminal. One part of this presentation is taking the latest 10 tweets and rendering them (in whatever way you like) to the user. Obviously the rendering related code lives in its own place separated from the logic, data storage etc. Imagine we have a Console class which gets those latest tweets and give them to the renderer object in order to show them to the user (for brevity I remove some parts):

def show_latest_tweets
    last_ten = Tweet.order(‘created_at DESC’).limit(10)
    renderer.render_tweets(last_ten)
end

and lets say we have a test for this piece to make sure that this method is calling the right things:

it “fetches last 10 tweets and renders them” do
    latest_tweets = [t1, …, t10]
    ordered_tweets = stub(:limit).with(10) { latest_tweets }
    Tweet.stub(:order).with(‘created_at DESC’) { ordered_tweets }
    renderer.should_receive(:render).with(latest_tweets)
    console.show_latest_tweets
end

**

If you look at this code there is an obvious violation of  Law of Demeter happening. We know that when we call order on Tweet what is being returned has a limit method that we can call to limit the results. How can you easily detect this? BECAUSE WE’RE DOING A NESTED STUBBING IN OUR TEST. We are setting up a nested stub cause we setup ordered_tweets as a result of calling order on Tweet which is a stub itself. This tells us that we know TOO MUCH about the inside of Tweet class and its details at this level (Console class) which we should NOT!

Right now we’re using something like ActiveRecord as the ORM for storage part of the application but what if we change that later to something else? There’s a good chance in that new ORM the mechanism for doing the same thing (getting the latest 10 tweets) will be different and we need to change our code appropriately. But with this code that we wrote here we have to change Console class for changing our storage mechanism which does not make any sense. Console SHOULD NOT know anything about the details of storage and storage should NOT be a REASON of change for Console.

We need to have a layer which hides this information from Console and give him what he wants instead of Console reaching for that information through method chains (Tell, Don’t Ask). As you can see Tweet is like a model (if you will) in this application. And he’s the one who should know about the storage mechanism in this app (how to be stored, retrieved etc.) [Or even taking it further there can be a TweetRepository class which is specifically handling storage-related stuff for Tweet, kind of like a Façade Pattern between Tweet and DB/File/etc.]

We can add a method to Tweet class like the following:

def self.last_n_tweets(n)
    Tweet.order(‘created_at DESC’).limit(n)
end

So now lets rewrite our test based on this change:

it “fetches last 10 tweets and renders them” do
    latest_tweets = [t1, …, t10]
    Tweet.stub(:last_n_tweets).with(10) { latest_tweets }
    renderer.should_receive(:render).with(latest_tweets)
    console.show_latest_tweets
end

And now the code for it:

def show_latest_tweets
    last_ten = Tweet.last_n_tweets(10)
    renderer.render_tweets(last_ten)
end

First of all, we don’t have that nested stubbing in our test method but that’s not the point, if you pay attention you see that we eliminated the coupling and dependency of Console to the Storage mechanism and it doesn’t have any knowledge about that part anymore which is a big advantage. If we decide to change the data storage part of this application or change our ORM, that change will be hidden from Console or any client of this functionality (Tweet retrieving stuff etc.)! Console will still call Tweet.last_n_tweets and how’s that being implemented is none of its business and it doesn’t care.

As you saw listening to our test can have interesting results and nice design hints. Whenever you feel that your test is more work than it should be, or it doesn’t seem to be right or it’s hard to write the test for the target piece of code, that means there’s a design problem in our code (MOST OF THE TIMES). So we should fix the problem in the right place not struggle with making that test happen anyway. It’s time to stop preaching.

Hope that helps and happy hacking.

** This test contains more than one assertion which is usually not a good idea (there are exceptions depending on the situation like anything else in programming)!  I wrote those here to show how things are supposed to work in that piece of code!

Clean is not only for the code or tests!

TL; DR
Branching -> Cleaning-up commits -> Merging -> Pulling -> Pushing;

Working in a clean and neat environment is not ONLY for the production code and the tests of the software, it also can apply on different aspects of software development process. One of these many different aspects can be Version Control that you’re using! How clean is your history/branches/check-ins/commits/etc.? It’s very important to make the environment that we’re working in SUPER clean. Cause it makes the development process much smoother and faster and more efficient. Always remember that the only way to go fast, is to go clean! And clean is not only for code!

I use Git for version control most of the times (I believe at this moment, EVERYONE is using some sort of version control software and if someone doesn’t, then God help them). And after different kinds of experiments and ways of working with it (both as a solo developer and as a team member) I found a clean, neat, smooth and headache-less (that’s not even a word) workflow with Git. I’m going to share it with you here so probably someone finds it useful. There are hundreds of articles and blogs on how to work with Git in a better way and this is only my personal preferred way of working which happens to work very well for me and I’m sure it’s going to be the same for someone else.

Here’s how it works.

Imagine we’re working on a software program and we have 2 remote branches for it! One is the “master” (which we want it to be as clean as possible) and another is “foo_feature” which we branched it out of master in order to implement feature “foo” (whatever that is)!

Obviously we have the corresponding local branches of these two remote ones on our own machine and we work on them locally whichever branch that we’re on at any moment. Now if I want to work on something related to “foo feature”, this is my workflow for doing it:

First I pull the latest changes from the remote foo-feature branch to my local foo-feature branch:

git pull origin –rebase foo-feature

I’m pretty sure a lot of feature are against using –rebase for a lot of reasons which I’m not gonna mention here but the reason that I like using it most of the times is that it’ll give me the ability to solve merge conflict issues one step at a time and continue the process and at the end I won’t have that one extra merge commit which is being generated automatically by Git when you pull. Also it put my latest changes on top of the latest changes in the remote branch history.

Anyway, then I create a new local branch on my machine out of the latest version of foo-feature (I rather not mess with the local foo-feature repository during my experiments and keep it clean and do! So here’s how’s it’s gonna work: (imagine I want to do some Refactoring on the code)

git checkout -b foo-feature-refactoring

After that, I start making my changes and commit them as much as I need during this Refactoring and when I get a log from my foo-feature-refactoring repository imagine here’s what I get as a result:

git log –oneline
e329shf commit message 1
e329shf commit message 2
e329shf commit message 3
e329shf commit message 4

At this point I’m done with the Refactoring and I want to merge it back to foo-feature branch and push it to the remote branch named foo-feature so other people in the team can see them as well! But since I was experimenting a lot during this Refactoring and made some mistakes I ended up with some commits that I don’t want to be in the clean history of foo-feature! I’m sure this happens to everyone during the development. (Some of these commits are not providing any value by being in the history and they’re just bunch of noise there)

So I try to make all these commits into few nice and meaningful commits which totally make sense and they provide some value if they’ll be in foo-feature repository. Beautiful Git let me to do it like the following:

git rebase -i HEAD~4

This will give me the last 4 commits that I made! It will open up an interactive environment (editor) for me including my last 4 commits (which I showed above as a result of git log –oneline) and I see something like the following:

pick e329shf commit message 1
pick e329shf commit message 2
pick e329shf commit message 3
pick e329shf commit message 4
# Commands:
# p, pick = …
# r, reword = …
# …
# s, squash = use commit, but meld into previous commit
# …

As you can see it gives the list of commit messages in an ordered manner (the oldest at the top) and a set of commands that we can manipulate the commits with them!
The one that I’m interested in here for this use case is “s, squash” which will meld the commit into the previous one! Now I Refactored the code in my local branch named foo-feature-refactoring and I want to have only 1 commit with a clear message about what I did so I’ll edit the text that git generated for me to the following:

pick e329shf commit message 1
squash e329shf commit message 2
squash e329shf commit message 3
squash e329shf commit message 4
# …

After doing that all these 4 commits will be meld into one commit and git will give me another editor withe the following structure so I can write my commit message for this whole action:

# This is combination of 4 commits
# The first commit’s message is:
commit message 1

# The first commit’s message is:
commit message 2

# The first commit’s message is:
commit message 3

# The first commit’s message is:
commit message 4

–> Refactored the FooFeature, got rid of some duplication (DRY).
# Please enter the commit message for your changes. …
# …

Now after save/quit of this editor I have only one commit with the message “Refactored the FooFeature class, got rid of some duplication (DRY).” which is succinct and clear. I got rid of all those noisy commits (message 1, message 2, etc.)

Now I need to merge this thing back to my local foo-feature branch which is a 2 step process:

git checkout foo-feature # switched to foo_feature branch
git merge foo-feature-refactoring # merged those 2 branches

Now I can delete the foo-feature-refactoring or if I need it, I’ll keep it there in the collection of local branches! (depends on the scenario obviously)

Now I need to push this change to the remote branch origin/foo-feature so others can see what’s going on! I just do a pull first (after I committed anything I have in my working tree of course):

git pull –rebase origin/foo-feature #and solve any conflict if any exists

Then I push my local changes to the remote branch:

git push origin/foo-feature

Now if someone else pulls the origin/foo-feature they will see my commit:

e329shf Refactored the FooFeature class, got rid of some duplication (DRY).

Instead of 4 commits with confusing messages which are just messing the history and log of the repository!

Also for having better messages in the commits I highly recommend reading these great points by Tim Pope on his blog here!

Maybe a lot of you are already working in this way! I just wanted to share this approach with you since I found it super useful and clean during the development process SPECIALLY in a team.

Worth a try! Hope that helps.

* Special thanks to Phil Corliss for telling me some interesting points about having a nice workflow in Git and its benefits over time!

* If you haven’t already, definitely read this part of “git” book/documentation on Rebasing and why it’s powerful & when you should not use it!