21

I've seen an increasing trend in the programming world saying that it is good practice to separate code blocks into their own functions. Obviously, if that code block is reusable, you should do that. What I do not understand is this trend of using a function call as essentially a comment that you hide your code behind if the code is not reusable. That's what code folding is for.

Personally, I also hate reading code like this because it feels like it has the same problem as the GOTO statement - it becomes spaghetti code where if I'm trying to follow the program's flow I'm constantly jumping around and can't logically follow the code. It is much easier to me to follow code that is linear but has a single comment over sections of code labeling what it does. With code folding, this is essentially the same exact thing, except the code stays in a nice linear fashion. When I try to explain this to my colleagues, they say comments are evil and clutter - how is a comment on top of a block of folded code any different from a function call that will never get called more than once? How is overusing functions different than overusing comments? How are frequent use of functions different from the problems with GOTO statements? Can someone please explain the value of the programming paradigm to me?

dallin
  • 412

11 Answers11

31

Code organization is all about displaying enough information to convey a single idea. The sweet spot is getting your code pared down enough that a single idea can fit in a single unit of code. Your unit of code can be a function, a class, etc. These are merely tools of organization. As with any tool, it can be over used or used incorrectly.

Having a one line function makes no sense unless the function conveys a meaningful idea. Having a large imperative function that conveys many ideas is hard to digest and reuse. It's all about striking the right balance, and even that is subjective.

mortalapeman
  • 1,613
  • 4
    Indeed. And on the other side, if your lengthy method cannot be broken down into distinct ideas I suggest that you have a poor design or understanding of the requirements. – Telastyn Sep 04 '13 at 02:57
  • Beautiful answer. Really gets to the heart of the issue. What I'm still trying to grasp though is with small blocks of non-resusable code, is there really some sort of serious advantage to separating it into its own function as opposed to separating it by putting a comment above it and collapsing it. Or is this one of those things where in reality, despite strong opinions, its just a personal opinion as long as you are breaking down your code to convey singular ideas. – dallin Sep 04 '13 at 03:01
  • Having smaller units of code means they are easier to quickly read, mentally process, test, and refactor. If you can easily hold the entire idea that a block of code conveys in your head, you don't have to ask yourself, "what if that that logical test 10 lines ago was false?" – mortalapeman Sep 04 '13 at 04:18
  • 2
    Of course, having smaller functions does little good for your mental capacity if you are modifying state outside the scope of your functions. Then you have to hold the ENTIRE program in your head. – mortalapeman Sep 04 '13 at 04:24
  • @dallin: comments add to cognitive load, because when using them, you have to maintain 2 things: your code and your comments. This is also one of the reasons why comments have a tendency to become stale. People often change code, but don't update the comments. – Stefan Billiet Sep 04 '13 at 10:31
  • @dallin Folding the code is a per-developer/per-editor/per-IDE thing - it doesn't get committed into version control along with the code. As soon as another developer checks it out, they won't have the same folds as you. And as soon as you update with their changes, your folds are now in the wrong place. – Izkata Sep 04 '13 at 11:37
  • @StefanBilliet I guess my experience says a million tiny functions add to the cognitive load. When a project gets huge, I end up struggling to know where to look and remembering where I put the code piece I need. I literally have to run through a million functions to find the code piece I need to edit. No matter how organized I make it, it becomes a case of "now where did I put that?" What I would really like to see is an actual codebase that's programmed in this fasion that I could look at and say, "yup, now I see how this works". – dallin Sep 04 '13 at 19:21
  • @StefanBilliet - Inappropriate use of comments does that. But appropriate use of comments does not. – Stephen C Sep 05 '13 at 04:04
  • @dallin: comments nor short methods will help in that case; I think you have a larger issue with how that code base is structured and organized. – Stefan Billiet Sep 05 '13 at 06:43
  • @Stephen C: in my oppinion, the only valid reason for comments is to explain some sort of technical step (e.g. why you're disposing an object in one place and not another). If you want to explain the normal flow of code, you should let the code itself do the talking. – Stefan Billiet Sep 05 '13 at 06:44
21

I've seen an increasing trend in the programming world saying that it is good practice to separate code blocks into their own functions.

I wouldn't have called this an "increasing trend". I was taught that splitting overly large methods into smaller methods improved readability ... ummm ... nearly 40 years ago. And I was taught the design-time equivalent ... functional decomposition.

What I do not understand is this trend of using a function call as essentially a comment that you hide your code behind if the code is not reusable. That's what code folding is for.

No. Functional decomposition is not primarily about creating reusable components / functions / methods / whatever. What is actually about is making the codebase easier to understand by reducing it into "bite sized chunks" that make it easier to understand.

IMO, you can't achieve the same thing with IDE code folding. Code folding typically does not take account of the effects of folded code on other code. For example:

    int a = 1;
    if (something()) {
        a = a + 1;
    }
    print(a);

If the IDE decided to fold the body of the if statement then the programmer is liable to not notice that a may change from its initial value. If it was up to the programmer to decide what to fold, then he / she has to understand the code in order to decide ... which makes the process circular.

By contrast, if the code was written like this:

   int a = 1;
   a = someMethod(a);
   print(a);

where

   function someMethod(a):
       return something() ? a + 1 : a;

you don't have the same "surprise".

(Obviously this is a highly unrealistic example ... but it illustrates the problem with code folding.)

Stephen C
  • 25,178
  • I think you misconstrued my question to think I meant I do not prefer and write short functions and functional decomposition. If you were to read some of my code, you'll see it's rare for me to have a single function that you can't view on a single page. What I'm talking about is the more extreme view of breaking every small logical block of code (3 to 6 lines) out into a separate function so that many of your functions read as a line of function calls. – dallin Sep 04 '13 at 19:34
  • My experience with this is it makes larger programs a mess. It's hard to follow program flow & organize large codebases with so many functions, & as the codebase grows, I struggle to remember where things are & where I need to go to edit them. As a result, I follow program flow to find the code snippet I need to edit, which is a disorienting experience jumping from one place to another. It's also a pain constantly creating new functions and organizing where to put them. This is my personal real world experience. Do you have an example of a codebase that does this well I could see? – dallin Sep 04 '13 at 19:44
  • If that is what you are talking about, I have not observed this "trend". Do you have some specific examples of publicly viewable code that illustrates this? (And I don't mean individual functions, methods, classes ... I'm talking about large scale examples of this "trend".) – Stephen C Sep 05 '13 at 04:02
10

I don't see the need to read every single line of code to understand what a program does. If the functions are named appropriately, why even look at the contents unless it does not give you the result you expect?

Another advantage of writing smaller functions is for unit testing. Write a small function that passes its test(s) and forget about the details. The key to debugging and trouble-shooting is review and change as few lines of code as possible.

The GOTO analogy may explain the "jump to" part of your problem, but you know exactly where a function is going to return. That is a huge difference.

It's very easy to get accustomed to a lot of things in programming. Designing with a GUI is no help when you eventually have to look at the underlying code and are completely lost.

JeffO
  • 36,816
  • Your goto explanation and unit tests make good points. As for why I would look at every function, I don't all the time, but there are times when I do, like when I'm trying to run through execution flow to grasp an entire code base I'm unfamiliar with. I have had some nightmares following other people's code in these circumstances, jumping from location to location, unable to picture how the whole program works together. I still think a collapsed block of code with a single comment over it is a better alternative, but I'm willing to be convinced otherwise. – dallin Sep 04 '13 at 02:53
  • 1
    +1 for the first sentence alone - naming things naturally and having good granularity make code easy to read. – Daniel B Sep 04 '13 at 10:25
7

Writing short functions has a few direct benefits and a lot of indirect benefits, as explained at length in Robert C. Martin's book Clean Code. Off memory:

  • Improves readability, because short functions are easier to read at a glance
  • Makes it easier to reuse those smaller functions later (I know you eliminated this possibility as a presupposition, but...)

The indirect benefits are legion, and can be rephrased as "what you don't get if you are unable to refactor a large function into smaller functions":

  • Minimises side effects (if you're in a long function, it's entirely possible for a variable to be modified hundreds of lines later - very difficult to track in your head)
  • Makes it obvious which variables are used, and for what - input, output - which makes it easy to track dependencies
  • By using meaningful function names, it encourages clear code (if you can't summarise what a function does in a few words...)
  • Hopefully, when you've refactored your big function into small functions that do very little, you will notice common patterns, such that those functions that you thought couldn't be reused, can. Or at least replaced with generic algorithms.

These reasons are well-understood and you can read all about them through the books or related answers. You are on to something with the second paragraph though, which I'll highlight here:

Short functions are great, but our IDEs are designed to make them harder!

You hinted at this with your mention of code folding. Why is it that we can do this:

SomeLongFunction(doohickey) {
    [+] /// Frobnicate the doohickey
    foreach (d : doohickeys) {
        [+] /// Reticulate the doohickey
    }
}

... and expand to see the actual code, in-line, by unfolding the [+]s, but we can't have this:

SomeLongFunction(doohickey) {
    splines = FrobnicateTheDoohickey(doohickey);
    foreach (d : doohickeys) {
        ReticulateTheDoohickey(d);
    }
}

... and "expand" the function calls? We could ask the IDE to take us to those function definitions, but depending on where that is, we could be taken to a new class in a new file, or a new location in the same file, and lose the calling context! Sure there are tools that let you easily jump between contexts, but why can't we view all the code in-line, like we would have if we used code-folding? This is entirely backwards; IDEs should help us write clean code by making things easiest when we do, and not encourage us to write bad code by introducing features that do so such as code folding.

Here are the advantages of being able to view code in-line (whether that's folding or calling other functions):

  • Gives me the choice of viewing the code at any granularity - I can traverse it BFS, DFS, or any combination in between.
  • With the code in one context, I can easily find code that is executed earlier, or later, by (gasp) scrolling up or down
  • By expanding certain folded sections (again, at my choice) I have a better sense of the complexity of the code, so it's easier to grok it. Being limited to viewing one function at a time hinders this by hiding the complexity.

Another way of looking at it is that it's analogous to having larger or multiple monitors - you are able to see more at a time, so you can understand better, even though it's often not "necessary", because good code should have short lines and short functions. By the way, multiple monitors has been proven to improve productivity. By a lot.

So why don't our editors or IDEs allow for this kind of in-line function call expansion? (if there is one please, PLEASE let me know!) I have no idea. It's clear to me that short functions has tremendous benefits. It's also clear to me that being able to view lots of code together is also beneficial. Why can't we have both?

congusbongus
  • 1,354
5

First, as @Stephen C said, this isn't a new idea. It's old, and the reason it's been around so long is that it works.

Secondly, the reason it works is the same reason functional programming is becoming a bit of a hot topic -- it makes the program easier to reason about. It may make it harder to physically trace (sounds like an opportunity for a tool), but it makes the individual pieces easier to grok.

Take a function that takes a dozen parameters and returns a boolean -- right off the bat, you know that it will return one of two values. You don't need to trace it out, see that it calls sixteen methods on 7 objects plus 4 utility functions, all of that is irrelevant as it returns either true or false.

Now, you may be asking how return values coorespond to breaking up functions, but really they are the same thing -- a function that takes a dozen parameters and returns a boolean, is just another way of saying that you have a series of calculations that culminate in a single result that is then used elsewhere. A void function is simply a series of operations that logically hang together and can be treated as one.

Breaking up a large function, even if none of the pieces are reusable, means that you can hold the pieces in mind better at the appropriate level of abstraction. A function named F which consits of nothing but 10 calls to F1, F2 and so forth, is hard to reason about only while you are trying to figure out what F does -- which a good name would reduce to a time best measured in fractions of a second. But even a bad naming convention will make it easier to ignore the rest of the program while you focus on the step that follows 4 and precedes 6. And being broken down into steps, either the state of an object is being changed, globals are being abused, or any information that F1 wants to provide to F10 is returned and then kept in a local variable...In all cases it's going to be easier to keep track of where things are changed in smaller parts.

jmoreno
  • 10,853
  • 1
  • 31
  • 48