Why are we supposed to use short functions to sectionalize our code?

Question

I've seen an increasing trend in the programming world saying that it is good practice to separate code blocks into their own functions. Obviously, if that code block is reusable, you should do that. What I do not understand is this trend of using a function call as essentially a comment that you hide your code behind if the code is not reusable. That's what code folding is for.

Personally, I also hate reading code like this because it feels like it has the same problem as the GOTO statement - it becomes spaghetti code where if I'm trying to follow the program's flow I'm constantly jumping around and can't logically follow the code. It is much easier to me to follow code that is linear but has a single comment over sections of code labeling what it does. With code folding, this is essentially the same exact thing, except the code stays in a nice linear fashion. When I try to explain this to my colleagues, they say comments are evil and clutter - how is a comment on top of a block of folded code any different from a function call that will never get called more than once? How is overusing functions different than overusing comments? How are frequent use of functions different from the problems with GOTO statements? Can someone please explain the value of the programming paradigm to me?

It depends on the language, in Haskell I quite regularly create tiny functions in a where clause, they never get called more than 3 times, but I find this very readable. In Haskell though, I feel guilty if a function is longer than 10 lines — daniel gratzer, Sep 04 '13 at 01:23
Related: Is it OK to split long functions and methods into smaller ones even though they won't be called by anything else?, Should I extract specific functionality into a function and why?, and One-line functions that are called only once — , Sep 04 '13 at 01:33
Should be more famously last words: "If I break every last little thing into its smallest component parts, this will be the most re-usable and modular codebase ever!" — Erik Reppen, Sep 04 '13 at 03:05
"...I've seen an increasing trend in the programming world..." - post three links (or more). — Den, Sep 04 '13 at 09:03
@ErikReppen - if it's very flat than I'd take it over a monolithic spaghetti every time. Also Ctrl+[Refactor]+[Inline] — Den, Sep 04 '13 at 09:04
I am dealing with legacy code that is in an ancient IDE. I don't have any form of line folding. To make it more fun the previous developer thought that having a 7000 line single function state machine was a good idea. But that's not the best of it, because the function was so large he cut parts of it out and #includes it directly into the code. If I could ever hunt him down I hate to think what I would do. I wouldn't say comments are evil or clutter either, once again don't overuse them but having none can be worse. — Firedragon, Sep 04 '13 at 10:33
I wonder why nobody mentions 'Literate Programming' (https://en.wikipedia.org/wiki/Literate_programming) - it's a perfect example of limiting the code fragment to "one idea", without necessarily using functions/methods as the delimiting scope. — topskip, Sep 04 '13 at 11:20
Also related: http://programmers.stackexchange.com/questions/160787/hiding-away-complexity-with-sub-functions — Tomás, Sep 04 '13 at 12:56

mortalapeman · Accepted Answer · 2013-09-04T01:30:50.063

31

Code organization is all about displaying enough information to convey a single idea. The sweet spot is getting your code pared down enough that a single idea can fit in a single unit of code. Your unit of code can be a function, a class, etc. These are merely tools of organization. As with any tool, it can be over used or used incorrectly.

Having a one line function makes no sense unless the function conveys a meaningful idea. Having a large imperative function that conveys many ideas is hard to digest and reuse. It's all about striking the right balance, and even that is subjective.

edited Sep 04 '13 at 01:30

answered Sep 04 '13 at 01:24

mortalapeman

1,613

4

Indeed. And on the other side, if your lengthy method cannot be broken down into distinct ideas I suggest that you have a poor design or understanding of the requirements. – Telastyn Sep 04 '13 at 02:57
Beautiful answer. Really gets to the heart of the issue. What I'm still trying to grasp though is with small blocks of non-resusable code, is there really some sort of serious advantage to separating it into its own function as opposed to separating it by putting a comment above it and collapsing it. Or is this one of those things where in reality, despite strong opinions, its just a personal opinion as long as you are breaking down your code to convey singular ideas. – dallin Sep 04 '13 at 03:01
Having smaller units of code means they are easier to quickly read, mentally process, test, and refactor. If you can easily hold the entire idea that a block of code conveys in your head, you don't have to ask yourself, "what if that that logical test 10 lines ago was false?" – mortalapeman Sep 04 '13 at 04:18
2

Of course, having smaller functions does little good for your mental capacity if you are modifying state outside the scope of your functions. Then you have to hold the ENTIRE program in your head. – mortalapeman Sep 04 '13 at 04:24
@dallin: comments add to cognitive load, because when using them, you have to maintain 2 things: your code and your comments. This is also one of the reasons why comments have a tendency to become stale. People often change code, but don't update the comments. – Stefan Billiet Sep 04 '13 at 10:31
@dallin Folding the code is a per-developer/per-editor/per-IDE thing - it doesn't get committed into version control along with the code. As soon as another developer checks it out, they won't have the same folds as you. And as soon as you update with their changes, your folds are now in the wrong place. – Izkata Sep 04 '13 at 11:37
@StefanBilliet I guess my experience says a million tiny functions add to the cognitive load. When a project gets huge, I end up struggling to know where to look and remembering where I put the code piece I need. I literally have to run through a million functions to find the code piece I need to edit. No matter how organized I make it, it becomes a case of "now where did I put that?" What I would really like to see is an actual codebase that's programmed in this fasion that I could look at and say, "yup, now I see how this works". – dallin Sep 04 '13 at 19:21
@StefanBilliet - Inappropriate use of comments does that. But appropriate use of comments does not. – Stephen C Sep 05 '13 at 04:04
@dallin: comments nor short methods will help in that case; I think you have a larger issue with how that code base is structured and organized. – Stefan Billiet Sep 05 '13 at 06:43
@Stephen C: in my oppinion, the only valid reason for comments is to explain some sort of technical step (e.g. why you're disposing an object in one place and not another). If you want to explain the normal flow of code, you should let the code itself do the talking. – Stefan Billiet Sep 05 '13 at 06:44

score 21 · Answer 2 · answered Sep 04 '13 at 04:37

I've seen an increasing trend in the programming world saying that it is good practice to separate code blocks into their own functions.

I wouldn't have called this an "increasing trend". I was taught that splitting overly large methods into smaller methods improved readability ... ummm ... nearly 40 years ago. And I was taught the design-time equivalent ... functional decomposition.

What I do not understand is this trend of using a function call as essentially a comment that you hide your code behind if the code is not reusable. That's what code folding is for.

No. Functional decomposition is not primarily about creating reusable components / functions / methods / whatever. What is actually about is making the codebase easier to understand by reducing it into "bite sized chunks" that make it easier to understand.

IMO, you can't achieve the same thing with IDE code folding. Code folding typically does not take account of the effects of folded code on other code. For example:

    int a = 1;
    if (something()) {
        a = a + 1;
    }
    print(a);

If the IDE decided to fold the body of the if statement then the programmer is liable to not notice that a may change from its initial value. If it was up to the programmer to decide what to fold, then he / she has to understand the code in order to decide ... which makes the process circular.

By contrast, if the code was written like this:

   int a = 1;
   a = someMethod(a);
   print(a);

where

   function someMethod(a):
       return something() ? a + 1 : a;

you don't have the same "surprise".

(Obviously this is a highly unrealistic example ... but it illustrates the problem with code folding.)

I think you misconstrued my question to think I meant I do not prefer and write short functions and functional decomposition. If you were to read some of my code, you'll see it's rare for me to have a single function that you can't view on a single page. What I'm talking about is the more extreme view of breaking every small logical block of code (3 to 6 lines) out into a separate function so that many of your functions read as a line of function calls. — dallin, Sep 04 '13 at 19:34
My experience with this is it makes larger programs a mess. It's hard to follow program flow & organize large codebases with so many functions, & as the codebase grows, I struggle to remember where things are & where I need to go to edit them. As a result, I follow program flow to find the code snippet I need to edit, which is a disorienting experience jumping from one place to another. It's also a pain constantly creating new functions and organizing where to put them. This is my personal real world experience. Do you have an example of a codebase that does this well I could see? — dallin, Sep 04 '13 at 19:44
If that is what you are talking about, I have not observed this "trend". Do you have some specific examples of publicly viewable code that illustrates this? (And I don't mean individual functions, methods, classes ... I'm talking about large scale examples of this "trend".) — Stephen C, Sep 05 '13 at 04:02

score 10 · Answer 3 · answered Sep 04 '13 at 02:45

10

I don't see the need to read every single line of code to understand what a program does. If the functions are named appropriately, why even look at the contents unless it does not give you the result you expect?

Another advantage of writing smaller functions is for unit testing. Write a small function that passes its test(s) and forget about the details. The key to debugging and trouble-shooting is review and change as few lines of code as possible.

The GOTO analogy may explain the "jump to" part of your problem, but you know exactly where a function is going to return. That is a huge difference.

It's very easy to get accustomed to a lot of things in programming. Designing with a GUI is no help when you eventually have to look at the underlying code and are completely lost.

answered Sep 04 '13 at 02:45

JeffO

36,816

Your goto explanation and unit tests make good points. As for why I would look at every function, I don't all the time, but there are times when I do, like when I'm trying to run through execution flow to grasp an entire code base I'm unfamiliar with. I have had some nightmares following other people's code in these circumstances, jumping from location to location, unable to picture how the whole program works together. I still think a collapsed block of code with a single comment over it is a better alternative, but I'm willing to be convinced otherwise. – dallin Sep 04 '13 at 02:53
1

+1 for the first sentence alone - naming things naturally and having good granularity make code easy to read. – Daniel B Sep 04 '13 at 10:25

score 7 · Answer 4 · answered Sep 04 '13 at 08:53

Writing short functions has a few direct benefits and a lot of indirect benefits, as explained at length in Robert C. Martin's book Clean Code. Off memory:

Improves readability, because short functions are easier to read at a glance
Makes it easier to reuse those smaller functions later (I know you eliminated this possibility as a presupposition, but...)

The indirect benefits are legion, and can be rephrased as "what you don't get if you are unable to refactor a large function into smaller functions":

Minimises side effects (if you're in a long function, it's entirely possible for a variable to be modified hundreds of lines later - very difficult to track in your head)
Makes it obvious which variables are used, and for what - input, output - which makes it easy to track dependencies
By using meaningful function names, it encourages clear code (if you can't summarise what a function does in a few words...)
Hopefully, when you've refactored your big function into small functions that do very little, you will notice common patterns, such that those functions that you thought couldn't be reused, can. Or at least replaced with generic algorithms.

These reasons are well-understood and you can read all about them through the books or related answers. You are on to something with the second paragraph though, which I'll highlight here:

Short functions are great, but our IDEs are designed to make them harder!

You hinted at this with your mention of code folding. Why is it that we can do this:

SomeLongFunction(doohickey) {
    [+] /// Frobnicate the doohickey
    foreach (d : doohickeys) {
        [+] /// Reticulate the doohickey
    }
}

... and expand to see the actual code, in-line, by unfolding the [+]s, but we can't have this:

SomeLongFunction(doohickey) {
    splines = FrobnicateTheDoohickey(doohickey);
    foreach (d : doohickeys) {
        ReticulateTheDoohickey(d);
    }
}

... and "expand" the function calls? We could ask the IDE to take us to those function definitions, but depending on where that is, we could be taken to a new class in a new file, or a new location in the same file, and lose the calling context! Sure there are tools that let you easily jump between contexts, but why can't we view all the code in-line, like we would have if we used code-folding? This is entirely backwards; IDEs should help us write clean code by making things easiest when we do, and not encourage us to write bad code by introducing features that do so such as code folding.

Here are the advantages of being able to view code in-line (whether that's folding or calling other functions):

Gives me the choice of viewing the code at any granularity - I can traverse it BFS, DFS, or any combination in between.
With the code in one context, I can easily find code that is executed earlier, or later, by (gasp) scrolling up or down
By expanding certain folded sections (again, at my choice) I have a better sense of the complexity of the code, so it's easier to grok it. Being limited to viewing one function at a time hinders this by hiding the complexity.

Another way of looking at it is that it's analogous to having larger or multiple monitors - you are able to see more at a time, so you can understand better, even though it's often not "necessary", because good code should have short lines and short functions. By the way, multiple monitors has been proven to improve productivity. By a lot.

So why don't our editors or IDEs allow for this kind of in-line function call expansion? (if there is one please, PLEASE let me know!) I have no idea. It's clear to me that short functions has tremendous benefits. It's also clear to me that being able to view lots of code together is also beneficial. Why can't we have both?

VS2013 does some think like this called peek definition – jk. Sep 04 '13 at 12:37 — jk., Sep 04 '13 at 12:37

score 5 · Answer 5 · answered Sep 04 '13 at 07:08

First, as @Stephen C said, this isn't a new idea. It's old, and the reason it's been around so long is that it works.

Secondly, the reason it works is the same reason functional programming is becoming a bit of a hot topic -- it makes the program easier to reason about. It may make it harder to physically trace (sounds like an opportunity for a tool), but it makes the individual pieces easier to grok.

Take a function that takes a dozen parameters and returns a boolean -- right off the bat, you know that it will return one of two values. You don't need to trace it out, see that it calls sixteen methods on 7 objects plus 4 utility functions, all of that is irrelevant as it returns either true or false.

Now, you may be asking how return values coorespond to breaking up functions, but really they are the same thing -- a function that takes a dozen parameters and returns a boolean, is just another way of saying that you have a series of calculations that culminate in a single result that is then used elsewhere. A void function is simply a series of operations that logically hang together and can be treated as one.

Breaking up a large function, even if none of the pieces are reusable, means that you can hold the pieces in mind better at the appropriate level of abstraction. A function named F which consits of nothing but 10 calls to F1, F2 and so forth, is hard to reason about only while you are trying to figure out what F does -- which a good name would reduce to a time best measured in fractions of a second. But even a bad naming convention will make it easier to ignore the rest of the program while you focus on the step that follows 4 and precedes 6. And being broken down into steps, either the state of an object is being changed, globals are being abused, or any information that F1 wants to provide to F10 is returned and then kept in a local variable...In all cases it's going to be easier to keep track of where things are changed in smaller parts.

score 2 · Answer 6 · answered Sep 04 '13 at 01:19

I believe this approach is done primarily to make code easier to read. And yes, I agree, it is possible to go too far down this road, where every function only contains 2-3 lines and the result is that the main function is no easier to read. Doing this well is more an art than anything else.

My personal guide is that I only split code out to a separate and (most likely) single-use function if it will reduce the length of the main function by 10 lines or more, if the lines being separated out are all logically part of the same task, and if that task can be given an easy-to-understand name (because otherwise the main function won't be any more readable).

score 2 · Answer 7 · answered Sep 04 '13 at 04:52

The act of splitting up your functions and methods into smaller ones is not just for re-usability. It is a good general design practice because it encourages de-coupled code. De-coupled code is easier to read and maintain.

What do I mean by this?

If I have a 100 line of code function. It handles what happens when the user makes a change to a domain object. This function is long and is doing 100 discrete things.

If I split that into two functions, one inside the other, the first function is now doing 51 discrete things and the second function is doing 50 discrete things. However, the second function is probably doing only one thing that can cause side-effects for the other 50 things. If there is a bug in the main function, I can probably isolate it to either my new method call or one of the other 50 lines of code. In one simple refactoring, I've cut the amount of code that I have to grok to find my bug in half.

As an industry, we have worked harder and harder to remove things like global variables and system-wide dependencies out of our languages and code. To some extent, we can consider every chunk of procedural code as its own global scope. When framed this way, it becomes very easy to see that we wish to cut down the size of this scope as much as is possible. Using small methods which do one thing helps with this a lot. It's not just a readability issue.

score 1 · Answer 8 · answered Sep 04 '13 at 03:10

It is easier to bust things out and re-use them as-needed than it is to read and understand a cacophony of completely pointlessly busted out micro-ideas that will never get re-used and represent very little on their own merit.

Always err on the side of clarity but when it helps nothing, leave the un-reused stuff inside the one func. You Ain't Going to Need It in that state.

score 1 · Answer 9 · answered Sep 04 '13 at 12:28

I work with young programmers that have been taught in school that functions should be short but lack of experience to correctly factorize their code.

I'm quite often doing exactly the opposite: inlining at call sites those functions or methods to make the code readable.

While it is true that well structured code should usually be concise, probably not larger than a screen for a given function or method which functions matters.

What I do is follow code smells rules to organize my code.

Long function is a well known code smell, but it's not the only one, and certainly not the most important.

For instance if you randomly take code blocks and change then to methods as do many inexperienced programmers, you just exchange long functions with large class. Even worse, this can lead to methods receiving parts of their data through method parameters and parts from object instance. Avoiding that smell (Temporary Fields... an object oriented way of programming with globals) is much more important to me than short methods.

Also a function or method should perform some clearly identifiable task, and only one such task. I have very often seen a function perform half a task and return some parameter passed to another function a few lines later performing the other half of the task (this may be a case of Data Clumps smell). Or function whose only task is to call another one, doing basically nothing (Lazy method). This also is horrible. You can usually easily detect that looking at the use of local variables (if these local variables haven't been put in object instance as in the previous case, Temporary Field smell).

Another common smell is passing a boolean to a function: in many cases it hides a function performing one task or another depending on the provided boolean. It would probably be better to split that in two functions (I call that schizophrenic method smell).

What is interesting in that the choice of functions is not always so bad at start, but code evolution over time often lead to above problems.

My explanation of why it happens lies in programmers psychology and is probably the main drawback of splitting the code between many methods or functions.

You can easily blind yourself with functions/methods and basically forget there is code inside these methods. Some (inexperienced) programmers once they created a method take it for granted as if it were a new library calls. They won't allow themselve to change it any more they would do for some third-partu library call. (same problem also occurs with user defined classes).

Astonishingly enough is also works the other way around: some inexperienced programmers won't create a logicaly necessary method or class if the code initial is short (typical case for curryfying a function by setting some parameters to initial values)!!!

This can become really bad if you are not aware of it and cautious.

Summarily what I'm saying is that spliting code between many methods/functions/classes can be a really good thing if you know what you are doing, or a real problem if you don't.

Because of this I put this particular smell quite low on my personal list.

Too bad it's often near head of list in code smell lists on the net. I guess it's because it is one of the easier smells to detect, both by humans and automated tools.

score 0 · Answer 10 · answered Sep 04 '13 at 04:11

A well defined identifier for the function must give you a wide knowledge about the function without go into it. Reusable or not, a function is intended to follow the divide and conquer principle, and in most cases a programmer start designing/implementing/coding a non-main function and only then a call is necessary.

score 0 · Answer 11 · answered Sep 04 '13 at 10:07

If the code is "good code" otherwise, is there a real difference in cognition between the action of expanding a code block with a purposeful comment on the one hand, and following the debugger into a well-named function on the other?

You may lose context when following the debugger into a function call - but if that makes a significant difference to the understanding of the function you are looking at then that indicates that the context is important - which implies that the purpose of the function being called isn't clear, so the code isn't "good code". In other words, you don't really trust what the function does from its name, which is a separate issue.

A code block might be using any variables or parameters from the current function, and you will not know for sure without expanding the code block. The comment will explain what the block does, but a function call's parameters also show the dependencies from the current context that are passed to the function. This is actually more context while looking at less code, which is a good thing (still assuming the code is actually good, of course).

The context you are working in may force the use of a non-folding editor, in which case larger code blocks may obscure the overall structure of a function - for example if the function would fit onto a single screen with all blocks collapsed, but covers many screens with them expanded.

A nice middle ground would be to have IDEs / editors that can "inline" the code of small functions being called, so that you are looking at the context, even if that context is spread over multiple files.

Why are we supposed to use short functions to sectionalize our code?

11 Answers11

Short functions are great, but our IDEs are designed to make them harder!

Linked

Related