3

I've been watching an presentation from Jonathan Blow on Software Quality. He as a point that adding more and more layers of abstraction gives you not only more difficulty to manage your code, but also we are using too much resources and the software is slow.

On this he also shows an giant call stack of some tomcat server. I don't know what work tomcat simplifies, but imho no server code has to be this much abstracted.

Also in The Art of Unix Programming the author indicates that it is better to have flat representations and also move the algorithmic parts to the data parts.

With this in mind, I was thinking that an tool or an performance checking approach that would track the size of the call stack. This would be an great abstraction and performance measurement unit. Of course you have to break your call stack limit, in some recursive algorithms, but the overall call stack size must be held under some threshold.

What do you think? Limiting/tracking your call stack would bring some advantages, or not? Do you have some advice/experience? Or is this just an stupid idea?

microo8
  • 147
  • 5
  • 4
    "imho no server code has to be this much abstracted" - so you have no idea what tomcat is good for, but come to this conclusion? Sounds you are victim of the famous Dunning-Kruger effect. – Doc Brown Aug 17 '17 at 13:34
  • 1
    @DocBrown: It wouldn't be the first time a system has been over-engineered. I don't need Dunning Kruger to know that a stack trace from a thrown exception that is 100 nested calls deep is probably a good indication that some Architecture Astronaut was involved. – Robert Harvey Aug 17 '17 at 15:56
  • @RobertHarvey: honestly, such a call stack is surely an indication that many levels of abstraction are involved, but if I do not know the system in stake, I would not draw any premature conclusions from that. Maybe it is overdesigned, maybe the abstraction stack is perfectly sensible, I don't know, but I think it it a weakness of character to make judgements without having any idea of what one is talking about. – Doc Brown Aug 18 '17 at 06:32
  • 1
    @DocBrown: maybe if you look at other systems, there isn't so much abstraction. It's better to have it flatter. And you don't have to insult someone to prove your point. We're here for constructive answers. If you don't like my question, downvote it or something. – microo8 Aug 18 '17 at 07:18
  • Sorry if you feel insulted, please don't take my critics personally. This was only about your wording in this particular question, it reads to me like "I have no idea what XYZ does or how it is designed, I see only one isolated symptom, but that leads my conclusion XYZ is totally crap". Such kind of judgement has a name, it is called Dunning-Kruger effect. You may take this as constructive hint to choose your words more carefully. – Doc Brown Aug 18 '17 at 07:29
  • ... besides that, I think you asked a reasonable question. I would probably upvote it without this issue. – Doc Brown Aug 18 '17 at 08:21
  • What about recursion? Or lack of tail call optimization? – Frank Hileman Aug 18 '17 at 20:56
  • "Abstractions" is the wrong term. These can be anything, not just classes or data types. Abstractions don't have to be represented in code at all -- that is why they are abstractions. I think you mean data types instead. – Frank Hileman Aug 18 '17 at 20:59
  • 1
    Code can be refactored to reduce call stack depth. In general, this requires separating the task of "determining which class/method should be handling the subsequent processing" and the task of "actually passing control to the subsequent class/method". To do this, the code responsible for the first task should return the delegate back to its caller, which means it will be exiting from the scope of one or more functions. Once the delegate reaches a suitable level, the delegate is invoked. This is more powerful than compiler-based tail call optimization. – rwong Aug 19 '17 at 08:22
  • @rwong: I think this is the answer :) the opposite of an deep call stack is an pipeline. divide and conquer! the task is separated to smaller/flatter subprocessing. Every subprocess can be better understood. It's like the unix philosophy: 1. do one thing and do it well 2. Write programs to work together 3. something about text-streams. – microo8 Aug 20 '17 at 07:18

2 Answers2

11

I agree that abstractions influence the call stack in the way you described. I also agree that limiting the call stack can give you less abstractions. But there are two major reasons for me to not limit the call stack:

Less abstraction is not necessarily good

By reducing abstractions you also risk losing many useful abstractions we built over the years. That's the very problem with abstractions - some of them are extremely valuable to us, but at the same time, others can get in the way and complicate things unnecessarily.

Unfortunately, it is not clear which abstraction belongs on which side of this coin. It also differs based on your project and team. What one team finds to be a complicated abstraction is just ordinary every-day programming to another team. What is a really useful abstraction for one project may be overkill for another. It's just not clear-cut enough to warrant any sort of measure aimed at reducing abstractions.

Smaller callstack is not necessarily good

A similar problem appears when you look at the call stacks. True, a call stack spanning over multiple screen pages is not a pleasant thing. But every developer needs to learn how to read a call stack anyways, since already basic programs can easily get to a depth of a dozen calls or more. The interesting question though is how hard is it to interpret the call stack. If it is full of abstractions which make it complicated to find what you are looking for, then this is certainly bad.

On the other hand, not all lengthy call stacks are bad. Especially when it comes to readability of code many developers advice to limit your method sizes. Since you still need to implement the same complicated logic you have to resort to splitting the code into more methods then. This in turn increases the call stack size, but does not add a lot of complexity, since the only abstraction involved there is a straightforward method call.

On the other hand, if you limit the call stack size, a developer can no longer split her code into easily understandable methods. Instead you need to resort to bunching all the code into a single method, which dramatically raises complexity , complicates maintenance and reduces readability.

Is it a good idea?

I don't think so. As argued above, there does not appear to be a direct relationship between complexity and call stack sizes. In fact, there are standard cases in which the opposite holds true. Given that your desired effect is not guaranteed to be present, I wouldn't do it.

Frank
  • 14,427
  • 3
  • 42
  • 67
  • Of course, most languages allow tail-calls and other optimizations which flatten the call-stack. Java is a prominent example of a language which does not. – Deduplicator Aug 17 '17 at 13:26
  • Tail-call optimization exactly highlights my point: you can write very concise and understandable recursive methods. How this abstraction relates to the call stack size though is not clear enough to argue in favor of such a limit. – Frank Aug 17 '17 at 13:30
  • By following that reasoning you should be writing your programs in Assembler, or even better, directly program via logic gates. -- That's a slippery-slope logical fallacy, a straw man fallacy, and probably a few others. – Robert Harvey Aug 17 '17 at 15:58
  • @RobertHarvey I was just intending an exaggeration, but you are right of course. Hope it's better now. – Frank Aug 18 '17 at 05:28
  • Maybe this limit can be relative, not absolute. Not an constant number, some problems are complex and it's better to break it down to smaller functions. When the problem is complex, the code is complex and the call stack is bigger too, that's ok. Or maybe we should limit just the depth of inheritance, and also avoid having factories of factories of etc. thanks for your answer :) – microo8 Aug 18 '17 at 07:26
2

Abstraction vs runtime performance

Nowadays Java provide lots of abstracions to develops way faster than old C at the cost of a bit of performance yes, but those usually don't matter for what you're doing.

Frameworks like Spring, server like Tomcat are product that have been developped for years with equip of developpers that are way more competent than the average developers. Spring beans & transaction management produces quite ugly stacks since it has to create proxy of your class yet it is not slow.

If you use the web filter from the JSR and chain multiple chains of them your stacks will look like this (from top to bottom!) :

at [.apache..].doFilter(ApplicationFilterChain.java:207)
 [.apache..].internalDoFilter(ApplicationFilterChain.java:240)
 [...your filter..].doFilter(DelegatingFilterProxy.java:262) 
 [.apache..].doFilter(ApplicationFilterChain.java:207)
 [.apache..].internalDoFilter(ApplicationFilterChain.java:240)
 [...your 2nd filter..].doFilter(DelegatingFilterProxy.java:262) 
 ...

So if you chain filter for :

  • Logging stuff
  • Authentication
  • Authorization
  • Start database transanction
  • Some framwork filter stuff

You can have already 20 line of that in your stacks, but most of those filter usually perform very few operations. Even if you get up to 1000 operations (Java more in assembly but still doesn't matter), it's nothing nowadays.

Furthermore if you have a fewer call stacks, that doesn't mean you're getting faster, you can just have a flatter hierarchy of function, which execute the same number of instructions.

Worrying about the length of your call stacks is useless even in C, unless your doing something extremely specific (kernel ?), you're wasting your energy by worrying about that. Reducing the number of parameter of a function by using a struct/class is more usefull yet unecessary optimisation most of the time (maybe if you loop throught that function 1 000 000 000 times ?)

Having a hard time reading a java stack ? 99% of the time you need to read it your stack will look like this :

Caused by : bla bla bla
 at bla bla bla
 at bla bla
 at bla bla bla 
...
Caused by : bla bla bla 2
 at bla bla bla
 at bla bla
 at bla bla bla 
 ...
Caused by : bla bla bla 3
 at bla bla bla
 at bla bla
 at bla bla bla 
 ...

Where [...] is about 10-40 lines of stacks. Those are usually useless, find in each caused by you need to dig through the first call of your class and you will find what is wrong most of the times easily. I hardly read more than 10 lines of stacks in 200 hundreds+, when the error is not in the first "caused by", of them to find what is wrong.

I have done only web developments until now, and I have yet to see my code going slow because of tomcat/spring instead of myself.

Aside from that :

You still think that tomcat is slow and prove by benchmarking that you need to earn some time from his layer ? There is the native connector APR,that switch the Java connector implementation to a native one.

You can also tune others parameters, and JVM, this requires knowledge yes, but isn't it the same when you use specific gcc/g++/VS compiler options ?

Abstraction vs development time

One could say easily the following : more abstractions = more classes = slower development time.

This is easy to say and I will try to address that through some points.

Abstraction* have the goal to hide unecessary thing and force you to use it properly without worrying about implementations details, so it means when coding, your code will tend be more stable. It will take more time to think properly how to write your code with those layers of abstraction, but because they're lot less details to worry about : you will write shorter code and more stable. So yes you will write less lines of codes per hour, but in the end you will end with less lines of code, less times to debug.

If there is a trap with abstractions it is not about raw development time, it is about the learning curve to be able to know how to use them properly : each new framework/library increase the learning curve and make turn over more difficult to handle which can end up with more development time because you end up with novices that don't stay long enough.

Because when you don't know how the abstractions is supposed to work, you will pass lot of time to write those 5 lines of codes and debug them at start.

For me the best way to handle that learning curve is :

  • Avoid exotic/"cleaver" usage of your tools keep it to standard usage.
  • Keep it simple as possible. Don't add unecessary framework and isolate the components that use each of them in differents layers.
  • Put a standard way of doing some common stuff through the app. It must be short so people will refer to it more easily. Moreover they will remember when they will be about to do something that it is documented.
  • Be (or the one with the knowledge) available to answer questions, don't only answer to "how make it work" but why it has to be done like this.

  • : This is of course considering that the job has been done good enough, not poor/over-engineer abstractions .

Walfrat
  • 3,486
  • Your answer is ok but I just wanted to point out that in the talk by Blow linked in the question he actually talks about how more abstractions lead to slower development time (more bugs, difficulty to debug, difficulty to reason about the code etc), not only execution time. Your answer doesn't address that. – jhyot Aug 18 '17 at 07:46
  • I couldn't check the video so I red only the post, I will edit. – Walfrat Aug 18 '17 at 07:54