9

Why do we need the main function and can't we execute the code without it? Can't we just execute our code outside the main function by making our own function?

Kakashi San
  • 115
  • 1
  • 2
  • 14
    OCaml and in generral the ML family of language do not have main. They just execute everything given to them. Python is also like that. – Andrej Bauer May 06 '23 at 20:02
  • 6
    I think this question belongs on StackOverflow rather than here. When considering programming abstractly, you can write whatever you like, e.g. write a function or construct an object and consider the complexity or other features of it. – einpoklum May 06 '23 at 21:46
  • 10
    What do you mean by "almost all the programming languages"? Of all the programming languages that I am using, none needs a main function. This includes, for example, Ruby, Python, Perl, PHP, ECMAScript, TypeScript, Smalltalk, Self, Newspeak, Scala, C#, Raku, Lua, Dart, Scheme, Racket, and many others. In this repository: https://github.com/JoergWMittag/lambdaconscarcdr I have implemented the same code in 65 different programming languages, and none of them require a main function. – Jörg W Mittag May 06 '23 at 23:54
  • I suppose that in C++ at least, you could (in principle) avoid the need for a main() function by instead declaring a static object and running all of your code from within that object's constructor. You wouldn't have access to argc and argv though, so you would either have to do without command line arguments, or find some other way to access them. – Jeremy Friesner May 07 '23 at 05:41
  • @JeremyFriesner: True. You'd have no guarantee of which other static objects were already initialized and which weren't, but you could just avoid all other static constructors if you wanted to abuse the language this way. (And somehow convince a toolchain to link an executable without a main, perhaps with custom CRT startup code that's simpler.) – Peter Cordes May 07 '23 at 06:16
  • @JeremyFriesner I believe you would still need an empty main() to satisfy the linker, though. – Ken Y-N May 08 '23 at 01:32
  • @AndrejBauer That is not true, python does have a main function, it just hides it. If you execute a script, everything is implicitly in an "evironment" called "__main__", which is for all intents and purposes a function. https://docs.python.org/3/library/main.html – kutschkem May 08 '23 at 05:52
  • 5
    For all intents and purposes __main__ is an environment, not a function. And also, stuff that is defined in class definitions is executed. Actually, everything is just exectuted. – Andrej Bauer May 08 '23 at 08:46
  • 5
    You don't need a main function, you need an agreed starting point. It can be the first line of the file, it can be a function with a specific name (main) or it can be whatever you like. Those are different conventions and different languages use different conventions. – infinitezero May 08 '23 at 09:42
  • OCaml and Python don't require a main function, yet if you read code written in these languages, you'll often find that developers choose to have a function called main which will be the starting point. – Stef May 09 '23 at 08:06
  • Effectively, { void main() } is the universal interface for an OS to call an executable program. Executable code without that interface isn't a program it's a library. – RBarryYoung May 09 '23 at 16:15
  • @RBarryYoung -- no, not at all. Each OS defines its own format for executable files, and that format defines how the entry point into the application is identified when the program is loaded and run. Compilers for high-level languages link to runtime code that has the OS-required entry point and supporting data. In C++ that runtime code typically does a bunch of initialization before calling the function named main. – Pete Becker May 09 '23 at 16:44
  • 1
    @RBarryYoung void main() is a non-standard, implementation-defined interface from some company based in Redmond WA. The C and C++ standards both call for int main(void) or int main(int arc, char** argv). main should return an int. – David Hammen May 12 '23 at 12:09
  • Many mature languages like Basic, Pascal or even Python don't have such requirements. It's mostly a thing in C-family languages, many of the others don't require it. Are C-family languages your background? – Mast May 12 '23 at 17:45

7 Answers7

16

Well, there needs to be somewhere your code starts executing(apart from the necessary initialisation and cleanup, which isn't readily visible to the programmer in the form of code).

Let's take reference to the C language, which influenced almost every other programming language. Documentation for main() in C states

Every C program coded to run in a hosted execution environment contains the definition (not the prototype) of a function named main, which is the designated start of the program.

The main function is called at program startup, after all objects with static storage duration are initialized. It is the designated entry point to a program that is executed in a hosted environment (that is, with an operating system).

Now if we are talking about assembly language, here is what an assembly code would look like

    org  0x100
mov  dx, msg       
mov  ah, 9         
int  0x21         

mov  ah, 0x4c      
int  0x21         

msg  db 'Where is my main() here?', 0x0d, 0x0a, '$'

And now python

print("No main!")

As you can see, main() isn't mandated, it's rather a specification and a convention which has been followed all along for the languages that do mention it(likes of C, C++, Java), and well, it does serve a purpose. And yes, you can have a program that doesn't start from main(), at least in C to my knowledge(gcc via nostartfiles option).

Rinkesh P
  • 1,024
  • 1
  • 5
  • 17
  • 2
    Note that the standard specifically says main is not a function, it is an entry point. I.e., you aren't allowed to call main. Yes, most Unixy C implementatons call a main function to way any function is called, but there might be (probably are) ones that don't. – vonbrand May 06 '23 at 18:14
  • That's 16-bit x86 code for a DOS .com executable. In executable file formats with metadata (like DOS/Windows PE .exe), you typically would define a symbol in the asm source, perhaps WinMain@16. Or under Linux, like you say with gcc -nostartfiles, you'd define a _start: label as the actual ELF entry point, instead of linking in the CRT startup code which provides a _start that (indirectly) calls main. – Peter Cordes May 06 '23 at 20:19
  • 8
    @vonbrand: You're mixing up C and C++. In C, main is a function you're allowed to call from elsewhere in the program. (calling main() in main() in c). In C++ it's not, letting the compiler put constructor hooks at the top of main instead of in code that runs before main. (Cygwin or MinGW gcc actually does this for some reason.) – Peter Cordes May 06 '23 at 20:21
  • 1
    In Java (simplifying slightly), every class can have a main() method and you have to say which one you want to execute, by naming the class either on the command line itself or by establishing a default entry point in the JAR file manifest. – Michael Kay May 08 '23 at 06:46
  • As for the last part of this answer, as an embedded developer I've often written perfectly valid freestanding C code without a main(), where according to the C standard "the name and type of the function called at program startup are implementation-defined." – pipe May 08 '23 at 10:12
  • @MichaelKay You can have main as a c++ class member too. But you know very well that is not what the OP meant. The entry main(...) function is not the same thing as a class member function named main(...). – stackoverblown May 09 '23 at 10:11
10

This is a design decision in the C language. Besides making main a reserved name, this is quite harmless and you can always implement the main function as a mere call to your favorite function. Also note that main has a fixed prototype, that allows to interact with the OS.

Without this convention, you would need to tell the linker what is the entry point in the executable by some external method or a special C declaration.

Other languages have different conventions. For instance Pascal and FORTRAN demand a Program declaration. Python works like an implicit function at file level.

  • 2
    main isn't usually the OS-level entry point. On GNU/Linux systems, the ELF entry point is normally called _start, which passes the address of main to __libc_start_main (in libc.so.6), which inits library data structures then calls that function pointer. That _start code comes from crt1.o etc. which compilers also link in to each executable. If you were writing by hand in assembly, with your own entry point code, that's when you'd use ld -e my_entry_point -o foo foo.o if you didn't want to name it _start. – Peter Cordes May 06 '23 at 20:12
  • 2
    If you use gcc -c hello.c / ld -e main hello.o, you won't get a working executable (even if you link -lc). See also What is the use of _start() in C? / ld : _start not found defaulting to , and Assembling 32-bit binaries on a 64-bit system (GNU toolchain) for some x86 Linux main vs. _start stuff. – Peter Cordes May 06 '23 at 20:14
  • @PeterCordes: it's the job of the C compiler to tie the main function to the proper entry point, whatever it is (plus the handling of the argc/argv arguments). –  May 06 '23 at 20:17
  • 2
    Correct. And as you say, the CRT startup code also handles the difference between the process startup environment (e.g. in the System V ABI with the stack pointer initially pointing at argc then argv[0..n] values in stack memory) vs. the function calling convention. (Args in registers, or on the stack with a return address below them, something the ELF entry point doesn't have.) – Peter Cordes May 06 '23 at 20:25
  • Your middle paragraph isn't strictly wrong; if you didn't have a separate _start, you would need to specify one of your functions as the entry point, but that could only work on a system designed to work that way, where the function-calling convention matched the process startup environment. Or where the entry point metadata was used by the dynamic linker, so there can still be other code that runs first, it just doesn't live in each executable. (IIRC, modern macOS works something like this, where ld -e main is correct for linking compiler output, and _start is in libc or something.) – Peter Cordes May 06 '23 at 20:28
  • @PeterCordes: right, control must be passed to the C run-time before launching the user function. I mean that the C user function could be specified via a compiler flag. –  May 06 '23 at 20:43
  • Ok, that could I guess be possible, with a linker flag that overrode the symbol name some relocation in crt1.o was looking for. (Or however you want to go about building a toolchain that could work this way, unlike current linkers.) At the end of the previous paragraph, you say that main's fixed prototype allows it to "interact with the OS". That and the linker-flag discussion sound a lot like you're saying that main can be the process entry point itself. (Which of course isn't the case; besides ABI, some non-Unix OSes like DOS/Windows pass one flat string, not C's argv[] array.) – Peter Cordes May 06 '23 at 20:55
  • @PeterCordes: in any case, my answer is generic, technical details are irrelevant. –  May 07 '23 at 05:54
  • Often people asking questions like this are curious about the technical challenges that implementations have to solve, and/or some low-level details of how they actually work, like what main really becomes in a compiled executable. Your answer doesn't do this, hence my comments. Saying when something is an over-simplification is a good policy, as a rule, even if the simplification is useful as part of a high-level picture. – Peter Cordes May 07 '23 at 06:07
  • @PeterCordes: you are on the wrong site (and so is the OP). This is CS, not StackOverflow. –  May 07 '23 at 08:00
  • main is not a reserved identifier. It's true that you can't have anything else called main in the global namespace, but you can use the main identifier freely everywhere else. – Aykhan Hagverdili May 08 '23 at 07:07
  • 1
    @AyxanHaqverdili: main is technically not a reserved identifier. I don't want to clutter my answer with such details, this is CS. Though the habit originated in the C language, the answer is not limited to that one. Next time I'll put quotes around. –  May 08 '23 at 07:17
  • @YvesDaoust No need to clutter it. You may remove the "Besides making main a reserved name" part and your answer should be more accurate. – Aykhan Hagverdili May 08 '23 at 07:19
  • @AyxanHaqverdili: I disagree. Main has a special meaning in the language. –  May 08 '23 at 07:20
9

This is partly to do with the grammatical structure of the language. Consider Java for example:

  • At the top level, each file is a compilation unit,
  • A compilation unit contains type declarations (classes, interfaces or enums),
  • A type declaration contains member declarations (fields, methods, constructors...),
  • A method or constructor declaration contains statements.

Ultimately, statements are what get executed. The language's grammar is simpler by not having extra rules to allow statements at the top level, only declarations. The semantics of the language are also simpler when you consider programs whose declarations span multiple files, since there is no need to define what order statements at the top levels of different files should be executed in. The implementation is also potentially simpler because declarations don't need to be ordered.

Other languages such as Python allow statements at the top level because all declarations (e.g. def or class) are a kind of statement ─ in such languages, even import is a statement that is executed, so the order of execution of top-level statements in multiple files is determined by the order of import statements in the files which import them. This tends to be the case for more dynamic languages, which will load modules and resolve names to declarations at runtime, so that there is no need for the compiler to find all of the declarations before it compiles statements, because the compiler doesn't need to know the meanings of the names which occur in those statements.

kaya3
  • 520
  • 2
  • 10
  • Comments have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review the purposes of comments. Comments that do not request clarification or suggest improvements usually belong as an answer, on [meta], or in [chat]. Comments continuing discussion may be removed. – D.W. May 08 '23 at 07:31
5

An important philosophical point you might want to consider: Is there really a fundamental difference between declaring a place where the program starts, and simply having the program start at the top of a specific file? main, in all the languages that have such a thing, is really just a marker that says "start here." Some languages have a main marker, and some appear not to, but really, for those languages that don't seem to have a declared main, they still have a place to start — it's just implicitly chosen!

So another way to look at this thought experiment: In languages like C# and Pascal we might explicitly write main() to tell the program where to begin. But is that really truly different from the compiler invisibly inserting main() { at the start of the file and } at the end if we didn't write such a thing ourselves? Looked at this way, there's always a main() — but you just may not have written it yourself.

So that's the real difference:

  • Some languages require you to say where the program begins. (C, C++, Pascal, Java, Rust.)
  • Some languages allow you to say where the program begins, and they have a rule for where it begins if you don't explicitly state otherwise. (Modern C#.)
  • Some languages only have a rule for where the program begins, and it always begins there. (JavaScript, Python, Ruby, Lisp.)

The program always starts somewhere. The only real question is whether your language lets you or forces you to choose that somewhere, or whether it chooses that somewhere by itself.


(Addendum, just for completeness: There's also another category in the bullet points above for declarative languages like SQL and Prolog, where there's really no main at all: The language implementation decides what order to run the pieces of your program in, and whether A is run before B is really up to the computer. Then there's arguably also a category for languages like HTML and CSS and SVG, where it's arguably still a "program," but how it even runs at all doesn't matter as long as the output is correct. There are lots of ways to describe a program, and lots of different ways to have or not have a main; one size does not even come remotely close to fitting all!)

Sean Werkema
  • 173
  • 6
  • This is all true (+1), although I think it is worth making the point that your third category of languages, where the program begins executing "at the start of the source file", are all languages where the source files exist at runtime and the program is invoked by executing a particular source file. Therefore this design isn't really an option when a language is compiled and linked from multiple source files into one binary ─ you would still need to explicitly declare which source file is the entry point, at least. – kaya3 May 08 '23 at 17:20
  • @kaya3 There are plenty of oddball categories out there, like assembly ROM code that starts at a fixed address, or Smalltalk and Lisp environments that never truly shut down; but once you get the idea that main is just a way to tell the machine where your program begins, you can usually come up with all sorts of clever answers about ways that could be made to work. – Sean Werkema May 08 '23 at 18:06
1

An application or program has a start. That's the "main" function. Many languages do not call it "main", and in many it is implicit (e.g. the function defined by the script as a whole).

On the other hand, libraries are collection of functions (and code, data, resources, etc.), and do not have a starting point. You chose your own "starting point" every time you call it. (Some libraries have initialization, and some have state.)

Pablo H
  • 211
  • 1
  • 6
1

Why do we need the main function and can't we execute the code without it?

It depends entirely on how the language is defined. Programming languages are man-made constructs anyway, so we can make them as we like. With that in mind, we don't "need" a main function, and we can well have a language that can read from outside any function.

Now, I'm not sure if you wanted discount them for some reason, but many interpreted "scripting" languages, like e.g. Perl, Python, and the various shell languages are good examples of languages where the tooling is happy with code in the top level, outside functions. But for non-trivial programs, it's often useful for us humans to structure the code using functions, including a main function (even if you have to call it explicitly.) Since larger programs will use functions anyway, and for smaller programs, it's only an extra few lines of boilerplate, it's not a big leap for a language designer to choose that all code must reside in functions. It might make the language syntax slightly more straightforward to define, too.

ilkkachu
  • 191
  • 5
-1

A lot of great answers have already been given, but I would like to name some additional aspects I find quite important as well: not having to define a main function as the entry point encourages less encapsulation and thus less clean, in general uglier (and sometimes also even unusable) code.

This is because high indentation levels are almost always regarded as something you should seek to avoid, and thus code that should have been separated among different classes, methods, functions or even modules is just sequentially written down in languages that support it. In addition, it prevents that callers need to execute arbitrary code at runtime when importing modules - there is a reason why idioms like if __name__ == "__main__" are always recommended, even in smaller projects, in languages like Python.

leo848
  • 99
  • 1