Why doesn't C++ allow you to take the address of a constructor?

Question

Is there a specific reason that this would break the language conceptually or a specific reason that this is technically infeasible in some cases?

~~The usage would be with new operator.~~

Edit: I'm going to give up hope on getting my "new operator" and "operator new" straight and be direct.

The point of the question is: why are constructors special? Keep in mind of course that language specifications tell us what is legal, but not necessarily moral. What is legal is typically informed by what is logically consistent with the rest of the language, what is simple and concise, and what is feasible for compilers to implement. The possible rationale of the standards committee in weighing these factors are deliberate and interesting -- hence the question.

It would not be problem of taking address of a constructor, but being able to pass around type. Templates can do that. — Euphoric, Jun 20 '14 at 19:33
What if you have a function template that you want to construct an object using a constructor that will be specified as an argument to the function? — Praxeolitic, Jun 20 '14 at 19:37
Why not encapsulate it in a different function? Or use Factory Pattern. — Euphoric, Jun 20 '14 at 19:39
There will be alternatives for any example I can think up, but still, why should constructors be special? There are plenty of things that you likely won't use in most programming languages but special cases like this usually come with a justification. — Praxeolitic, Jun 20 '14 at 19:47
What would the address of a constructor be, exactly? By definition a constructor creates an object. No object, no address. No constructor call, no object. — Robert Harvey, Jun 20 '14 at 19:58
@RobertHarvey: are you confusing pointers to functions with pointers to objects? — Doc Brown, Jun 20 '14 at 20:41
@DocBrown: Constructors don't return anything, so unless you're relying on side effects to accomplish whatever this is, factory methods seem like a better choice. Finding the memory address of a constructor doesn't seem particularly relevant to me. — Robert Harvey, Jun 20 '14 at 20:44
@RobertHarvey: but the question was not "are there any better alternatives", but "why doesn't the language provide it"? — Doc Brown, Jun 20 '14 at 20:49
@RobertHarvey: that's not really a satisfactory explanation, don't you think so? I think there must be better reason for this. I have an idea about this, maybe I can make an answer of it ... — Doc Brown, Jun 20 '14 at 20:50
@RobertHarvey The question occurred to me when I was about to type up a factory class. — Praxeolitic, Jun 20 '14 at 21:05
operator new() doesn't call constructor - it just allocates memory, and constructor is called by new operator, not operator new() function (more directly, the compiler) — ikh, Jun 21 '14 at 02:54
@Praxeolitic That's more strange.. It is the compiler that implements a new operator, and compiler knows which constructor should be used. — ikh, Jun 21 '14 at 10:01
I’ve never thought about this problem in 25 years of C++ programming. However, it is an interesting question. The answer lies in the implementation of the “new operator”, which cannot be overridden. The closest choice is placement new. — Bill Door, Jun 21 '14 at 17:15
@ikh It turns out what I originally had in mind was placement new (which I think is a specific form of "new operator"?). Now that I think about it more, I'm undecided if that example makes sense. — Praxeolitic, Jun 21 '14 at 21:14
I wonder if the C++11 std::make_unique and std::make_shared can adequately solve the underlying practical motivation for this question. These are template methods, which means one need to capture the input arguments to the constructor, and then forward them to the actual constructor. — rwong, Sep 09 '16 at 04:49

Doc Brown · Accepted Answer · 2014-06-20T22:12:55.120

13

Pointers-to-member functions make only sense if you have more than one member function with the same signature - otherwise there would be only one possible value for your pointer. But that is not possible for contructors, since in C++ different constructors of the same class must have different signatures.

The alternative for Stroustrup would have been to choose a syntax for C++ where constructors could have a name different from the class name - but that would have prevented some very elegant aspects of the existing ctor syntax and had made the language more complicated. For me that looks like a high price just to allow a seldom needed feature which can be easily simulated by "outsourcing" the initialization of an object from the ctor to a different init function (a normal member function for which pointer-to-members can be created).

edited Jun 20 '14 at 22:12

answered Jun 20 '14 at 20:59

Doc Brown

206,877

2

Still, why prevent memcpy(buffer, (&std::string)(int, char), size)? (Probably extremely un-kosher, but this is C++ after all.) – Thomas Eding Jun 21 '14 at 09:19
4

sorry, but what you wrote makes no sense. I see nothing wrong with having pointer to member pointing to a constructor. also, it Sounds like you quoted something, without link to source. – BЈовић Jun 21 '14 at 09:23
1

@ThomasEding: what exactly do you expect that statement to do? Copying the assembly code of the string ctor somewgere? How will "size" be determined (even if you try something equivalent for a standard member function)? – Doc Brown Jun 21 '14 at 09:35
I'd expect it do the same thing it would do as given the address of a free function pointer memcpy(buffer, strlen, size). Presumably it would copy the assembly, but who knows. Whether or not it the code could be invoked without crashing would require knowledge about the compiler you use. Same goes for determining size. It would be highly platform dependent, but lots of non-portable C++ constructs are used in production code. I see no reason to outlaw it. – Thomas Eding Jun 21 '14 at 10:07
@ThomasEding: A conforming C++ compiler is expected to give a diagnostic when trying to access a function pointer as if it were a data pointer. A non-conforming C++ compiler could do anything, but they can also provide a non-c++ way of accessing a constructor as well. That is not a reason to add a feature to C++ that has no uses in conforming code. – Bart van Ingen Schenau Jun 21 '14 at 16:17
@ThomasEding: for regular pointer-to-member functions, the typical use case is clear: to switch calls between different functions with the same signature. My post explains why this is not possible for ctors. For a hypothetic pointer-to-ctor, the only use case we know is a horrible, obscure hack which semantics is not clear for any compiler I know of (and this is true for regular member functions as well). That does not seem a very convincing reason for me to introduce such a feature into the language. – Doc Brown Jun 21 '14 at 18:33
IMHO, your answer doesn't satisfy my understanding. Methods of a class can be overloaded just like free-standing functions, hence if you wanted to get a pointer of a specific one you have to help the compiler with overload-resolution. Also, constructors participate in the type-system of the C++ language and you can refer to them, e.g. in derived classes with the using base::base construct. – klaus triendl Jan 17 '20 at 16:01

score 8 · Answer 2 · answered Jun 21 '14 at 07:45

8

A constructor is a function that you call when the object does not yet exist, so it could not be a member function. It could be static.

A constructor actually gets called with a this pointer, after the memory has been allocated but before it has been completely initialised. As a consequence a constructor has a number of privileged features.

If you had a pointer to a constructor it would either have to be a static pointer, something like a factory function, or a special pointer to something that would be called immediately after memory allocation. It could not be an ordinary member function and still work as a constructor.

The only useful purpose that comes to mind is a special kind of pointer that could be passed to the new operator to allow it to indirect on which constructor to use. I guess that could be handy, but it would require significant new syntax and presumably the answer is: they thought about it and it wasn't worth the effort.

If you just want to refactor out common initialisation code then an ordinary memory function is usually a sufficient answer, and you can get a pointer to one of those.

answered Jun 21 '14 at 07:45

david.pfx

8,125

This seems like the most correct answer. I recall an article from many (many) years ago concerning operator new and the internal workings of “the new operator”. operator new() allocates space. The new operator calls the constructor with that allocated space. Taking the address of a constructor is “special” because calling the constructor requires space. The access for calling a constructor like this is with the placement new. – Bill Door Jun 21 '14 at 17:11
1

The word "exist" obscures the detail that an object can have an address and have allocated memory but not be initialized. On member function or not, I think getting the this pointer makes a function a member function because it's clearly associated with an object instance (even if uninitialized). That said, the answer raises a good point: the constructor is the only member function that can be called on an uninitialized object. – Praxeolitic Jun 21 '14 at 21:31
Nevermind, apparently they have the designation of "special member functions". Clause 12 of the C++11 standard: "The default constructor (12.1), copy constructor and copy assignment operator (12.8), move constructor and move assignment operator (12.8), and destructor (12.4) are special member functions." – Praxeolitic Jun 21 '14 at 22:07
1

And 12.1: "A constructor shall not be virtual (10.3) or static (9.4)." (my emphasis) – Praxeolitic Jun 21 '14 at 22:12
@Praxeolitic: Of course. If it was not clear, I'm saying that this is what would be needed and the consequences would have major impacts on the language. – david.pfx Jun 22 '14 at 04:10
1

The fact is that if you compile with debug symbols and look for a stack trace, there is actually a pointer to the constructor. What I was never able is to find the syntax to get this pointer (&A::A doesn't work in any of the compilers I tried.) – alfC May 05 '16 at 08:37
@Praxeolitic: Non-constructor member functions can be called on an uninitialized or partially-initialized object. This is what allows MyClass::MyClass() { init(); } to work. – dan04 Sep 09 '16 at 21:16
1

@Praxeolitic: in your example, the object is fully initialised before init() is called. Base classes have been called, member initialisation is complete. the contents of init() are equivalent to statements within the constructor. – david.pfx Sep 11 '16 at 00:39
@david.pfx I think you meant to ping @dan04? – Praxeolitic Sep 11 '16 at 00:42
@Praxeolitic: oops, sorry about that. Bit late to edit now. – david.pfx Sep 11 '16 at 07:19

score -3 · Answer 3 · answered Oct 10 '15 at 10:17

-3

This is because their is no return type of constructor and you are not reserving any space for the constructor in memory. Like u do in case of variable during declaration. For example : if u write simple variable X Then compiler will generate error because compiler will not understand the meaning of this. But when you write Int x; Then compiler come to know that it int type data variable , so it will reserved some space for variable.

Conclusion:- so the conclusion is that due to exclusion of return type it will not get the address in memory.

answered Oct 10 '15 at 10:17

lovish Goyal

1

1

The code in the constructor has to have an address in memory because it has to be somewhere. There's no need to reserve space for it on the stack but it must be somewhere in memory. You can take the address of functions that don't return values. (void)(*fptr)() declares a pointer to a function with no return value. – Praxeolitic Oct 10 '15 at 10:22
2

You missed the point of the question - the original post asked about taking the address of the code for the constructor, not the result that the constructor provided. In addition, on this board, please use full words: "u" is not an acceptable replacement for "you". – BobDalgleish Oct 10 '15 at 12:20
Mr praxeolitic, I think if we don't mention any return type then compiler will not set a particular memory location for ctor and it location is set internally.... Can we fetch the address of any thing in c++ which is not given by compiler? If am wrong then please correct me with correct answer – lovish Goyal Jun 07 '16 at 18:21
And also tell me about reference variable. Can we fetch the address of reference variable? If no then what address printf("%u",&(&(j))); is printing if &j=x where x=10? Because address printed by printf and address of x are not same – lovish Goyal Jun 07 '16 at 18:34

Dmytro · Answer 4 · 2016-09-08T23:04:36.510

-4

I'll take a wild guess:

C++ constructor and destructor are not functions at all: they are macros. They get inlined into the scope where the object is created, and the scope where the object is destroyed. In turn, there is no constructor nor destructor, the object just IS.

Actually, I think the other functions in the class are not functions neither, but inline functions that DONT get inlined because you take address of them(the compiler realizes you're onto it and doesn't inline or inlines the code into the function and optimizes that function) and in turn the function seems to "still be there", even though it would not if you haven't took address of it.

The virtual table of the C++ "object" is not like a JavaScript object, where you can get its' constructor and create objects from it at runtime via new XMLHttpRequest.constructor, but rather a collection of pointers to anonymous functions that act as means to interface with this object, excluding ability to create the object. And it doesn't even make sense to "delete" the object, because it's like trying to delete a struct, you can't: it's just a stack label, just write to it as you please under another label: you are free to use a class as 4 integers:

/* i imagine this string gets compiled into a struct, one of which's members happens to be a const char * which is initialized to exactly your string: no function calls are made during construction. */
std::string a = "hello, world";
int *myInt = (int *)(*((void **)&a));
myInt[0] = 3;
myInt[1] = 9;
myInt[2] = 20;
myInt[3] = 300;

There is no memory leak, there is no issues, except you effectively wasted a bunch of stack space that's reserved for the object interfacing and the string, but it's not going to destroy your program(as long as you don't try to use it as a string ever again).

Actually, the if my earlier assumptions are correct: the complete cost of the string is just the cost of storing these 32 bytes and the constant string space: the functions are only used at compile time, and may as well get inlined and tossed away after the object is created and used(As if you were working with a struct and only referred to it directly without any function calls, sure there's duplicate calls instead of function jumps, but this is usually faster and uses less space). In essence, whenever you call any function, compiler just replaces that call with the instructions to literally do it, with exceptions that the language designers have set.

Summary: C++ objects have no idea what they are; all the tools for interfacing with them are inlined statically, and lost at runtime. This makes working with classes as efficient as filling structs with data, and directly working with that data without calling any functions at all(these functions are inlined).

This is completely different from the approaches of COM/ObjectiveC as well as javascript, which retain the type information dynamically, at the cost of runtime overhead, memory management, calls of constructions, as the compiler can't throw this information away: it's necessary for dynamic dispatch. This in turn gives us the ability to "Talk" to our program at runtime, and develop it while it is running by having reflectable components.

edited Sep 08 '16 at 23:04

answered Sep 08 '16 at 22:52

Dmytro

123

2

sorry, but some parts of this "answer" are either wrong or dangerously misleading. Sadly, the comment space is far too small to list them all (most methods won't get inlined, this would prevent virtual dispatch and bloat the binary; even if inlined, there might be an addressable copy somewhere accessible; irrelevant code example that in the worst case corrupts your stack and in the best case doesn't fit your assumptions; ...) – hoffmale Sep 09 '16 at 03:18
The answer is naive, I just wanted to express my guess as to why constructor/destructor cannot be referenced. I agree that in the case of virtual classes, the vtable must persist, and addressable code must be in memory so that the vtable can reference it. However, classes that do not implement a virtual class, seem to be inlined, as in the case of std::string. Not everything gets inlined, but things that do not seem to be minimally put into an "anonymous" code block somewhere in memory. Also, how does the code corrupt the stack? Sure we lost the string, but otherwise all we did is reinterpret. – Dmytro Sep 09 '16 at 04:15
Memory corruption occurs in a computer program when the contents of a memory location are unintentionally modified. This program does it intentionally and does not try to use that string anymore, so there is no corruption, just wasted stack space. But yes, string's invariant is no longer maintained, it does clutter the scope(at the end of which, the stack is recovered). – Dmytro Sep 09 '16 at 04:19
depending on the string implementation you might write over bytes you don't want to. If string is something like struct { int size; const char * data; }; (like you seem to assume) you write 4 * 4 Bytes = 16 bytes on a memory address where you only reserved 8 bytes on a x86 machine, so 8 bytes are written over other data (which can corrupt your stack). Fortunately, std::string normally has some in-place optimisation for short strings, so it should be large enough for your example when using some major std implementation. – hoffmale Sep 09 '16 at 05:23
@hoffmale you are absolutely right, it could be 4 bytes it could be 8, or even 1 byte. However once you know the size of the string, you also know that that memory is on stack in the current scope, and you can use it as you please. My point was that if you do know the structure, it is packed in a way independant of any information about the class, unlike COM objects that have a uuid identifying their class as part of IUnknown's vtable. In turn, the compiler accesses this data directly via inlining, or mangled static functions. – Dmytro Sep 09 '16 at 05:26
That said you are right in that it is misleading to say that it's always 4 bytes. That's implementation specific, but each string supports sizeof to check how much stack space they take up, as long as it was created without dynamic allocation functions. – Dmytro Sep 09 '16 at 05:29
my point is: the actual string data (those 13 chars) are not on the stack! Most compilers will put those into a static section of the executable and all you will get is a pointer to that, so you only get 4 bytes for the pointer (not 13 bytes for the whole string data), which means your calculations for what is "safe" to overwrite are still off. Even if the compiler doesn't do this, std::string might still place those chars on the heap... – hoffmale Sep 09 '16 at 05:30
Let us continue this discussion in chat. – hoffmale Sep 09 '16 at 05:32
@hoffmale no, it's in text section so that they throw an exception if you try to alter them(const). It is unused data, but it's still there untouched. I am not advocating for doing this, as it is bad practice, but I am merely emphasizing the structure/properties of this static construct. In the case of dynamic constructs, overriding this data would lose references to dynamic memory, which is at least a memory leak, and can even be a system resource leak. That said, If the class has a destructor that references this structure, doing this can crash the program. – Dmytro Sep 09 '16 at 05:46

Why doesn't C++ allow you to take the address of a constructor?

4 Answers4