22

I have just realized something disturbing. Every time I have written a method that accepts a std::string as a paramater, I have opened up myself to undefined behaviour.

For example, this...

void myMethod(const std::string& s) { 
    /* Do something with s. */
}

...can be called like this...

char* s = 0;
myMethod(s);

...and there's nothing I can do to prevent it (that I am aware of).

So my question is: How does someone defend themself from this?

The only approach that comes to mind is to always write two versions of any method that accepts an std::string as a parameter, like this:

void myMethod(const std::string& s) {
    /* Do something. */
}

void myMethod(char* s) {
    if (s == 0) {
        throw std::exception("Null passed.");
    } else {
        myMethod(string(s));
    }
}

Is this a common and/or acceptable solution?

EDIT: Some have pointed out that I should accept const std::string& s instead of std::string s as a parameter. I agree. I modified the post. I don't think that changes the answer though.

  • 1
    Hooray for leaky abstractions! I'm not a C++ developer, but is there any reason why you couldn't check the string object's c_str property? – Mason Wheeler Dec 08 '13 at 19:05
  • 1
    I think the problem would start even before I could call c_str(). I read (but don't have handy) that the initialization of std::string with 0 leads to undefined behaviour. – John Fitzpatrick Dec 08 '13 at 19:08
  • 6
    just assigning 0 into the char* constructor is undefined behavior, so it's the callers fault really – ratchet freak Dec 08 '13 at 19:10
  • 4
    @ratchetfreak I didn't know that char* s = 0 is undefined. I've seen it at least a few hundred times in my life (usually in the form of char* s = NULL). Do you have a reference to back that up? – John Fitzpatrick Dec 08 '13 at 19:14
  • 3
    I meant to the std:string::string(char*) constructor – ratchet freak Dec 08 '13 at 19:15
  • 1
    @ratchetfreak Sorry I misunderstood. – John Fitzpatrick Dec 08 '13 at 19:20
  • 1
    (We have nullptr now) – James Dec 08 '13 at 23:15
  • 2
    I think your solution is fine, but you should consider not doing anything at all. :-) Your method quite clearly takes a string, in no way is passing a null pointer when calling it a valid action - in the event that a caller is accidentally bunging nulls about to methods like this the sooner it blows up on them (rather than being reported in a log file, for example) the better. If there were a way to prevent something this at compile time then you should do it, otherwise I'd leave it. IMHO. (BTW, are you sure you couldn't be taking a const std::string& for that parameter...?) – Grimm The Opiner Dec 09 '13 at 15:46
  • @user1158692 You're right I should take a const string& there unless there's a reason not to, which this snippet doesn't have. – John Fitzpatrick Dec 09 '13 at 18:00
  • @JohnFitzpatrick Hardly fixes your issue though. That said, taking a non const reference would as an implicit conversion to a temporary isn't allowed with a non const ref. Not much of a solution though. – Grimm The Opiner Dec 09 '13 at 18:13
  • Side note: taking string as a parameter isn’t ideal because passing a literal implies an extra allocation and copy. The best solution is boost::string_ref, which will be standardised in C++14 as std::string_ref. – Jon Purdy Dec 13 '13 at 12:00
  • The pointer being 0 (or NULL or nullptr), is nothing particular here, I think. Any pointer can potentially produce an error if there is no terminating character \0 somewhere after. So anything that was not generated with char w[] = "string" or manipulated carefully is potentially problematic for the std::string constructor. – alfC Jan 21 '15 at 05:13

7 Answers7

21

I don't think you should protect yourself. It's undefined behavior on the caller's side. It's not you, it's the caller who calls std::string::string(nullptr), which is what is not allowed. The compiler lets it be compiled, but it lets other undefined behaviors be compiled as well.

The same way would be getting "null reference":

int* p = nullptr;
f(*p);
void f(int& x) { x = 0; /* bang! */ }

The one who dereferences the null pointer is making UB and is responsible for it.

Moreover, you cannot protect yourself after undefined behavior has happened, because the optimizer is in its full right to assume that the undefined behavior had never happened, so the checks if c_str() is null can be optimized out.

Domenic
  • 684
  • 7
  • 15
Vlad
  • 375
  • There's a beautifully written comment which says much the same thing, so you must be right. ;-) – Grimm The Opiner Dec 10 '13 at 10:14
  • 2
    The phrase here is protect against Murphy, not Machiavelli.

    A well versed malicious programmer can find ways to send ill formed objects to even create null references if they try hard enough, but that's their own doing and you might as well let them shoot themselves in the foot if they really want to.

    The best you can be expected to do is prevent accidental errors. In this case it is rare indeed that someone would accidentally pass in a char * s = 0; to a function that asks for a well formed string.

    – YoungJohn Dec 16 '13 at 20:34
2

I also encountered this problem a few years ago and I found it to be a very scary thing indeed. It can happen by passing a nullptr or by accidentally passing an int with value 0. It's really absurd:

std::string s(1); // compile error
std::string s(0); // runtime error

However, in the end this only bugged me a couple of times. And each time it caused an immediate crash when testing my code. So no night-long sessions will be required to fix it.

I think overloading the function with const char* is a good idea.

void foo(std::string s)
{
    // work
}

void foo(const char* s) // use const char* rather than char* (binds to string literals)
{
    assert(s); // I would use assert or std::abort in case of nullptr . 
    foo(std::string(s));
}

I wish a nicer solution was possible. However, there isn't.

2

The code below gives a compile failure for an explicit pass of 0, and a run-time failure for a char* with value 0.

Note that I do not imply that one should normally do this, but no doubt there may be cases where protection from the caller error is justified.

struct Test
{
    template<class T> void myMethod(T s);
};

template<> inline void Test::myMethod(const std::string& s)
{
    std::cout << "Cool " << std::endl;
}

template<> inline void Test::myMethod(const char* s)
{
    if (s != 0)
        myMethod(std::string(s));
    else
    {
        throw "Bad bad bad";
    }
}

template<class T> inline void Test::myMethod(T  s)
{
    myMethod(std::string(s));
    const bool ok = !std::is_same<T,int>::value;
    static_assert(ok, "oops");
}

int main()
{
    Test t;
    std::string s ("a");
    t.myMethod("b");
    const char* c = "c";
    t.myMethod(c);
    const char* d = 0;
    t.myMethod(d); // run time exception
    t.myMethod(0); // Compile failure
}
Keith
  • 473
  • 2
  • 5
1

How about changing the signature of your method to:

void myMethod(std::string& s) // maybe throw const in there too.

That way the caller has to create a string before they call it, and the sloppiness you are worried about will cause problems before it gets to your method, making obvious, what others have pointed out, that it is the caller's error not yours.

Justsalt
  • 141
  • yes I should have made it const string& s, I forgot actually. But even so, aren't I still vulnerable to undefined behaviour? The caller can still pass a 0, right? – John Fitzpatrick Dec 13 '13 at 13:03
  • 2
    If you use a non-const reference, then the caller can no longer pass 0, because temporary objects won't be allowed. However, using your library would become much more annoying (because temporary objects won't be allowed) and you're giving up on const correctness. – Josh Kelley Dec 13 '13 at 20:13
  • I once used a non-const reference parameter to a class where I had to be able to store it and ensured therefore that there was no conversion or temporary being passed in. – CashCow Oct 30 '14 at 11:38
1

How about you provide on overload the takes an int parameter?

public:
    void myMethod(const std::string& s)
    { 
        /* Do something with s. */
    }    

private:
    void myMethod(int);

You don't even have to define the overload. Trying to call myMethod(0) will trigger a linker error.

fredoverflow
  • 6,874
0

Your method in the first code block will never be called if you try to call it with a (char *)0. C++ will simply try to create a string and it will throw the exception for you. Did you try it?

#include <cstdlib>
#include <iostream>

void myMethod(std::string s) {
    std::cout << "s=" << s << "\n";
}

int main(int argc,char **argv) {
    char *s = 0;
    myMethod(s);
    return(0);
}


$ g++ -g -o x x.cpp 
$ lldb x 
(lldb) run
Process 2137 launched: '/Users/simsong/x' (x86_64)
Process 2137 stopped
* thread #1: tid = 0x49b8, 0x00007fff99bf9812 libsystem_c.dylib`strlen + 18, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00007fff99bf9812 libsystem_c.dylib`strlen + 18
libsystem_c.dylib`strlen + 18:
-> 0x7fff99bf9812:  pcmpeqb (%rdi), %xmm0
   0x7fff99bf9816:  pmovmskb %xmm0, %esi
   0x7fff99bf981a:  andq   $15, %rcx
   0x7fff99bf981e:  orq    $-1, %rax
(lldb) bt
* thread #1: tid = 0x49b8, 0x00007fff99bf9812 libsystem_c.dylib`strlen + 18, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00007fff99bf9812 libsystem_c.dylib`strlen + 18
    frame #1: 0x000000010000077a x`main [inlined] std::__1::char_traits<char>::length(__s=0x0000000000000000) + 122 at string:644
    frame #2: 0x000000010000075d x`main [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(this=0x00007fff5fbff548, __s=0x0000000000000000) + 8 at string:1856
    frame #3: 0x0000000100000755 x`main [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(this=0x00007fff5fbff548, __s=0x0000000000000000) at string:1857
    frame #4: 0x0000000100000755 x`main(argc=1, argv=0x00007fff5fbff5e0) + 85 at x.cpp:10
    frame #5: 0x00007fff92ea25fd libdyld.dylib`start + 1
(lldb) 

See? You have nothing to worry about.

Now if you want to catch this a bit more gracefully, you should simply not be using char *, then the problem won't arise.

vy32
  • 632
  • 4
    constructing a std::string with a nullpointer will be undefined behavior – ratchet freak Dec 08 '13 at 19:17
  • 2
    the EXC_BAD_ACCESS exception sounds like a null dereference which will crash your program with a segfault on a good day – ratchet freak Dec 08 '13 at 19:20
  • @vy32 Some of the code I write that accepts std::string goes into libraries used by other projects where I am not the caller. I am looking for a way to gracefully handle the situation and inform the caller (maybe with an exception) that they passed an incorrect argument without crashing the program. (OK, granted, the caller might not handle an exception I throw and the program will crash anyway.) – John Fitzpatrick Dec 08 '13 at 19:24
  • 2
    @JohnFitzpatrick you are not going to be aable to protect yourself from a nullpointer passed into std::string unless you can convince the standards comity to have it be a exception instead of undefined behavior – ratchet freak Dec 08 '13 at 19:41
  • @ratchetfreak In a way I think that's the answer I was looking for. So basically I have to protect myself. – John Fitzpatrick Dec 08 '13 at 20:50
  • Can this exception be caught? I'm pretty sure it's a crash and to make matters worse it only crashes when the string is used, not when it's created. – James Dec 08 '13 at 23:18
  • @JohnFitzpatrick, yes, you can catch a null pointer on most systems with an appropriate signal handler. However, that is the wrong approach. You are correct that the code that you originally posted will catch the attempt to call your library with a null pointer, but that is only because you have used overloading and are checking the char * before the std::string object is created. Another approach would be to use a subclass of std::string which does not allow for implicit conversion. – vy32 Dec 09 '13 at 04:55
0

If you are worried about char* being potentially null pointers (e.g. returns from external C APIs) the answer is to use a const char* version of the function instead of the std::string one. i.e.

void myMethod(const char* c) { 
    std::string s(c ? c : "");
    /* Do something with s. */
}

Of course you will need the std::string version as well if you want to allow those to be used.

In general it's best to isolate calls to external APIs and marshal arguments into valid values.