Providing more information about incorrect widget retrieving by name

GetterSetter · 07 April 2024, 17:03:41

If there is an error in the name of a widget and we try to access it, we will get a segmentation fault without any explanation. Sometimes, it can be difficult to determine whether the problem is actually in the naming, e.g.:

Code Select

gui.add(picture, "picture");
//...
func(gui.get<...>("picturW")); //(1)

//////////
func(Picture::Ptr pic):
     pic->getRenderer()->setTexture(texture); //(2) We will get a segmentation fault only there! 
//And in my case I truly believed that the problem was with the setTexture() function

So, my proposal is to throw an exception like std::out_of_range at the step marked as (1).

texus · 07 April 2024, 17:14:39

The "get" function isn't guaranteed to succeed, the documentation states that it will return a "nullptr" if it can't find the widget.

The only thing that I could do to improve this situation is to create an alternative getter that would throw an exception instead of returning a nullptr.

Does the debugger give you an address when the segmentation fault occurs? It should, and the address should be very close to 0 for this crash. If you see this, then the first thing to check is if you are calling a function on a nullptr. This can be helpful to more easily find what is causing the crash.

Another useful thing to look at if you get a crash is the call stack. If you have the debug symbols for TGUI then it might even literally contain that the this pointer is 0x0 inside the Picture function.

GetterSetter · 07 April 2024, 17:32:39

Actually, the debugger didn't give me the exact address but there was a variable with 0x0 address (pic->get()->0x0), but I needed to look through the local variables (or start gdb in a terminal) to see it. It was not quite obvious until you pointed it out. I admit that it works as it should, but I suppose that the more practical approach would be to throw an exception in order to save time in the debugger.
Moreover it's illogical to check whether get() returns nullptr if I definitely know that I've put the Widget into the gui

texus · 07 April 2024, 19:09:00

The thing that I'm a bit worried about is that it might not always be as useful to throw there. If you have a try-catch in your main function to print the exception and then return from main (which I sometimes even recommend that people do), then the debugger won't tell you where the exception occurred. Although if I include the widget name in the exception error then you would probably still be able to locate the faulty "get" call.

The usability of detecting an issue by throwing an exception really depends on the platform. On my linux system, any assertion or uncaught exceptions will be printed out in the terminal with the error message, even without a debugger attached. On other platforms, it just crashes without printing a message. I've had multiple people ask me in the past why there code was crashing where it turned out to be an uncaught exception, which is the reason why I often recommend the try-catch in the main function: so that these people would be able to see what is going wrong. The exact same advice is however bad for people who do use a debugger properly, as they get a better output without catching the exception.

The "get" function isn't going to change for backwards compatibility and because I'm not convinced that it should throw (because I try to only throw things that might be called during initialization). So the only thing that I can do is add a new "tryGet" function. The problem with that is that you need to change your code to always use the new function everywhere, and I'm not sure if it really helps much. The debugger will stop inside the "tryGet" function on failure, so you would still need to get a call stack to figure out where the function was called. The exact same "bt" command in gdb would probably also tell you that there is a nullptr in "pic->getRenderer()->setTexture(texture);" though. Although I admit that the call stack with the exception would be slightly more useful if you don't know what to look for.

GetterSetter · 07 April 2024, 20:54:56

Well, I understand your worries about backwards compatibility, but maybe it would be a good idea to print out something with cerr/clog, for example "Widget <name> not found" to inform the developer? SFML does that and sometimes such messages clarify the details.
IMHO, some feedback is needed in these cases. Since we can't catch these errors during compilation, we must be able to detect them at runtime without causing an abnormal termination (or at least we should know why this happened).

texus · 07 April 2024, 22:51:12

The issue is that calling the get function with an incorrect name isn't technically an error, it could be intentional. Otherwise it would be simple: I would just add an assert in the get function, that would immediately alert the developer that he's doing something invalid.

Maybe I could consider some global flag that you could set to cause a get() function to crash when used with an invalid name, but I would first need to verify that TGUI isn't relying on a nullptr being returned internally.

In TGUI 0.6 I had my own smart pointer type instead of using std::shared_ptr. That actually had one big advantage: I could control what happened when attempting to dereference a nullptr. I don't think it's worth going back to something like that, but it would allow better debug information as well. This is however something that I feel the debugger should catch for you. Similar to how "operator[]" asserts in debug mode but not in release mode, I feel like a nullptr dereference should be something that the debugger very clearly tells you about when running the code in debug mode. Some debuggers do a better job than others at showing where the real issue lies in these cases.

GetterSetter · 08 April 2024, 10:53:16

I agree that it is not actually an error, but get() is almost always used to retrieve a known widget, so I have already mentioned that it's illogical to do a nullptr check and, consequently, we may not notice this error. However, if there is an intention to call get() using a name given by the user, for example through user input, then I will perform that check regardless. Moreover, in the context of get(), Gui behaves like a container (similar to std::map), and some behavior is expected when accessing out-of-range values. Additionally, using a debugger to locate such errors makes development harder. Therefore, my opinion is that get() should, at least print something to the terminal. I appreciate your suggestion about using a global flag, and I believe it is a good solution. Throwing an exception seems like a more modern approach to using C++. But anyway the final decision rests with you. Would you be so kind as to notify me later about your findings here?

cg923 · 08 April 2024, 18:50:18

As much as I understand the desire for explicit information about why the function "failed," throwing an error here prevents us from handling the situation as we like if get() does return nullptr.

For example, I might want to add a widget if get() returns nullptr, or something like that. So, I definitely don't think that TGUI should throw an error, but perhaps just printing one to the console, although even that might be annoying if I'm expecting the behavior.

GetterSetter · 09 April 2024, 16:17:37

Quote from: cg923 on 08 April 2024, 18:50:18For example, I might want to add a widget if get() returns nullptr, or something like that.

I can also do this in case of an exception being thrown, can't I? Anyway we have already discussed that using exceptions might break backwards compatibility, so the only thing I would be glad to see is some information in the terminal.

Quote from: cg923 on 08 April 2024, 18:50:18might be annoying

I admit that it might be, but isn't it annoying to spend time using a debugger only to figure out that there was a mistake in the naming? As Texus has suggested, we can use a global flag to disable these messages (or we may not print them in the release version of the library, but do it in the debug one).

texus · 09 April 2024, 18:42:13

I've given it some more thought and asked for some extra opinions about this issue on discord yesterday, and my decision is that I'm not going to change the API for this. I don't really want to add another function and extra code just to provide slightly better debug information in the case where the "get" function is used without proper error checking (i.e. testing if returned value is nullptr or not).

QuoteI can also do this in case of an exception being thrown, can't I?

Yes, you could catch the exception to find out that that the widget didn't exist. So no functionality would be lost. The code wouldn't look as clean as a simple if check though.

GetterSetter · 09 April 2024, 19:14:29

Thank you for your reply, texus. I assume that in my case the best way to help myself is to write some wrapper functions, e.g.

Code Select

        template <class WidgetType>
        typename WidgetType::Ptr get(tgui::Gui &gui,const String& widgetName) const
        {
            auto ptr=gui.get<WidgetType>(widgetName);
            if(ptr==nullptr) throw std::runtime_error("reason+widgetName");
            return ptr;
        }

+ the same function which is overloaded for tgui::Container* instead of tgui::Gui.

texus · 09 April 2024, 19:25:22

Yes, that would be the solution.
You would indeed need a second function with tgui::Container::Ptr, unless you make the first parameter a template as well (but then you will need to pass "&gui" as argument when calling the function).

cg923 · 09 April 2024, 19:31:13

Quote from: GetterSetter on 09 April 2024, 16:17:37
Quote from: cg923 on 08 April 2024, 18:50:18For example, I might want to add a widget if get() returns nullptr, or something like that.
I can also do this in case of an exception being thrown, can't I? Anyway we have already discussed that using exceptions might break backwards compatibility, so the only thing I would be glad to see is some information in the terminal.
Quote from: cg923 on 08 April 2024, 18:50:18might be annoying
I admit that it might be, but isn't it annoying to spend time using a debugger only to figure out that there was a mistake in the naming? As Texus has suggested, we can use a global flag to disable these messages (or we may not print them in the release version of the library, but do it in the debug one).

You definitely can handle the errors that the library would throw, of course. But then, that would require that you'd be validating the results of the get() function already anyway, in which case, doesn't your original problem disappear? Validating that way is just a different way of doing that validation that would have prevented your segfault in the first place.

Looking at your previous post, the wrapper function seems like a good solution!

GetterSetter · 09 April 2024, 19:34:55

Quote from: texus on 09 April 2024, 19:25:22make the first parameter a template

Oh, that's even better. Thank you for your advice!

GetterSetter · 09 April 2024, 19:44:32

Quote from: cg923 on 09 April 2024, 19:31:13doesn't your original problem disappear

Actually, it doesn't. If get() throws the exception that isn't caught, the program will terminate and in the terminal I will be able to see what happened (what was the reason to terminate) whereas a nullptr dereference will cause a segfault without any explanation.

cg923 · 09 April 2024, 20:09:24

Quote from: GetterSetter on 09 April 2024, 19:44:32
Quote from: cg923 on 09 April 2024, 19:31:13doesn't your original problem disappear
Actually, it doesn't. If get() throws the exception that isn't caught, the program will terminate and in the terminal I will be able to see what happened (what was the reason to terminate) whereas a nullptr dereference will cause a segfault without any explanation.

Ahh. Gotcha. Now I'm on the same page. I suppose my only reservation toward throwing an error there (I know Texus has already decided) is that I can imagine a case where returning nullptr is useful, and having an error get thrown there means one would need to catch that error and deal with it... which would be a simple rewrite of existing code but could be a breaking change. Wouldn't be the end of the world.