Menu

#1314 Lifecycle of pointers to 'concrete' sub-objects (C#-binding patch included)

None
closed
5
2024-10-23
2013-04-09
Greg Knight
No

When retaining a reference to a 'concrete' object (as opposed to a pointer or reference) within another C++ class, the high-level wrapper of the container class may go out of scope while the high-level wrapper of the concrete remains within scope; the destruction of the high-level wrapper of the container class may destruct the low-level C++ object, causing subsequent accesses to the concrete wrapper to be corrupt. I originally observed this problem while iterating through a vector of rather more complex C++ objects in a C# foreach loop, while attempting to retain a small bit of data from one of them, leading to random crashes and corruption.

To illustrate, I've built an example:

  • dog.h/dog.cxx: Groucho is a concrete inside of a Dog. Groucho knows the Answer.
  • Program.cs: We instantiate a Dog so that we can ask Groucho for the Answer.
  • dog.swig: Binds Groucho + Dog to C#

We query for the answer in two places: First, while we know the dog is whole and hale, and second, after we know the garbage collector has taken him. In the latter case, we reliably observe corruption of Groucho's answer. This corruption occurs because no valid reference to the original Dog remains in the CLR's heap, and the poor beast has been destructed.

I've attached a patch for the C# bindings (against swig-2.0.4, so needs refreshing) that make slightly different use of the HandleRef's Wrapper field, retaining the original Dog as Groucho's Wrapper in the Dog.Groucho property getter. With the attached patch, since C#-Groucho secretly retains a reference to the C#-Dog he's residing within, we don't observe the corruption/crash.

I've used the thus-patched binary extensively, but I'd like to know the SWIG team's perspective on this use case. The patch involves a memory-use risk, in that a given property getter causes the retrieved property to retain a hidden reference to the property owner, unless the typemap for the gotten property says otherwise. To illustrate this, a pointer to an unrelated dog, Lassie, is retained by the Dog class - observe the generated Lassie getter given the above patch.

It's sort of a question of whether to bloat or corrupt; however, ideally, I think we could detect whether we have a concrete (such as Groucho) and need to keep the parent object alive, or if we have a pointer/reference (such as Lassie) for whom Wrapper should just be 'this'. I was not immediately clear on how to achieve this based on cursory inspection of the SWIG code, but may have missed something.

Thoughts? Shall I modernize and test the patch for submission, or do you have recommendations as to how I could fix this in a way less likely to risk bloat in some use cases?

1 Attachments

Discussion

  • Olly Betts

    Olly Betts - 2022-03-21

    This issue still seems to be relevant as it doesn't seem to have been addressed since reporting AFAICS.

    Whether this is something SWIG should attempt to address seems like a question for @wsfulton.

     
  • Olly Betts

    Olly Betts - 2023-08-09
    • labels: --> csharp, patch
    • Group: -->
     
  • William Fulton

    William Fulton - 2024-10-23

    The solution was documented in the C# docs last year, but was only detailed for accessor methods to member variables. The same technique can be used for direct member access and I've just enhanced the docs with this, see the version in master: https://htmlpreview.github.io/?https://github.com/swig/swig/blob/master/Doc/Manual/CSharp.html#CSharp_memory_management_member_variables .

    I can't see the swig-2.0.4-csharp-concrete.patch being accepted as is, but maybe with improvements as some sort of special feature. Instead, using the approach in the newly updated docs, you'd add the following into dog.swig before the final %include:

    %typemap(cscode) Groucho %{
      // Ensure that the GC doesn't collect any Dog instance set from C#
      private Dog dogReference;
      internal void addReference(Dog dog) {
        dogReference = dog;
      }
    %}
    
    // Add a C# reference to prevent premature garbage collection and resulting use
    // of dangling C++ pointer. Intended for methods that return pointers or
    // references to a member variable.
    %typemap(csvarout, excode=SWIGEXCODE2) Groucho* Dog::Marx %{
        get {
          global::System.IntPtr cPtr = $imcall;
          $csclassname ret = null;
          if (cPtr != global::System.IntPtr.Zero) {
            ret = new $csclassname(cPtr, $owner);
            ret.addReference(this);
          }
          $excode
          return ret;
        } %}
    

    This can of course be tweaked and made available in a preprocessor macro if needed for multiple member variables.

     
  • William Fulton

    William Fulton - 2024-10-23
    • status: open --> closed
    • assigned_to: William Fulton
     

Log in to post a comment.

MongoDB Logo MongoDB