Friday, May 20, 2011

How to deal with EXC_BAD_ACCESS triggered by _CFAutoreleasePoolPop

Getting crashes due to EXC_BAD_ACCESS errors means that you're trying to access memory through an invalid pointer. While there can be many ways of messing things up like that, a common one in Objective-C is to over-release an object allocated on the heap. That means that the object is released more times than it is retained.

What makes this an insidious problem is that sometimes those crashes happen at random, are not always reproducible, and the stack trace you get doesn't offer clear clues of what portion of the code is at fault. As you can see from the picture below, all code in the stack trace is system code, none in your control:

Sometimes the stack trace is slightly different, here's another one:

The presence of _CFAutoreleasePoolPop in the trace is a strong indication of the fact that an object that's already been deallocated, as a result of over-release, is now attempted to be released again, during the draining of an autorelease pool.

In such cases you should be looking for 2 pieces of information. What's the address of the object being released, its invalid pointer, and where did that pointer come from in the first place, that is, where in your code it was first allocated. Getting to know the variable name in your code that holds that offending pointer, as it appears at the point of allocation, helps tremendously in solving the issue. All you need to do is search in your code where that variable is being used, where it gets released, and so on. Often times you'll find that you're releasing an object that's already autoreleased (it hasn't been created with a call to a method that starts with "alloc" or "new").

You should be able to somehow reproduce the problem, if you can do that, you're good to go. If reproducing the issue is hard, you might want to try placing breakpoints that stop the execution for a while. In some instances that alters the timing of execution in the system and helps reproducing the issue. Sometimes your only recourse is to run the same code repeatedly until it crashes. You can also try running on the simulator and device. Update: I used this method only on simulator. When running on device you'll get an error when running info malloc-history in GDB. If you can only reproduce the crash on a device you'll need to investigate a way of analyzing the allocation logs, which presumably get stored on the device (e.g. malloc: stack logs being written into /private/var/mobile/Applications/DF97FB1F-546A-4B9A-92AB-B6979B4E28A9/tmp/stack-logs.6413.PubPit.index).

Once you can reproduce the crash, the first thing you need to do is set some environment variables to aid in debugging. In the left pane of Xcode, locate the executable group, and double-click the name of the executable for your project:

Choose the "Arguments" tab in the window that opens and set the NSZombieEnabled, NSDebugEnabled, NSAutoreleaseFreedObjectCheckEnabled and MallocStackLoggingNoCompact like so:

Then run your code, if it crashes, look in the debugger console and you should see something like this:

[Session started at 2011-05-20 11:14:57 +0300.]
GNU gdb 6.3.50-20050815 (Apple version gdb-1510) (Wed Sep 22 02:45:02 UTC 2010)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin".sharedlibrary apply-load-rules all
Attaching to process 40114.
PubPit(40114) malloc: recording malloc stacks to disk using standard recorder
PubPit(40114) malloc: stack logging compaction turned off; size of log files on disk can increase rapidly
PubPit(40114) malloc: process 40076 no longer exists, stack logs deleted from /tmp/stack-logs.40076.PubPit.p5tTh9.index
PubPit(40114) malloc: stack logs being written into /tmp/stack-logs.40114.PubPit.Vpmlum.index
2011-05-20 11:14:59.890 PubPit[40114:207] *** -[CFDictionary release]: message sent to deallocated instance 0x652e280
Data Formatters temporarily unavailable, will re-try after a 'continue'. (Not safe to call dlopen at this time.)

The key info here is:
*** -[CFDictionary release]: message sent to deallocated instance 0x652e280

In my case I have an instance of CFDictionary that was initially allocated at address 0x652e280, which now is deallocated once too many times. The question is where did I ever allocate that object. To find that I do:

(gdb) info malloc-history 0x652e280
Alloc: Block address: 0x0652e280 length: 48
Stack - pthread: 0xa057c540 number of frames: 42
    0: 0x949d617c in malloc_zone_malloc
    1: 0x132080d in _CFRuntimeCreateInstance
    2: 0x132040a in CFBasicHashCreate
    3: 0x1321ebb in __CFDictionaryCreateGeneric
    4: 0x1321bd1 in CFDictionaryCreateMutable
    5: 0x13462ac in parseXMLElement
    6: 0x13469fe in parsePListTag
    7: 0x1345891 in parseXMLElement
    8: 0x1344a94 in _CFPropertyListCreateFromXMLStringError
    9: 0x13435a8 in _CFPropertyListCreateWithData
   10: 0x1399957 in _CFPropertyListCreateFromXMLData
   11: 0xf92ef in +[NSPropertyListSerialization propertyListFromData:mutabilityOption:format:errorDescription:]
   12: 0xfa9c in -[Model loadData] at /Users/bhapca/WORK/Cocoa/Pubpit/Classes/Model.m:171
   13: 0x102b8 in -[Model init] at /Users/bhapca/WORK/Cocoa/Pubpit/Classes/Model.m:251
   14: 0x103fe in +[Model sharedModel] at /Users/bhapca/WORK/Cocoa/Pubpit/Classes/Model.m:268
   15: 0x6c54 in -[PagingViewController loadView] at /Users/bhapca/WORK/Cocoa/Pubpit/Classes/PagingViewController.m:716
   16: 0x4275e3 in -[UIViewController view]
   17: 0x425a57 in -[UIViewController contentScrollView]
   18: 0x436201 in -[UINavigationController _computeAndApplyScrollContentInsetDeltaForViewController:]
   19: 0x434831 in -[UINavigationController _layoutViewController:]
   20: 0x435b4c in -[UINavigationController _startTransition:fromViewController:toViewController:]
   21: 0x430606 in -[UINavigationController _startDeferredTransitionIfNeeded]
   22: 0x548e01 in -[UILayoutContainerView layoutSubviews]
   23: 0xd6c451 in -[CALayer layoutSublayers]
   24: 0xd6c17c in CALayerLayoutIfNeeded
   25: 0xd6537c in _ZN2CA7Context18commit_transactionEPNS_11TransactionE
   26: 0xd650d0 in _ZN2CA11Transaction6commitEv
   27: 0x37c19f in -[UIApplication _reportAppLaunchFinished]
   28: 0x37c659 in -[UIApplication _runWithURL:payload:launchOrientation:statusBarStyle:statusBarHidden:]
   29: 0x386db2 in -[UIApplication handleEvent:withNewEvent:]
   30: 0x37f202 in -[UIApplication sendEvent:]
   31: 0x384732 in _UIApplicationHandleEvent
   32: 0x1cdda36 in PurpleEventCallback
   34: 0x13526f7 in __CFRunLoopDoSource1
   35: 0x134f983 in __CFRunLoopRun
   36: 0x134f240 in CFRunLoopRunSpecific
   37: 0x134f161 in CFRunLoopRunInMode
   38: 0x37bfa8 in -[UIApplication _run]
   39: 0x38842e in UIApplicationMain
   40: 0x1e74 in main at /Users/bhapca/WORK/Cocoa/Pubpit/main.m:14
   41: 0x1e05 in start

BINGO ! My code that allocates that object and the variable used to point to that object is in: -[Model loadData] at /Users/bhapca/WORK/Cocoa/Pubpit/Classes/Model.m:171

Now all I need is too look at that code and figure out where that object gets released more than needed.

It's important to remove those NSZombieEnabled, NSDebugEnabled, NSAutoreleaseFreedObjectCheckEnabled and MallocStackLoggingNoCompact environment variables, once you're done with debugging, to prevent them to affect the released version of your app.

Credit goes to this site, where I first learned about MallocStackLoggingNoCompact.

Update: Apparently the same investigation can be performed using the Instruments tool that comes with Xcode 4, by choosing the Zombies trace template. I believe that is the method now recommended by Apple for dealing with this type of issues.

1 comment:

Anonymous said...

Thanks a lot for this. This was a big help!