Super Source

A couple of days ago I read an interesting article by Pierre-Yves Ricau, an Android Software Engineer for Square, where he listed reasons for why Square has moved away from using Fragments in their Android products. In it he mentioned the ridiculous life cycle of Fragments, how it makes them at times hard to handle and even harder to test against. And I agree with him. Fragments are easy to use in smaller projects, but in my experience don’t scale linearly in difficulty as the project grows. And if you start bundling many of them together at the same time and even throw in some animation you usually have a recipe for a minor disaster.

Now a short while after reading that article I had a situation where some freshly made code came back from testing with a bug ticket attached to it. A couple of Listviews (contained inside a Fragment) weren’t being updated as new data was being stored in the local database.

Alright, I thought. Something must have gone wrong with storing the new data into the database. Maybe I just forgot to update the database with the schema. Huh, no wasn’t that. Alright, did the values actually get stored? I pulled the database and had a look.

To make a long story short, the data was there, getting stored correctly and the result was handled in the manner designed. It was time to see what was happening under the hood.

A little side note, pulling a whole database can easily be done using ADB with the following command:

adb -d 'run-as <your-package-name> cat /data/data/<your-package-name>/databases/<your-database-name> > /sdcard/<your-database-name>'

And then simply do a pull operation.

adb pull /sdcard/<your-database-name> <destination>

Breakpoint time! Link to heading

The Fragment uses Loaders, and LoaderCallbacks, to access the database and display the results to the user. Those callbacks have two overridden methods. onCreateLoader(id, args) gets called when the Loader is created through the LoaderManager, and it initializes a custom Loader with parameters telling it what data to load. This method was called, and it executed a database operation as instructed.

So far so good.

The second method is onLoadFinished(Loader, data) and it is invoked when the loader has finished (duh!). Except.. it didn’t. That didn’t make sense! Looking at the change logs I could see that nothing had changed in the implementation of the loaders, or the callbacks. Everything should have worked!

At this point I started pouring over the new code, trying to find a single reference to something that might shed a light on what was happening. After finding nothing, I checked out the latest stable build and started manually adding in the new code, testing between every addition. Finally I had copied everything over and the bug was not present! At this point I was astonished about what was happening. I could have decided that it had just been glitch in the matrix, or I could go back to my branch and find out what was actually happening.

Introducing the Rubber duck Link to heading

I sighed loudly enough that a coworker noticed and asked what was going on. I went into rubber-duck-debugging mode during which he noticed that the onStart() method was calling super.onResume() instead of super.onStart(). I casually fixed it and then resumed the debugging session.

@Override
public void onStart(){
  super.onResume();
  // Code ...
}

@Override
public void onStart(){
  super.onStart();
  // Code ...
}

Turns out that this was it! It is not hard to miss this slight difference, as can be seen in the code above. And this wasn’t even part of the new code. I had no idea that there had been a change there, even though later I noticed it had shown up in the change logs (guess I focused to intently on more meatier changes).

But why had this caused my Loader to initialize, run, but not receive a response back?

Always go full source Link to heading

It has been suggested that your first response to someone’s complaints about problems in the stack is to point them to the source code. And so that is what I did.

Looking at the source code for Fragments, and the onStart() method in particular, you can see that there is a flag ‘mCalled’ that gets flipped, as well as some LoaderManager initialization. If you have ever used the Android SDK you may have had an SuperNotCalledException raised at one point. And in the stacktrace you get a reference to android.app.Fragment.performStart(Fragment.java:1721. Looking at it you can see that same flag ‘mCalled’ is used to verify, and force, that the call to onStart() passes into the default implementation at some point. If not, then a exception is raised.

Alright, now lets look at onResume(). Yes, your eyes do not deceive you. The only thing this method seems to do is to flip ‘mCalled’ to true. Why? Oh only so that performResume() may verify that you called super and throw an exception if you didn’t….. And since I did a wrong kind of super, there was no initialization of the LoaderManager and none of the Loaders got started, but by calling super.onResume() I was able to avoid getting an exception. Thus completely obscuring what had actually broken, and with no reasonable connection between the cause and the issue.

Further more if you look at that source code I’ve linked above you can see how Loaders are automatically started by the fragment as soon as onStart() is invoked. This was a problem in this same product of ours where it caused the Fragment’s entry animations to lag since the Loaders were trying to load data into the Listviews before the animation was even finished. Our way of dealing with that problem was to register an animationListener on the Fragment, and attempt to initialize the Loaders when the animation had finished. And we then had to unregister the loaders when the fragment was moved onto the backstack, since next time the fragment would be displayed the loaders would get fired up during the animation cycle and everything would lag.

Conclusion Link to heading

So what is the point I’m trying to make here? Well, by making a slight error at one point I was able to throw a wrench into the gears in such a way that the engine worked perfectly, just not in 2nd or 5th gear. And there is no reasonable way that anybody would make a unit test that could discover this case, because all the crucial logic required for this bug to surface is ‘hidden’ in the source of the Android SDK.

So yeah Fragments are painfully unstable and fickle beasts that will become mad at you if you as much as look at them in a wrong way. I may just start mirroring what Pierre and Square have decided to do, and do away with Fragments all together in future projects, just to see if doing away with them all together is easier then living with them.