September 06, 2005Setting pointers to null is usefulSetting pointers to null is useful This discussion on TheServerSide is showing people up in arms about the practice of setting pointers to null:
Setting pointers to null does help the garbage collector. Regardless of the type of garbage collector you are using (generational, mark and sweep, etc...), setting a pointer to null allows the algorithm to skip a step in its implementation. If you don't set the pointer to null, the virtual machine will try to follow this pointer and will, later, realize that the pointed object only has one reference, and is therefore up for garbage collection. If you set the pointer to null, you save one step. It's as simple as that. Now, while this practice helps the garbage collector, it doesn't help it by much, and I would argue that for most applications (JSE and JEE), you should never bother using it. It's a bit different for JME, though, where every byte counts and where I recommend (and personally use) this technique with lazy initialization to limit the amount of heap used by my applications. Finally, one last comment:
This strikes me as a code smell. If you are keeping a pointer alive just in case some other part of your code uses it, you have a bigger problem than memory management. Regardless of your implementation, you should always be able to answer the question: "Is this object still needed after this point or not?". Sometimes, the answer is "maybe" (multithreaded code for example), but in most cases, the answer to this question should be a clear "yes" or "no". Posted by cedric at September 6, 2005 10:33 AM Comments
"Setting references to NULL to "help" GC is the easiest way to sign up for a NPE in other parts of the code." I tend to think that it's one of the nicest benefits of setting pointers to null. LIke that you're really sure that when an object shouldn't be used anymore, that it's not referenced. Better having a NPE than modifying something that shouldn't be modified. Posted by: Geert Bevin at September 6, 2005 11:18 AMHi Cedric I am definitely not an expert on GC, but your argument about you helping the garbage collector *no matter what GC algorithm* by "nulling"? If I remember correctly from my reading of Jones and Lins "Garbage Collection" ( http://www.amazon.co.uk/exec/obidos/ASIN/0471941484/qid=1126036524/sr=8-2/ref=sr_8_xs_ap_i2_xgl/202-4837466-3399064 ), this is not true for e.g. Mark-and-sweep, as you will newer try to follow links in a non-referenced object. I don't have the book at hands right now, so I just did a quick google on Mark-and-sweep, and took a pick on a totally un-validated site: http://lambda.uta.edu/cse5317/notes/node47.html Which explains the Mark-and-sweep algorithm as I also remember it. An example from the picture on the side: No matter if the reference from object 2 -> 4 is "nulled" or not, the garbage collector will newer follow this reference, as object 2 is not reachable from the root-objects when the garbage collector runs. Please feel free to correct me, if I misunderstood your explanation. Best Regards No Morten, you are right. My remarks only apply to certain implementations of garbage collectors. Thanks for pointing out my mistake. Posted by: Cedric at September 6, 2005 01:16 PMIIRC, Microsoft's .Net GC can consider a reference to be no longer used mid-way through a method. What's the point in setting a reference to null if you have a GC this capable? Posted by: RichB at September 6, 2005 01:36 PMAny mark and sweep collector will skip over object references that themselves are dead. A lingering object reference in a leaf method will have no material effect on the memory usage of the app. Whether the object graph can be collected now or in another 200 clock cycles makes very little impact on how much memory I have available -on average- during program execution. On the other hand, a lingering object reference in a top level method, like a main loop(message pump, work queue, scheduler, etc) may have a dramatic impact on available memory, because the reference will linger for millions of clock cycles longer than strictly necessary. Add to this the fact that one tends to see larger object graphs closer to the 'center' of the app, and this problem is compounded. I think where opinions differ is that some, like me, consider these to be merely an important exception to a very good rule. The typical antipattern, that many are skirting around here is keeping one large object graph erroneously alive while calculating a replacement for it. If the graph is the single largest memory use in the app, you've virtually doubled your app's memory needs: BigGraph foo; instead of BigGraph foo; while (true) or where applicable, the much preferred solution of narrowing your variable scope: while (true) For every time I've come across this situation, I've come across hundreds of situations where someone was needlessly nulling an object, and hundreds of thousands of situations where someone could have done so. Posted by: Jason Marshall at September 6, 2005 02:17 PMThere are instances where you must set a pointer to null in order to avoid a memory leak - read Effective Java for an explicit example. I'd give you the chapter and page, but my copy is at work. Posted by: at September 6, 2005 03:23 PMCedric, take a look at the following article by Brian Goetz: http://www-128.ibm.com/developerworks/java/library/j-jtp01274.html "For most applications, explicit nulling, object pooling, and explicit garbage collection will harm the throughput of your application, not improve it -- not to mention the intrusiveness of these techniques on your program design. In certain situations, it may be acceptable to trade throughput for predictability -- such as real-time or embedded applications. But for many Java applications, including most server-side applications, you probably would rather have the throughput." There are some nice examples in the article as well (including cases when nulling is really needed). Posted by: at September 6, 2005 05:52 PMTSS has definitely jumped the shark. Posted by: Patrick Calahan at September 6, 2005 09:31 PMI've blogged (http://kirk.blog-city.com/code_smell_x__null.htm) on this subject a while back after recieving a hidous performance tuning article that suggested that the cure to their performance ills was to null out variables. The technique is about the same as calling a destructor on a C++ object which leads me to believe that we are entering into completely different set of questions. GC mark&sweep works in blocks. All objects are visited in the mark phase irregardless of the state of any application reference to them. The next step is a transitive closure on all reachable objects. IOW, if the root of an object is not reachable, then the child nodes will not be visited. So, nulling has no effect on this phase either. The third step is to free memory and that phase is once again based on the oop table and once again nulling has no effect on this operation. Now where nulling does have an effect is if the place holder is improperly scoped. If an object survives too long, it will end up in old space. GC in old space is MUCH more expensive then GC in young. Plus young is a copy collector which means that extra objects hanging about means more work moving them from eden to s1, s2 and back. The cost of GC is mostly in what is left behind, not in what is collected! > IIRC, Microsoft's .Net GC can consider a JRockit does this as well. I'd say that this is what block scoping was invented for. Although rarely used or at least intentionally thats why you can always enclose things with {} and get a scope. First off it basically negates the need to set to null. Second off, this is the ultimate trick for efficiency nuts since it clues the VM/compiler to the fact that this is a nice place to do direct register assignment. Of course I bet the first IO call you make negates all of thse tricks...If you're using web services or RMI or whatever than this is the least of your worries. Posted by: Andy at September 7, 2005 12:54 PMPost a comment
|