OSDev.org

The Place to Start for Operating System Developers
It is currently Mon Apr 29, 2024 12:42 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 36 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 2:46 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7614
Location: Germany
Colonel Kernel wrote:
Your gut needs to learn about more GC algorithms. :)


My head, perhaps, my gut, no. Give me some time to brush up my Java skills (been busy with C++ the last four years), and I show you a memory leak in about 20-30 lines of Java. If I introduce complexity (GC) to solve a problem (memory management being bug-prone), it better solve the problem for sure or it isn't worth the effort in my book.

But that's just me.

As for the Wiki page "Garbage Collection", I have a serious complaint with this paragraph:

Quote:
One of the major advantages of a compacting collector is that it makes allocating from the heap extremely fast and scalable to multiprocessor systems. All you have to do to allocate a new object is to increment a "top-of-heap" pointer and initialize a few fields, which can be done in constant time. Compare this to a traditional implementation of malloc(), which may need to scan many entries in the free list in order to find a block large enough to satisfy the request. In the worst case, this takes linear time based on the length of the free list. Consider that in a multiprocessor system, holding a global heap lock while scanning the free list is not a very good idea.


What's described here - keeping free blocks in a single list and scanning that on an allocation - is what I've done as a malloc() placeholder in PDCLib with 113 lines of code. Comparing a compacting GC with a 100-line malloc() placeholder is hardly fair.

I don't know about the author's definition of "traditional", but "normal" malloc() implementations hold lists of similar-sized free blocks, and satisfy larger requests using [tt]mmap()[/tt], which means the memory can be returned to the system without any memory-copying.

I am willing to admit that GC has its uses, but I don't agree to biased... well, "propaganda" for lack of a better word.

And then there's the issue of C/C++ allowing pointers, and pointer arithmetics. Unless you want to write your kernel in C#, VB, Java or Python, you simply replace the issue of remembering to do proper memory cleanups with the issue of remembering not to do The Bad Thing (tm) with pointers - which is probably rather hard to avoid in kernel space.

Bottom line, yes, GC in kernel space is most likely possible. But whether it is worth the effort is in the eyes of the beholder. Just like writing an OS in Pascal or going to lengths to make "pure" microkernels. Let's provide NPOV information, instead of trying to make [tt]malloc()[/tt] look worse than it is.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 3:13 am 
Offline
Member
Member

Joined: Wed Oct 18, 2006 11:59 am
Posts: 1600
Location: Vienna/Austria
Just my 2 c: I wouldn't want a gc thread halt my system critical tasks in kernel land just because of some garbage collecting stuff. I wouldn't find it very funny if a gc tiptoed over the borders of virtual memory management and messed up all the objects allocated for vmm stuff. And I wouldn't want to introduce an additional lock on that.

Further, despite gc sounding quite cool for development it also generates an increasing hankering for lazyness among the not so trained and experienced developers. Omitting a free here and there, ah, the gc will take care of it, I don't mess with it - that sounds most prolly inviting to someone who doesn'
t really want to be busied with programming nitty gritty like freeing allocated objects as soon as they aren't used.

To bring it to a Point: I don't want to rely on gc for freeing my kernel objects. I want to know what happens at each tick in the kernel, what happens upon sending/receiving a message or a sw interrupt other than send/receive. Absolutely. No gc interfering with system calls and irq's. In Kernel Land - that's my requirement - it is necessary to know the absolute reliability of each call path/function. No place for gc messing around at any time.

Well - that's been a big 2 cent, eh?

Stay safe :-)

_________________
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 6:15 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7614
Location: Germany
There's the Boehm garbage collector for C/C++, with accompaning documentation, at http://www.hpl.hp.com/personal/Hans_Boehm/gc/. The docs also carry a nicely balanced view on the pro's and con's of GC vs. malloc / new in general.

And, returning to the initial question, we're talking about a novice (no offense intended), who prefers garbage collection due to ease of use. While I accept the usefulness of GC especially for novices / mediocre programmers, implementing such a beast is certainly not a task for someone who wants to use it because he feels uneasy around pointers...

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 8:48 am 
It was less to do with being uncomfortable with Pointers and more that I was interested in looking at the alternatives, after all there is little point re-inventing the wheel if its just going to be a big round thing suspiciously similar to all the other wheels out there. As it is I was hoping to try a few things in my Kernel or supporting modules that may not be common practise, if for no other reason than to find out why they are not common practise :)

However having looked at various GC implementations and considering the requirements of a GC within Kernel space I'm left in no doubt that, to write such a system which will not block standard Kernel operation while it is in use and will still be able to accurately detect (even in a conservative scheme) what memory should or should not be collected is beyond me. In fact I think this would probably make a very nice and sizable side project for someone.

I think for now I'll stick with Malloc, but I would be very interested to hear if anyone does decide to take this further.


Top
  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 9:26 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 2:31 am
Posts: 5964
Location: In a galaxy, far, far away
that makes http://www.osdev.org/osfaq2/index.php/GarbageCollection grow up ;) ... which should please Eleanore

_________________
Image May the source be with you.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 9:46 am 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
Solar wrote:
Give me some time to brush up my Java skills (been busy with C++ the last four years), and I show you a memory leak in about 20-30 lines of Java.


I don't buy that. What you call a "leak" is either "unintentional object retention" (which I admit sounds a bit like a cop-out but is really a different problem), or a bug in your JVM. I look forward to seeing your example code. :)

Solar wrote:
As for the Wiki page "Garbage Collection", I have a serious complaint with this paragraph


So fix it (that's why it's a wiki). It was written by someone who has admittedly never implemented malloc() before. :-[ Based on your description however, it still sounds like it has linear time complexity in the worst case, which is the point I was trying to make. The data point I have to go by is the history of the Windows heap manager, which had to undergo some big changes between NT4 and Win2k because it didn't scale well on multiprocessor systems.

Quote:
I am willing to admit that GC has its uses, but I don't agree to biased... well, "propaganda" for lack of a better word.


I'd suggest finding a better word. What I'm seeing in this thread is a lot of opinion based on "gut feelings" and not much research into how GCs actually work and when to use them, and when not to (which I have already covered several times).

Quote:
And then there's the issue of C/C++ allowing pointers, and pointer arithmetics. Unless you want to write your kernel in C#, VB, Java or Python, you simply replace the issue of remembering to do proper memory cleanups with the issue of remembering not to do The Bad Thing (tm) with pointers - which is probably rather hard to avoid in kernel space.


Agreed. I think several people on this thread (including me) have already said that GC and "unmanaged" languages do not mix.

beyond infinity wrote:
Just my 2 c: I wouldn't want a gc thread halt my system critical tasks in kernel land just because of some garbage collecting stuff.


This is why it's been said a few times that a compacting GC in the kernel is a really bad idea. Note that there are non-compacting GCs that don't have to halt anything (e.g. -- concurrent mark-sweep).

Quote:
I wouldn't find it very funny if a gc tiptoed over the borders of virtual memory management and messed up all the objects allocated for vmm stuff.


Just because there is a GC in the kernel doesn't mean everything has to be managed by it. Having the GC manage VMM structures is like implementing your physical memory manager with kmalloc(). ;)

Quote:
Further, despite gc sounding quite cool for development it also generates an increasing hankering for lazyness among the not so trained and experienced developers. Omitting a free here and there, ah, the gc will take care of it, I don't mess with it - that sounds most prolly inviting to someone who doesn'
t really want to be busied with programming nitty gritty like freeing allocated objects as soon as they aren't used.


I've heard this argument many times, and I think what overrules it for me are the advantages of GC (again, not necessarily an in-kernel GC, but GC in general). In a nutshell, explicit deallocation introduces the possibility of dangling pointers and double-deletion bugs, which makes having a strictly type-safe language nearly impossible. It's great that developers like us (with our strong work ethic :P ) are careful enough to do great things with "unsafe" languages like C and C++, but we're a very small minority. The world needs more software than what we alone can write. :) IMO it's better if the programming language can help to avoid such bugs related to type-safety.

In terms of laziness, I've thought of a better way to characterize that kind of laziness in a GC language -- being careless about using "new" too much. What bugs me about Java is that it gives you very little choice in the matter. At least C# has structs that can live on the stack or inline (although you allocate them with "new" as well -- a syntax I find baffling coming from a C++ background). Still, I would rather have someone write code that generates too much garbage (which affects performance) rather than code which accidentally deletes an object twice and causes a crash in a completely unrelated part of the code (which affects correctness and is a major pain to debug).

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 10:34 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 2:31 am
Posts: 5964
Location: In a galaxy, far, far away
Solar wrote:
As for the Wiki page "Garbage Collection", I have a serious complaint with this paragraph:

Quote:
One of the major advantages of a compacting collector is that it makes allocating from the heap extremely fast and scalable (...) .
Compare this to a traditional implementation of malloc(), ...


What's described here - keeping free blocks in a single list and scanning that on an allocation - is what I've done as a malloc() placeholder in PDCLib with 113 lines of code. Comparing a compacting GC with a 100-line malloc() placeholder is hardly fair.


I have to admit i took paragraphs of the previous discussions to build that one ... Certainly a decent implementation of [tt]malloc()[/tt] would be trying to shorten lists -- and when possible, to allocate the first item of the list rather than the best item ever

My own in-kernel implementation scans only a sublist that can span over 64K (and yes, it first has to scan a list of boxes that is N/64K entries long to know what box should be scanned).

Another thing to take into consideration is that often, you even have portions of your code where you _cannot_ have malloc's (e.g. because you're in an interrupt handler which might have interrupted malloc!)

_________________
Image May the source be with you.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 10:46 am 
The idea with a compacting GC is that memory allocation is faster than "traditional" methods, but memory reclaims are of course a lot slower.

It might change if you use different GC algorithms or perhaps different malloc routines in "traditional" systems. As always, it all depends on your system and the way you like it to work.

For example, in Singularity they have a lot of other advantages and speed improvements (no ring switches for example) that may offset some performance degradations from a garbage collector.

You can't just look at a small section of an OS / kernel, you have to look at the big picture.

Personally a GC for me has nothing to do with "lousy" coding (I've done professional work in asm, C(++), Visual Basic, C# etc.) but with more stable systems. That is a big plus. That's of course not to say the other way yields unsafe code, but I think we can all agree that working with pointers, strings and memory buffers in asm / C introduces subtle errors fast.

But that's just my opinion ;)


Top
  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 12:14 pm 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7614
Location: Germany
Colonel Kernel wrote:
I don't buy that. What you call a "leak" is either "unintentional object retention" (which I admit sounds a bit like a cop-out but is really a different problem), or a bug in your JVM. I look forward to seeing your example code. :)


The rough cookbook is: Set up a widget, register an event handler for that widget, release the widget without releasing the event handler first. The event manager still holds a reference to the event handler to the non-existing widget, but you no longer have the widget with which to unregister the event handler.

That's how someone whose coding skills I trust deeply told it.

Quote:
Solar wrote:
As for the Wiki page "Garbage Collection", I have a serious complaint with this paragraph


So fix it (that's why it's a wiki).


I loathe edit wars, so I wanted some talkback first.

Quote:
It was written by someone who has admittedly never implemented malloc() before. :-[ Based on your description however, it still sounds like it has linear time complexity in the worst case, which is the point I was trying to make.


Hehe... don't get me started on worst-case GC behaviour. Everything is an 80-20 approach. You know an O(1) algorithm can statistically take longer on average than an O(n^n)? ;-)

Quote:
What I'm seeing in this thread is a lot of opinion based on "gut feelings" and not much research into how GCs actually work and when to use them, and when not to (which I have already covered several times).


I admit my knowledge on GC's internals is limited to what other people wrote about them, but I read quite some of those. And, let's admit it, more often than not we are payed for our "gut feelings" because there isn't time and/or budget for in-deep research.

But I think the point is mood, since we basically agree.

Quote:
It's great that developers like us (with our strong work ethic :P ) are careful enough to do great things with "unsafe" languages like C and C++, but we're a very small minority. The world needs more software than what we alone can write. :) IMO it's better if the programming language can help to avoid such bugs related to type-safety.


On the other hand, many people who never learned about pointers are what I consider a project risk. Not to say there aren't great Java or Perl coders, but that's usually the camp where those people creating those weird memory problems do come from.

Quote:
In terms of laziness, I've thought of a better way to characterize that kind of laziness in a GC language -- being careless about using "new" too much. What bugs me about Java is that it gives you very little choice in the matter.


ACK.

The one example I always quote in a discussion like that: I am maintenance coder for a big banking application. We have several MB's of source code, all in C++. Latest count of "new"'s in all that code: 6. (Used in wrapper code for an underlying C API.) All the rest are using strictly automatic memory management.

Quote:
Still, I would rather have someone write code that generates too much garbage (which affects performance) rather than code which accidentally deletes an object twice and causes a crash in a completely unrelated part of the code (which affects correctness and is a major pain to debug).


Well, there are projects where you can bear with neither. Like, kernel code. Or when working for a customer where performance penalty equates hardware cost equates less of a business case for employing you instead of the competition. ;)

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 3:18 pm 
Offline
Member
Member

Joined: Wed Oct 18, 2006 11:59 am
Posts: 1600
Location: Vienna/Austria
@solar: automatic memory management == local declaration of objects on stack and sorta?

_________________
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 4:15 pm 
solar: I'm not into Java, but that sounds like a problem with the releasing of the widget instead of the GC. In other words, if a widget gets destroyed and event handlers attached to it should be gone as well, of course.

As long as something has a reference to some other thing a GC can't do anything about it. However, in a managed environment like Java (and .NET) the system should even in that case be able to tell that the object holding the reference is collected so the other object actually has no live references and can be collected as well.

Now I don't want to say that the .NET platform is great or anything, but I haven't heard of any such problems or memory leaks. There have been memory leaks on that platform, but as far as I know they where always located in the layer between the managed world and the Operating System (ie the Windows or COM API). Which was written in a non-managed language, of course.

Again I don't know much about Java, so perhaps things just work differently there. I can easily create a memory leak in .NET if I write unsafe code blocks (specially marked blocks that allow direct memory access through pointers etc. Which won't run on the Singularity OS, in case anyone was wondering).


Top
  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 8:20 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
IIRC, event handlers in Java are just objects that implement event handling interfaces, so the sequence would go something like this:
  • Create an object A that will fire events.
  • Create an object B that will handle events.
  • Register B with A. At this point, your code refers to A and B, and A refers to B as well.
  • Release A (and probably B). This really means giving up your reference to it, so it is no longer reachable by any root.
  • At this point, A refers to B, but nothing else refers to A, and nothing else refers to B.
I see no reason why A and B would not be collected in this case. Even if A and B point at each other, GCs have no trouble dealing with cycles (unless you're using reference counting).

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 11:02 pm 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7614
Location: Germany
beyond infinity wrote:
@solar: automatic memory management == local declaration of objects on stack and sorta?


Correct.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Wed Mar 01, 2006 11:42 pm 
@Solar: I suspect that this 'memory leak' is actually caused by _native_ code which implements the widget library (there has to be native code somewhere down there), it doesn't have much to do with a broken GC. As soon as native (unsafe, in .NET) code is involved, things start to get complicated for any managed language, IMO.

cheers Joe


Top
  
 
 Post subject: Re:Newbie Memory Management
PostPosted: Thu Mar 02, 2006 1:06 am 
Offline
Member
Member

Joined: Wed Oct 18, 2006 11:59 am
Posts: 1600
Location: Vienna/Austria
Well, the thing in java is, that you have to *new* nearby every object you want to use - except some "autoinstantiated" (power builder speak) ones which are present per se, so to say. the Math class is an example.

You are also well advised to release the objects with destroy() or delete() after use, because then, your object's destructors are called and can clean up the aftermath.

This makes work easier for the gc. Really.

I don't know if java locally instantiated objects are destroyed per se if going outta scope - and if the destructors are called. If dependency lies in the order of how things are to happen, I'd say destroy() ere leaving the function.

The event handler stuff is basically as solar says. I have to take a look into it again.

Nevertheless, sometime in the future (next few months between working and studying) I gonna take a look at that boehm gc. Might come in handy.

@solar: thx for confirmation. & agree, that's the by far most effective form of MM. Put it on the stack and forget about it afterwards. Are constructors/destructors called as is due? I *should* take a look.

_________________
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 36 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot] and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group