Java, diyGC

TDC

Hi,

I'm not a professional programmer, so please don't hesitate to point out any WTFs ;)

Assume a (large) 3D space with positions x,y,z and objects moving in it. Since I needed an easy way to answer the question "which objects are on a specific position?" I created a Position class with the coordinates x,y,z and an arraylist with the objects that currently are on this position.

So now I can do things like myObject.getPosition().getObjects() and find out whether there's anything else in same spot - great. Of course I needed to prevent duplicate objects of the same position, so I created a PositionPool that has a HashMap which maps position strings "x|y|z" to Position objects (and creates them if they are not there already).

My problem now is, that this HashMap will grow larger and larger with time, as the objects visit more and more positions since the GC will not remove them (because they are in the HashMap). So I need a mechanism to remove unused positions. A position with no objects in it might still be used (for example as a destination for an object) - so I can't just remove all 'empty' positions.

So since I don't want to reinvent the wheel my question is: Is there a way to tell the java GC to remove all objects of a HashMap that are not used anywhere outside the HashMap? And if not: what could I do, to do that myself without turning everything into a pile of WTFery?

Thanks for any input :)

carfield

You can take a look of this http://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html

asuffield

@TDC said:

Assume a (large) 3D space with positions x,y,z and objects moving in it. Since I needed an easy way to answer the question "which objects are on a specific position?"

Heh... this is a huge problem that comes up all the time (notably games and physics simulations) and none of the thousands of people who have worked on it have found any really good solutions yet. You're going to have fun with this one.

TDC

@carfield said:

You can take a look of this http://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html

Thanks, interesting read, although I'm not sure I understood everything :)

I looked at the WeakHashMap and see it's not the solution for my problem since it uses weak referecences for the keys, while I need them for the values.

So what I basically have to do is keep my current HashMap, but insert WeakReference<Position>s instead of Positions? (Although I guess someone might've written a WeakValueHashMap or something like that?). And I'll need a small clean up routine to remove the keys of the removed Positions (I can use the ReferenceQueue for that, as I understood it) - Is it really that simple? :-)

edit: I found one and it really has the name I suggested.. nice ^^ http://dcl.mathcs.emory.edu/util/dist/latest/doc/api/edu/emory/mathcs/util/collections/WeakValueHashMap.html

@asuffield said:

Heh... this is a huge problem that comes up all the time (notably games and physics simulations) and none of the thousands of people who have worked on it have found any really good solutions yet. You're going to have fun with this one.

Yeah, I thought it was a common problem, good to hear there are no easy solutions for it :)

Nether

@asuffield said:

@TDC said:
Assume a (large) 3D space with positions x,y,z and objects moving in it. Since I needed an easy way to answer the question "which objects are on a specific position?"
Heh... this is a huge problem that comes up all the time (notably games and physics simulations) and none of the thousands of people who have worked on it have found any really good solutions yet. You're going to have fun with this one.

I've written a couple amateur games in which I had to solve this problem. If you know the set of objects in your world isn't going to be arbitrarily large, you can do direct comparisons between the position coordinates in your objects. If the resolution of your logical space is integer-valued, this can go pretty fast. Professional solutions seem to take the same basic approach but try a lot of clever tricks to speed it up. You can dynamically size your logical space into buckets along each dimension, or use static buckets along each dimension. You can group your objects into logical sets if not every object has to worry about colliding with every other object. You can keep ordered lists of your object positions along each dimension and update those lists on every tick of the universe.

In a logical 3d universe in a game I wrote in which the resolution was double-valued, my collisions were determined by one object being within a minimum distance of another object. Since this was floating-point arithmetic, I just calculated the distance of a vector from object's location to another and compared it to the minimum distance. If it was less than the minimum collision distance, the objects had collided. Despite there being a lot going on in the game, profiling showed this approach to be rather inexpensive in my universe limited to 1000 objects. I think this has N^2 - n = O(N^2) complexity though.

The OP's approach seems very WTFey to me. Based on his (limited) description thus far, I'd completely ditch this weird hashmap and array thing he's got going, and simply do a direct coordinate comparisons between objects, and let objects track their own positions. His description doesn't indicate that performance and scalability are serious issues.

if(a.x == b.x && a.y == b.y && a.z == b.z) { //collision a<->b }

If anyone else knows some clever ways to approach the problem I'd love to hear them. My search for whitepapers back when I first worked on this yielded nothing fruitful.

TDC

A few facts then: Space is limited (although rather large) and x,y,z are integers. Objects however are not limited (which is why I never even thought of trying to compare them directly, since that would scale terribly?). So yes, performance and especially scalability are issues :)

Also this is not only used for collision detection, but also to display partial maps etc - having to loop through _all_ objects to find out which of them are 'in scope' seems rather odd?

Nether

@TDC said:

A few facts then: Space is limited (although rather large) and x,y,z are integers. Objects however are not limited (which is why I never even thought of trying to compare them directly, since that would scale terribly?). So yes, performance and especially scalability are issues :)
Also this is not only used for collision detection, but also to display partial maps etc - having to loop through all objects to find out which of them are 'in scope' seems rather odd?

Ok that explains a lot, but there's still a couple things that don't make sense to me.

Why can't you remove Positions with empty objects lists from the hashmap? I don't understand your statement about them possibly being destinations. Are your object movements not atomic operations? Is there a concurrency problem? You say you don't want them removed but you also seem to be asking for a way to remove them via the GC. You say the Position entries are recreated if they don't exist, so what's the problem with removing empty Positions from the hashmap?

TDC

Well, the moving objects do actually have a 'mission' that might be 'fly to x,y,z' or 'follow that object' etc. They do not 'just move'.

The purpose of the Pool was to ensure no Positions would be duplicated (because then one might be empty while the other has 3 objects on it.. not cool), so removing empty positions that are still used anywhere else is not good and will lead to bugs. (If not now, then later). (*)

So my thought here is this: I do not care which objects still use the Position and for what (destination, waypoints, whatever), I only want to know which objects are on the position (as I said for collision detection, map drawings, etc) and I want to ensure that Positions are unique. So why not let the GC handle it?

(*) Imagine an object reaching it's destination - if I know that the myObject.getDestination() Position is exactly the one in the HashMap, I can just say getDestination().add(this). Otherwise I'd have to do something like PositionPool.getPosition(getDestination().toString()).add(this), and if I ever forget that I have a bug.

asuffield

@Nether said:

@asuffield said:
@TDC said:
Assume a (large) 3D space with positions x,y,z and objects moving in it. Since I needed an easy way to answer the question "which objects are on a specific position?"
Heh... this is a huge problem that comes up all the time (notably games and physics simulations) and none of the thousands of people who have worked on it have found any really good solutions yet. You're going to have fun with this one.
If you know the set of objects in your world isn't going to be arbitrarily large, you can do direct comparisons between the position coordinates in your objects. If the resolution of your logical space is integer-valued, this can go pretty fast. Professional solutions seem to take the same basic approach but try a lot of clever tricks to speed it up. You can dynamically size your logical space into buckets along each dimension, or use static buckets along each dimension. You can group your objects into logical sets if not every object has to worry about colliding with every other object. You can keep ordered lists of your object positions along each dimension and update those lists on every tick of the universe.
...
If anyone else knows some clever ways to approach the problem I'd love to hear them. My search for whitepapers back when I first worked on this yielded nothing fruitful.

A popular approach is the use of an octree to
logarithmically reduce the number of comparisons you need to find
objects which are "near" another object. Ultimately, nobody's found any
better method than picking a bunch of tricks like this and combining
them in some way that makes sense for your problem - and every variation has to be hand-adjusted to suit the specific system you're currently trying to simulate. Many books have been written on the subject of how to approach this sort of thing, particularly for the specific variant used in collision detection. Lots of different approaches are known, but none of them are both generic and efficient - there aren't any good methods which will work on every problem of this type. It seems like one should exist, but it hasn't been discovered yet.

Given the scope, significance, and relatively recent nature of the problem (in academic timescales, anything that wasn't a problem 50 years ago is too new for anybody to have a good understanding of it yet), it is a subject of current research, and more in the realm of academic papers than whitepapers. Here's a research project that has done a lot of significant work in the field; they have a list of papers linked from there, and you can find the rest of the field via citeseer. It's enough to spend a lifetime just studying this problem.

Nether

@asuffield said:

@Nether said:
@asuffield said:
@TDC said:
Assume a (large) 3D space with positions x,y,z and objects moving in it. Since I needed an easy way to answer the question "which objects are on a specific position?"
Heh... this is a huge problem that comes up all the time (notably games and physics simulations) and none of the thousands of people who have worked on it have found any really good solutions yet. You're going to have fun with this one.
If you know the set of objects in your world isn't going to be arbitrarily large, you can do direct comparisons between the position coordinates in your objects. If the resolution of your logical space is integer-valued, this can go pretty fast. Professional solutions seem to take the same basic approach but try a lot of clever tricks to speed it up. You can dynamically size your logical space into buckets along each dimension, or use static buckets along each dimension. You can group your objects into logical sets if not every object has to worry about colliding with every other object. You can keep ordered lists of your object positions along each dimension and update those lists on every tick of the universe.
...
If anyone else knows some clever ways to approach the problem I'd love to hear them. My search for whitepapers back when I first worked on this yielded nothing fruitful.
A popular approach is the use of an octree to
logarithmically reduce the number of comparisons you need to find
objects which are "near" another object. Ultimately, nobody's found any
better method than picking a bunch of tricks like this and combining
them in some way that makes sense for your problem - and every variation has to be hand-adjusted to suit the specific system you're currently trying to simulate. Many books have been written on the subject of how to approach this sort of thing, particularly for the specific variant used in collision detection. Lots of different approaches are known, but none of them are both generic and efficient - there aren't any good methods which will work on every problem of this type. It seems like one should exist, but it hasn't been discovered yet.
Given the scope, significance, and relatively recent nature of the problem (in academic timescales, anything that wasn't a problem 50 years ago is too new for anybody to have a good understanding of it yet), it is a subject of current research, and more in the realm of academic papers than whitepapers. Here's a research project that has done a lot of significant work in the field; they have a list of papers linked from there, and you can find the rest of the field via citeseer. It's enough to spend a lifetime just studying this problem.

Thanks for the link asuffield. Now I have some bedtime reading =P

I still oppose trying to alter the standard behavior of the GC. What happens when your altered behavior turns out to be detrimental to other parts of your program? I fear the law of unintended consequences in this situation.

If the list being empty isn't sufficient evidence to tell whether or not the position is used, can't you just maintain a flag about whether that position object is used or not and maintain it over the course of your program?

TDC

@Nether said:

I still oppose trying to alter the standard behavior of the GC. What happens when your altered behavior turns out to be detrimental to other parts of your program? I fear the law of unintended consequences in this situation.
If the list being empty isn't sufficient evidence to tell whether or not the position is used, can't you just maintain a flag about whether that position object is used or not and maintain it over the course of your program?

I can't see how this could have any impact on other parts of the program - it just ensures that positions which are really not used anywhere outside the HashMap are removed (as long as I don't use WeakReferences anywhere else as well, of course) - trying to do that myself (as was my original plan because I didn't know about the WeakReferences) appears to be both reinventing the wheel and a possible source of bugs. What kind of consequences are you thinking of?

RaspenJho

Is there a LINQ for Java?