Best Linked List Ever

pacohope

I'm a consultant and I review code at a variety of customers. This one came from an embedded Linux system. They had an event queue in the system that checked a sorted linked list to see if it was time to fire an event. Events would get put on the queue with their time and a callback to invoke at that time. Each interval, the queue manager would wake up, see if it was time to do the top element(s) of the queue, and then go back to sleep. The joy was in the implementation of the linked list that served as their queue.

In an effort to save space (I suppose), it was a singly linked list. The insert() code did the intuitive thing. It put the event in the right part of the queue, based on its timestamp. Perhaps a little ASCII art would help (it's certainly appropriate to the genre):

  +---+      +---+      +---+
  | A |+|--->| B |+|--->| C |+|--+
  +---+      +---+      +---+    v

The search routine, however, had to take into account the fact that it wasn't a doubly-linked list (i.e. with back pointers from each node). If you wanted to delete the event that you found (the most common reason for searching the linked list), you would need to have the thing that pointed at the thing you wanted to delete. Follow me? If I A links to B, and I want to delete B, I need a pointer to A. Thus, the search routine returned "the thing that points at the thing that matches your search"! search(C) would return B. There are a couple of caveats. If the thing you want isn't found, search() returns NULL. If the thing you want was actually the first element of the list, it returned the first element of the list; search(A) returned A. All the event code was aware of this unusual way of doing things.

The delete() routine, naturally, was aware of the shenanigans going on in search(). If you passed a pointer to an event to delete(), it would actually delete the object pointed to by the object you passed. delete(B) would have the effect of removing C from the list. Likewise, if you passed it A, and it knew that A was the head of the list, it deleted A. Silly, but it matches the search() behavior. Figured it out yet?

What if the thing you wanted to delete was the second thing in the list? You would do search(B), which returned A. Then you pass A to delete(). Delete() sees that you've passed A, which is the head of the list, so it deletes A. This has the glorious effect of:

Deleting the next event to fire, which you probably DID want
Leaving the second event on the list, which is the one you DIDN'T want

I looked through the CVS tags at the top of the file and asked about who the commiter was. The response was "Oh, him? He's one of our Linux kernel gurus." Sigh.

Devi

You know, in a way that's kind of clever... He can get the 4 byte per node memory saving, and still have the ability to delete nodes with a O(1) operation. Of course, the fact you can't delete the second object in the list is kind of a bummer... Though I guess if you had a unique const to identify the head of the list, you could get around that. For example, search would return NULL to mean the first thing (because nothing's before the first thing), otherwise a pointer to the node before the search point and FILE_NOT_FOUND if it's not there. Of course, if he were really clever (like me) he would hidden the result in an object so you didn't have to keep repeating the result->node pointer translation:

#define FILE_NOT_FOUND ((Node*)0xffffffff)

class NodeSearchResult

{

private:

Node *mNodePrevious; // Node before the one we want

List *mList; // So we can find the front of the list

public:

Node * operator -> ()

{

if (mNodePrevious==NULL) return mList->GetFirst();

if (mNodePrevious==FILE_NOT_FOUND) return NULL;

return mNodePrevious->GetNext();

}

} ;

Of course, all this assumes that you're programming for a system which will actually benefit from the 40k per 1024 nodes saving that you'll get...

Ancient_Hacker

This is triply silly. If the Guru had read just a teensy bit of linked list literature, surely he would have seen Nick Wirths way to link both ways with just one pointer.

It's only been around since, like 1972

bstorer

The best part is that with a simple refinement of what the return value means, the problem with search(B) could have been avoided. I would have done it thusly:

search(A) returns NULL
search(B) or search(C) returns the predecessor node (i.e., A or B respectively)
search(D) returns C

So my delete code could work like this:

delete (Node * node)
{
    Node * nodeToDelete = head;
    if (node == NULL)
        head = head -> next;
    else if (node -> next == NULL)
        return;
    else
    {
        nodeToDelete = node -> next;
        node -> next = nodeToDelete -> next;
    }
    delete nodeToDelete;
    return;
}

Pardon the crappy formatting. I'm sick of fighting with the editor. But you get the idea.

Kerin

this makes me gloriously happy, in a perverse way.

luke727

Yes, but did he remember to unroll his loops?!?!

fennec

Returning the prior ELEMENT is ridiculous and silly. Returning a pointer to the 'next' field of the previous element (or the head item in the list), on the other hand, would make perfect sense. Pointers-to-pointers for the win once more!

ActionMan

maybe he didnt want to infringe on the double-linked-list patent ;) (which was "invented" last year)

viraptor

@Ancient_Hacker said:

This is triply silly. If the Guru had read just a teensy bit of linked list literature, surely he would have seen Nick Wirths way to link both ways with just one pointer.
It's only been around since, like 1972

??? Link please....

asuffield

@viraptor said:

@Ancient_Hacker said:
This is triply silly. If the Guru had read just a teensy bit of linked list literature, surely he would have seen Nick Wirths way to link both ways with just one pointer.

It's only been around since, like 1972

??? Link please....

I don't know which variant was Wirth's, but the standard trick is the delta-linked list:

struct node {
  void *obj;
  int delta;
};

If you have struct node* a, b, c, d, and the list order is a, b, c, d, then:

a->delta = NULL

b->delta = c - a

c->delta = d - b

d->delta = NULL

To walk the list, you need pointers to the first two elements, the next element is x + y->delta, and the previous element is y - x->delta.

The neat trick about this structure is that the whole thing is relocatable - you can copy it to any point in memory, and the pointers are still valid, since they only store the distance between nodes.

Other variants include addition and xor, but those aren't relocatable.

GettinSadda

I don't know which variant was Wirth's, but the standard trick is the delta-linked list:
struct node {
void *obj;
int delta;
};

Now that's just silly - we know this project is using embedded Linux, so an int should be 4 bytes, the same as a pointer. All you have saved is 4 - 4 = 0 bytes. But the code is much less obvious.

tchize

@asuffield said:

I don't know which variant was Wirth's, but the standard trick is the delta-linked list:
struct node {
  void *obj;
  int delta;
};
If you have struct node* a, b, c, d, and the list order is a, b, c, d, then:

a->delta = NULL

b->delta = c - a

c->delta = d - b

d->delta = NULL

To walk the list, you need pointers to the first two elements, the next element is x + y->delta, and the previous element is y - x->delta.

The neat trick about this structure is that the whole thing is relocatable - you can copy it to any point in memory, and the pointers are still valid, since they only store the distance between nodes.
Other variants include addition and xor, but those aren't relocatable.

Not lot more usefull, you always need a pair of pointer if you want to locate the next or previous node. With node56 alone you can not find node55 nor node57, you have to get your hands on node56 and node57 or node55 and node 56 to be walkeable from that point . This mean you need to complicate search results structure and list modification calls to always accept a pair of node.

However +1 for the nice trick that save memory. And yes, despite what says GettinSadda, this save memory. You remove 2*4 bits pointer and use instead a 4 bit int
(2*4-4=4). This might be interesting for long lists in limited memory environments

zip

@GettinSadda said:

I don't know which variant was Wirth's, but the standard trick is the delta-linked list:
struct node {
void *obj;
int delta;
};

Now that's just silly - we know this project is using embedded Linux, so an int should be 4 bytes, the same as a pointer. All you have saved is 4 - 4 = 0 bytes. But the code is much less obvious.

Um... assuming you replied to the right post, that's an example of a doubly linked list taking up the space of a singly linked list. It's not about saving space.

GettinSadda

@zip said:

@GettinSadda said:
I don't know which variant was Wirth's, but the standard trick is the delta-linked list:
struct node {
void *obj;
int delta;
};

Now that's just silly - we know this project is using embedded Linux, so an int should be 4 bytes, the same as a pointer. All you have saved is 4 - 4 = 0 bytes. But the code is much less obvious.
Um... assuming you replied to the right post, that's an example of a doubly linked list taking up the space of a singly linked list. It's not about saving space.

What? You say two things:

a) "...that's an example of a doubly linked list taking up the space of a singly linked list."
b) " It's not about saving space."

Which one did you mean?

GettinSadda

@tchize said:

However +1 for the nice trick that save memory. And yes, despite what says GettinSadda, this save memory. You remove 24 bits pointer and use instead a 4 bit int
(24-4=4). This might be interesting for long lists in limited memory environments

Firstly, the pointers and the int are 4 [b]byte[/b] variables, not 4 bits
Secondly, the code has both a pointer and an integer (i.e. 8 bytes).

It is possible that the pointer in the example does not come into the calculations (the examples are not clear). If this is the case I don't see how this is supposed to work!

Edit: I think I have figured out now how it works - and it works, but its yucky! Might be worth it for the memory saving if you can rely on always having pairs, but if you could rely on that most times a singly linked list would be fine.

viraptor

@GettinSadda said:

I don't know which variant was Wirth's, but the standard trick is the delta-linked list:
struct node {
void *obj;
int delta;
};

Now that's just silly - we know this project is using embedded Linux, so an int should be 4 bytes, the same as a pointer. All you have saved is 4 - 4 = 0 bytes. But the code is much less obvious.

obj is not a link pointer - it's data pointer.

single: void *obj; node *next;
double: void *obj; node *next; node *prev;
delta: void *obj; int delta;

4 bytes are saved with delta - Ok now?

pacohope

that's an example of a doubly linked list taking up the space of a singly linked list. It's not about saving space.

It's actually an example of something more complex than a doubly linked list taking up exactly the same space as a doubly linked list, and taking more space than a singly linked list. The thread is about saving space. This "delta" trick saves no space at all. It might save some computation, since you don't have to walk the list in order to jump to a node, but I don't think that helps us here because of the nature of the problem, as I described it.

First off, searching is still O(n) because you have to linearly walk the list looking for what you want (i.e., a node whose data elements are an exact match). The ability to use a delta value and calculate a destination's address doesn't improve searching. Secondly, an "int" delta takes up the same room as a "previous" pointer. You could theoretically save space by using a single byte for delta, but that would limit the depth of the list. Whatever RAM savings you had in chopping down a few bytes from the "delta" storage would be consumed by the extra instructions you would write (which have to be stored in RAM in this embedded system) that check for overflow of the 255 node limit.

Lest anyone forget the size of integers and pointers in C on Intel hardware and a Linux-ish operating system, I wrote the following test program. Sadly, I have no Linux box available to me, but my MacBook Pro and my FreeBSD server both produce the same output. Note that I added a data element to the node struct just to give it that extra feeling of realism. :)

Code Snippet

struct deltanode {
  int   data;
  void *next;
  int delta;
};

struct doublelist {
  int   data;
  void *next;
  void *prev;
};

int main( int argc, char* argv[] ) {
  int deltasize;
  int doublesize;
  struct deltanode dn;
  struct doublelist dl;
  dn.data = 42;
  dl.data = 42;
  deltasize = sizeof( dn );
  doublesize = sizeof( dl );
  printf( "Delta list size: %d.  Double list size: %d.\n",
           deltasize, doublesize );
  return 0;
}

Compile with gcc -o sizeit test.c

Output on MacOS X

Delta list size: 12. Double list size: 12.

Output on FreeBSD

Delta list size: 12. Double list size: 12.

Bottom Line

The delta trick saves you nothing at all in storage. In fact, for the most common cases cited in my background discussion, it will execute more instructions than a doubly linked list, since you have to do pointer arithmetic to find the previous node. Rather than just saying prev = cur->prev, you have to do prev=cur - cur->delta or something similar. I know this sounds like making a mountain out of a molehill, but in the embedded world you have to be vigilant against gratuitous consumption of data space or computation.

DaveK1

Jesus H.... can absolutely *nobody* here read or count? Or program?

@pacohope said:

that's an example of a doubly linked list taking up the space of a singly linked list. It's not about saving space.

It's actually an example of something more complex than a doubly linked list taking up exactly the same space as a doubly linked list, and taking more space than a singly linked list. The thread is about saving space. This "delta" trick saves no space at all.
Secondly, an "int" delta takes up the same room as a "previous" pointer.

Yes, or the same as a "next" pointer. The difference is YOU ONLY HAVE ONE with a delta list. FCOL!

@pacohope said:

Lest anyone forget the size of integers and pointers in C on Intel hardware and a Linux-ish operating system, I wrote the following test program.

You REALLY screwed up.

@pacohope said:

Note that I added a data element to the node struct just to give it that extra feeling of realism. :)

Cool. May I also note that you added a completely bogus "next" pointer in order to falsify your result? I may? Oh joy!

@pacohope said:

struct deltanode {

  int   data;
  void *next;
  int delta;
};

THERE'S NO BLOODY 'NEXT' POINTER IN A DELTANODE YOU IDIOT. THE DELTA IS ALL IT NEEDS APART FROM THE DATA ITEM.

Did you even READ the description of what a delta list is and how it works?

@pacohope said:

Bottom Line

... is you're a moron.

Otterdam

@pacohope said:

Code Snippet

struct deltanode {
  int   data;
  void *next;
  int delta;
};

struct doublelist {
  int   data;
  void *next;
  void *prev;
};

<--- point of the delta trick

You

fist_poster

There is no pointer to the next node in this case, the node contains data and a delta value.

However, I fail to see how this solves the problem. To find the previous node, I need to know where this and the next node are. Given only this node, however, I can't find the next node.

Therefore, if search and delete are to be done as separate steps, search needs to return a pair of nodes (this and next). My question is, why can't search return a pair of pointers to this and previous, using regular linked list, in the first place?

@pacohope said:

that's an example of a doubly linked list taking up the space of a singly linked list. It's not about saving space.

It's actually an example of something more complex than a doubly linked list taking up exactly the same space as a doubly linked list, and taking more space than a singly linked list. The thread is about saving space. This "delta" trick saves no space at all. It might save some computation, since you don't have to walk the list in order to jump to a node, but I don't think that helps us here because of the nature of the problem, as I described it.

First off, searching is still O(n) because you have to linearly walk the list looking for what you want (i.e., a node whose data elements are an exact match). The ability to use a delta value and calculate a destination's address doesn't improve searching. Secondly, an "int" delta takes up the same room as a "previous" pointer. You could theoretically save space by using a single byte for delta, but that would limit the depth of the list. Whatever RAM savings you had in chopping down a few bytes from the "delta" storage would be consumed by the extra instructions you would write (which have to be stored in RAM in this embedded system) that check for overflow of the 255 node limit.

Lest anyone forget the size of integers and pointers in C on Intel hardware and a Linux-ish operating system, I wrote the following test program. Sadly, I have no Linux box available to me, but my MacBook Pro and my FreeBSD server both produce the same output. Note that I added a data element to the node struct just to give it that extra feeling of realism. :)

Code Snippet
struct deltanode {
  int   data;
  void *next;
  int delta;
};

struct doublelist {
  int   data;
  void *next;
  void *prev;
};

int main( int argc, char* argv[] ) {
  int deltasize;
  int doublesize;
  struct deltanode dn;
  struct doublelist dl;
  dn.data = 42;
  dl.data = 42;
  deltasize = sizeof( dn );
  doublesize = sizeof( dl );
  printf( "Delta list size: %d.  Double list size: %d.\n",
           deltasize, doublesize );
  return 0;
}
Compile with gcc -o sizeit test.c

Output on MacOS X

Delta list size: 12. Double list size: 12.

Output on FreeBSD

Delta list size: 12. Double list size: 12.

Bottom Line

The delta trick saves you nothing at all in storage. In fact, for the most common cases cited in my background discussion, it will execute more instructions than a doubly linked list, since you have to do pointer arithmetic to find the previous node. Rather than just saying prev = cur->prev, you have to do prev=cur - cur->delta or something similar. I know this sounds like making a mountain out of a molehill, but in the embedded world you have to be vigilant against gratuitous consumption of data space or computation.

wow, people are really having a hard time understanding this concept.

The pointer in the deltaNode is to the user data. the structure for deltaNode is:

typedef struct {

void * userData;

int delta;

} deltaNode;

notice how that is 8 bytes? A doubly linked list is 12 bytes as you showed. however, the "data" member of the DLL should be a void * to user defined data.

Furthermore you can always (ALWAYS) count on there being 2 things in the delta linked list by just instantiating two head nodes instead of 1 to act as your starting point.

One thing I think of though. I don't know of any languages that allow this. But if you have a garbage collected language that you can get the value of pointers in (which is a stupid idea and I don't know of any languages that allow it) this will break garbage collection and you will loose your nodes.

fist_poster

Yes, this construct is like a doubly-linked list in that it lets you traverse it forwards and backwards, given any two consecutive nodes.

Unlike a doubly linked list, it doesn't let you do anything, given just one node. The problem with the original code idea was that search returned an inappropriate result. However, which-ever technique you use, if you want to do it with 8 bytes, you'll have to return two pointers anyway to find your way around. If you don't want to search backwards, you might just as well use a regular singly linked list, which is probably easier, faster and safer.

(I'm not sure, but is pointer arithmetics guaranteed to give defined results in C and C++, if used with non-sequential memory?)

@fist-poster said:

(I'm not sure, but is pointer arithmetics guaranteed to give defined results in C and C++, if used with non-sequential memory?)

yes. There is no paging inside of the processes memory space. The process sees a contiguous block of memory for code, stack and heap. This doesn't mean that the memory is ACTUALLY contiguous, but the operating system abstracts the physical memory locations.

Faxmachinen

@DaveK said:

Jesus H.... can absolutely *nobody* here read or count? Or program?

My thoughts exactly.

fist_poster

@tster said:

@fist-poster said:
(I'm not sure, but is pointer arithmetics guaranteed to give defined results in C and C++, if used with non-sequential memory?)

yes. There is no paging inside of the processes memory space. The process sees a contiguous block of memory for code, stack and heap. This doesn't mean that the memory is ACTUALLY contiguous, but the operating system abstracts the physical memory locations.

This seems to disagree:

Unless both pointers point to elements of the
  same array object, or one past the last element of the  array  object,
  the behavior is undefined.

From C++ standards draft 1997, 5.7.6
(concerning subtracting two pointers, same thing about adding an int to a pointer)

Not only are there only a few programmers out there who can understand the neat trick, I wouldn't dare to use it in practice :(

jimlangrunner

Oh, my. As I recall from CS201, a linked list was a nice exercise, but (as the example shows) not very useful. And the deletion routine mentioned just sucked.

The doubly-linked list would solve most of the problems. The "classic" doubly-linked list, with *next and *prev.

But the "super-efficient" doubly-linked list looks suspiciously like a classic "array". You remember arrays, don't you? This can be moved from one place in mem to another, but deleting items can be costly (unless you blank the element). If you remove item 56, you must either mark 56 as "removed" or move 57 down to 56, 58 down to 57, and so on.

I think that covers it.

Happy 1/2*mv^2!

scooby_doo

John Tucker must Die

Kemp

@fist-poster said:

Unless both pointers point to elements of the
  same array object, or one past the last element of the  array  object,
  the behavior is undefined.
From C++ standards draft 1997, 5.7.6
(concerning subtracting two pointers, same thing about adding an int to a pointer) 
Not only are there only a few programmers out there who can understand the neat trick, I wouldn't dare to use it in practice :(

That is just a reference to the fact that they can't tell you what a pointer is going to point to if you just start randomly subtracting different objects from each other. In any normal usage scenario it's fine.

MichaelWojcik

@tster said:

@fist-poster said:
(I'm not sure, but is pointer arithmetics guaranteed to give defined results in C and C++, if used with non-sequential memory?)

yes.

No. Why do people answer questions about the language standard without reading the language standard?

The result of pointer arithmetic outside a single object is undefined. Using the result causes undefined behavior.

And there are conforming C implementations where these tricks don't work. Which is fine, since they serve no useful purpose in general-purpose computing. They may be useful in resource-constrained embedded environments, but there implementation-specific code is fine.

It might be worth noting that the problem that started this thread can trivially be fixed by putting a dummy node at the head of the queue, but of course that has the disadvantage of preserving that ghastly design.

scooby_doo

@MichaelWojcik said:

@tster said:
@fist-poster said:
(I'm not sure, but is pointer arithmetics guaranteed to give defined results in C and C++, if used with non-sequential memory?)

yes.

No. Why do people answer questions about the language standard without reading the language standard?

The result of pointer arithmetic outside a single object is undefined. Using the result causes undefined behavior.

And there are conforming C implementations where these tricks don't work. Which is fine, since they serve no useful purpose in general-purpose computing. They may be useful in resource-constrained embedded environments, but there implementation-specific code is fine.

It might be worth noting that the problem that started this thread can trivially be fixed by putting a dummy node at the head of the queue, but of course that has the disadvantage of preserving that ghastly design.

What's this got to do with John Tucker?

asuffield

@tster said:

One thing I think of though. I don't know of any languages that allow this. But if you have a garbage collected language that you can get the value of pointers in (which is a stupid idea and I don't know of any languages that allow it) this will break garbage collection and you will loose your nodes.

It's a historical trick, from back when memory was measured in kilobytes and rented by the month from IBM. You can assume that if you have enough memory to operate a garbage collector, then you have enough memory to use a double-linked list like any sane person.

I can't imagine there being a use for it in modern software.

(There are a couple of garbage collectors for C, and obviously you can manipulate the integer value of pointers there - and they do break in the presence of code which uses tricks like this)

scooby_doo

@asuffield said:

@tster said:
One thing I think of though. I don't know of any languages that allow this. But if you have a garbage collected language that you can get the value of pointers in (which is a stupid idea and I don't know of any languages that allow it) this will break garbage collection and you will loose your nodes.

It's a historical trick, from back when memory was measured in kilobytes and rented by the month from IBM. You can assume that if you have enough memory to operate a garbage collector, then you have enough memory to use a double-linked list like any sane person.

I can't imagine there being a use for it in modern software.

(There are a couple of garbage collectors for C, and obviously you can manipulate the integer value of pointers there - and they do break in the presence of code which uses tricks like this)

I do not agree

DONT HACK MY ACCOUNT SCARY SHEEP-MOUSE

so EAT IT. EAT IT NOW!!!

obediah

real WTF = pacohope jumping all over delta with that special mix of ignorance, hubris, and denigration we all love and then not even having the guts to come back and eat his crow.

tchize

@tster said:

One thing I think of though. I don't know of any languages that allow this. But if you have a garbage collected language that you can get the value of pointers in (which is a stupid idea and I don't know of any languages that allow it) this will break garbage collection and you will loose your nodes.

Thanks the Sun God, java has no pointer arithmetics :D

tchize

@GettinSadda said:

@tchize said:
However +1 for the nice trick that save memory. And yes, despite what says GettinSadda, this save memory. You remove 2*4 bits pointer and use instead a 4 bit int
(2*4-4=4). This might be interesting for long lists in limited memory environments

Firstly, the pointers and the int are 4 [b]byte[/b] variables, not 4 bits

and am [b]sure[/b] you can understand that a non english speaker mixes two words when redacting. Of course i was speaking of octets (er i mean byte). [i]For background information, the letter I and Y are pronounced the same in french, that's where my brain shortcircuit probably came from.[/i]

qbolec

Why would anyone need so many cases?

list * * find(list * * head_pointer, key_type key){
  while(*head_pointer){
    if((*head_pointer)->key==key)
      return head_pointer;
    head_pointer=&((*head_pointer)->next;
  }
  return 0;
}
void kill(list * * result_of_find){
  if(result_of_find)
    *result_of_find=(*result_of_find)->next);  
}

Kain0_0

Funny thing but you don't need a two nodes to start the list off. If that is the head node then the link before it is null. If it's the last node, the link after it is null. I don't think many algorithms start thier search along a linked list in the middle.

Skurry

Here's what I would have done:
- Rename search() to findPredecessor(). Make it return null if you pass A or an unknown node. This makes its behavior consistent and reusable.

- Modify delete() to check if A is to be deleted BEFORE calling search(). If it is, delete it and set the head to B. If any other node is to be deleted, call findPredecessor(), set the pointer of the returned node to the pointer of its successor and delete the successor.

Clean, and doesn't change the data structure.

pacohope

@obediah said:

real WTF = pacohope jumping all over delta with that special mix of ignorance, hubris, and denigration we all love and then not even having the guts to come back and eat his crow.

Not that at all. It was that damn day job thing. I did a rather dumb "fire and forget" sort of thing. I posted my response, thought "what could possibly be wrong with my assessment of the situation?" and wandered off to do work. Although the amount of flack thrown my way for simply being wrong (and arrogant) was pretty heavy, I'll take my lumps. I was totally wrong. I absolutely totally failed to get the delta thing the first time through. No bones about it. I was wrong.

For those of you looking for the crow to be eaten. That was it. You can move along now. If any of you were asking yourselves "What was he thinking!?" the answer is below.

Here's how I blew it, mainly because I was thinking in the context of the actual code, not the example code for the delta list. That is: the example on the delta list had a pointer to an object (where all the actual data would be stored), and a delta value. In the actual code (and, thus in my brain) there's no extra indirection like this. The node struct has data directly in it (c.f., my code example, the int data) and a pointer to the next element. Thus, I read the void* obj in the delta node as a next pointer, and then there was this delta value. That's when I thought "huh, that doesn't save anything. It's still got a pointer and an int..."

DaveK1

@Kain0_0 said:

Funny thing but you don't need a two nodes to start the list off. If that is the head node then the link before it is null. If it's the last node, the link after it is null. I don't think many algorithms start thier search along a linked list in the middle.

Actually it's a fairly common technique, particularly if you know the list is in some kind of sorted order: cache the position where the last search returned a match, then next time, if the value you're searching for is 'less' than the value at the cached node pointer, start searching at the beginning, else start searching at the cached node pointer.

This is a *huge* win over a naive search starting at the head of the list every time if the items you are searching for come in sorted order too. It's a more modest gain otherwise.

pacohope

@asuffield said:

@tster said:
One thing I think of though. I don't know of any languages that allow this....snip...
It's a historical trick, from back when memory was measured in kilobytes and rented by the month from IBM. You can assume that if you have enough memory to operate a garbage collector, then you have enough memory to use a double-linked list like any sane person.
I can't imagine there being a use for it in modern software.

Again another sad detail from the actual situation. This ran on a 900MHz Pentium III with 512M of RAM. I don't think the memory situation was that dire. This queue, on the other hand, was a core part of their event loop and needed to execute fast. I'm not implying that some of the various suggestions here don't run fast enough. I'm just saying that if you had to pick space versus time on this system, time would win. Which is all the more reason we wanted to dope slap the developer who shaved a few bytes off the queue and got it wrong. He could have done a simple doubly-linked list and it would have been exactly what was called for.

danielpitts

@pacohope said:

@asuffield said:
@tster said:
One thing I think of though. I don't know of any languages that allow this....snip...
It's a historical trick, from back when memory was measured in kilobytes and rented by the month from IBM. You can assume that if you have enough memory to operate a garbage collector, then you have enough memory to use a double-linked list like any sane person.
I can't imagine there being a use for it in modern software.
Again another sad detail from the actual situation. This ran on a 900MHz Pentium III with 512M of RAM. I don't think the memory situation was that dire. This queue, on the other hand, was a core part of their event loop and needed to execute fast. I'm not implying that some of the various suggestions here don't run fast enough. I'm just saying that if you had to pick space versus time on this system, time would win. Which is all the more reason we wanted to dope slap the developer who shaved a few bytes off the queue and got it wrong. He could have done a simple doubly-linked list and it would have been exactly what was called for.

Hmm, priority queue implemented as a linked list... Anyone heard of this new-fangled "Heap" data-structure. Perhaps I should patent it, since I know its more advanced than the recently patented Linked List.

If its NOT a priority queue, then an array list certainly would be more effecient than a linked list.

Mark Peschel

I know this is necroposting, but I'm sure other people are bothered by stumbling across references that nobody follows up on. I believe the following is correct.
According to Google, the Nick Wirth referenced by @Ancient_Hacker is Niklaus Wirth, who released a book in 1984 called Algorithms and Data Structures. The relevant section is pages 116 and 117. He does not describe how to traverse a singly linked list backwards; instead, he gives a technique for inserting or removing an element without needing a reference to the element before it.

PleegWat

I'd definitely go for the double-indirect pointer solution. I don't use it for performance since it takes way too much CPU, but I find this easier to grok than double linked lists.

Oh, and both of these I don't generally use helper functions, I write them in place.

item_t ** search( item_t *list, int (*match)(item_t*) )
{
    item_t ** pitem;

    for( pitem = &list ; *pitem ; pitem = &((*pitem)->next) ) {
        if( match( *pitem ) ) {
            return pitem;
        }
    }
    return NULL;
}

void delete( item_t ** item )
{
    item_t *tmp = *item;
    *item = tmp->next;
    free(tmp);
}

anotherusername

Returning and passing the preceding element instead of the element itself (to avoid needing a previous pointer) would've worked just fine, with only very slight modification. Assuming a list of [A, B, C]:

findPrecedingElement(A) => null
findPrecedingElement(B) => A
findPrecedingElement(Z) => C

Then:

If findPrecedingElement(e) is null, the element you searched for is the very first element;
otherwise, if findPrecedingElement(e)->next is null, the element you searched for wasn't found;
otherwise, the element you searched for exists, and findPrecedingElement(e)->next points to it.

There's no longer an ambiguous case.

wharrgarbl

10 year on @anotherusername

@Skurry said in Best Linked List Ever:

Here's what I would have done:
- Rename search() to findPredecessor(). Make it return null if you pass A or an unknown node. This makes its behavior consistent and reusable.

- Modify delete() to check if A is to be deleted BEFORE calling search(). If it is, delete it and set the head to B. If any other node is to be deleted, call findPredecessor(), set the pointer of the returned node to the pointer of its successor and delete the successor.

Clean, and doesn't change the data structure.

wharrgarbl

Guest said in Best Linked List Ever:

wow, people are really having a hard time understanding this concept. The pointer in the deltaNode is to the user data. the structure for deltaNode is:typedef struct {void * userData;int delta;} deltaNode;

Either this one is a bigger WTF than OP, or I don't see how this delta linked list thing would be useful.

asdf

@wharrgarbl said in Best Linked List Ever:

I don't see how this delta linked list thing would be useful.

It includes nasal demons for free!

anotherusername

@wharrgarbl said in Best Linked List Ever:

10 year on @anotherusername

@Skurry said in Best Linked List Ever:

Here's what I would have done:
- Rename search() to findPredecessor(). Make it return null if you pass A or an unknown node. This makes its behavior consistent and reusable.

- Modify delete() to check if A is to be deleted BEFORE calling search(). If it is, delete it and set the head to B. If any other node is to be deleted, call findPredecessor(), set the pointer of the returned node to the pointer of its successor and delete the successor.

Clean, and doesn't change the data structure.

I saw that post, and it's similar but not exactly the same as what I was saying; the key difference is that as @Skurry described it, search still doesn't differentiate between "node at beginning" and "node not found"; it treats them both as cases where there's no predecessor and it returns null in both cases. I think it would be cleaner to just have search return a distinct value in all possible scenarios.

anotherusername

@wharrgarbl said in Best Linked List Ever:

Guest said in Best Linked List Ever:

wow, people are really having a hard time understanding this concept. The pointer in the deltaNode is to the user data. the structure for deltaNode is:typedef struct {void * userData;int delta;} deltaNode;

Either this one is a bigger WTF than OP, or I don't see how this delta linked list thing would be useful.

It lets you store a linked list that can be traversed in both the forward and the reverse direction like a doubly-linked list, while storing a single relative pointer instead of two absolute pointers for the adjacent elements forward and back. The delta pointer of each element is equal to the value that would be next - previous, if the list had next and previous pointers. As a result, you need two pointers to start, pointing to two adjacent nodes; but then you can navigate the list in either direction.

Suppose you have two adjacent elements (which are neither at the very beginning of the list, nor at the very end of it; that's a special case that I won't go into). We'll call them e1 and e2. Further suppose that e1->previous ought to refer to e0, and e2->next ought to refer to e3, but they don't actually have previous and next pointers; just a delta value. Remember, it equals the value of next - previous: it's the address of the next element minus the address of the previous element.

e1->delta holds the value of e2 - e0
e2->delta holds the value of e3 - e1

Initially, all we have is two pointers, e1 and e2; we know their order, and that they are adjacent elements. To traverse the list, we have to be able to calculate the location of e0 and e3 based on the two delta values, and the addresses e1 and e2 themselves.

To get e0, we take e2 - e1->delta; this simplifies: e2 - (e2 - e0) = e2 - e2 + e0 = e0
To get e3, we take e2->delta + e1; this simplifies: (e3 - e1) + e1 = e3 - e1 + e1 = e3

So, from any two adjacent elements, the next or previous element can be found; the list can be traversed in either direction by repeatedly performing these steps.