Best Linked List Ever
-
@PleegWat The true doubly-linked list has the best performance with modern memory managers. (Well, using an array is even better if you can. Arrays are unreasonably good.)
-
@anotherusername Sounds finnicky. Also would only work if all list elements are allocated from the same element pool, since
next - previous
is undefined in the general case.
-
Statistically, he's more fbmac than fbmac is.
-
@PleegWat said in Best Linked List Ever:
@anotherusername Sounds finnicky. Also would only work if all list elements are allocated from the same element pool, since
next - previous
is undefined in the general case.Pendants out! You mean the same array.
-
@anotherusername said in Best Linked List Ever:
Returning and passing the preceding element instead of the element itself (to avoid needing a
previous
pointer) would've worked just fine, with only very slight modification. Assuming a list of [A, B, C]:- findPrecedingElement(A) => null
- findPrecedingElement(B) => A
- findPrecedingElement(Z) => C
Then:
- If
findPrecedingElement(e)
is null, the element you searched for is the very first element; - otherwise, if
findPrecedingElement(e)->next
is null, the element you searched for wasn't found; - otherwise, the element you searched for exists, and
findPrecedingElement(e)->next
points to it.
There's no longer an ambiguous case.
The problem here is pretty clear: we need to return an element of the list or any of two special cases.
Weird how despite how incredibly common this pattern is, most languages lack a straightforward way to implement it. I suppose the proper way in C would be to return a struct, with an enum (FOUND|NOT_FOUND|FIRST_ELEMENT) and a pointer to the result if it is found.
-
Gotta love the imported "replies"
-
@PleegWat said in Best Linked List Ever:
but I find this easier to grok than double linked lists.
When you are writing for Linux kernel, there is no excuse for not using the standard doubly-linked lists defined in include/linux/list.h and use all over the place there. It is readable (for C), efficient, and already implemented.
In fact, if I had to do anything in plain C again and needed a list, I would just copy that, because for C it is excellent.
-
@anonymous234 said in Best Linked List Ever:
he problem here is pretty clear: we need to return an element of the list or any of two special cases.
Weird how despite how incredibly common this pattern is, most languages lack a straightforward way to implement it. I suppose the proper way in C would be to return a struct, with an enum (FOUND|NOT_FOUND|FIRST_ELEMENT) and a pointer to the result if it is found.I just don't call the search function from delete and duplicate those 2 lines of code. Beats complicating my search function IMO.
-
@PleegWat Yup. For a flat memory space where pointer operations map sanely to unsigned integer ops, you might have something that helps you.
If you're working with segmented memory - good look getting any sense from the pointer difference. (Segmented memory is given as the reason certain pointer ops between different objects are not allowed, in the Rationale of the C standard)
-
@PleegWat said in Best Linked List Ever:
Also would only work if all list elements are allocated from the same element pool, since next - previous is undefined in the general case.
And if they are, you have to pre-allocate all elements and thus have a limited maximum size, losing one of the interesting properties of a linked list; once I'm there, I think I'd just use a data array plus an indexing array, which IMHO is less mind-bending to work with than a linked list.
However the first "single-pointer double linked list" variant I'd found when I googled it was this one: https://en.wikipedia.org/wiki/XOR_linked_list -- similar concept but using XOR rather than subtraction, and I think it could be implemented without relying on undefined behavior; though it does not provide relocatability.
Though now that I think about it: pointer-to-int conversion (and back) is implementation-defined rather than undefined, no? So couldn't the 'delta' method be implemented in a standard-compliant way by first casting the pointers to ints and then doing the difference?
-
@dkf said in Best Linked List Ever:
The true doubly-linked list has the best performance with modern memory managers.
And why would that be (assuming you never have to traverse it in reverse)?
-
@Steve_The_Cynic said in Best Linked List Ever:
@PleegWat said in Best Linked List Ever:
@anotherusername Sounds finnicky. Also would only work if all list elements are allocated from the same element pool, since
next - previous
is undefined in the general case.Pendants out! You mean the same array.
What he means is that their offset has to be aligned to an integer multiple of the element size, which is something an array naturally guarantees.
Pointer math is defined such that it operates by element sizes, not bytes.
*(int + 1)
refers to the second element inint[]
, not to 1 byte past its beginning. Effectively it means*(int + 1 * sizeof(int))
.The implication of this is that if you have two pointers,
e1
ande2
, and you want to do pointer addition or subtraction on them, the distance between them needs to be an integer multiple ofsizeof(elementType)
or you could get nasal demons... uh, I mean, undefined behavior.
-
@anonymous234 said in Best Linked List Ever:
@anotherusername said in Best Linked List Ever:
Returning and passing the preceding element instead of the element itself (to avoid needing a
previous
pointer) would've worked just fine, with only very slight modification. Assuming a list of [A, B, C]:- findPrecedingElement(A) => null
- findPrecedingElement(B) => A
- findPrecedingElement(Z) => C
Then:
- If
findPrecedingElement(e)
is null, the element you searched for is the very first element; - otherwise, if
findPrecedingElement(e)->next
is null, the element you searched for wasn't found; - otherwise, the element you searched for exists, and
findPrecedingElement(e)->next
points to it.
There's no longer an ambiguous case.
The problem here is pretty clear: we need to return an element of the list or any of two special cases.
Weird how despite how incredibly common this pattern is, most languages lack a straightforward way to implement it. I suppose the proper way in C would be to return a struct, with an enum (FOUND|NOT_FOUND|FIRST_ELEMENT) and a pointer to the result if it is found.
I really think the most logical solution would be a search function that returns the found element, and optionally can remove it before it returns. I know, I know... then search isn't a pure function anymore...
-
@ixvedeusi said in Best Linked List Ever:
And why would that be (assuming you never have to traverse it in reverse)?
Faster deletions when you have only the pointer to the element you're deleting.
-
@wharrgarbl if that's all you want, then a
bool isDeleted
might be simpler and better. You can iterate the whole list and do "real" deletions of those items periodically if you need to keep the list short and have cycles to spare.
-
@anotherusername when I delete something I want it deleted, not a "hidden bit" to be set
-
@ixvedeusi said in Best Linked List Ever:
And why would that be (assuming you never have to traverse it in reverse)?
The information flow inside the processor results in better handling of memory, whereas complex tricks with address arithmetic force the main ALU to be used to generate the address, so you tend to get a longer pipeline stall. By contrast, array accesses have been extremely well optimised for you already, as they're utterly critical to so much performance. (Structure accesses are also really quick.)
-
@dkf Double linked lists suck for memory localization but are great for deleting. Arrays have the best possible memory localization and suck at deleting.
-
@wharrgarbl said in Best Linked List Ever:
@anotherusername when I delete something I want it deleted, not a "hidden bit" to be set
hmm.... i remember someone else who was obsessed with deleting things.... man that was annoying.
-
@anotherusername said in Best Linked List Ever:
I know, I know... then search isn't a pure function anymore...
I don't think any C "kernel guru" programmers give a shit about pure functions.
-
@wharrgarbl said in Best Linked List Ever:
@anotherusername when I delete something I want it deleted, not a "hidden bit" to be set
Someone should enlighten you on how deleted posts work here...
-
@accalia I'm not one of my alts, if that's what you're implying
-
@wharrgarbl said in Best Linked List Ever:
@accalia I'm not one of my alts, if that's what you're implying
Does that mean you're one of your controls?
-
@RaceProUK said in Best Linked List Ever:
@wharrgarbl said in Best Linked List Ever:
@accalia I'm not one of my alts, if that's what you're implying
Does that mean you're one of your controls?
Ask him about one of his deletes.
-
@anotherusername said in Best Linked List Ever:
@Steve_The_Cynic said in Best Linked List Ever:
@PleegWat said in Best Linked List Ever:
@anotherusername Sounds finnicky. Also would only work if all list elements are allocated from the same element pool, since
next - previous
is undefined in the general case.Pendants out! You mean the same array.
What he means is that their offset has to be aligned to an integer multiple of the element size, which is something an array naturally guarantees.
Pointer math is defined such that it operates by element sizes, not bytes.
*(int + 1)
refers to the second element inint[]
, not to 1 byte past its beginning. Effectively it means*(int + 1 * sizeof(int))
.The implication of this is that if you have two pointers,
e1
ande2
, and you want to do pointer addition or subtraction on them, the distance between them needs to be an integer multiple ofsizeof(elementType)
or you could get nasal demons... uh, I mean, undefined behavior.The standards say that the difference between two pointers can only be definedly calculated if they point to members of the same array (including the non-member that would be - but isn't - immediately after the last real member of the array). Two arbitrary pointers might be in different zones of memory (aka segments), which will produce even wonkier results than differences that aren't an exact multiple of the pointer type apart.
But for C++, they also say that you can use
std::less<void *>
to impose a strict weak ordering suitable for inclusion in astd::map
or similar, regardless of where the pointers point to. The ordering will depend on details of the environment, and therefore shouldn't be relied on in any way except that it will be stable during a particular run of the program.
-
@Steve_The_Cynic something something embedded system with no filesystem something.
-
@anotherusername said in Best Linked List Ever:
@Steve_The_Cynic something something embedded system with no filesystem something.
Yeah, many years ago I worked in C on a embedded system with no filesystem, and although it didn't *exploit* segmentation in any way, it was 8088-based, and had segmented memory.
I also recall a colleague's scepticism about "well, here's a bug that will cause the problem you're seeing" - his response was, "yeah, but the window for that bug happening is only three machine instructions long" which was true, but those three instructions were executed reasonably often, so I told him, "OK, but you have to fix it because it will cause exactly the symptom you're seeing." He grumbled, but he did fix the thing I said was a bug, and the symptom that took about 20 minutes to happen never happened again. The bug was code that wasn't protected against interrupts but modified DS, while the interrupt handlers relied on DS being correct at all times (). It took, normally, about 20 minutes for the interrupts to end up aligned with the three-instruction window, and after that happened, a printer output string was corrupted.
(Later events that caused us to port to a different compiler lead me to rewrite the entry code to the interrupt handlers so that they set up DS correctly, and the problem disappeared forever.)
-
@anotherusername C is used for:
- OS development
- heavy CPU bound stuff
- WTF
- embedded
If you're doing C and you don't know what of these bullets apply, you're on WTF.
-
@wharrgarbl said in Best Linked List Ever:
@accalia I'm not one of my alts, if that's what you're implying
imply something? why would i ever associate @fbmac with @wharrgarbl? they're like completely separate people. they in no way share any similarities what so ever.
-
@wharrgarbl said in Best Linked List Ever:
@anotherusername C is used for:
- OS development
- heavy CPU bound stuff
- WTF
- embedded
If you're doing C and you don't know what of these bullets apply, you're on WTF.
- Plenty of OS development in C - ex: Linux, FreeBSD, etc.
- No shortage of CPU-bound stuff that happens in C, even if it isn't the best-suited to some of it.
- Plenty of developed in C.
- Almost all the embedded stuff I've ever done(1) has been in C.
(1) Except one thing we built in 2000 or so that was C++, and a gas-flow corrector I built on a TI Sensor Signal Processor(2) in 1992-1993.
(2) The most memory-deficient CPU I've ever seen. The entire writable memory of the CPU was on board (although it could interface to serial-access EEPROM devices using a seriously wonky protocol) and amounted to just 576 bits.
-
@wharrgarbl @Steve_The_Cynic I'm going to go with on this one...
-
@wharrgarbl said in Best Linked List Ever:
@anotherusername C is used for:
- OS development
- heavy CPU bound stuff
- WTF
- embedded
- games
If you're doing C and you don't know what of these bullets apply, you're on WTF.
FTFY ;)
-
@anotherusername said in Best Linked List Ever:
@wharrgarbl @Steve_The_Cynic I'm going to go with on this one...
Would be nice if the dumbox put the article title and the word 'Comments' in the link
-
@RaceProUK it would, if the page's HTML put the article title and the word 'Comments' in the title...
I'll edit a link text in, though, so it's clearer...
-
@RaceProUK C games that don't fit into the heavy cpu bound stuff goes into the WTF bucket
-
@anotherusername Wow, Alex really blew it with that article.
-
@wharrgarbl said in Best Linked List Ever:
@anotherusername when I delete something I want it deleted, not a "hidden bit" to be set
We already have .net for that...
(or I should say managed languages...)
-
@dcon C# List is an array of references. That is quick to delete, but C# references will have the same memory locality problems of linked lists.
I dunno if there is any way to ensure objects are together in memory in .NET or any other memory managed language.
-
-
@dcon I know, @anotherusername idea is kind of like mark-and-sweep garbage collection
-
@wharrgarbl exactly like that.
-
@anotherusername Except with GC I don't need to keep checking for the flags in my code.
But where soft deletes get really ugly is in databases.
SELECT ...
FROM A, B, C, D, E, F
WHERE ...
AND A.DELETED = 0
AND B.DELETED = 0
AND C.DELETED = 0
AND D.DELETED = 0
AND E.DELETED = 0
AND F.DELETED = 0
-
@wharrgarbl in databases you just create a view that only includes records that aren't marked deleted.
-
@anotherusername look at you and your fancy views
-
@anotherusername said in Best Linked List Ever:
@wharrgarbl @Steve_The_Cynic I'm going to go with on this one...
Nope. I'm well aware of the reference. Most of those embedded systems I worked on didn't have filesystems, nor operating systems as such.
-
@wharrgarbl said in Best Linked List Ever:
@dcon C# List is an array of references. That is quick to delete, but C# references will have the same memory locality problems of linked lists.
I dunno if there is any way to ensure objects are together in memory in .NET or any other memory managed language.
Use a list of
struct
s, since they're value types they'd be all together in the array.
-
@Steve_The_Cynic yeah, well, I was pretty sure that @wharrgarbl at least didn't make the connection, and you pretended not to so I went ahead and whooshed you both...
-
@Steve_The_Cynic said in Best Linked List Ever:
Most of those embedded systems I worked on didn't have filesystems, nor operating systems as such.
The one I'm working has a filesystem, but you need to access with special snowflake functions instead of the usual stdio ones. It sucks for porting code.
-
@Steve_The_Cynic said in Best Linked List Ever:
Almost all the embedded stuff I've ever done(1) has been in C.
You can do C++ on the serious embedded platforms just fine, but you have to profile the language pretty hard. The big thing that absolutely has to be removed from the base language is exceptions, as their cost in terms of memory is pretty huge. That in turn means you lose the whole of the C++ standard library and Boost, but that's less of a problem. ;-) Virtual inheritance can stay in limited circumstances, but that's usually only required where you would otherwise have to have other types of indirect call if you were doing it in C, so the actual cost is largely moot.
RTTI is another no-no, but you probably haven't got all that many objects in the first place; hard-coding something instead if required is actually practical.
-
@dkf said in Best Linked List Ever:
@Steve_The_Cynic said in Best Linked List Ever:
Almost all the embedded stuff I've ever done(1) has been in C.
You can do C++ on the serious embedded platforms just fine, but you have to profile the language pretty hard. The big thing that absolutely has to be removed from the base language is exceptions, as their cost in terms of memory is pretty huge. That in turn means you lose the whole of the C++ standard library and Boost, but that's less of a problem. ;-) Virtual inheritance can stay in limited circumstances, but that's usually only required where you would otherwise have to have other types of indirect call if you were doing it in C, so the actual cost is largely moot.
RTTI is another no-no, but you probably haven't got all that many objects in the first place; hard-coding something instead if required is actually practical.
Oh, yeah, I know all that. When the embedded system is a 68HC11E9 with 18K of RAM, and the year is 1992, C++ is a non-starter.