Graphs / Trees and C#



  • I already stubbed my toe on the problem that Azure Mobile Services does not provide Join for their LINQ queries. But thanks to the help from some guys here, that particular problem could be circumvented (albeit by brute force).

    However, for the next part I needed a new data type - namely a Tree like data structure. As I'm fairly new to C# I thought to myself: "Surely this is a problem someone else has run into before? And actually solved?"

    As it turns out, sort of.

    First I ran across a Graph library from Microsoft itself - alas, this library doesn't do Store Apps.
    Then there's a General Data Types library in NuGet which provides trees/graphs. It's from 2011, though, and doesn't do .Net 4.5
    There's an article from 2006 on MSDN which actually constructs a Graph Class. Doesn't compile though, due to the inherited interface requiring some methods which obviously weren't needed back then.
    There's another blog entry which is chockfull of spelling errors (doesn't compile as well), from 2013.

    But the kicker? A data type library named "C5" which tells me that it provides Trees but for the actual documentation I first have to build the solution, then run an external program (named Sandcastle) to build the documentation which then is a bit on the unhelpful side as to how to actually use the Tree data structure. Not to mention that the idioms are strange - I guess a "Red-Black-Tree" is a binary tree? And what are "Treebags" supposed to be?

    Gah. Seems as if I really have to roll my own.


  • Java Dev

    Yeah, a Red-Black-Tree is a type of self-balancing binary tree. Can't help you on treebags or C#, I'm afraid.

    What kind of tree are you looking for? Something like a XML/DOM tree?



  • No, a simple tree structure with a root which has directed edges towards the vertices. No loops and no edges at the same "level".


  • FoxDev

    is this a sorted collection of things, or does the position in the tree carry semantic values?



  • Basically, it's like this:

    I want to store grades for my pupils. As such, the structure is like this: Every node can contain one or several grades, possibly overriding the grades below. But the grades itself would not reside in this tree.

    As such, the tree contains a root which would be the school year. Every school year would then contain 2 semesters, 3 trimesters or other blocks of time as nodes. These nodes then contained oral and written grades. And the oral grades contain collaboration, tests, homework as nodes.

    I.e. something like this:

    Schoolyear => 1. Semester => Written Grades
                              => Oral Grades => Collaboration
                                             => Homework
                                             => Tests
               => 2. Semester => Written Grades
                              => Oral Grades => Collaboration
                                             => Homework
                                             => Tests
    

    This would be one example of a possible layout. I also want to include weights (Homework doesn't count as much as collaboration) and special rounding rules (like, if the average of 2 exams results in a .5, then the 2nd exam determines whether you round up or down).


  • FoxDev

    ah. i don't think a tree is what you want. i think a collection of strongly typed objects would be more what you are looking for.

    have a GRADE object with a SCORE and a WEIGHT
    and a CLASS object with a list of GRADE objects
    and a SEMESTER object that had CLASS objects

    and so on.



  • That layout was just one example, though.


  • FoxDev

    right, but based on your description a arbitrary tree does not sound what you want.

    i'm sure with some careful planning you can come up with a proper schema for object hierarchy and achieve the same effect.

    if not then.... i dunno. i'd say look at working with an XML document or something...?



  • @accalia said:

    right, but based on your description a arbitrary tree does not sound what you want.

    i'm sure with some careful planning you can come up with a proper schema for object hierarchy and achieve the same effect.

    if not then.... i dunno. i'd say look at working with an XML document or something...?

    Thing is, as I said, that each layer can have an override value - i.e. I don't have to accept the calculated value from the layer below (makes sense in case you have a pupil right on the edge between two grades, for example).


  • FoxDev

    @Rhywden said:

    Thing is, as I said, that each layer can have an override value

    hmm... that complicates things a bit..... let me think....



  • Finally found a fairly recent implementation which largely seems to do what I want.

    Save for serialization but that shouldn't be too hard.


  • Discourse touched me in a no-no place

    @Rhywden said:

    Thing is, as I said, that each layer can have an override value - i.e. I don't have to accept the calculated value from the layer below (makes sense in case you have a pupil right on the edge between two grades, for example).

    Doesn't mean you want a generic tree though. The main use of a generic tree is for implementing a map where you want to keep the keys ordered and want bounds on the insert and remove complexity. That's not your case: you're assigning meaning to each level of your tree.

    You might think in terms of having each node in the tree be a member of a subclass of some class that provides the general summing/summarisation capability. Or maybe you'd implement that separately at each level if that makes more sense. Even with a few hundred people in a class, I wouldn't expect you to need anything other than very basic built-in datatypes to implement the tree nodes; structural changes aren't going to be needed very often after all.



  • I won't be the only one to use this, though. Thus it doesn't make sense to tailor it specifically to only my use case.


  • Discourse touched me in a no-no place

    @Rhywden said:

    I won't be the only one to use this, though. Thus it doesn't make sense to tailor it specifically to only my use case.

    Don't get too hung up on making it generic. There's a fairly good chance that you will be the only person ever to do a deployment of this (because life's cruel like that). Might as well focus first on making it work for you, since then it will at least work for one person instead of being yet another lump of code in the pile marked “no use to anyone”.



  • It sounds like an ideal problem to solve using bog-standard object inheritance. The tree's just going to complicate the shit out of things. (Remember, an object can contain a List of other objects.)



  • @Rhywden said:

    I won't be the only one to use this, though. Thus it doesn't make sense to tailor it specifically to only my use case.

    KISS, keep it simple, stupid. Until you need new functionality, leave it out. You might go 30 years and never end up needing it. Read Joel Spolsky's article on Architecture Astronauts and internalize it.

    (You're going to need to build all those classes/interfaces anyway. Might as well just codify the relationship between them in the type system where the compiler will enforce it for you rather than a generic BalancedTree<Object>(). The type system is your best friend in C#.)

    Moreover, frequently, it's better to gently guide users to working the way you anticipated than providing a product that can be configured in a million ways. (Think of, say, what Apple does compared to what, say, Lotus Notes does. Which products are more popular? Which are more loved by their users?)

    EDIT: what dkf said. Although he's still an idiot for what he posted in the StackOverflow thread, he's right on this point.



  • @blakeyrat said:

    it's better to gently guide users to working the way you anticipated than providing a product that can be configured in a million ways.

    I hope the TDWTF is required reading for Linux developers.



  • @Gurth said:

    I hope the TDWTF is required reading for Linux developers.

    I don't think Linux developers read anything except perhaps Hop on Pop.



  • I had to Google that, but now I have, it doesn’t seem to be overly complicated, so I doubt many consider it worth reading.



  • In addition to what everyone else said: If you manage to make this work using a tree, you will probably have only succeeded in creating a weakly-type implementation of something that should be strongly typed. Each type of TreeNode will be expected to behave according to the rules of its type, not the generic rules of a TreeNode. It sounds like you will either end up with a buch of classes that inherit from TreeNode (ehat everyone else is already suggesting, only more complicated) or a bunch of switch statements (which will deserve its own article on the front page).


  • Discourse touched me in a no-no place

    @blakeyrat said:

    Moreover, frequently, it's better to gently guide users to working the way you anticipated than providing a product that can be configured in a million ways.

    Whether or not a piece of software can be configured in a million ways, try to have a really good guess at getting things configured correctly right out of the box without the user having to say anything. Sometimes that takes a bit of effort (either in the first-startup code or in the installer) but it is totally worthwhile; it means that when someone just tries it out, it works. Maybe it doesn't work as perfectly efficiently as it might with tuning, but actually working first time is a powerful incentive to users to keep using the software.

    It's one of the “that sounds dumb and obvious” that so many programmers (especially academic programmers 😦) totally fuck up on.


  • Java Dev

    As my area lead from when I stared would say: You cannot build a generic solution based on a single usecase.



  • As a Bag datastructure is generally a Set where each element stores an additional count, I assume a TreeBag is a Bag datastructure that uses a balanced tree (like a 2-3-4 tree or a red-black tree) to achieve O(log n) complexity for determining how often a key (of a sortable data type) is in the Bag.

    So probably not what you want to do with your tree :)


  • I survived the hour long Uno hand

    This is why formal education in data structures is useful to programmers.



  • @PleegWat said:

    As my area lead from when I stared would say: You cannot build a generic solution based on a single usecase.

    I myself have four other cases. My colleagues do as well 😄


  • Discourse touched me in a no-no place

    @Rhywden said:

    I myself have four other cases. My colleagues do as well 😄

    It's missing the point. Solve a specific use case first. Then make things more generic if necessary. It's easier to do it that way round, and you spend less time working on making stuff generic that doesn't need it.

    For example, you could theoretically make your scores be any type that supports an appropriate number algebra, but that would be a dumb thing to do as it would be quite a lot of work and nobody will actually need anything like that. Or at least to as good an approximation of “nobody” as makes no difference.



  • Stop thinking about the trees. It's not the 80s. A tree is now an implementation detail of a collection, just like a hash array or a regular array. And what you actually need is an object that can hold a collection of other objects, which can in turn hold collections of other objects.

    In short, you basically need a bog-standard object, perhaps with a side of interfaces and/or generic methods to operate on its contents.



  • Wait, was the problem that you wanted to be able to display the data as a tree? Because that's a whole different question.



  • Guys, language X doesn't have a GenericDataStructureGeneratorGeneratorFactory which means I have to write ten lines of code by myself! Whatever shall I do?



  • This post is deleted!


  • @ben_lubar said:

    Guys, language X doesn't have a GenericDataStructureGeneratorGeneratorFactory which means I have to write ten lines of code by myself! Whatever shall I do?

    Use PHP. They have generators.

    Catch: you still have to write those 10 lines by yourself.


  • Java Dev

    @ben_lubar said:

    Guys, language X doesn't have a GenericDataStructureGeneratorGeneratorFactory which means I have to write ten lines of code by myself! Whatever shall I do?

    That was actually my first thought, but I typically work in C where building my data structures from scratch is the preferred solution, rather than the last resort.



  • Some advanced-level Linux developers have progressed to the Cat and the Hat. None of them are remotely close to The Five Hundred Hats of Bartholomew Cubbins.



  • Nice sentiment right there. And so helpful!

    Let's reinvent the wheel ALL THE TIME! Silly me for thinking that such a basic problem already had some readymade solutions.

    I'll remind you guys of this sentiment the next time you laugh at a reimplementation of the Date()-Object or something similar.

    And I want to see those 10 lines. And not 10 lines of "30 commands per line". Traversal, children/parent seeking and counting, insertion/moving/removal, sanity checks (e.g. no orphaned nodes) and events. Oh, and Data Binding, of course.


  • Discourse touched me in a no-no place

    @Rhywden said:

    Let's reinvent the wheel ALL THE TIME! Silly me for thinking that such a basic problem already had some readymade solutions.

    The basic structural stuff is there. You want a mapping? Hashtable (or one of the other Dictionary types). But you want to attach semantic meaning to the branches? You have to write some of that yourself, and that's the point where you are justified in doing it. You don't want to model that sort of thing with a balanced tree (well, not directly) because those will change where things are stored in the tree during the rebalancing process.

    You might use a balanced tree within one level of conceptual tree to store the children of that conceptual tree node, if justified (probably not justifiable though), but that's an implementation detail. The clients of the conceptual tree shouldn't care and shouldn't have to care.

    (What is this, Software Engineering 101?)



  • You've mentioned you want a recursive bag-like structure where arbitrary properties are calculated by aggregating the same property on the bag's contents, but only if the same property isn't set locally. And the aggregation needs to be an instance-specific user-defined lambda, to accommodate weights and your super-special rounding rules.

    That sounds like quite the basic problem that everyone always has. </sarcasm>

    If you have a domain, create domain-specific objects that work within that domain. class Year, class GradingPeriod, class GradeType, and perhaps even class GradeCategory are crying out to be written, and you can even use actual god-damn words and types for property names like public char LetterGrade.

    If you don't have a domain, and are just going "I wanna have a tree because I wanna tree and this one domain problem kinda looks like a tree", then you should calm down and reconsider why exactly you want this. Sure, it can be written up in ~20-30 minutes with a DynamicObject subclass or two, but it wouldn't be fit for any purpose.

    Also, Ben's complaint is apt, considering the unholy mess "meaningful" Java EE development has turned into. Thanks to GenericDataStructionGeneratorGeneratorFactoryBrokers, the first half of such Java projects is writing not-Java. We have the full power of our language to express what we want to happen, and the knowledge to express how we want that to happen. Why cripple ourselves?



  • @Rhywden said:

    Let's reinvent the wheel ALL THE TIME! Silly me for thinking that such a basic problem already had some readymade solutions.

    You're kind of missing the point of our advice.

    The basic problem has been solved, it just hasn't been solved in the manner you were expecting.

    @TwelveBaud said:

    If you have a domain, create domain-specific objects that work within that domain. class Year, class GradingPeriod, class GradeType, and perhaps even class GradeCategory are crying out to be written, and you can even use actual god-damn words and types for property names like public char LetterGrade.

    Exactly. Implementing rules for weighting grades are a bit tougher, but not even remotely close to unsolvable, and probably much easier to solve in the type system than outside it.

    @TwelveBaud said:

    Also, Ben's complaint is apt, considering the unholy mess "meaningful" Java EE development has turned into. Thanks to GenericDataStructionGeneratorGeneratorFactoryBrokers, the first half of such Java projects is writing not-Java. We have the full power of our language to express what we want to happen, and the knowledge to express how we want that to happen. Why cripple ourselves?

    I would just like to point out that this is a people/culture problem in Java. There's nothing inherent to Java that requires bullshit like Ben's joke any more than C#; the only difference is C# developers are much more practical-minded. (Generally-speaking.)


  • BINNED

    @Arantor said:

    Catch: you still have to write those 10 lines by yourself.

    1. You'll have to write that unset() twice :P


  • They fixed that in 5.1.4.

    Or was it 5.1.5. Anyway, fixed years ago.



  • @TwelveBaud said:

    "I wanna have a tree because I wanna tree and this one domain problem kinda looks like a tree"

    I think that's pretty much the whole thread right there.

    @dkf said:

    (What is this, Software Engineering 101?)

    Touché.


  • BINNED

    @Arantor said:

    They fixed that in 5.1.4.

    Or was it 5.1.5. Anyway, fixed years ago.

    I know.

    you killjoy


  • Discourse touched me in a no-no place

    @Onyx said:

    you killjoy

    As it's still in my clipboard....


  • Discourse touched me in a no-no place

    @blakeyrat said:

    There's nothing inherent to Java that requires bullshit like Ben's joke any more than C#; the only difference is C# developers are much more practical-minded.

    Sensible Java programmers avoid that short of thing too, or at least corral it off in somewhere like Spring so that they don't have to think about it. When someone starts waffling about AbstractFactoryBuilderStrategies and so on, you know they're a bunch of know-nothing Architecture Astronaut wannabes. (Well, assuming that they're serious about it. 😃) The same would apply to someone doing the same sort of shit anywhere else.

    I ought to look up whether any of the SE courses at work teach the principles of DI or IoC. The specific methods for doing this sort of thing might (inevitably) be going to change, but the principles are good ones that are reusable across many domains.



  • @blakeyrat said:

    The basic problem has been solved, it just hasn't been solved in the manner you were expecting.

    QFBWR



  • @dkf said:

    (What is this, Software Engineering 101?)

    http://www.cs.uwm.edu/~cs101/



  • @ben_lubar said:

    @http://www.cs.uwm.edu/~cs101/ said:
    Word Chapter 1 Notes

    They've sunk really deep…


Log in to reply