Help me not define a programming language in a database



  • I have been tasked until such time as I fail, with helping come up with an ideal architecture for an application. It needs to be configurable and reliable and non-programmers need to be able to configure it.

    I'm not going to go into much detail about the system as a whole, just one individual problem that I reckon is going to crop up about 10 million times during this process of designing it.

    I have a set of things. At some point I am going to have some data come in, that would cause me to consider adding thing X to list A. List A has an arbitrary set of conditions (imagine a list of possible conditions Q, and A has selected a subset Q') that must be evaluated to decide whether thing X belongs in list A.each condition basically translated to a method in c# which returns true/false.

    Supposing Q' contains items Q1, Q2, and Q3. If I force these to be evaluated in the form Q1 AND Q2 AND Q3, everything is fine, however, we might find we want List B to have Q1 AND (Q2 OR Q3) for example.

    I was reasonably happy translating this into a database schema of:

    Table of Lists
    Table of Conditions

    Table of ListConditions linking Lists and Conditions as *:*

    until I got to the point of thinking about different combinations of conditions. I could add an expression column to List, but this means I either duplicate data by specifying the conditions twice (once in ListConditions, once in the expression) or I break the relationship in SQL and have to find my conditions in code through multiple queries.

    An alternative would be to specify that expression in another condition (e.g. Q' = Q1, Q4. In code Q4 is:

    return Q2() || Q3();
    

    ) but then I feel like I will end up with a list of conditions that is unmanageably long, unusable and unmaintainable, and will be changed for every new customer (I can't find the 4 conditions I need, so I define Q5632 which accidentally duplicates Q123 Q469, Q912 and Q893), which is exactly the problem we are trying to eliminate.

    I feel like I am getting into the whole enterprise platform within a platform area of shittyness, which I am obviously keen on avoiding.

    This problem basically boils down to:

    Input

    Arbitrary set of conditions evaluated in an arbitrary way to validate that data (or decide if it means we should do something)

    boolean output

    Is there an established pattern for implementing any of this? I've seen this in 2 companies previous to this one, implemented in different ways (one had programmers specify the evaluation, one had manual evaluation). I feel like there must be something that handles this in a reasonably clean and clear way, I can't accept that essentially everywhere I have worked has had a variation on this problem and there isn't a generalised form with a corresponding development pattern to make it easy to resolve.

    does anyone have any thoughts?


  • ♿ (Parody)

    @algorythmics said:

    I have been tasked until such time as I fail, with helping come up with an ideal architecture for an application. It needs to be configurable and reliable and non-programmers need to be able to configure it.

    @algorythmics said:

    Is there an established pattern for implementing any of this?

    Sounds like you're going for an inner platform. More of an anti-pattern, really. You were really close:

    @algorythmics said:

    I feel like I am getting into the whole enterprise platform within a platform area of shittyness, which I am obviously keen on avoiding.

    More seriously, this is the goal of lots of things that go by names like business rules engines, where you give a relatively simple (implementations vary) DSL that's heavy with plain language wrappers.



  • I used a data structure like this:

    [
    	{
    		field: "email",
    		operation: "EQ",
    		operand: "cartman@gmail.com"
    	},
    	"OR",
    	[
    		{
    			field: "suspended",
    			operation: "EQ",
    			operand: false
    		},
    		"AND",
    		{
    			field: "schema",
    			operation: "IN",
    			operand: ["administrator", "supervisor"]
    		}
    	]
    ]
    

    Everything in an array is "expression". Keywords are "OR" or "AND". Things inside {} are expressions in the form "a <op> b". You can nest it as deep as necessary.

    I store it in database as a single JSON field. I fond there was no need to ever do any kind of deep search inside the expression itself. I just pluck it out and edit / use, as needed.

    I have two processors that turn this into something usable. One converts it into an SQL where condition. The other one creates a javascript code snippet (email === 'blah' || (...)). Both work pretty well.

    We also did some work on a visual editor so that users can create their own expressions. It worked well enough in a sandbox, although we never got around to actually plugging it into the interface (still on the docket).

    So, in your case I'd use a data structure like this, and create a formatter that turns this into a C# expression.


  • BINNED

    I was thinking about JSON myself, but I got stuck on trying to figure out how to do the operators. I kept trying to nest things instead of keeping it relatively flat like you did.

    Looks as sane as something like this can be to me. And if you ever need to make anything more complicated than AND and OR (hopefully you won't) you could always change the operators from strings to objects, I guess.



  • @Onyx said:

    Looks as sane as something like this can be to me. And if you ever need to make anything more complicated than AND and OR (hopefully you won't) you could always change the operators from strings to objects, I guess.

    Actually, keeping them as strings makes it super easy to parse this "storage format" into actual object tree with print() functions and everything.

    String -> always a keyword
    Array -> always an expression
    Object -> always a statement

    Very simple and clear.


  • BINNED

    Oh, I agree, just saying that if you need something more complex there are options.



  • The more I think about this, the more I feel like I'm doing it wrong.

    Basically a set of conditions is being mapped to a use case (e.g. Thing 1 is destined for Machine A, so it must be X Y and Z or it isn't ready) while supporting that massive configurability of allowing multiple use cases to arbitrarily check X Y and Z could be a good thing, it requires really experienced users, and encourages copy pasting of the X Y and Z check with minor edits (e.g. X Y and Z or W) which is obviously bad.

    I am thinking that maybe I should stop at the use case level, since that's what the stakeholder wants anyway. He wants a shopping list of "you have machine A so you want the checklist for machine A" or machine B etc) That requires more coding when a job comes in that doesn't match a current use case, but we have already committed to charging exorbitant fees in this case anyway, so I don't really think that's an issue.

    So yeah, I guess what I am thinking now is that each list actually just has a single condition, and the mapping just happens in carefully designing the structure of my C# code to support reuse.



  • I didn't analyze your posts fully, but it seems it is a case of



  • That's pretty much exactly it, I still think I might not need to go to that level, but I have bookmarked it for future reference if I do.

    looking to see if there is a library for it, there are several, mostly written in foreign languages, or having no documentation (I'd be happy with even a minimal example of using it) so I guess I'm rolling my own for now if I go down this path, still hoping to avoid it though!



  • That's basically MongoDB's query language.



  • Wouldn't this be a job for a BRMS? I didn't bother to read the whole thing but I've used some BRMS before and, although there's some friction with the lusers, they usually work fine. There are some that even have a GUI with boxes and arrows and stuff.



  • I spent the afternoon mapping this out and have figured out that I am going to go with just one condition for each list.I ran through a couple of examples and everything holds together and looks good. There may be a problem in terms of list dependencies (Updating list 1 can change list 2) But I can just re-evaluate the lists on a change, or if that is expensive, enforce through coding standards that lists cannot depend on each other.



  • I'd say that that's a mighty elegant approach -- basically an AST-in-JSON, almost like a bastardized S-expr. Nice!

    (I bet it also makes the codegens and the visual editor fairly nice, too)

    @Eldelshell said:

    Wouldn't this be a job for a BRMS? I didn't bother to read the whole thing but I've used some BRMS before and, although there's some friction with the lusers, they usually work fine. There are some that even have a GUI with boxes and arrows and stuff.

    My thing with the "business rules management system" notion is that it's treated as a blackbox with a proprietary and non-extensible input language all too often, rather than what it really is, which is a DSL for scripting some facet of your business logic.


  • Discourse touched me in a no-no place

    @algorythmics said:

    I have a set of things. At some point I am going to have some data come in, that would cause me to consider adding thing X to list A. List A has an arbitrary set of conditions (imagine a list of possible conditions Q, and A has selected a subset Q') that must be evaluated to decide whether thing X belongs in list A.each condition basically translated to a method in c# which returns true/false.

    I was told there would be no math.


Log in to reply