A critical look at Marvel vs. Capcom....

Jaime

@Benjamin-Hall said in A critical look at Marvel vs. Capcom....:

get that all in one go seems a lot more efficient

Is this still true when you look at the larger scheme of things? If you get users and permissions all in one go, what do you do when you only need a list of users? Do you:

Maintain two blocks of code for getting users, one with permissions and one without
Always get permissions even if you don't need them

Both of these are different types of inefficiencies, and it may be that both are worse than the original inefficiency that you described. Or, maybe not. The point is that the people who try to make the task in front of them as easy as possible are usually the ones that make the entire application incredibly difficult to maintain. Those of us who have looked at this problem as a whole usually favor solutions prioritize consistency and compile time binding over fewer lines of code and micro optimizations.

If you do it well, performance problems can often be solved by batch loading and caching rather than by introducing duplication.

Benjamin Hall

@Jaime said in A critical look at Marvel vs. Capcom....:

@Benjamin-Hall said in A critical look at Marvel vs. Capcom....:

get that all in one go seems a lot more efficient

Is this still true when you look at the larger scheme of things? If you get users and permissions all in one go, what do you do when you only need a list of users? Do you:

Maintain two blocks of code for getting users, one with permissions and one without

Always get permissions even if you don't need them

Both of these are different types of inefficiencies, and it may be that both are worse than the original inefficiency that you described. Or, maybe not. The point is that the people who try to make the task in front of them as easy as possible are usually the ones that make the entire application incredibly difficult to maintain. Those of us who have looked at this problem as a whole usually favor solutions prioritize consistency and compile time binding over fewer lines of code and micro optimizations.

If you do it well, performance problems can often be solved by batch loading and caching rather than by introducing duplication.

There is that. But overall, I find that there's a balance to be struck and that hard, ideological stands either direction are generally bad. So generally having mostly generic calls but having a few more complex ones for commonly-used, well-defined complex queries.

Zenith

@Groaner said in A critical look at Marvel vs. Capcom....:

The most challenging job I ever had was also the one that paid the lowest. Really gets the noggin turning.

Because really bad developers can really fuck up a ton of stuff if they're not stopped. But it's this self-reinforcing problem. Management pays peanuts, gets these monkeys, decides all programmers are monkeys, and uses that to justify keeping the low pay. I see it every single day in government. The difference is that, in government, if you learn to be helpless, you'll probably be alright. Private sector? Same mountain of fuckery, same low pay, same unwillingness to even attempt solutions. But you're on borrowed time, a little more if you visibly spin your wheels (be seen "doing something") but borrowed time nonetheless. I guess the corollary to "stay in a good job, grass isn't greener" is "rocket, don't run, to the exit if you smell failure."

Jaime

@Benjamin-Hall

You don't seem to be aware that there are already solutions to these problems... and they are in better programming languages already. For example, .Net and Linq make the choice completely go away - you simply bind to the database schema by creating an Entity Model and write your fetches using a very fluent style that generates efficient and correct queries. The original inefficiency that you described is gone and it isn't replaced with another one.

The real takeaway here is that these language-based features are used in place of anything at the DB layer like stored procedures or views.... those are the 1990s solutions to these problems. There are cases where they are still the best solution, but the number of those is dwindling. If you are living in one of these scenarios, you should look into modernizing how you build software.

Benjamin Hall

@Jaime said in A critical look at Marvel vs. Capcom....:

@Benjamin-Hall

You don't seem to be aware that there are already solutions to these problems... and they are in better programming languages already. For example, .Net and Linq make the choice completely go away - you simply bind to the database schema by creating an Entity Model and write your fetches using a very fluent style that generates efficient and correct queries. The original inefficiency that you described is gone and it isn't replaced with another one.

The real takeaway here is that these language-based features are used in place of anything at the DB layer like stored procedures or views.... those are the 1990s solutions to these problems. There are cases where they are still the best solution, but the number of those is dwindling. If you are living in one of these scenarios, you should look into modernizing how you build software.

Oh absolutely. My point was to point out how we aren't using the right tools currently. IE "don't be like us, instead". Hence my "Or really, don't do this" statement.

And yeah...well...we're using Node, PHP (without an ORM), Typescript, and a bunch of other stuff, including Perl. None of it anywhere near the current versions. So while it would be nice to just have something like that, we don't. And there's enough legacy here that we likely never will.

Heck, we've got 10k lines of javascript/HTML in a component.vue.php file (note the ending) containing all of our front-end web components that gets included on every load of the main web interface. Confuses the everliving heck out of browsers, by the by.

Bulb

@Jaime said in A critical look at Marvel vs. Capcom....:

You don't seem to be aware that there are already solutions to these problems... and they are in better programming languages already. For example, .Net and Linq make the choice completely go away - you simply bind to the database schema by creating an Entity Model and write your fetches using a very fluent style that generates efficient and correct queries. The original inefficiency that you described is gone and it isn't replaced with another one.

Yeah, LINQ managed to find a pretty good balance. It is integrated in the language, but it is effectively a different syntax for SQL, so it has most of the power of the database, while providing all the type-safety and integration with the language embedding strings just can't give you. And the language does evolve faster than the database engines.

MrL

@Bulb said in A critical look at Marvel vs. Capcom....:

Yeah, LINQ managed to find a pretty good balance. It is integrated in the language, but it is effectively a different syntax for SQL

It's even two syntaxes. The SQL-like, thankfully not widely used, and the nice fluent one.

Groaner

@MrL said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

Yes, you are forced to do searches, because business logic in SPs is a fragmented mess that you have no control of.

What?

If one doesn't even have control over the stored procedures that one wrote by oneself, one probably has bigger problems than business logic concerning the database being in the database.

You don't have control over BL in SPs as a whole, because for anything nontrivial it becomes a convoluted fragmented mess. So you have to do searches, hoping you'll find all relevant places, instead of being immediately shown by your tools.

As long as you have all the source code, you always have control. And I'm sure there are cases where such tools would fail as well.

What runtime errors?

Key/constraint violations, type conversion errors, etc.? Please tell me you don't just hit Build and pray.

ORM takes care of those.

Including...

check constraints defined with a complex expression or scalar function?
computed column constraints to enforce "only one row per foo_id can be null" patterns and the like?
single-row table pattern constraints?
indexed views to enforce integrity across multiple rows or multiple tables?

I'm pretty sure I could come up with something that would defeat this automagical resolution, unless ORMs are now capable of solving the halting problem.

INB4, yes, you can violate a constraint if you have no idea what you're doing.

Then maybe relying on compile-time checks isn't good enough?

Can this magical ORM do table/index/query hints?

Depends on ORM.

Does it know how to aggregate rows into a temp table when needed?

Doesn't really make sense when using ORM.

So ORM's aren't the be-all-end-all, then? I mean, it's been my experience that ORMs are very good at CRUD with one Widget at a time, and a pain with anything more complicated than that.

Because SQL is a bad choice for writing business logic

Why?

It's a query language, with bad readability, verbose inconsistent syntax, terrible flow control and subpar tooling. It's great for, well, querying data.

Sounds pretty subjective. I can read SQL just fine, the syntax is plenty clear and often more compact than in an imperative programming language.

SQL Server has had try/catch since at least 2005, and if/while/case/recursion for even longer. What flow control are you missing?

What kind of tooling would be "par" or "above par?"

Hint: if your application uses a database, it deals directly with a database AT SOME LEVEL. There's no escaping it.

If you add a new column with a constraint to dbo.Orders and want to use it in your app, you need to update all your business objects to reflect that. If you add a new table, you need to add corresponding business objects to reflect that. If you change the relationship between two tables, you have to update all your business objects to reflect that.

Yes, you update your mapping and immediately see where it makes a breaking difference. You fix it, run tests and immediately you know what pieces of logic were impacted.

The process is pretty similar with stored procs, with the exception of not having compile-time breakage which may or may not be a reliable indicator of actual breakage. If there's something wrong with your stored procedure, it won't pass your tests. If there's something subtly wrong that goes undetected, that can also happen in application code.

Also, stored procs can be mapped to methods, which is quite nice and eliminates some of the problems mentioned here.

Groaner

@izzion said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

Can this magical ORM do table/index/query hints?

In general, no, and if you're working with things that are such a mess that they need those hints either (1) your database people need to get off their asses and fix the indexing so the query optimizer can build decent plans to begin with

The DB people may have optimized the schema to be highly efficient with certain kinds of operations. However, there are almost always tradeoffs and you can't optimize for every type of access. Such is life.

I'll give you a real-life example: once upon a time, I helped build a survey app from the ground up. Each Survey would have one or more Sections, each of which would have one or more Questions. Sounds simple, right? Well, over the next few months, after this schema was built, you got the following additional requirements:

Our users are stupid and don't appreciate that deletes are forever, so instead of deleting surveys, sections, or questions, can we add a soft delete flag to all of them?
Also, only certain users should be able to see certain questions based on their assigned user type(s).
But admins and super-admins should see all the questions in Admin Mode™.
Oh yeah, can we make it so users only see certain questions if they answered a previous question a certain way?
Can we make it so that users only see certain questions if they answered several previous questions in certain ways?
Can we change what questions a user sees based on their answers in other surveys?
Can we hide certain questions for users at Institution X and only show others at Institutions Y and Z?

Imagine what a query would look like to retrieve all the questions visible to a user, or even question n+1 if the user is currently viewing question n. If you're imagining correctly, it would be about a page tall and very complicated, no matter whether you do it in SQL, LINQ or whatever. The complexity is not indicative of poor design, it's indicative of reality being complicated.

And while you might be able to optimize such complex logic for that particular case, when you're asked to do a different complex operation (like come up with pie charts of response distribution to each question), there's no guarantee it'll automatically be performant with the right ~~haggis~~indexing.

or (2) you're doing reporting that is sufficiently complex that you should use a purpose built tool (e.g. SSIS or whatever the $$$ORAORAORA equivalent is).

The elders told me when I was a padawan never to rely on SSIS for anything more than simple bulk loads as:

such integration tools are notoriously buggy, and
ask someone who's built out their entire workflow in SQL Server 2000 DTS packages (and are thus stuck on SQL Server 2000) how well that's worked out for them

Does it know how to aggregate rows into a temp table when needed?

Generally the same as above. Though if you have a really grungy query that needs to do a lot of aggregation and temp table work, with the ORM solution, you have the option of explicitly pulling all of the data across to the application tier (read as: web server) and do the aggregation/calculation work there. Where, to paraphrase Brent Ozar, "you aren't paying $7,000 a core" to do that work, so it's probably more cost efficient to let the ORM pull everything down and crunch numbers in CLR code anyway!

That sounds like a good idea until you're retrieving millions of rows and pushing gigabytes over the wire for every page load.

Does it retain compiled execution plans and allow users to address parameter sniffing?

The SQL server will retain compiled execution plans for standard queries just as easily as it does for stored procedures (don't believe me? Install a SQL 2016+ dev box to play with somewhere, turn Query Store on, and throw a bunch of the same plain old query at it over and over again).

Yes, I know how sp_executesql works.

I'll concede that if you do have a parameterized query that is susceptible to parameter sniffing, the ORM isn't going to ever really know about it or be able to do the stupid ~~human~~ stored procedure tricks that exist to break parameter sniffing. But (again, assuming modern SQL Server here, since that's my domain of expertise) you still have access to Query Store's tooling to strongly encourage the good plan, or you could look at tuning the query to actually fix the small plan / big plan problem or just opt into pulling everything down and doing the data processing that causes the parameter sniffing spills in the application tier's cheaper CPU/RAM.

Yes, you can do things in memory in the application side when you can fit things into memory. But to get to that point, you're often going to have to do some heavy lifting on the SQL side, and the conditions may be nontrivial.

Groaner

@Zenith said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

The most challenging job I ever had was also the one that paid the lowest. Really gets the noggin turning.

Because really bad developers can really fuck up a ton of stuff if they're not stopped. But it's this self-reinforcing problem. Management pays peanuts, gets these monkeys, decides all programmers are monkeys, and uses that to justify keeping the low pay. I see it every single day in government. The difference is that, in government, if you learn to be helpless, you'll probably be alright. Private sector? Same mountain of fuckery, same low pay, same unwillingness to even attempt solutions. But you're on borrowed time, a little more if you visibly spin your wheels (be seen "doing something") but borrowed time nonetheless. I guess the corollary to "stay in a good job, grass isn't greener" is "rocket, don't run, to the exit if you smell failure."

A lot depends on what your end goals are. If you want to be challenged and forced to grow and rise to the occasion (and underpaid), go work for a startup. If you want a large salary and the ability to shrug off things ever so slightly outside your bailiwick as Somebody Else's Problem™, look into government and/or mature companies.

Groaner

@Jaime said in A critical look at Marvel vs. Capcom....:

If you are living in one of these scenarios, you should look into modernizing how you build software.

So.... using a newfangled query language that compiles to Javascript?

Zenith

@Groaner said in A critical look at Marvel vs. Capcom....:

A lot depends on what your end goals are. If you want to be challenged and forced to grow and rise to the occasion (and underpaid), go work for a startup. If you want a large salary and the ability to shrug off things ever so slightly outside your bailiwick as Somebody Else's Problem™, look into government and/or mature companies.

Not in PA it doesn't. Government is trying to find senior developers with 10+ years of .NET Core (and Oracle and Django and Azure and OnBase and...) for $60K and no benefits. They had trouble finding people for half market rates so now they pay $300/hr to firms to find $10/hr "talent." And it shows in what's produced, believe me. Not exactly a great area for startups either. There's no entry level stuff where you can learn. No, you have to already have 10+ years with Azure IoT Hadoop CRM Angular microservices. For a 3-month contract with no benefits (but don't you dare show a pile of 3-month contracts in your experience).

My end goals are apparently weird. I like to learn and be challenged but in a useful context. Let me solve a hard problem or improve performance or build more features? Great. Have me tear the same app(s) apart every single day because a new JavaScript framework on Github is having its 15 minutes of fame? Fuck right off.

Gąska

@Zenith said in A critical look at Marvel vs. Capcom....:

Government

no benefits

dcon

@Gąska said in A critical look at Marvel vs. Capcom....:

@Zenith said in A critical look at Marvel vs. Capcom....:

Government

no benefits

Because they're funding the contracting firm's CEO's mega yacht.

@Zenith said in A critical look at Marvel vs. Capcom....:

now they pay $300/hr to firms to find $10/hr

dkf

@Zenith said in A critical look at Marvel vs. Capcom....:

For a 3-month contract

Even without the peanuts, that's only going to get them monkeys.

with no benefits

Not surprising with a 3-month, but that's still dumb as heck.

(but don't you dare show a pile of 3-month contracts in your experience)

PA has legalized mind-altering drugs, yes? It must have, given the amount that must've been required to think that collection of requirements is a good plan.

Zenith

@dkf People need to have a mind before a drug can alter it.

Unperverted Vixen

@Jaime said in A critical look at Marvel vs. Capcom....:

You don't seem to be aware that there are already solutions to these problems... and they are in better programming languages already. For example, .Net and Linq make the choice completely go away - you simply bind to the database schema by creating an Entity Model and write your fetches using a very fluent style that generates efficient and correct queries. The original inefficiency that you described is gone and it isn't replaced with another one.

It’s replaced with either significant startup time (on EF 6) or lots of developer time trying to keep entities and schema in sync (EF Core).

Until EF Core gets better database-first tooling I don’t think I’ll go back.

Groaner

@Unperverted-Vixen said in A critical look at Marvel vs. Capcom....:

Until EF Core gets better database-first tooling I don’t think I’ll go back.

Database-first really needs to be the first priority because people generally already have a database to work with.

It defeats the purpose of an ORM to have to hand-write classes to map to all your tables à la Code-first.

Jaime

@Unperverted-Vixen said in A critical look at Marvel vs. Capcom....:

It’s replaced with either significant startup time (on EF 6) or lots of developer time trying to keep entities and schema in sync (EF Core).

It was a perfectly cromulent example that there are alternatives that break the false-dichotomy of "use stored procedures" or "embed complex SQL".

I too have never built anything that matters with any version of Entity Framework since every version seems to have some Achilles heel. There are plenty of alternative out there that strike whatever balance you are looking for. Like @dkf above, I usually roll my own database layer. I didn't want to propose that alternative because many would see it as too much work.

Jaime

@Groaner said in A critical look at Marvel vs. Capcom....:

I'll give you a real-life example: once upon a time, I helped build a survey app from the ground up. Each Survey would have one or more Sections, each of which would have one or more Questions. Sounds simple, right? Well, over the next few months, after this schema was built, you got the following additional requirements:

Our users are stupid and don't appreciate that deletes are forever, so instead of deleting surveys, sections, or questions, can we add a soft delete flag to all of them?

Also, only certain users should be able to see certain questions based on their assigned user type(s).

But admins and super-admins should see all the questions in Admin Mode™.

Oh yeah, can we make it so users only see certain questions if they answered a previous question a certain way?

Can we make it so that users only see certain questions if they answered several previous questions in certain ways?

Can we change what questions a user sees based on their answers in other surveys?

Can we hide certain questions for users at Institution X and only show others at Institutions Y and Z?

Imagine what a query would look like to retrieve all the questions visible to a user, or even question n+1 if the user is currently viewing question n. If you're imagining correctly, it would be about a page tall and very complicated, no matter whether you do it in SQL, LINQ or whatever. The complexity is not indicative of poor design, it's indicative of reality being complicated.

It really wouldn't be that hard. Just make the database layer fetch all the questions for all sections. I doubt this would ever become a performance problem. Then, have the business layer handle the visibility rules.

Soft delete is a column and a line of code in the visibility rules
Showing questions based on type would be a simple section of visibility code. It would have the advantage of being distinct from the question fetching code, and therefore you would be less likely to introduce security bugs.
Admins and super admins see all would be a single if.
Question dependency would be quite easy since you have all the questions loaded - making dependent visibility wouldn't be a big deal.
Ditto
Sure

Groaner

@Jaime said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

I'll give you a real-life example: once upon a time, I helped build a survey app from the ground up. Each Survey would have one or more Sections, each of which would have one or more Questions. Sounds simple, right? Well, over the next few months, after this schema was built, you got the following additional requirements:

Our users are stupid and don't appreciate that deletes are forever, so instead of deleting surveys, sections, or questions, can we add a soft delete flag to all of them?

Also, only certain users should be able to see certain questions based on their assigned user type(s).

But admins and super-admins should see all the questions in Admin Mode™.

Oh yeah, can we make it so users only see certain questions if they answered a previous question a certain way?

Can we make it so that users only see certain questions if they answered several previous questions in certain ways?

Can we change what questions a user sees based on their answers in other surveys?

Can we hide certain questions for users at Institution X and only show others at Institutions Y and Z?

Imagine what a query would look like to retrieve all the questions visible to a user, or even question n+1 if the user is currently viewing question n. If you're imagining correctly, it would be about a page tall and very complicated, no matter whether you do it in SQL, LINQ or whatever. The complexity is not indicative of poor design, it's indicative of reality being complicated.

It really wouldn't be that hard. Just make the database layer fetch all the questions for all sections. I doubt this would ever become a performance problem. Then, have the business layer handle the visibility rules.

Depending on how it's done, you may or may not be able to get away with that. If you're using a slimmed-down viewmodel that only holds skeletal information about each question, sure, but each question might have walls-o-text to display to the user, along with embedded images, etc.

Also, it's not enough to just get questions. You also need to fetch all that user's answers not just for the current survey, but potentially previous surveys, since those all can affect the visible questions.

Soft delete is a column and a line of code in the visibility rules

One column and one where condition per table. And likely each new table thereafter will have one.

Showing questions based on type would be a simple section of visibility code. It would have the advantage of being distinct from the question fetching code, and therefore you would be less likely to introduce security bugs.

Think of "type" as one or more roles. You'd need to not only fetch all the questions, but all the user types for that user and the user types for each of the questions. Oh, and user types might change over time and across surveys.

Admins and super admins see all would be a single if.

Pretty much the same in SQL or in the app.

Question dependency would be quite easy since you have all the questions loaded - making dependent visibility wouldn't be a big deal.

To do that, again, you have to load all the user answers every time to compute visibility. Loading the "questions" under such a system entails loading the whole question object graph each time plus looking up a lot of user data.

It might help to cache as much of these structures as possible, but then you have all the usual caching dilemmas - if something changes, how soon is it acceptable for that change to be visible? I've seen a lot of applications that query user permissions on every page load rather than risk a demoted user being able to do damage while the system hangs on to their cached previous credentials.

Jaime

@Groaner Whatever... all of your responses sound like "I don't understand how to do it that way, so I'm going to declare that it will be difficult to implement and maintain".

Here's an important thing to consider: I have done it the "DB first" way, and I've done it in a middle layer. I know the challenges and benefits of both. I'm confident that the solution will be easier to write and maintain if the logic is kept out of the database layer.

I'm also not going to pay for more database licenses to run application code. I'm going to fetch data in reasonable bite-sized consistent chunk and cache the ever living heck out of them. I understand that cache invalidation is a hard problem... but it's the cost of scalability and the key to the whole thing.

dkf

@Groaner said in A critical look at Marvel vs. Capcom....:

It might help to cache as much of these structures as possible, but then you have all the usual caching dilemmas - if something changes, how soon is it acceptable for that change to be visible? I've seen a lot of applications that query user permissions on every page load rather than risk a demoted user being able to do damage while the system hangs on to their cached previous credentials.

That depends on how frequently you check versus how frequently you invalidate, and what the risks of various operations are. (Would caching for 5 minutes be unacceptable, at least for read access? People expect writes to take longer.) If actual invalidations are infrequent, a good approach is to have an epoch counter that is incremented on every invalidation. Software can then just check if the epoch of the last check matches to make a decision on whether its cache is good.

error

@Groaner said in A critical look at Marvel vs. Capcom....:

it's all really just a rebranding of the age-old concept of "separation of concerns."

Separation of concerns is a design principle, which is still adhered to. MVC is an implementation of that principle. (n-tier is another.)

Groaner

@Jaime said in A critical look at Marvel vs. Capcom....:

@Groaner Whatever... all of your responses sound like "I don't understand how to do it that way, so I'm going to declare that it will be difficult to implement and maintain".

I'd say being able to identify pitfalls of alternate implementations is pretty good evidence for being able to understand them.

Here's an important thing to consider: I have done it the "DB first" way, and I've done it in a middle layer. I know the challenges and benefits of both. I'm confident that the solution will be easier to write and maintain if the logic is kept out of the database layer.

And I've seen applications where all the logic that ought to have been at the DB layer was in the application layer, and the DB schema could never change because there were hundreds of different variations on accessing the same tables/objects. Had all access gone through something like views or SPs, maybe something could have been done about the schema.

I'm also not going to pay for more database licenses to run application code. I'm going to fetch data in reasonable bite-sized consistent chunk and cache the ever living heck out of them. I understand that cache invalidation is a hard problem... but it's the cost of scalability and the key to the whole thing.

That's the thing, though. To get to those bite-sized chunks, you need to be able to filter through the millions or billions of rows in the underlying tables. So at least some logic needs to be done at the DB. We are free to disagree on how much. A complex query may add CPU load to a DB server, but so does pulling back megabytes of data for trivial operations.

I will say that I have never seen any "query," done in SQL or in an ORM, to retrieve a list of "active" Orders, or Questions, or Widgets in a mature system that was not uncomplicated, because given sufficiently long contact with reality, no design escapes complication.

Groaner

@dkf said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

It might help to cache as much of these structures as possible, but then you have all the usual caching dilemmas - if something changes, how soon is it acceptable for that change to be visible? I've seen a lot of applications that query user permissions on every page load rather than risk a demoted user being able to do damage while the system hangs on to their cached previous credentials.

That depends on how frequently you check versus how frequently you invalidate, and what the risks of various operations are. (Would caching for 5 minutes be unacceptable, at least for read access? People expect writes to take longer.) If actual invalidations are infrequent, a good approach is to have an epoch counter that is incremented on every invalidation. Software can then just check if the epoch of the last check matches to make a decision on whether its cache is good.

There's also the limiting factor of "how many active user sessions can you comfortably keep in memory?" A smaller user base lets you get away with caching a lot more, but a larger userbase will impose more stringent requirements. Guess which situation will also be more demanding on a database.

Groaner

@error said in A critical look at Marvel vs. Capcom....:

@Groaner said in A critical look at Marvel vs. Capcom....:

it's all really just a rebranding of the age-old concept of "separation of concerns."

Separation of concerns is a design principle, which is still adhered to. MVC is an implementation of that principle. (n-tier is another.)

Perhaps I was too broad in my interpretation, and also treating MVC as I've seen it implemented rather than how it might appear in textbooks.

What's a good name, then, for a design where you not only have models, views, and controllers, but viewmodels and mappers and services and repositories, not just on the backend, but the frontend as well? It's getting pretty annoying to see, for example, in the Angular community, a rigid adherence to these patterns (where there's almost an obligation for all communication between controllers to be done through a service). Then, there's a similar enthusiasm about services and repositories on the ASP.NET side as well.

Whatever happened to YAGNI?

Unperverted Vixen

@Groaner said in A critical look at Marvel vs. Capcom....:

What's a good name, then, for a design where you not only have models, views, and controllers, but viewmodels and mappers and services and repositories, not just on the backend, but the frontend as well?

I can't speak for your Angular example, but if by "frontend" you mean "ASP.NET MVC or ASP.NET Core code"? I'd just call that "moderately complex."

Most of it's optional, though - we don't use mappers, for example.

Whatever happened to YAGNI?

We do need it, although we may be an odd case.

We use the same ASP.NET MVC front-end code with multiple versions of our software. Repositories let us have repository implementations for each possible schema, wiring up the correct one at runtime without the front-end code needing to know any of the particulars.
We use services as our main extensibility point. Client A needs non-standard logic for something (say, changing the sort order on a page), so they'll override the service to customize the behavior for populating the view model (in this example, they can just call base and then tweak the results).

If we left that logic in the controller, the only way to customize it would be to copy/paste the whole thing into the client-specific implementation. That provides a lot of space for the dev to screw up, and would also mean that changes from future releases would never make their way to the client.

Kamil Podlesak

@Groaner said in A critical look at Marvel vs. Capcom....:

What's a good name, then, for a design where you not only have models, views, and controllers, but viewmodels and mappers and services and repositories, not just on the backend, but the frontend as well?

Generally accepted term is overengineering.

Bulb

@Jaime said in A critical look at Marvel vs. Capcom....:

Here's an important thing to consider: I have done it the "DB first" way, and I've done it in a middle layer. I know the challenges and benefits of both. I'm confident that the solution will be easier to write and maintain if the logic is kept out of the database layer.

Manipulating data is usually easier with queries than procedural logic. It does not matter whether the logic is stored in the database as views and stored procedures, in the application as embedded SQL or built with query builder – which of that is better depends more on the project organization than the problem – but it should be executed in the database as query. Because a query is declarative. You don't have to tell the database how, you only tell it what, and it sorts it out.

The database might need some help optimizing the query in the form of creating indices and index use hints, but that can generally be done without restructuring the queries. Optimizing procedural code that uses cubic algorithm usually requires a lot of restructuring.

@Jaime said in A critical look at Marvel vs. Capcom....:

I'm also not going to pay for more database licenses to run application code.

Or you use PostgreSQL and then it does not matter too much whether you spend the CPU time on the database servers or some other ones.

@MrL said in A critical look at Marvel vs. Capcom....:

It's a query language, with bad readability, verbose inconsistent syntax, terrible flow control and subpar tooling.

You can use Java in stored procedures in Oracle. I didn't check, but I believe you can use .NET in stored procedures in SQLServer. And you can use Tcl, Perl, Python, Java, Lua, R, shell, and JavaScript (and probably some more) in PostgreSQL.

I agree tooling is worse. But then, the tooling I have in my current project (C++, Lua) is, well, essentially nonexistent.

@MrL said in A critical look at Marvel vs. Capcom....:

It's great for, well, querying data.

In many applications, major part of the business logic is querying the data the right way.

Regarding project organization. Back in the day I was doing some data imports. It was a huge conglomerate of systems done by different companies.

The components done by our company generally preferred putting more logic into the database. So when I was to import the data there, I got a database connection, opened a transaction, inserted the data in some table and called a stored procedure that checked constraints (the imported data had to be paired with the data already in the database) and created another table of unmatching entries. If it failed midway, it didn't get committed, so the batch would just get restarted later and all was easy. And simple.

There was another component done by different company, where the database was behind an application server. There for each record I had to download a piece of XML with timestamp, create a new piece of XML, send it there, handle transaction failures, remember where in the batch it got in case it crashed midway… And there were all the extra conversions between the components.

I am sure it would have been easy inside the database server. It wouldn't make much difference whether the pairing logic was in a stored procedure or if it was in an application library. But it did matter whether the import could work with a transaction. If there is one application server, you can put the logic in it or in the database and it does not matter much. But if you have many separate components, sometimes putting common logic in the database helps a lot.

dkf

@Groaner said in A critical look at Marvel vs. Capcom....:

There's also the limiting factor of "how many active user sessions can you comfortably keep in memory?" A smaller user base lets you get away with caching a lot more, but a larger userbase will impose more stringent requirements. Guess which situation will also be more demanding on a database.

The only way to truly scale is to have ways to split the database so you can have several servers on that part of the task. That's been known for a long time. But it doesn't change my point: caching strategy needs you to understand invalidation frequency and impact.

Jaime

@Bulb said in A critical look at Marvel vs. Capcom....:

Manipulating data is usually easier with queries than procedural logic.

var total = things.Sum();

No.

@Bulb said in A critical look at Marvel vs. Capcom....:

Or you use PostgreSQL and then it does not matter too much whether you spend the CPU time on the database servers or some other ones.

You still limit scalability. The database layer is the hardest to scale and anything you do to reduce the capacity of a node (such as adding functionality that did not need to be added), will make that scaling harder.

MrL

@Bulb said in A critical look at Marvel vs. Capcom....:

You can use Java in stored procedures in Oracle. I didn't check, but I believe you can use .NET in stored procedures in SQLServer. And you can use Tcl, Perl, Python, Java, Lua, R, shell, and JavaScript (and probably some more) in PostgreSQL.

OMG what a terrible idea. Making BL in DB even worse. Just wow.

boomzilla

@MrL said in A critical look at Marvel vs. Capcom....:

@Bulb said in A critical look at Marvel vs. Capcom....:

You can use Java in stored procedures in Oracle. I didn't check, but I believe you can use .NET in stored procedures in SQLServer. And you can use Tcl, Perl, Python, Java, Lua, R, shell, and JavaScript (and probably some more) in PostgreSQL.

OMG what a terrible idea. Making BL in DB even worse. Just wow.

It's good for stuff like reports or overnight jobs that need to do some complex calculations. Basically, same reason you'd need a stored function, but in a nicer language. Also, you can have the same code in your application and your DB, available to be called in either place as required.

nerd4sale

@Bulb said in A critical look at Marvel vs. Capcom....:

You can use Java in stored procedures in Oracle.

Yes you can, I believe since Oracle 8.1 in the late 90s.
And IIRC Oracle discourages using it, since Oracle 10g (maybe earlier), in 2003.

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

It's good for stuff like reports or overnight jobs that need to do some complex calculations.

Nearly everywhere I've worked that had sane DBAs has banned this.

Imagine being responsible for securing a database where any opaque assembly could be packaging up data and SFTP'ing it to China. Imagine being responsible for performance where the same opaque assemblies could be stuck in an infinite loop. Imagine migrating a database to another node when any assembly might have some obscure dependency on the server name (e.g. SELECT * FROM server.database.schema.table).

Good DBAs understand DB stuff. Putting non-DBA stuff in the middle of their assigned territory makes for a terrible mess.

Also, our security analysis product comes with a ton of templates for common software configurations. One of the rules in all of the Microsoft SQL Server rule packs is "Ensure 'CLR Enabled' Server Configuration Option is set to '0'". We can take this as an indication that even enabling CLR code in Microsoft SQL Server is seen as problematic from a security standpoint.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

It's good for stuff like reports or overnight jobs that need to do some complex calculations.

Nearly everywhere I've worked that had sane DBAs has banned this.

Imagine being responsible for securing a database where any opaque assembly could be packaging up data and SFTP'ing it to China.

Well, we have things to prevent that, like code reviews and deployment history / audits and firewalls and other network monitoring.

Imagine being responsible for performance where the same opaque assemblies could be stuck in an infinite loop.

They'd tell us (developers) to fix our shit.

Imagine migrating a database to another node when any assembly might have some obscure dependency on the server name (e.g. SELECT * FROM server.database.schema.table).

Blah, blah, blah. They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server) so that's on them in any case, because the app needs that.

Good DBAs understand DB stuff. Putting non-DBA stuff in the middle of their assigned territory makes for a terrible mess.

Yeah, let's coddle the DBAs instead of getting shit done. Good strategy.

loopback0

@Jaime said in A critical look at Marvel vs. Capcom....:

Imagine being responsible for securing a database where any opaque assembly could be packaging up data and SFTP'ing it to China.

There should be more robust security on servers anyway, so that database servers can't just randomly connect outside of the network.

boomzilla

@loopback0 said in A critical look at Marvel vs. Capcom....:

@Jaime said in A critical look at Marvel vs. Capcom....:

Imagine being responsible for securing a database where any opaque assembly could be packaging up data and SFTP'ing it to China.

There should be more robust security on servers anyway, so that database servers can't just randomly connect outside of the network.

Yeah, we have several layers of network with the database servers in the innermost layer.

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server)

You don't have to do it. I've re-written plenty of things to remove linked servers and never have I run into a case where it wasn't possible (or even one where it was difficult). Linked servers are generally worse than any other alternative implementation if you consider the whole application development lifecycle... assuming you aren't a cowboy coder that does everything in live.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server)

You don't have to do it. I've re-written plenty of things to remove linked servers and never have I run into a case where it wasn't possible (or even one where it was difficult). Linked servers are generally worse than any other alternative implementation if you consider the whole application development lifecycle... assuming you aren't a cowboy coder that does everything in live.

Yeah...not seeing it. What's the big deal?

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

They'd tell us (developers) to fix our shit.

So tell me, do you would you simply tell an auditor (or a particularly demanding customer) "trust me, I don't make mistakes" when this rule is triggered: https://www.stigviewer.com/stig/ms_sql_server_2016_instance/2019-01-03/finding/V-79179 ?

They don't explicitly say you can't use CLR in SQL... but they do say that you should only allow signed CLR Assemblies, and that implies you should have a proper process of key control, code review, and a secure deployment pipeline (the exact thing that SolarWinds just screwed up).

The truth is... it's simply not worth it. Just put your damn code in the application where it belongs and make your security posture much stronger and much easier to confirm. It's 2021... there is no longer an excuse for poor security operations. The twenty minutes you save isn't worth implementing all of the compensating controls that you should be (and almost certainly aren't) putting in.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

They'd tell us (developers) to fix our shit.

So tell me, do you would you simply tell an auditor (or a particularly demanding customer) "trust me, I don't make mistakes" when this rule is triggered: https://www.stigviewer.com/stig/ms_sql_server_2016_instance/2019-01-03/finding/V-79179 ?

They don't explicitly say you can't use CLR in SQL... but they do say that you should only allow signed CLR Assemblies, and that implies you should have a proper process of key control, code review, and a secure deployment pipeline (the exact thing that SolarWinds just screwed up).

Dunno. We don't run CLR in MS SQL, but we haven't had a problem running java in Oracle. Or if we did it was addressed and approved. More likely it's not able to do all the stuff that the CLR can so it's not the same risk at all.

The truth is... it's simply not worth it. Just put your damn code in the application where it belongs and make your security posture much stronger and much easier to confirm. It's 2021... there is no longer an excuse for poor security operations. The twenty minutes you save isn't worth implementing all of the compensating controls that you should be (and almost certainly aren't) putting in.

LOL, 20 minutes and imagined security problems.

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server)

You don't have to do it. I've re-written plenty of things to remove linked servers and never have I run into a case where it wasn't possible (or even one where it was difficult). Linked servers are generally worse than any other alternative implementation if you consider the whole application development lifecycle... assuming you aren't a cowboy coder that does everything in live.

Yeah...not seeing it. What's the big deal?

I've had arguments with people who thinks it's OK to hardcode sa in a thousand places. I've had arguments with people who think it's not necessary to parameterize SQL calls. Those people thought they were right, too. Enjoy your ignorance-born bliss.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server)

You don't have to do it. I've re-written plenty of things to remove linked servers and never have I run into a case where it wasn't possible (or even one where it was difficult). Linked servers are generally worse than any other alternative implementation if you consider the whole application development lifecycle... assuming you aren't a cowboy coder that does everything in live.

Yeah...not seeing it. What's the big deal?

I've had arguments with people who thinks it's OK to hardcode sa in a thousand places. I've had arguments with people who think it's not necessary to parameterize SQL calls. Those people thought they were right, too. Enjoy your ignorance-born bliss.

I do indeed claim ignorance! Why do you think I asked the question? I can't see how this post has anything to do with the other post.

What does having a link to another DB have to do with parameterizing queries or "hardcode sa" (I have no clue what this means at all)?

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

I do indeed claim ignorance ... have to do with parameterizing queries ...

Ignorance indeed.

Query Parameterization - OWASP Cheat Sheet Series

From the linked page:

SQL Injection is one of the most dangerous web vulnerabilities. So much so that it's the #1 item in the OWASP Top 10.

I listed some of the most common and egregious security mistakes that are made repeatedly by developers. Since you don't have familiarity with Microsoft SQL Server, you do deserve an explanation of the "hardcode sa" reference. "sa" is a built-in system administrator account that used to be created and activated by default in early versions of Microsoft SQL Server. Lazy developer had a habit of simply using that account for all database access and doing it in the laziest possible way, by hardcoding "uid=sa;pwd=xyzpdq;" (or similar) in every connection string. This creates several problems:

sa is a known high-privilege user name. Attackers would simply brute-force the password.
The fact that the password was hardcoded in a lot of places, it would be very difficult to change the password and go and update all of the connection strings.
Later versions of SQL server allowed renaming and disabling of the account. The above habit of hard-coding made it hard to avail one's self of this security enhancement.

This has been a known unacceptable practice for a very long time. Yet, I could easily find ten people that would argue that's it isn't a problem and should be allowed.

Not only do I find that ignorance of security guidance common, I find entrenched, strong opinions of wrong headed security is terribly common.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

I do indeed claim ignorance ... have to do with parameterizing queries ...

Ignorance indeed.

Good lord. You're not reading my post.

I listed some of the most common and egregious security mistakes that are made repeatedly by developers. Since you don't have familiarity with Microsoft SQL Server, you do deserve an explanation of the "hardcode sa" reference. "sa" is a built-in system administrator account that used to be created and activated by default in early versions of Microsoft SQL Server. Lazy developer had a habit of simply using that account for all database access and doing it in the laziest possible way, by hardcoding "uid=sa;pwd=xyzpdq;" (or similar) in every connection string. This creates several problems:

sa is a known high-privilege user name. Attackers would simply brute-force the password.

The fact that the password was hardcoded in a lot of places, it would be very difficult to change the password and go and update all of the connection strings.

Later versions of SQL server allowed renaming and disabling of the account. The above habit of hard-coding made it hard to avail one's self of this security enhancement.

This has been a known unacceptable practice for a very long time. Yet, I could easily find ten people that would argue that's it isn't a problem and should be allowed.

Not only do I find that ignorance of security guidance common, I find entrenched, strong opinions of wrong headed security is terribly common.

Yay. None of which is any kind of an answer to my question about linking to other databases.

Jaime

@boomzilla said in A critical look at Marvel vs. Capcom....:

Yay. None of which is any kind of an answer to my question about linking to other databases.

That's simply yet another security thing that you don't understand. There's no point telling you the specifics since my experience is that nearly everyone will simply refuse to comply, call the security people idiots, and generally whine and cry. I get paid to say this kindly at work, but not here. I gave the two examples to show that these guidances always have good reasons and it's exhausting to explain it over and over.

boomzilla

@Jaime said in A critical look at Marvel vs. Capcom....:

@boomzilla said in A critical look at Marvel vs. Capcom....:

Yay. None of which is any kind of an answer to my question about linking to other databases.

That's simply yet another security thing that you don't understand.

But that you can't explain. Gotcha. And instead of actually explaining what you were talking about you went on to list other stuff that had no relation except that you're convinced no one understands but you.

There's no point telling you the specifics since my experience is that nearly everyone will simply refuse to comply, call the security people idiots, and generally whine and cry. I get paid to say this kindly at work, but not here. I gave the two examples to show that these guidances always have good reasons and it's exhausting to explain it over and over.

So why not just over the one that I asked about instead of bringing up other stuff? Seems like that would have been easier and less work, so I have difficulty in believing you about how exhausting it is. Or, at least that this is anyone's fault but your own.

izzion

@boomzilla said in A critical look at Marvel vs. Capcom....:

They set up the external DB links (which we do have since we integrated with a COTS product that uses SQL Server) so that's on them in any case, because the app needs that.

Caveat: This deals with specifics of MS SQL land, and probably doesn't have the same limitations / considerations in $ORA$ORA$ORA$$$ land

At this point in MS SQL land, there's an almost blanket recommendation of "Friends Don't Let Friends Use Linked Servers" due to some limitations in how Microsoft has implemented them, and the resulting problems those limitations cause:

If you choose to "pass through" credentials from Server A to Server B, you need to have Kerberos set up correctly between the two servers, and between Server A and the client, and have Kerberos delegation set up correctly, otherwise it just fails in weird ways
If you don't want to deal with that headache, you have the option to "impersonate" a specific login, which stores that username & password in the SQL server's master database (in a retrievable way, if someone has sufficient permissions to see the table), and then all connections from Server A to Server B are in the context of the login being impersonated. Which then means permissions on the database in Server B have to be based on the impersonating login (and maybe the user login too if users directly connect to Server B in other cases, happy joy)
In either of the above cases, any cross-server queries using a linked query don't provide full statistics information to the consuming server (Server A), except if the account being used for the query has sa permissions
In light of all of the above, the most common use case in the wild for linked servers is that Server A is making the connection to Server B as the SQL-integrated, default sa account... which is a security nightmare for all of the usual reasons.

As such, the feature has basically gotten a bad rap and the generic advice is "Not Even Once". It's possible to set it up in a way that's more secure, but if you're pulling queries that need good statistics estimates (e.g. heavily filtered queries out of large tables) to not have memory grant problems, then setting up a more secure linked server connection blows up your query performance on Server A.