Redesigning my retail software/database - part 1

Zenith

After the tax issues last year and this year, I decided it was probably time to redesign my software from the ground up before 2020 is over. The system as it stands is basically seven tables:

MacroInventory - orders and fixed expenses
MicroInventory - inventory items
MacroSales - would've been order level stuff, like shipping and tracking, but never ended up being used due to rare multiple item sales
MicroSales - sales items, plus higher-level stuff like shipping and tracking
eBayFees - fee breakdown (manually carried over to MicroSales)
eCraterFees - fee breakdown (manually carried over to MicroSales)
Rules - basically notes for how some calculations work due to weird vendor behavior

So there are a few issues.

There are seven dates in MacroInventory and they still don't cover all of the relevant dates. I originally dealt with vendors that took all payment entirely up front or entirely when orders came in stock. That didn't really work so well when I started dealing with vendors that took deposits or charged and delivered partial orders (for example, I ordered four cases and was charged individually for each at different times).
The same sort of applies to money amounts. I kept adding fields as stuff like sales tax and foreign transaction fees and handling fees were added. There are enough money fields to look cluttered but not enough to cover everything.
There is also a bug in the software that I've been lucky not to run into yet. There's a complex calculation where items that did not have an individual cost are calculated based on what's left over in the "item price" money amount. Every other fee, for backwards compatibility, is distributed evenly over all items in the order. The problem is that refunds are taken out of the non-item amounts, so, if the refund ever exceeded that sum, I don't know what'd happen.
I started to keep track of stuff like vendor invoices, foreign exchange rates (when I ordered in Yen), whether an order was paid out of store money or my money, payment methods, notes, etc. There are some other fields I'd like to add. I have 34 currently but could easily bump that up to 40. It seems like alot of fields for a single table.
I don't know if it was a great idea to put inventory and fixed expenses in the same table. It's helpful not having to UNION tables together but there are some considerations that have to be made as a result.
I know I'll get grief for this but SQL 2000 is what I have to work with right now. One of the limitations that I've run into is that computed columns can use functions but those functions cannot be updated as long as they're referenced by a table. The only alternative is to use views which have some different behavioral quirks.

So I've been wrestling with the idea of breaking MacroInventory down into several smaller tables.

Pro: Smaller tables will fit on screen better.
Pro: Smaller tables won't be so overwhelming.
Con: There will have to be a big view or every query will have to join multiple tables.
Con: It will be a pain to update multiple tables because a view linking them isn't writable.

Does it make sense for me to just clean this up and still keep it as one table?

Is there some way I can keep functions without resorting to views?

Keep in mind, a UI is still a long ways off so this is going to be Enterprise Manager all the way for some time.

boomzilla

@Zenith said in Redesigning my retail software/database - part 1:

Con: There will have to be a big view or every query will have to join multiple tables.
Con: It will be a pain to update multiple tables because a view linking them isn't writable.
Does it make sense for me to just clean this up and still keep it as one table?

I don't understand your reluctance to join tables. It's one of the fundamental things about a relational database that makes it worth using.

I don't claim to understand all the stuff you wrote about what your software is doing but this talk about joins is residing red flags for me.

Zenith

@boomzilla said in Redesigning my retail software/database - part 1:

I don't understand your reluctance to join tables. It's one of the fundamental things about a relational database that makes it worth using.

Because absent a custom UI, it's a pain in the ass to enter data into multiple tables that conceptually represent one object.

I'm also not sure where the line between "core table" and "ancillary tables" is. If I include just the order key, description, dates, and amounts, that's 20 columns. I absolutely don't want what I've seen Indians do, where they turn the whole concept of a table on its head and have every "table" row be 20 rows in a gigantic key/value table.

Edit: I do join the tables that I have for various purposes. I just don't want to hypernormalize to the point where it's difficult to see the object represented.

Groaner

@Zenith said in Redesigning my retail software/database - part 1:

Con: It will be a pain to update multiple tables because a view linking them isn't writable.

INSTEAD OF triggers?

Zenith

@Groaner said in Redesigning my retail software/database - part 1:

@Zenith said in Redesigning my retail software/database - part 1:

Con: It will be a pain to update multiple tables because a view linking them isn't writable.

INSTEAD OF triggers?

So instead of MacroInventory, I have MacroInventoryBase, MacroInventoryDates, MacroInventoryAmounts, MacroInventoryOther, and MacroInventoryView that joins on column Key. A trigger might create blank rows in three other tables from MacroInventoryBase but MacroInventoryView is readonly because even SQL 2012 barked at me over views deriving from more than one table.

Now that I typed this all out, I think the best course is probably to bite the bullet and keep one table. SQL doesn't seem to bitch about the functions if the one referenced in the computed column is a placeholder function that simply points to a different real function. That seems stupid and I was hoping there was some sort of setting like how SSMS defaults to making tables non-alterable for some reason.

Edit: I also sort of want to add some error checking into the tables. Part of the current iteration of the software calculates stuff from two different POVs and flags where they don't add up. The idea of fighting the interface every time I need to tweak those calculations is not appealing though.

Groaner

@Zenith said in Redesigning my retail software/database - part 1:

So instead of MacroInventory, I have MacroInventoryBase, MacroInventoryDates, MacroInventoryAmounts, MacroInventoryOther, and MacroInventoryView that joins on column Key. A trigger might create blank rows in three other tables from MacroInventoryBase but MacroInventoryView is readonly because even SQL 2012 barked at me over views deriving from more than one table.

I'm pretty sure you can have a modifiable view that consists of multiple joins, even in 2000, as long as you have proper INSTEAD OF triggers. You will of course need to be able to properly direct the changes to each of the underlying tables in your trigger. I worked on something that split out ten different entities stored in the same table back in the day and I seem to remember updates against the compatibility view working swimmingly. There is, of course, a performance hit, but nothing is free.

Now that I typed this all out, I think the best course is probably to bite the bullet and keep one table. SQL doesn't seem to bitch about the functions if the one referenced in the computer column is a placeholder function that simply points to a different real function. That seems stupid and I was hoping there was some sort of setting like how SSMS defaults to making tables non-alterable for some reason.

SQL Server 2000 has lots of quirks with determinism requirements for functions. You are also stuck with a DBMS that's 20 years old:

Edit: I also sort of want to add some error checking into the tables. Part of the current iteration of the software calculates stuff from two different POVs and flags where they don't add up. The idea of fighting the interface every time I need to tweak those calculations is not appealing though.

I would recommend indexed views as a solution to catch complex data integrity requirements (e.g. when requirements are violated, the view gains a duplicate row and a key violation), but I believe that would require 2008. On 2000, it's rules and check constraints.

loopback0

@Zenith said in Redesigning my retail software/database - part 1:

I know I'll get grief for this but SQL 2000 is what I have to work with right now.

Why? SQL Server Express 2019 is free, allbeit with some limits.

Zenith

@loopback0

A) Because the eventual requirement is to work without an enterprise RDMS.
B) Also I don't want to put new software on this PC and can't find my Windows CD to set up the new PC.

loopback0

@Zenith Fair enough.

Zecc

@Zenith said in Redesigning my retail software/database - part 1:

A) Because the eventual requirement is to work without an enterprise RDMS.

SQLite then?

Zenith

@Zecc Actually I originally had it set up with MSSQL Compact. SQLite support would've been trivial assuming their SQL dialect wasn't too weird.

Zenith

I decided to just keep the general structure that I have right now.

It's easier to enter from Enterprise Manager.
I don't have to fight triggers or data integrity lapses not caused by me.
By the time I decided on core columns, that was half of them anyway.
Most of what I could normalize away was so small (bytewise) that it made no sense.
Compatibility with non-enterprise RDMS.

Now it's time to transfer the old data and see how that shakes out. I would also like to add a related table for scanned/saved files.

dkf

@Zenith said in Redesigning my retail software/database - part 1:

SQLite support would've been trivial assuming their SQL dialect wasn't too weird.

The main weird things with SQLite are:

Almost no date and time support.
The way that types are suggestions and are actually per value not per column. (This actually makes sense when with dealing with real-world form entry, alas.)

Apart from that, there's no stored procedures, limited concurrent access and no access control.

Stored procedures: you've usually got a host programming language in the same process so just use that.
Concurrent access: it's not completely impossible (depending on the filesystem the DB is on) but concurrent access isn't really what it was designed to do.
Access control: well duh; it's not a database server, and it's not going to be a database server.

The syntax used by SQLite is mostly derived straight from the ANSI spec, and is intentionally fairly close to PostgreSQL. MySQL is rather more distantly related (because their dialect is b•••••ming weird). I don't know MSSQL's syntax enough to say much there.

And if you want to set up full text search, SQLite does have good support for that. You can build tables where a column is configured to be a FTS column (with whatever metadata in the other columns), pitch the relevant document blobs into the column, and search for relevance in there. It's really quite fast indeed, enough that I know people who are using it to power the search function for public websites. It's a hell of a lot easier than setting up a Solr instance, for example.

Zenith

Stupid SPA JavaScript monstrosity confusing post drafts.

Zenith

@Zenith I have been moving over macro level data though. Some of the new requirements are causing me to divide more than a few old orders. I had been doing that in extreme cases where months of delays (AKA solicitor lies) put half an order in one year and the other half in another. That way tax calculations run against both years are, and remain, correct.

Order[E-0001] Invoice[XYZ] OrderDate[1-1-2017] Items[20] Amount[$500] Shipping[$30] TaxDate[?]

vs

Order[E-0001] Invoice[XYZ] OrderDate[1-1-2017] Items[10] Amount[$250] Shipping[$25] TaxDate[2-1-2017]
Order[E-0002] Invoice[XYZ] OrderDate[1-1-2017] Items[10] Amount[$250] Shipping[$25] TaxDate[3-1-2018]

On the downside, it's looking to create more duplication at the micro level. Since I changed sourcing from distribution to clearance (way better margins), there has been an explosion in one-off listings.

Item[T-0001] Order[E-0001] Label[Marvel Legends Wolverine] Amount[20] Cost[$21]
Item[T-0002] Order[E-0002] Label[ML Wolverine] Amount[10] Cost[$19]

vs

Item[T-0001] Order[E-0001] Label[Marvel Legends Wolverine] Amount[1] Cost[$21]
Item[T-0002] Order[E-0002] Label[Marvel Legends Wolverine] Amount[7] Cost[$4]
Item[T-0003] Order[E-0003] Label[ML Wolverine] Amount[1] Cost[$19]
Item[T-0004] Order[E-0004] Label[Hasbro Wolverine] Amount[3] Cost[$12]
Item[T-0005] Order[E-0005] Label[Legends Wolverine] Amount[1] Cost[$16]
Item[T-0006] Order[E-0006] Label[ML Wolverine] Amount[2] Cost[$8]

One of the hurdles to syncing with any sort of online storefront has always been a lack of a unified catalog view. There's thus far been no easy way to present all 6 of these items as 1 purchasable item. Sure, I can do it manually, reading down a hand-written list at the end of the day and subtracting from inventory, but I mean in an automated way as it happens. Working in PHP to boot didn't help (text editor is great for tweaking an app but not ground-up dev). Maybe once I get around to setting up an IIS/ASP project it'll be different. Not a top priority though.

Fortunately, it doesn't fuck up any other calculations (albeit making them a little more expensive because there are more smaller batches to calculate through).

Zenith

@Zenith said in Redesigning my retail software/database - part 1:

Fortunately, it doesn't fuck up any other calculations (albeit making them a little more expensive because there are more smaller batches to calculate through).

Spoke too soon.

It turns out I divided all non-purchase charges evenly across the order. Historically, this was because all of the charges early on were fixed/flat. Well, then foreign transaction fees and sales tax came along and it was to distribute them proportionally so they just piled on top.

(In my defense, the item part is scary with what it can figure out. It really was to keep going.)

Dividing orders up by shipment brought it a little closer...but at the cost of retroactively goofing up past sales records.

Damn. I have to rethink this. I think I can work around everything individually but I don't know about all of them...

Zenith

@Zenith So here's the way it stands.

I have 4 orders:

Order[1] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2017]
Order[2] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-1-2017]
Order[3] Vendor[store] Invoice[null] TaxDate[1-1-2017]
Order[4] Vendor[store] Invoice[null] TaxDate[1-1-2017]

So in the legacy system:

Order[1] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2017]
Order[2] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2018]
Order[3] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-2-2017]
Order[4] Vendor[store] Invoice[null] TaxDate[1-1-2017]
Order[5] Vendor[store] Invoice[null] TaxDate[1-1-2017]

And in the modern system possibly:

Order[1] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2017]
Order[2] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2018]
Order[3] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-1-2017]
Order[4] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-3-2017]
Order[5] Vendor[store] Invoice[null] TaxDate[1-1-2017]
Order[6] Vendor[store] Invoice[null] TaxDate[1-1-2017]

Problems to solve:

Make sure both split orders can be recognized as one original order in certain views.
Keep old division rules in place while fixing new division rules going forward.
Hard separation between major/willful splits and minor/incidental splits.

Hmm...

Normally, the invoice number would be of some help. SELECT OrderDate, SUM(Money) GROUP BY Invoice or something similar. The issue is the nulls where they were clearance sales without a proper invoice. MSSQL usually can't do shit with NULL. We'll see.
I could put an order-level flag for calculation type. F for fixed and P for proportional. It should be possible to SUM up money/weight. Possible but ugly.
I typically make TaxDate synonymous with BillDate. Generally speaking, I've been charged right before shipping and transit time is relatively short. I know this isn't a universal truth, hence it's set manually. I have a separate StockDate field that's based on the ShipDate. TaxDate is for "counts on taxes as of now" and StockDate is "available on shelf as of now" so they're well enough separated.

So I think the structure that make the most sense might be:

Order[1] Box[0] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2017]
Order[2] Box[0] Vendor[asshole splitter] Invoice[abc] TaxDate[1-1-2018]
Order[3] Box[1] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-2-2017]
Order[3] Box[2] Vendor[accidental splitter] Invoice[xyz] TaxDate[1-2-2017]
Order[4] Box[0] Vendor[store] Invoice[null] TaxDate[1-1-2017]
Order[5] Box[0] Vendor[store] Invoice[null] TaxDate[1-1-2017]

I think this sort of query should work to get order info:
SELECT Vendor, Invoice, SUM(Money) FROM Orders GROUP BY Invoice WHERE YEAR(TaxDate) = 2017;
SELECT Vendor, COUNT(0), SUM(Money) FROM Orders GROUP BY Invoice WHERE YEAR(TaxDate) = 2017;

And I think this (partial) query should work to calculate non-item costs:
SELECT * FROM Items JOIN Orders ON Order = Order /*--AND Box = Box;*/ WHERE Year(TaxDate) = 2017;

Assuming this works, I'm not sure how to explain this to a potential customer or make the interface itself structure the two sorts of split orders differently. The real difference is that some of these orders are split by a day and some of them are split by 6 months. You wouldn't be able to make the numbers work for the latter because you'd potentially have to file taxes before you were billed and knew what the other half of the order cost. So that's something I think the user would have to just know to do when entering inventory. But that halfway defeats the purpose or automating some portions of this process. I mean if you're just going to enter items as you get them, Excel might work. Ugh, my brain hurts. I need to take a nap.