Best DB design for table containing counters

ataulhaq

The application I am working on has a lot of counters that are maintained to count the number of requests for each option selected by a user. This count is used to generate report to show how many users used option 1, how many users used option 2, etc. Options, in general, span over several menus in the application.

Currently the way table is designed to store the information is that there is only one table with each column representing a counter value. Each row is maintained for each day, not for each user. An update query updates the counters for same day results, summing up the individual counter values for one day. As more options are added to the application, more counters are required and so, more columns to the table. Usually, the table contains around 25-30 counters. The unused columns are not removed because it might break the report

Something tells me that there is a better table design for reports. My extent of knowledge is to normalize the tables but I am wondering if there is any "best practice" approach for having tables for reports.

Is there any better design for this particualr situation or any better design for reports in general?

apapadimoulis

Yich.

Your best bet is to create a Action Log table: {User_Num, Request_Type, Action_Date}. This simplifies the recording logic. From there, you can easily generate the reports with views on the back-end and cross-tabbing in the front-end.

db2

I would definitely not recommend storing every counter for a particular day into one single row. It seems like row locking and resource contention could really bite you in the ass should the demand for scalability increase.

lpope187

I agree with Alex on designing a more normal log table. I'd then create a job that aggregates the data on a nightly basis for reporting. That will speed up reporting considerably. If you leave the log file alone, you still be able to generate more finely detailed reports (hourly, quarter of a day, etc) if needed in the future.

ataulhaq

Thanks for the comments. I had the idea presented by Alex Papadimoulis but the guy who "designed" those tables think I am just trying to do things my way and my way is nothing better :)

Good to know I am not crazy. :)

ammoQ

[quote user="ataulhaq"]

Thanks for the comments. I had the idea presented by Alex Papadimoulis but the guy who "designed" those tables think I am just trying to do things my way and my way is nothing better :)

Good to know I am not crazy. :)

[/quote]

The database design described in the original post is definitely stupid. Even if you would not want to implement Alex' idea, because hard disk space is too expensive ;-), a simple table

action_date request_type hit_count (primary key: action_date+request_type)

would be much better, IMO. On the other hand, chances are that if your application generates so many hits that log space is a problem, concurrent access to those counters could also cause problem. (Still less problems than the original design, anyway)

vr602

ammoQ definitely has the right idea here. The idea of a log of individual 'events' is interesting, but costly and a maintenance headache; I think you should think hard before implementing that one, especially if you really only require granularity of a day for your report: there's no point in knowing when someone hit menu option X to the nearest microsecond, if you are just trying to get rough counts of what happened on a given day. Regards.

ammoQ

[quote user="vr602"]

ammoQ definitely has the right idea here. The idea of a log of individual 'events' is interesting, but costly and a maintenance headache; I think you should think hard before implementing that one, especially if you really only require granularity of a day for your report: there's no point in knowing when someone hit menu option X to the nearest microsecond, if you are just trying to get rough counts of what happened on a given day. Regards.

[/quote]

In my experience, event logging is not that much of a problem. We have activated event logging in several installations, from ~50 to several hundred concurrent users; no, it doesn't crash the database. The real gain of this is that when some error happens, it's easy to examine what exactly the user did and whether it was his fault (wrong data entered) or the fault of the program. The performance impact of our event logging is hardly noticeable at all.

ataulhaq

Yea. I agree. If the system managed to hold itself with that bad design, i think it will still work with this new design. Anyhow, the idea is going to be (eventually) load tested with several hundred thousand concurrent users simulation program. So, we will measure the performance hit and other related issues.

Besides, it's really important to have that information for debugging purposes. So, I agree that it's really helpful to track issues with such detailed event log information. The logs table will be archived regularly but for reporting purpose, we will have background jobs that will aggregate results for each day, week, month and year. With too much information going into even tables, we don't want to risk running reports off the same table, since it's going to be aggregated reports anyways and caching is part of the framework.