SQL help - rows into columns?
-
I have here the potential to create a truly epic WTF-laden view in SQL Server 2005, and I'd like some help to avoid that. I know there has to be a better way, but I'm still pretty new to all this and I can't find it.
The story:
I need to retrieve data from 1 table. There's 2 columns of interest, "Variable" and "Value". For the most part, they do not contain unique values. "Variable" references a question box on a web form, and "Value" references the answer to said variable. Some boxes are free text, some are combo boxes, and some are radio selects. Every time somebody fills out a form and hits "Submit" a new row is added to this table with the data for Variable and Value. There's 700 unique Variables, 5000 unique Values but over 130000 total. The reporting user wants a view that will use every distinct "Variable" value as a column, and also have the original Variables and Values within the rows.
So here's a sample from the original table:
Variable Value
LastName Whiskey
RandomLabel3 A
LastName Tango
aafdehrExH8 Foxtrot
RandomLabel3 A
And here's the proposed view:
LastName RandomLabel3 aafdehrExH8
Whiskey NULL NULL
NULL A NULL
Tango NULL NULL
NULL NULL Foxtrot
NULL A NULL
Additional difficulty: I have to give the user what she's asking for, not what she needs.
Proposed Solution:
Yes, this hurts my soul. I don't even know if it will work, if it's actually the best/only way to go about this, or if it will actually cause the universe to implode when I hit "Execute".
Start out with your typical SELECT statement, but instead of giving it a column list, give it sp_executesql. Then pass it a dynamic SQL block, which uses a cursor to find unique Variables, sort them into a list, and create an nvarchar for all 700 of them. It will concatenate these nvarchars into a massive list of columns and send it back to the main query.
When you get to the FROM part of the query, it will start with the main table, then join it to a subquery on their common Variables. This subquery will use the same list of columns, and only 1 row. That row will also have the Variables in it for data.
The resulting view will have all the rows of the original table, and all the columns of the original table plus the 700 from the view. Then she should be able to throw some filters on it and see...whatever it is she's trying to see.
I looked at PIVOT but it only seems to deal with aggregate data.
-
I would SELECT the table into a program, then do all your column generation in the program.
Maybe start with SELECTing DISTINCT Variables, then iterating over those and SELECTing all the values for that variable, adding a column in (I assume) your excel doc output for each distinct Variable. You would end up with cascading values rather than the haphazard order in which the rows were originally inserted.
-
How fun! I've lived this exact problem with an EAV-based database....the numbers weren't as high, but there were many instances of some modules that looked for "Agent_id", others for "AgentId", and still others for AgenttId"
First and foremost, you can't do this in a view as (1) the # of columns are unknown and (2) you can't call SP from views.
But the good news (Yay!) is that you can do it with a SP pretty easily. Idea is:
SET @sql = 'SELECT ID '
SELECT @sql = @sql + ', ' + Variable + ' = (SELECT _C.Value FROM Clusterfuck _C WHERE _C.ID = C.ID AND _C.Variable = '+ Variable +' )'
SET @sql = ' FROM Clusterfuck C'EXEC sp_executesql @sql
If this absolutely has to be a View, then the only way to do that would be to have a SP that generates the View using the above method.
Depending on how often the data is queried, your best bet would be to have a SP that runs periodically that dumps it to an indexed table ("ClusterfuckNormalized").
-
@Alex Papadimoulis said:
If this absolutely has to be a View, then the only way to do that would be to have a SP that generates the View using the above method.
or a view that selects * from a function returning a table generated with the above method
-
I guess there's no reason I can't make a table instead of a view. This might be the kind of report they run ad-hoc several times in as many minutes, so precalculating everything would make it a lot more responsive. They don't need up-to-the-minute data so I could just refresh it every week or so. The more I think about it, dumping it into a table seems like the best route.
Thanks for the advice! This whole thing still feels...wrong...somehow though. ;)
-
Am I missing something or do you just need to look up the PIVOT command?
TRWTF: Random <p> tags that the community server kept adding mid-word while typing this reply.
-
@josh26 said:
Am I missing something or do you just need to look up the PIVOT command?
TRWTF: Random <p> tags that the community server kept adding mid-word while typing this reply.
I looked into PIVOT, but it won't work here for 2 reasons:
1. I need to select a dynamic column list and PIVOT needs a static list
2. PIVOT must contain an aggregate function, and I can't find a way to do that and still get a useful column list back.