This is a cross-post from StackOverflow, I thought I'd ask here as well in case anyone had any insight.
I'm trying to query an SQL Server view with Entity Framework and only return rows that are distinct based on multiple columns.
I've tried a solution based on this answer (GroupBy
and then Select(g => g.FirstOrDefault())
, but I'm still getting duplicate rows.
Table structure (this is a fairly complex view in the real database, but the final output is similar in structure to this example):
CREATE TABLE Example (
ID_A BIGINT,
ID_B BIGINT,
ID_C BIGINT,
Type_A NVARCHAR(50),
Type_B NVARCHAR(50),
ID_Z BIGINT,
Foo NVARCHAR(200),
Bar NVARCHAR(200)
)
Example data:
INSERT INTO Example (ID_A, ID_B, ID_C, Type_A, Type_B, ID_Z, Foo, Bar)
VALUES (1, 1, 1, 'TypeA1', 'TypeB1', 1, 'foo1', 'bar1'), -- This row and the next one represent the same main record (1) joined to different "Z" records (1 and 2)
(1, 1, 1, 'TypeA1', 'TypeB1', 2, 'foo1', 'bar1'),
(2, 1, 2, 'TypeA2', 'TypeA2', 1, 'foo2', 'bar2'), -- This row and the next two represent the same main record (2) joined to different "Z" records (1, 2 and 3)
(2, 1, 2, 'TypeA2', 'TypeA2', 2, 'foo2', 'bar2'),
(2, 1, 2, 'TypeA2', 'TypeA2', 3, 'foo2', 'bar2')
ID_A
, ID_B
, ID_C
, Type_A
and Type_B
identify the "main" entity and ID_Z
identifies the joined "Z" entity. Foo
and Bar
are non-unique data columns that need to be included in the final results.
For each combination of main ID/Type values, there can be multiple ID_Z
values. I need to filter by the ID_Z
values and then return the distinct main entity values (based on the ID/Type values).
I've tried a query like the following:
// The `ID_Z` values to filter on
var zIDs = new List<long> { 1, 2 };
var result = context.Set<ExampleEntity>()
.Where(e => zIDs.Contains(e.ID_Z))
.GroupBy(e => new { e.ID_A, e.ID_B, e.ID_C, e.Type_A, e.Type_B })
.Select(g => g.FirstOrDefault())
.ToList();
But this seems to return all of the rows matching the Z_ID
filter (resulting in duplicate "main" values) instead of only returning the first row for each set of "main" ID/Type values.
If I materialise (ToList
) the query directly after GroupBy
, I seem to get the correct groupings; but I'd like to run this all in the DB and avoid using LINQ to Objects queries.
How can I create this query?