Naming things is hard
-
About a year ago I found a problem with the Oracle ODP.net managed driver, wherein the sql generator for migrations would ignore provided index names and instead always calculated a name based on the used columns (which then are truncated because it exceeds 30 characters). That in turn made "down" migrations fail because the default is to just drop the index by name, without providing the column definitions, hence trying to remove an object with a different name.
Now enter Devart dotConnect for Oracle, this piece of database technology does not have the previously mentioned defect, but in turn has a very
cleverstupid implementation of the truncate to 30 chars logic.For foreign keys (I didn't look further after checking that part) it truncates like this
int hashCode = calculatedName.GetHashCode(); if(hashCode >= 0) return $"FK_{hashCode.ToString(CultureInfo.InvariantCulture)}"; else return $"FK_N{Math.Abs(hashCode).ToString(CultureInfo.InvariantCulture)}";
Now what have the docs to say about string.GetHashCode();
Remarks
The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.
Notes to Callers
The value returned by GetHashCode() is platform-dependent. It differs on the 32-bit and 64-bit versions of the .NET Framework. It also can differ between versions of the .NET Framework and .NET Core.
You probably guessed it, this makes database migrations on long running projects fail in all kinds of ways.
-
@OloEopia said in Naming things is hard:
About a year ago I found a problem with the Oracle
You're slow.
-
@OloEopia seems to me that the real is the "truncating identifiers to 30 characters" part. But I'm no expert.
-
@Benjamin-Hall said in Naming things is hard:
@OloEopia seems to me that the real is the "truncating identifiers to 30 characters" part. But I'm no expert.
Up to 12c Relase 2 (march 2017) identifiers in Oracle can only be 30 bytes, since then it's 128 but the 30 limit is still enforced by various tools
-
@OloEopia said in Naming things is hard:
@Benjamin-Hall said in Naming things is hard:
@OloEopia seems to me that the real is the "truncating identifiers to 30 characters" part. But I'm no expert.
Up to 12c Relase 2 (march 2017) identifiers in Oracle can only be 30 bytes, since then it's 128 but the 30 limit is still enforced by various tools
So yeah, there's your -in-a-can. How long are your identifiers? And does that limit apply to fully-qualified identifiers (ie
table.column
) or just to the pieces?
-
@Benjamin-Hall said in Naming things is hard:
So yeah, there's your -in-a-can. How long are your identifiers? And does that limit apply to fully-qualified identifiers (ie
table.column
) or just to the pieces?The limit applies to database objects, so both table and column in your example can be up to 30 bytes, making 61 bytes in total including the .
But the identifiers that run into problems aren't explicitly named, but rather auto generated. Stuff like foreign keys where the auto generated name is something likeFK_DependentTable_PrincipalTable_Column1_Column2
...
-
@OloEopia More fun: GetHashCode is meant to provide a value unique-enough to go into a HashTable with an acceptable rate of collisions - not to avoid collisions altogether.
It prioritizes speed over uniqueness.
-
@error That's just the possibility of collisions, on the other hand the UseRandomizedStringHashAlgorithm config option gives you new hash codes on each subsequent run of the application (the default in .net core)
The common language runtime computes hash codes for strings on a per application domain basis. Identical strings in different application domains and in different processes will have different hash codes.
-
@OloEopia Wait, that's the default?
Is it a security thing? Or a let's-intentionally-make-this-not-work-so-no-idiot-will-try-to-use-it-as-a-key thing?
Obviously, the latter didn't work.
-
@error said in Naming things is hard:
@OloEopia Wait, that's the default?
Is it a security thing?
Actually yes, because if you know the hash algorithm you have a nice denial of service attack for anything that puts your user submitted data into a hashset.
-
@OloEopia said in Naming things is hard:
Actually yes, because if you know the hash algorithm you have a nice denial of service attack for anything that puts your user submitted data into a hashset.
The better fix at the application level is not to let an entirely untrusted source specify enough keys for it to really matter.