Dis-tract-ed
-
We have a number of old Access-based "applications" (I use that term loosely). Typically these consist of a database, often with just one gigantic table, with a crude interface to edit the data and view reports. We were having problems with one of these, so I was asked to investigate. I noticed a field called "tract", which often contained a four-digit number, but sometimes held a decimal-like value. (Its data type was set to "text", naturally.) I did a
SELECT DISTINCT tract ORDER BY tract
to see what people had been entering in this field, and I got this:1011.01
1011.02
1012
1013
1014
1015
1016
1017
1017.00
1018
1019
1021
1021.01
1021.02
1022
1023
1023.00
1024.00
1024.01
1024.02
1025
1026
1027
1028
1029
1031
1031.00
1032
1033
1034
1035
1036
1036.00
1037
1038
1038.00
1039
1040.00
1041
1042
1043
1044.00
1045
1046
1047.02
1048
1048.00
1049
1049.00
1051
1051.00
1053
1053.00
1054
1055
1055.00
1056.01
1056.02
1057
1057.00
1058
1059
1059.00
1061
1061.00
1062
1062.00
1063
1063.00
1064
1065
1066
1068
1068.00
1069
1069.00
1070
1072
1075
1076
1077
1078
1079
1083
1084
1086
1087
1087.00
1088
1089
1093
1096
1097
1098
1104.00
1105
1105.00
1106
1107
1108
1109
1109.00
1111
1112
1113
1114.01
1114.02
1115
1115.00
1116
1117
1117.00
1118
1119.01
1119.02
1121
1121.00
1121.01
1122
1123
1124
1125
1125.00
1126
1126.00
1128
1129
1131
1131.00
1132
1133
1134
1135
1135.00
1136
1136.00
1137
1137.00
1138
1138.00
1139.00
1141
1141.00
1142
1143
1143.00
1144
1145
1146
1146.00
1147
1148
1149
1149.00
1151.00
1152
1152.00
1153
1153.00
1154
1154.00
1155
1157
1157.00
1158
1158.00
1159
1159.00
1161
1162
1162.00
1163
1163,00
1163.
1163.00
1164
1164.00
1165
1165.00
1166
1166.00
1167
1167.00
1168
1168.00
1169
1169.00
1171.01
1171.02
1172.01
1172.02
1172.07
1172.1
1173
1173.00
1174
1175
1175.00
1176
1176.00
1177
1177.02
1178
1178.00
1179
1179.00
1181
1181.00
1182
1182.00
1183
1183.00
1184
1184.00
1185
1186.01
1186.02
1187
1187.00
1188
1188.00
1189
119.01
1191
1192.02
1193
1193.00
1194.01
1194.02
1195.01
1195.02
1196
1197.01
1197.02
1198
1198.00
1199
1199.00
12.18
120
1201
1202
1204
1204.00
1205
1205.00
1206
1206.00
1207.01
1207.02
120702
1208.01
1208.02
1208.2
1208.22
1211
1211.00
12115
1212
1212.00
1213
1214.0
1214.01
1214.02
1214.04
1215
1215.00
1216
1216.00
1217
1217.00
1218
1218.00
1219
1219.00
1221
1221.00
1222
1223
1223.00
1231
1231.00
1232
1232.00
1233
1234
1234.00
1235
1235.01
1235.02
1236.00
1236.01
1236.02
1236.03
1237
1237.00
1238
1238.00
1239
1239.00
1241
1241.00
1241.01
1242.01
1242.02
1243
1243.00
1244
1244.00
1245
1245.00
1246
1246.00
1251.00
12603.0
1261
1261.00
1264
1275
1351.06
1381.09
1405
1407.01
1413
1503
1518
1543
1544
159.00
1606.01
1612
1618
17305
1731.03
1773.03
1835
1861.05
1891.04
1913
1915
1941
2221
30343
3215
44102
4599
8970
PeacockI'm not sure what to do with this information.
-
...Peacock
-
Could have been worse. They could have been entering [url="http://www.chick.com/reading/tracts/0046/0046_01.ASP"]Chick Tracts[/url]
-
@OldBrooklyn said:
We have a number of old Access-based "applications" (I use that term loosely). Typically these consist of a database, often with just one gigantic table, with a crude interface to edit the data and view reports. We were having problems with one of these, so I was asked to investigate. I noticed a field called "tract", which often contained a four-digit number, but sometimes held a decimal-like value. (Its data type was set to "text", naturally.) I did a
SELECT DISTINCT tract ORDER BY tract
to see what people had been entering in this field, and I got this:
[...]I'm not sure what to do with this information.
I'd try
SELECT * GROUP BY tract HAVING COUNT(*) = 1
. It might be enlightening.
-
@OldBrooklyn said:
I'm not sure what to do with this information.
You discover who put the peacock at the database, and make him take his peacock to some other place.
-
At my last job we dealt with mapping data. One client provided us with their data and they had never heard of domain constraints. There was a manhole table with a column for location, which is typically a domain-coded value with 10 - 20 possible selections. They had it set up as a straight text field and as you can imagine there were so many variations of each possibility. I'm pulling this from memory (actually my rectum) but I remember DISTINCTing the data and seeing something like this.
- Middle of street
- Street
- street
- STREET
- The street
- THE STREET
- THE STREET
- street
- Alley
- Aley
- alley
- Sidewalk
- Side Walk
- side walk
- sidewalk
- The sidewalk
- Teh sidewalk
- Tree
IIRC there was approximately 900 distinct values. One of the bosses put the data into a pie chart and used it in his presentations on the importance of well-structured data. As you can imagine it was a useless pie chart and almost brought the Almighty Excel to its knees trying to render it. (We never did figure out how a manhole could be in a tree.)
-
@Fjp said:
I'd try
SELECT * GROUP BY tract HAVING COUNT(*) = 1
. It might be enlightening.Not exactly enlightening. Some values were entered many times, some were entered only once. No rhyme or reason.
Inexplicably (or perhaps it was to be expected?), "Peacock" was entered exactly three times.
Perhaps if I say it out loud I will summon a creature...?
-
@mott555 said:
(We never did figure out how a manhole could be in a tree.)
Probably means that it was under or right next to the tree. Either the street was closed and a field took over the place later, or they put the manhole in a field to begin with. I've seen manholes like that before.
-
@Snooder said:
Probably means that it was under or right next to the tree. Either the street was closed and a field took over the place later, or they put the manhole in a field to begin with. I've seen manholes like that before.
One guy mentioned that and we decided that was the most likely possibility. However these guys were professional land surveyors with decades worth of combined experience and even they weren't sure what it meant.
-
@mott555 said:
We never did figure out how a manhole could be in a tree.
I would assume it was one of the varieties of trees whose branches very nearly touch the ground. If you've got to burrow through branches to get to the manhole, it's more "in" the tree than "under" the tree. And I can quite picture someone wasting a full day muttering "where the F*CK is that manhole, it should be right around here somewhere" before finding it and wanting to save the headache for the next person who needed to find it...
-
Eight ways of representing "street"? Pfft! My colleagues and I used to compare how many ways the cities in our county were entered into our various systems (until we got both bored and horrified). I won. No less than 24 ways of representing "South San Francisco". No shitting 21 ways of representing "Redwood City" AND, sure as fuck, 11 ways of representing "Daly City". Daly City?!? Really?!? Eleven ways of typing in Daly fucking City?!? Eight fucking letters and a space. You stoopit cows really can't manage that? They even managed to spell Colma four different ways! I mean ... W.T.F.???
(l)users. Fuck 'em.
(/rant)I like the tree one, though.
-
@rudraigh said:
They even managed to spell Colma four different ways!
Who cares? There's no one alive there.
-
@mott555 said:
At my last job we dealt with mapping data. One client provided us with their data and they had never heard of domain constraints. There was a manhole table with a column for location, which is typically a domain-coded value with 10 - 20 possible selections. They had it set up as a straight text field and as you can imagine there were so many variations of each possibility. I'm pulling this from memory (actually my rectum) but I remember DISTINCTing the data and seeing something like this.
- Middle of street
- Street
- street
- STREET
- The street
- THE STREET
- THE STREET
- street
- Alley
- Aley
- alley
- Sidewalk
- Side Walk
- side walk
- sidewalk
- The sidewalk
- Teh sidewalk
- Tree
IIRC there was approximately 900 distinct values. One of the bosses put the data into a pie chart and used it in his presentations on the importance of well-structured data. As you can imagine it was a useless pie chart and almost brought the Almighty Excel to its knees trying to render it. (We never did figure out how a manhole could be in a tree.)
A few years ago I was working for a big telco and they were rolling out a new product on their website: ringtones. There was a specific team in charge of the search engine for the ringtone catalog and every single person in that team was dishevelled and neurotic. One day I was riding the elevator with one of them and he was mumbling something. I asked him to repeat, he looked at me with crazy eyes that I will never forget and said: "you have no idea how some people spell 'Justin Timberlake'" and he kept shaking his head silently until he left the elevator.
-
@mott555 said:
We never did figure out how a manhole could be in a tree.
Didn't the database have any other information that specifically located the manhole? I mean, "Teh sidewalk" is only useful if you know which sidewalk...
This would enable you to solve the mystery by visiting the manhole and finding out where the tree comes into it.
-
@RTapeLoadingError said:
@mott555 said:
Which sidewalk? TEH SIDEWALK!!We never did figure out how a manhole could be in a tree.
Didn't the database have any other information that specifically located the manhole? I mean, "Teh sidewalk" is only useful if you know which sidewalk...
-
@RTapeLoadingError said:
This would enable you to solve the mystery by visiting the manhole and finding out where the tree comes into it.
Down! Down mind! Dirty mind. sorry, it's just all this talk of visiting man-holes and coming into them...
-
@El_Heffe said:
@RTapeLoadingError said:
@mott555 said:
Which sidewalk? TEH SIDEWALK!!We never did figure out how a manhole could be in a tree.
Didn't the database have any other information that specifically located the manhole? I mean, "Teh sidewalk" is only useful if you know which sidewalk...
Ah...TEH sidewalk, not TAHT sidewalk.
Now that I understand how the location scheme works I'll visit the tree manhole and report back tomorrow
-
@RTapeLoadingError said:
I'll visit the tree-man hole
Ah.... yes.... the ent-moet... is in... a cave...
-
@dhromed said:
@RTapeLoadingError said:
I'll visit the tree-man hole
Ah.... yes.... the ent-moet... is in... a cave...
I have returned from the site of the manhole and it was more Evil Dead than Ent. I think the tree itself must have added that description to lure inquisitive data miners.