City State Country Best Practices



  • I'm trying to find a solution for this that doesn't include a horrific workaround or a transactional record.

    I am doing up a db that includes the city-state-country tablesets, but I will need to include non north american data, which will not have a state. Ideally I would like to have only one link in the transaction - city, but so far this would not work with non-na data.

     Things I've considered :

    put all three in, autofill those that are north american

    place flag in city table to denote non-na

    duplicate country data in province, linking the countries 

     

    So far the only viable option is the first one, as the others have much larger chances of ballooning out of control.

     My question is - what is the best practice for this? Has anyone come up with an elegant solution that doesn't require horrible workarounds or increase maintenence?

     Thanks


  • ♿ (Parody)

    Most databases that have to represent this data look like:

    Countries: {Country_Code, Country_Name}
    Regions: {Country_Code, Region_Code, Region_Name}  // states, provinces, etc
    Cities: {Country_Code, Region_Code, City_Name}

    CountryGroups: {Group_Code, Country_Code} // NAM, SAM, EUR, etc

    CountryGroups are rare, and may not be necessary if you only care about US and CA.

    Either way, you could easily make a Cities_Extended view that joins the necessary tables and adds a column [NAM_Indicator] or something.



  • Hi Alex,

    I won't be  requiring country groupings, but I will need non-NA data.

     I don't seem to 'get' your solution, unfortunately. Without adding either a redundant 'country_code' field to city (for those that don't have a region), or putting in a flag the table to denote the region_code points to country instead, how do I set the join up for the view?

    In case I was not clear enough, I need to record a building in its location around the world. I'd like the data to resemble something like {address Data, City, More Data}, but that  won't work out for non-NA cities unless I figure a way around the city->region->country link. The other alternative is {address data, city, region, country, more data}, but I want to avoid that if possible.

    Currently the structure I have is what you posted above, but there is no direct city->country link, but I will need to represent one somehow.

     Thanks again



  • @chadsexington said:

    In case I was not clear enough, I need to record a building in its location around the world. I'd like the data to resemble something like {address Data, City, More Data}, but that  won't work out for non-NA cities unless I figure a way around the city->region->country link.
    You shouldn't have city->region->country links.  You should have building->city, building->region (may be null), and building->country references.


  • ♿ (Parody)

    @chadsexington said:

    there is no direct city->country link, but I will need to represent one somehow

    There shouldn't be such a link; that's not how 99.9% of the cities around the world work. There are only a handful of exceptions (Vatican, for eg, which is a City+State+Country), and in those cases, you can just create a region that is named the same as the country. But, it's a bad idea to model your data around the few exceptions.

    To get a city -> Country link, you need a virtual table (view); however, if your data is designed right, you can already tell what countries a city is in, as the PK of a city is COUNTRY+REGION, so, "SELECT * FROM CITIES WHERE COUNTRY_CODE <> 'US'" gets you all non-US Cities.



  •  Hi Again,

     

    The majority of the data WILL have regions associated with them (either province or state), as it will be mostly NA data. The execption is the non-na data with no region associated.

    So your suggestion is to have the city table containing both FK for country and province and not have a country link in the province table (or have one to filter lists).

    What I need to represent is something like Ottawa/Ontario/Canada, or Berlin/Germany. I didn't want to include city/province/country in the transaction records, hopefully just the city ID.

     

    And as for creating a region that is the same as the country, I did not want to do that, as I would have to duplicate all the non-NA data in the province (region) table to link it to their country. 

     

    Thanks again


  • ♿ (Parody)

    You seem to have a fundamental misunderstanding of how cities are represented: all cities are in regions, some regions may countain other regions, and all regions are in countries. The only exception to this rule are sovergn city-states like the Vatican, and, as I mentioned, there is an appropriate way to deal with those.

    To use your example, Berlin is both a Region and a City. Some choose to use Berlin's buroughs as cities, others choose to represent it as Berlin, BE. An example closer to home is Washington, DC. DC is not a state and Washington is not a city. Washington, DC is both a Region and a City.

    If your data is missing region and/or city data, than your data is incorrect. You should not be designing a system based on incorrect data. Instead, try correcting the data. If you need dropdowns for future selections, then use one of the countless free lists (ISO, etc) that provide every country/region/city. If it's for existing data, then correct it either by hand or with a script.

    If you choose to go with some bizarre, non-standard design, then it doesn't matter how you do it, because you will be doing it wrong.



  • The error was in the nomenclature. The data is not to accurately depict the globe, or specific areas, but to pin down a location with a workable address. The 'region' will be represented by either a province or a state if it is North America, but would not be necessary if it is in England, Italy, Germany, etc. 

     I am trying to avoid some non-standard bizarre design, which is why I am here.  The reason that we will not be recording any of the misc sub regions of non-north american cities is the data is not needed, and would be missing from the physical requests.

    Again, if I am not clear, here is what is being represented in the transaction -

    Request1 - {some data} , Ottawa, Ontario, Canada

    Request2 - {some data}, Berlin, Germany

     Both of these are 'complete' records. We don't need to know which burrough in germany it is (or districts, I can't remember which they have), but we do need to record that Ottawa is in Ontario. I am hoping to find a solution that does not require the transactional records to contain cityID, provinceID, countryID, but instead only have cityID and have the relationship within the city/province/country tables. This would be easy to do if it either did not require non-North American data, or if each Non-North American country had a concise province/state breakdown that we could use. Even if we included the 'burrough ' in the province table for germany, since our requests would not contain the 'burrough', only the city and country, it would be difficult and time consuming to find the correct one (or throw code at - neither of which I want).

     Anyway, thanks again for your help, and I hope I have been clear in regards to what I am looking for

     



  • @Alex Papadimoulis said:

    If you choose to go with some bizarre, non-standard design, then it doesn't matter how you do it, because you will be doing it wrong.

     

    And don't forget to send Alex your code and DB design when you're done :)


  • ♿ (Parody)

    I think you're being very clear, but there's a gap in understanding that a region is a critical part of the data structure. If you do not have data for the region (such as Berlin, Germany), then you should use a placeholder value ("N/A", "-", "", NULL, etc) to represent the unavaiable region. In other words,  "Berlin, Germany" becomes "Berlin, -, Germany".

    City: { CityID=4828, CityName="Berlin", RegionCode='-' }
    Region: { CountryCode='DE',RegionCode='-', RegionName=NULL}
    Country: { ContryCode='DE', CountryName='Germany' }

    Note how I used "Code" instead of "ID". Don't make your own identification system for countries; ISO has already done a good job of that.



  •  Hi Alex,

     So this method would essentially take all the non-North American countries and put a placeholder for each country in the regions table, correct? I'm not experienced with 'best practices' (and my employer is definitely not), but is this the 'standard' way to depict what I'm after?

     

    Thanks again

    Ninja Edit : I can't use any free list unless it also appears in french. Hooray for bilingualism.


  • ♿ (Parody)

    @chadsexington said:

    So this method would essentially take all the non-North American countries and put a placeholder for each country in the regions table, correct?

    Exactly. Then, in your view, it's just a matter of doing NULLIF('-',Region_Code) AS Region_Name.

    @chadsexington said:

    but is this the 'standard' way to depict what I'm after?

    It's a very common problem, and this model best represents it.



  •  Excellent.

     Thank you very much. I owe you one piece of WTF code.



  • @chadsexington said:

    Thank you very much. I owe you one piece of WTF code.
    Alex seems awfully nice until you remember he's subjecting us all to Community Server. :)



  • @chadsexington said:

    So this method would essentially take all the non-North American countries and put a placeholder for each country in the regions table, correct?

    Except that you're forgetting about places like Brisbane, Queensland, Australia.  Not *all* non-North American places lack a Province/State designation - just like not all places *in* North America do (see above note regarding Washington DC.)


Log in to reply