No need for a stinkin' search function



  • SELECT DISTINCT CATTYPE FROM RCINVPORTF WHERE CATTYPE LIKE '%TOP%'
    
     LAPTOP
    ADOBE PHOTOPSHOP
    ADOBE photopshop3
    computer (laptop)
    COMPUTER LAPTOP
    DELL Laptop
    Dell latitude computer laptop
    Dell latitude laptop
    Dell notebook laptop
    DELL notebook laptop
    LAOTOP
    LAPTOP
    Laptop
    laptop
    Lap-top
    LAPTOP COMPUTER
    Lap-top computer
    laptop computer 
    LAPTOP DELL
    LAPTOP PC
    Laptop PC DELL
    Latop
    LATOP COMPUTER
    NETBOOK DELL LAPTOP
    NOTEBOOK COMPUTER LAPTOP
    notebook laptop
    PC laptop
    PHOTOPSHOP 
    PORTABLE COMPUTER (LAPTOP?)
    TOPBRAID
    TopBraid
    Topbraid
    TOPBRAID COMPOSER
    TOPQUADRANT
    TOPQUADRANT TOPBRAID
    [...] etc, you get the picture
    

    There are currently 16k+ different categories in that inventory (which is supposed to be just for hardware but also contains quite a bit of software for some reason). I don't know how much money was saved by writing a lousy data model with a binary collation and/or by not providing a decent search function so people could select an existing category when feeding the inventory, but since I had to run a clustering algorithm just to display the trends in the daily maintenance of the inventory, I would say this was a bad business decision.



    I don't know why but this appears to be typical in ERPs. I've seen the same kind of WTF twice in the last few years.



  • @thistooshallpass said:

    SELECT DISTINCT CATTYPE FROM RCINVPORTF WHERE CATTYPE LIKE '%TOP%'

    LAPTOP
    ADOBE PHOTOPSHOP
    ADOBE photopshop3
    computer (laptop)
    COMPUTER LAPTOP
    DELL Laptop
    Dell latitude computer laptop
    Dell latitude laptop
    Dell notebook laptop
    DELL notebook laptop
    LAOTOP
    LAPTOP
    Laptop
    laptop
    Lap-top
    LAPTOP COMPUTER
    Lap-top computer
    laptop computer
    LAPTOP DELL
    LAPTOP PC
    Laptop PC DELL
    Latop
    LATOP COMPUTER
    NETBOOK DELL LAPTOP
    NOTEBOOK COMPUTER LAPTOP
    notebook laptop
    PC laptop
    PHOTOPSHOP
    PORTABLE COMPUTER (LAPTOP?)
    TOPBRAID
    TopBraid
    Topbraid
    TOPBRAID COMPOSER
    TOPQUADRANT
    TOPQUADRANT TOPBRAID
    [...] etc, you get the picture

    There are currently 16k+ different categories in that inventory (which is supposed to be just for hardware but also contains quite a bit of software for some reason). I don't know how much money was saved by writing a lousy data model with a binary collation and/or by not providing a decent search function so people could select an existing category when feeding the inventory, but since I had to run a clustering algorithm just to display the trends in the daily maintenance of the inventory, I would say this was a bad business decision.



    I don't know why but this appears to be typical in ERPs. I've seen the same kind of WTF twice in the last few years.

    I wonder what my tag script has to say about that.



  • @thistooshallpass said:

    <Snip many variations on the word "Laptop">
     

    You always see this sort of thing when people get to enter free text.  I guess options to this issue in general are:

    1. No free text entry, everything managed by lookups.
      Pros: No unexpected variations on a theme.
      Cons: Maintenance of lists required; required options may be unavailable; items classified incorrectly with 'best fit'.

    2. Free text entry
      Pros: Items can be classified as accurately as the user likes (may not be a pro)
      Cons: See OP

    3. Free text entry with ability to easily search for previous categories
      Pros: Items can be classified as accurately as the user likes; ability to use previously entered categories
      Cons: Users may choose to not search or search incorrectly -> see OP

    4. Free text entry with some sort of autocomplete / autosuggest

      Pros: Items can be classified as accurately as the user likes; ability to use previously entered categories

      Cons: Implementation may be annoying or broken (see Communty Server tags)*

     

    The apps we deal with use lookups and this works well in general as long as sufficient planning and modelling occurs when the system is set up.  The environentsalso lend themselves to fairly static 'categories' so there's not a massive admin overhead.

    Serious questions

    • Are you suggesting that a "free text with search" option would be a good fit for your environment? 
    • If so, do you think that there would be a barrier to people searching for an existing category that fits because it's easier to just type "Laptot" and be done with it?  Are therepositive/negative implications for the end-user for mis categorising items?
    • How would you go about making it as easy for the user to do the right thing than what they currently do?

     As someone who has been part of many data migrations from free text systems to lookup table systems I'd be interested to hear your ideas. 



  • @RTapeLoadingError said:

    The apps we deal with use lookups and this works well in general as long as sufficient planning and modelling occurs when the system is set up.  The environentsalso lend themselves to fairly static 'categories' so there's not a massive admin overhead.

    In my experience, planning is overrated. Ask any cook, and he'll tell you that the best chef is not the guy with the best recipes, but the guy that can make something tasty with whatever is available in the fridge. Huge ERP vendors spent zillions of dollars in planning and design, but with most of those products, the users have to change their habits and adap to the system or the implementation will fail. And guess what is the last thing a user wants? Yeah, change.

    Now ask any ERP vendor and they will tell you that their system can be completely customized. And this is true. The catch is that customization is not something that can be easily applied to a single system used by five thousand users. So while each customer can definitely get something custom-made, they still have to cater to the needs of many groups and departements with conflicting priorities. Back to square one.

    As for having just a few static categories, this is always the case at first. Then someone needs to filter some stuff, and they start to shove metadata in whatever textbox they can find (just look at SRV records in a Windows Domain DNS). And pretty soon, they use that texbox as a subcategory, or a type, or a class, or a label, or whatever - and they do that without consulting other departments first. So before you notice it, there are tons of data already polluted with multiple schemes of subclassification, and it's a different beast when you cannot go back to a blank drawing board. It's like becoming mayor of Detroit or president of Haiti - you inherit a lot of problems that could have been avoided but it's too late and you have to deal with it.

    My advice: always fight the urge to see categories as a property of something (1:N). Categories are relationships (N:N). As long as this basic design is respected, the sky is the limit, patch-wise; you can always correlate, filter, join your way back to sanity.

    @RTapeLoadingError said:

    Serious questions

    • Are you suggesting that a "free text with search" option would be a good fit for your environment? 
    • If so, do you think that there would be a barrier to people searching for an existing category that fits because it's easier to just type "Laptot" and be done with it?  Are therepositive/negative implications for the end-user for mis categorising items?
    • How would you go about making it as easy for the user to do the right thing than what they currently do?

     As someone who has been part of many data migrations from free text systems to lookup table systems I'd be interested to hear your ideas. 

    The situation I described occurs in a big ERP where countless people logon to the system to update the inventory. Unfortunately, there is no dropdown list for them to select a category, only a "search" textbox which returns exact matches only (no wildcard, binary collation so case sensitive). When there is no match, the user can click on "Create" to add a category. This leads to the quality of data shown in my original post.

    This situation is out of control, and many workarounds have been tried over the last few years, such as maintaining an Excel file with categories, or having just a few "power users" do the data entry. None of this worked. What I came up with was a simple scorecard showing the popularity of the various categories in the system. Now the users have to lookup a category on a web page before creating it, and if their category has a low score, there is a crude soundex match done to show similar but more popular categories. This won't really fix the mess but at least it will slowdown the progression while the power users are using another web page to improve data quality. On this web page a matcher use the same strategy (show the least popular categories that have a soundex match with a more popular category) and the power users can merge categories visually, fixing hundreds of records at a time. Unfortunately, like any batch solution, this tool will be faced with a hockey stick kind of ROI graph, making it more and more time-consuming to fix less and less data, but that's the nature of the beast. 80/20 and all that.

    As for the user incentive, at first that was an issue because all users have the same logon in the ERP (another WTF), making it difficult to find who did what. The solution I found was to pull data quality metrics and slice them by  date/hour based on the audit field. This allows managers to view on a daily basis how the inventory is doing, and some of them have been using this info to create performance dashboards. These metrics can also track what proportion of the data entry has been previously validated via the web page lookup (where individual logons are used).

    So to answer your questions: in an organization where there is no taxonomy authority, yes, a free text with search would be a good solution, as long as the results can be quantified. A free-form entry with a good matcher will at least let the user see if his entry is mainstream or completely unique. It's like using Google to do a quick spell-check - most of the time the number of results (and the auto-complete) will provide you with the correct spelling. And what is great with that kind of solution is that it allows the artificial building of peer pressure - users are lazy and will always try to find the easy way but at the office nobody wants to stick out of the crowd.

    Now a disclaimer: a lot of this jives with the theory behind NoSQL products, such as Cassandra. But IRL there are huge problems associated with this kind of technology (ie: Digg, Reddit) so it's not a silver bullet.



  • @thistooshallpass said:

    ADOBE PHOTOPSHOP

    ADOBE photopshop3

     

     

    mi likez adobe illustratopr moar


  • ♿ (Parody)

    @thistooshallpass said:

    Now ask any ERP vendor and they will tell you that their system can be completely customized. And this is true. The catch is that customization is not something that can be easily applied to a single system used by five thousand users. So while each customer can definitely get something custom-made, they still have to cater to the needs of many groups and departements with conflicting priorities. Back to square one.

    TRWTF is that management sees a bespoke system developed for and around a particular organization and says, "Hey! We've got an off the shelf product here! Let's sell it to everybody!"



  • @thistooshallpass said:

    PHOTOPSHOP
     

    Best pho in town!



  • @thistooshallpass said:

    @RTapeLoadingError said:

    The apps we deal with use lookups and this works well in general as long as sufficient planning and modelling occurs when the system is set up.  The environentsalso lend themselves to fairly static 'categories' so there's not a massive admin overhead.

    In my experience, planning is overrated. Ask any cook, and he'll tell you that the best chef is not the guy with the best recipes, but the guy that can make something tasty with whatever is available in the fridge. Huge ERP vendors spent zillions of dollars in planning and design, but with most of those products, the users have to change their habits and adap to the system or the implementation will fail. And guess what is the last thing a user wants? Yeah, change.

     

    I can't say that I agree entirely with your sentiment here.  Certainly it's possible to over-plan but we have been called in to work on plenty of systems where insufficient planning has taken place and the implementation has evolved into a mess.  To use your analogy, the best chef can make an awesome meal from the ingredients at hand but all too often we turn up and get served "beef trifle".

    With complex off-the-shelf products there's always got to be a trade off between what it does as standard and what it can be configured / made to do.  There's no sense in hanging on to practices that are inefficient but as you rightly say, people hate change especially if they cannot see it as change for the better.  People also fear being optimised out of a job.

     

     @thistooshallpass said:

    <Snip details of ERP operation>

      I like the idea of the crowdsourcing solution whereby the most popular choices are flagged as such.  Makes sense for a system where free-text entry is in use for whatever reason.

     



  •  When the pdf is sitting on the webserver itself, and not the samba mount, everything works fine, regardless the browser.  This makes me very certain it's not MS-specific code.  The entire site is built in php, and our developer is majorly into web standards.


  • :belt_onion:

    @RTapeLoadingError said:

    I can't say that I agree entirely with your sentiment here.  Certainly it's possible to over-plan but we have been called in to work on plenty of systems where insufficient planning has taken place and the implementation has evolved into a mess.  To use your analogy, the best chef can make an awesome meal from the ingredients at hand but all too often we turn up and get served "beef trifle".

     

    Agreed. To use your analogy further, when I'm lazy, I can throw whatever meat and veggies I want into the Crock Pot for dinner later, but (though good) it's never going to be as good as when I use a family recipe and cook an excellent Italian delicacy from scratch. Lots of planning went into that, and boy, does it show. (Now I'm hungry.)

    Why do you think ERP systems spend "zillions of dollars" planning? I think it has something to do with the fact that they have a ton of customers because of it (recouping their cost) and 90%+ of their customers don't require any customizations because of how well-designed the system is.

     



  • @dhromed said:

    @thistooshallpass said:

    PHOTOPSHOP
     

    Best pho in town!

    No matter how good it is, I can't see it making decent women's clothing.

    Lady Gaga might wear a pho dress, I suppose, but it's hardly mass-market.




  • @RTapeLoadingError said:

    You always see this sort of thing when people get to enter free text.  I guess options to this issue in general are:

    1. No free text entry, everything managed by lookups.
      Pros: No unexpected variations on a theme.
      Cons: Maintenance of lists required; required options may be unavailable; items classified incorrectly with 'best fit'.

    2. Free text entry
      Pros: Items can be classified as accurately as the user likes (may not be a pro)
      Cons: See OP

    3. Free text entry with ability to easily search for previous categories
      Pros: Items can be classified as accurately as the user likes; ability to use previously entered categories
      Cons: Users may choose to not search or search incorrectly -> see OP

    4. Free text entry with some sort of autocomplete / autosuggest

      Pros: Items can be classified as accurately as the user likes; ability to use previously entered categories

      Cons: Implementation may be annoying or broken (see Communty Server tags)*

     

    1b. No free text entry, everything managed by lookops. "Add cathegory" button next to the listbox.

    Pros: Minimum of unexpected variations on theme, users, discouraged by the overhead click needed, will tend to find cathegory first, and if no close match is in the list, then  add a new one

    Cons: Might not work in reality

     



  • @SEMI-HYBRID code said:

    "Add cathegory" button next to the listbox.
    I think I've spotted a problem already.



  • @El_Heffe said:

    @SEMI-HYBRID code said:
    "Add cathegory" button next to the listbox.
    I think I've spotted a problem already.
     

    adding a button usually isn't such a big problem, if you don't use native windows API or MFC.


Log in to reply