Internet is down, please reboot it



  • One of my client has an intranet portal called internet available at the address http://internet/ (and don't you dare forget the last slash). I had hopes that the IT people would be highly sarcastic because on the first conference call I heard the lead developer talk about implementing some "web-over-IP" feature. Turns out they are just plain stupid:

    • The head of QA usually starts testing applications (including web-over-IP ones) by setting the focus in a textbox and leaving his finger on the space bar for a long time to see if the application is a candidate for buffer overrun.
    • The senior DBA told me that he was an expert in Server Query Language.
    • The daily backup takes 32 hours to complete, and that's the incremental one (the full one runs once a month but fails most of the time)
    • They lost the source code for their custom accounting software, a "client-server" application that has both the client and the server on the same machine (the server is a GUI too), so they can't delete an old user account which is used in the hard-coded DB connection string.

    And this is no mom & pop shop, they even have a PCI certification.



  • @Speakerphone Dude said:

    The daily backup takes 32 hours to complete...

    I've seen this before. Usually what happens is having 2 backups running at the same time slows things down enough that backups take longer and longer, until you end up with so many running simultaneously that the whole server goes down.

    Good times.



  •  @Speakerphone Dude said:

    some "web-over-IP" feature.

     

    I run "web-over-ICMP" you insensitive clod!

     


  • ♿ (Parody)

    @Speakerphone Dude said:

    The head of QA usually starts testing applications (including web-over-IP ones) by setting the focus in a textbox and leaving his finger on the space bar for a long time to see if the application is a candidate for buffer overrun.

    How many buffer overflows has he found, then?



  • @boomzilla said:

    How many buffer overflows has he found, then?
     

    His case started beeping from holding space too long.


  • ♿ (Parody)

    @dhromed said:

    @boomzilla said:
    How many buffer overflows has he found, then?

    His case started beeping from holding space too long.

    If that isn't a candidate for further investigation, then I don't know what is!!!!11



  • @Speakerphone Dude said:

    And this is no mom & pop shop, they even have a PCI certification.

    If this is true, then perhaps PCI compliance no longer means anything at all.  Just another way for some committe to milk money out of companies without providing any real service.

     


  • Trolleybus Mechanic

    @KattMan said:

    If this is true, then perhaps PCI compliance [b]no longer[/b] means anything at all.  Just another way for some committe to milk money out of companies without providing any real service.
     

    Was going to FTFY, but I thought a Socratic lesson would be better.

    When was PCI compliance first useful?



  • @Lorne Kates said:

    @KattMan said:

    If this is true, then perhaps PCI compliance no longer means anything at all.  Just another way for some committe to milk money out of companies without providing any real service.

    Was going to FTFY, but I thought a Socratic lesson would be better.

    When was PCI compliance first useful?

    Around 1993, when it began replacing ISA. I don't think anyone important cares about it anymore.



  • @Speakerphone Dude said:


    And this is no mom & pop shop, they even have a PCI certification.

    That's several certifications out of date. Do they have SCSI, AGP or PCI-E certifications?



  • Are you working for my old employer, or is this "best practices", perhaps even a set of qualification demands?

    The pain, the pain. I remember asking once about db backups on one of the systems, and the IT administrators told us they made one. It even completed. However, it turned out that it couldn't be used. If we were very lucky, it might be useful for a full restore, but nobody had tried that.

    Why, o why?



  • @Speakerphone Dude said:

    The senior DBA told me that he was an expert in Server Query Language.

       

      Sounds like a good opportunity to teach them something.   If I were you I'd do an analysis of their infrastructure and charge them $300/hr to fix the issues.  Whats their largest database and do they even do DB backups?

       



    • Isn't the web ALREADY over IP? (And TCP, and HTTP - is the "big deal" that they managed to do the web over IP and ONLY IP?)



    • I bet you $50 that those 32-hour incremental backups have never been tested and don't work.


    • Trolleybus Mechanic

      @ekolis said:

      Isn't the web ALREADY over IP? (And TCP, and HTTP - is the "big deal" that they managed to do the web over IP and ONLY IP?)
       

      No, the web is over my phone line, but I see young people today with the web on their eye phones, so I guess the wire part isn't used any more.

      All I know is that Facebook was installed on my computer, though I'm not sure why they used a blue "E" instead of a blue "F". Especially since they used a different shade of blue. It's very confusing.



    • @galgorah said:

      @Speakerphone Dude said:

      The senior DBA told me that he was an expert in Server Query Language.

         

        Sounds like a good opportunity to teach them something.   If I were you I'd do an analysis of their infrastructure and charge them $300/hr to fix the issues.  Whats their largest database and do they even do DB backups?

         

        They refused to switch to Full Recovery model because they tried in the past and "the transactions logs become too big and fill the whole disk". According to them, database clusters are useless because they have redundant disks and power supply in their high end hardware, and anyways a cluster means a maintenance window too big for the patch deployment process (!!!). That tells you the general skill set in that team.

        Already I was able to convince them to disable their "optimization" maintenance plans during business hours because the databases were awfully slow; they were shrinking databases, rebuilding logs, etc every 20 minutes. They had come to that approach after having this optimization run every day and the backup were taking forever, so they increased the frequency of the plans gradually as performance was degrading (due to "database fragmentation"). I was able to show them that most of the times the maintenance jobs were failing because the previous instance was still running and the only way to get them to agree to this was to help them start an investigation about I/O performance. Which anyways suck because their SAN administrator does not believe in allocating multiple volumes to a same machine (it's all the same underlying disks!) so they have all the heavy workloads (data, logs, backups) run on the same I/O path and all the databases are on a same disk pool (one pool for databases, one pool for filers and one for VMs).

        This type of situation is very tricky because there are so many problems, trying to address all of them at the same time is actually more dangerous than the status quo. The only viable approach is to do an actual analysis of what are the business requirements to address the more urgent items first, then build a looooooooooooooooooong roadmap to cover everything else.



      • @boomzilla said:

        @Speakerphone Dude said:
        The head of QA usually starts testing applications (including web-over-IP ones) by setting the focus in a textbox and leaving his finger on the space bar for a long time to see if the application is a candidate for buffer overrun.

        How many buffer overflows has he found, then?

        He told me this was a trick that his CS teacher taught him 20 years ago and the guy was a genius so it was a "proven method". If I was doing development in that company I would put an event listener on the key_down event and pop a message box saying: "Stop pressing that key!" after the input text size has reached 50 or so.



      • @ubersoldat said:

         @Speakerphone Dude said:

        some "web-over-IP" feature.

         

        I run "web-over-ICMP" you insensitive clod!

         

        ICMP is IP.



      • Maybe they are not in business of software.


        You may become surprised when you see how back offices in bank are currently operating, if you take close look.

        @Speakerphone Dude said:

        One of my client has an intranet portal called internet available at the address http://internet/ (and don't you dare forget the last slash). I had hopes that the IT people would be highly sarcastic because on the first conference call I heard the lead developer talk about implementing some "web-over-IP" feature. Turns out they are just plain stupid:

        • The head of QA usually starts testing applications (including web-over-IP ones) by setting the focus in a textbox and leaving his finger on the space bar for a long time to see if the application is a candidate for buffer overrun.
        • The senior DBA told me that he was an expert in Server Query Language.
        • The daily backup takes 32 hours to complete, and that's the incremental one (the full one runs once a month but fails most of the time)
        • They lost the source code for their custom accounting software, a "client-server" application that has both the client and the server on the same machine (the server is a GUI too), so they can't delete an old user account which is used in the hard-coded DB connection string.

        And this is no mom & pop shop, they even have a PCI certification.



      • @Nagesh said:

        Maybe they are not in business of software.

        UPS isn't in the business of building trucks, but if their trucks keep bursting into flames because of inept maintenance, that's a problem.



      • @Speakerphone Dude said:

        @galgorah said:

        @Speakerphone Dude said:

        The senior DBA told me that he was an expert in Server Query Language.

           

          Sounds like a good opportunity to teach them something.   If I were you I'd do an analysis of their infrastructure and charge them $300/hr to fix the issues.  Whats their largest database and do they even do DB backups?

           

          They refused to switch to Full Recovery model because they tried in the past and "the transactions logs become too big and fill the whole disk". According to them, database clusters are useless because they have redundant disks and power supply in their high end hardware, and anyways a cluster means a maintenance window too big for the patch deployment process (!!!). That tells you the general skill set in that team.

          Already I was able to convince them to disable their "optimization" maintenance plans during business hours because the databases were awfully slow; they were shrinking databases, rebuilding logs, etc every 20 minutes. They had come to that approach after having this optimization run every day and the backup were taking forever, so they increased the frequency of the plans gradually as performance was degrading (due to "database fragmentation"). I was able to show them that most of the times the maintenance jobs were failing because the previous instance was still running and the only way to get them to agree to this was to help them start an investigation about I/O performance. Which anyways suck because their SAN administrator does not believe in allocating multiple volumes to a same machine (it's all the same underlying disks!) so they have all the heavy workloads (data, logs, backups) run on the same I/O path and all the databases are on a same disk pool (one pool for databases, one pool for filers and one for VMs).

          This type of situation is very tricky because there are so many problems, trying to address all of them at the same time is actually more dangerous than the status quo. The only viable approach is to do an actual analysis of what are the business requirements to address the more urgent items first, then build a looooooooooooooooooong roadmap to cover everything else.

          Sounds like they know just enough to be dangerous about their systems.  I agree you are going to have to roadmap this, over several projects most likely.  Education is going to have to be a big part of it to.

          It sounds like your going to need to look at storage up through the DB server configuration and on into queries, etc.  It sounds like this is SQL Server, so I might suggest running Brent Ozar's [url="http://www.brentozar.com/blitz/"]blitz[/url] script.  I've personally found it invaluable in getting a quick overview of a server's health at a glance.  I've got some updates I sent over to Brent, which will hopefully be merged into it in the next version.  One of the nice things about the script is it ouputs a url discussing each of the  issues it finds and best practice surrounding those issues.  It sounds like you arepretty knowledgeable yourself, but it never hurts to be able to hand a client a reading list for follow up after some training.

           



        • @galgorah said:

          It sounds like this is SQL Server, so I might suggest running Brent Ozar's blitz script.  I've personally found it invaluable in getting a quick overview of a server's health at a glance.  I've got some updates I sent over to Brent, which will hopefully be merged into it in the next version.  One of the nice things about the script is it ouputs a url discussing each of the  issues it finds and best practice surrounding those issues.

          They use SQL Server so thanks for this link, it sounds promising.

          In recent years I've seen a lot of improvement in Oracle (automated storage management, automated tuning, grid computing, etc.) and I've come to realize that when a client has no skilled DBA things may get to a point where Oracle is a more sensible solution - which was a completely absurd scenario 10 years ago. Oracle RAC makes SQL Server failover cluster look obsolete and hard to maintain. Granted, it takes a specialist to do the initial setup with Oracle but in the long run it may be worth it. As for the price, now that Microsoft is moving to a per-core licensing (instead of per-socket) the gap between the two products is closing rapidly.

          BTW your link now bookmarked on all my Google-haunted devices (that star in the Chrome address bar is a mighty icon). Ok this is off-topic but over the last year I've found myself more and more relying on Google stuff. I have a Nexus phone and I use Chrome on my PC and laptops so all my contacts, bookmarks, search history, emails are in sync automatically, and now with Google drive even my files are available on all my devices automatically. I know that there is something similar with Apple but it's not as well integrated and it does not cover all the features (maps, etc.). I've been using an iPod Touch for a while and it was a big improvement over my previous mp3 players, but now I use only the Nexus and the iPod is just an expensive bluetooth storage device for music hidden in my car's glove compartment.



        • @Speakerphone Dude said:

          They use SQL Server so thanks for this link, it sounds promising.

          In recent years I've seen a lot of improvement in Oracle (automated storage management, automated tuning, grid computing, etc.) and I've come to realize that when a client has no skilled DBA things may get to a point where Oracle is a more sensible solution - which was a completely absurd scenario 10 years ago. Oracle RAC makes SQL Server failover cluster look obsolete and hard to maintain. Granted, it takes a specialist to do the initial setup with Oracle but in the long run it may be worth it. As for the price, now that Microsoft is moving to a per-core licensing (instead of per-socket) the gap between the two products is closing rapidly.

           

          BTW your link now bookmarked on all my Google-haunted devices (that star in the Chrome address bar is a mighty icon). Ok this is off-topic but over the last year I've found myself more and more relying on Google stuff. I have a Nexus phone and I use Chrome on my PC and laptops so all my contacts, bookmarks, search history, emails are in sync automatically, and now with Google drive even my files are available on all my devices automatically. I know that there is something similar with Apple but it's not as well integrated and it does not cover all the features (maps, etc.). I've been using an iPod Touch for a while and it was a big improvement over my previous mp3 players, but now I use only the Nexus and the iPod is just an expensive bluetooth storage device for music hidden in my car's glove compartment.

          With SQL server 2012 your not paying the same price per core that you were paying per socket.  Per core is<font size="3"><font face="Calibri"> $6874 as opposed to the per socket cost of $27k.  You do need to have a minimum of 4 cores licensed per socket however.  So if you get a dual core processor, your still paying for 4 cores licensed.  The real expense though comes from processors with more than 4 cores.  </font></font>

          Brent also has several articles and videos that discuss the intersection of SQL Server, SAN technologies, and Virtualization.  These all include scripts, suggested perfmon counters, etc to help make your case.

          Also check out Adam Machanic's [url="http://sqlblog.com/blogs/adam_machanic/archive/2012/03/22/released-who-is-active-v11-11.aspx"]sp_WhoIsActive[/url].  Its another great script that makes performance analysis so much easier.

          Most of my maintenance is automated through Multi server administration with SQL Agent Jobs.  I have one server that functions as a master where I load my jobs and other server listed as targets.  This way during a maintenance window the job can kick off on multiple servers.  But it only ever needs to be modified on one.  I'm personally not a big fan of SSIS based maintenance plans since they are not really all that flexible.  I prefer to selectively rebuild or reorganize my indexes based on fragmentation level for example   

           

           

           



        • @galgorah said:

          Per core is<font size="3"><font face="Calibri"> $6874 as opposed to the per socket cost of $27k.  You do need to have a minimum of 4 cores licensed per socket however.  So if you get a dual core processor, your still paying for 4 cores licensed.  The real expense though comes from processors with more than 4 cores.</font></font>

          So basically, you're paying the same for 4 cores or less, but more for over 4 cores. So it does represent a pretty significant price increase (a 12-core socket would cost 3x as much under the new regime).



        • @morbiuswilters said:

          @galgorah said:
          Per core is<font size="3"><font face="Calibri"> $6874 as opposed to the per socket cost of $27k.  You do need to have a minimum of 4 cores licensed per socket however.  So if you get a dual core processor, your still paying for 4 cores licensed.  The real expense though comes from processors with more than 4 cores.</font></font>

          So basically, you're paying the same for 4 cores or less, but more for over 4 cores. So it does represent a pretty significant price increase (a 12-core socket would cost 3x as much under the new regime).

          Correct.  Also keep in mind those with Software Assurance and processors over 4 cores will still have to pay additional.  Each processor under SA only converts to a 4 core license. So if you have a six core processr, you still need to license 2 more cores.  The shops that will see the big increase however are generally going to be much larger shops with much larger budgets.



        • @galgorah said:

          price per core
           

          what the fuck is this.

           

          Edit
          I have a hypothesis that makes this non-bizarre, but I'm going to wait for an explanation proper from you guys why they want to charge me for running software faster.



        • @dhromed said:

          @galgorah said:

          price per core
           

          what the fuck is this.

           

          Edit
          I have a hypothesis that makes this non-bizarre, but I'm going to wait for an explanation proper from you guys why they want to charge me for running software faster.

          When you license server products (like most RDBMS) you can either license per user or per server CPU (or both). Until recenty on Microsoft the CPU licensing was per physical CPU socket on the server motherboard; it is now per core in each CPU. Since a typical server has 2 or 4 CPU with 4 to 8 cores per CPU, the price goes up quickly when you get a bigger machine.



        • My hypothesis was (note that I really don't set up server hardware at all, so I'm out of my depth here. I just type var and int and make it go) that X cores could, for any practical purpose, facilitate X virtual machines, thus requiring OS licenses and software instance licenses.

          Your explanation kind of helps to see how it got from there to here, but here is still a really bizarre place. So if anyone can tell me why this is fair, let me know. It looks like Adobe upping their price for Photoshop 400% if you have a quad-core CPU. That is beyond ridiculous, so why is it normal for server software? What is the difference?

          They want a piece of the action of your business scale?



        • With SQL 2008 R2 it was more economical to buy a single socket license if you had more than around 35 users, and if you ran the server in a VM, you had to have enough socket licenses to cover all sockets on the host regardless of the virtual cores assigned to the VM. 2012 changes this by requiring you to only buy licenses for the number of cores assigned to the VM (with licenses being sold in 2-core packs, minimum 2 licenses/4 cores; from what I've seen, the price for 4 cores is nearly the same as was the price for a single socket; you need to buy socket/core license if you intend to use the database for any web-accessible application, since you'd need a CAL for each individual user that accesses the website otherwise).


          Speaking of OS licenses, Windows Server Enterprise lets you run up to 4 virtual instances per licensed physical host, and Datacenter allows unlimited virtual instances.



        • @dhromed said:

          My hypothesis was (note that I really don't set up server hardware at all, so I'm out of my depth here. I just type var and int and make it go) that X cores could, for any practical purpose, facilitate X virtual machines, thus requiring OS licenses and software instance licenses.

          Your explanation kind of helps to see how it got from there to here, but here is still a really bizarre place. So if anyone can tell me why this is fair, let me know. It looks like Adobe upping their price for Photoshop 400% if you have a quad-core CPU. That is beyond ridiculous, so why is it normal for server software? What is the difference?

          They want a piece of the action of your business scale?

          I guess that the logic behind per-CPU or per-core licensing is that companies that need bigger servers can afford bigger licensing fees. Depending on the specific product Oracle licensing can be insane, with formulas that take into account not only the CPU count and the core count, but also the frequency of the CPU.

          A license per server (versus per user) is typical in large organizations or in situations where a server should be internet-facing (in which case you don't control how many users will use the server). And most licensing schemes are very carefully designed to bypass "clever" situations like having a 3-tier architecture where one could argue that the database server is only accessed by the web server account (this is called multiplexing and usually leads to a situation where a per-CPU license is needed).

          The math is funny. A SQL Server client access license is around $150 (per user). A SQL Server CPU license is about $25k (per CPU). If you have 4 CPU on a server designed to support 1000 users, the CPU license is cheaper, but if you have only 50 users then the CAL is cheaper... unless you have more than one physical server; having a per-CPU (or per-core) license for SQL Server means that you can install as many instances of SQL Server in as many VM as you wish on the same physical server, and a user that has a CAL can access as many SQL Server as he wants. Lots of fun doing what-if scenarios at budget time.

          Virtualization is a slightly different topic because the licensing changes depending on the product and edition. As an example, a single license of Windows Server Standard edition (the cheapest) is required for each VM, but the Enterprise edition covers 4 virtual machines on a same host (+the hypervisor if it's Hyper-V), and the Data Center edition covers an unlimited number of VM on the same host. But recently EMC/VMware has started to play greedy and the RAM has come into play for licensing VMware products, which means that while it was a no-brainer a few years ago to have very advanced hardware and run as many VM as possible on each host, it is becoming muddier. No surprise that a lot of companies are switching to Hyper-V (Microsoft's virtualization product), which is pretty good and way cheaper.



        • The answer is: Microsoft licensing for SQL Server is fucking retarded as shit, doesn't take into-account cloud-based services (having zero actual CPUs; although the big ones have signed deals now, but it's still fucking expensive to run it in the cloud).

          The reason for this is that it's still 20 times more sane than DB2 or Oracle licensing. So they still come out ahead. Barely.



        • @Speakerphone Dude said:

          A SQL Server CPU license is about $25k (per CPU).
          Interesting, the price I've been given for a 2-core SQL 2012 license is ~3600€+VAT.



        • @ender said:

          @Speakerphone Dude said:
          A SQL Server CPU license is about $25k (per CPU).
          Interesting, the price I've been given for a 2-core SQL 2012 license is ~3600€+VAT.

          That is probably SQL Server Standard Edition. The figure I gave is for Enterprise.



        • Yeah, I'm going to make a toasted peanutbutter sandwich.

          At least the ROI there is 100% and no funny business (literally).


        Log in to reply