FoxPro is ten times faster than MSSql



  • @RaceProUK said:

    You'd be happy with 9%?

    It's an upgrade from .99%!



  • @swayde said:

    I only do cartesian joins

    May you never cross paths with a violent DBA.



  • @SeriousEion said:

    The Cloud

    https://what.thedailywtf.com/t/give-all-your-data-to-the-clown/54340

    @SeriousEion said:

    cursors we were using

    Cursors are generally slow.

    @SeriousEion said:

    mostly to set up some 'real world' benchmarking

    Sensible!



  • @SeriousEion said:

    The main one is that the cursors we were using were being ine

    Cursors? No wonder.



  • @SeriousEion said:

    The main one is that the cursors we were using were being inefficient (something about running sp_cursor on the server a lot), by switching to forward-only cursors we were now running the queries on the server rather than one cursor lookup per row - maybe that makes more sense to MSSQL people

    Yes, this is why you are getting such horrible performance. If a trace shows the client issuing anything other than sp_exececutesql, the raw statement you fed to ODBC, or sp_reset_connection, then whatever you are doing is probably fundamentally broken.



  • @SeriousEion said:

    So switching to forward-only cursors gives us about a 20% improvement on MSSQL from what we've seen so far, but it's at the penalty of rewriting all our queries to make sure that all TEXT columns are the last columns in the SELECT, a not-small undertaking - and also means we can't count up the number of results using ODBC functions, we have to do a separate SELECT COUNT(*) query. So we are testing speeds with forward-only cursors too, but because there's too many errors the numbers aren't very reliable.

    Maybe I'm not understanding how you're using cursors, but I encourage you to take a look at the CURSOR doc for SQL Server. FAST_FORWARD should definitely be used if you can. Still, you should really be looking for set-based solutions that eliminate the need for procedural loops like cursors. The only thing I've ever written that had to use cursors was a stored procedure that uses dynamic SQL to create a server login, database user, schema and set of views based on active directory membership. DDL often doesn't allow variables in creation statements, so it had to be that way.

    Also, you should be aware that TEXT is a deprecated data type (along with NTEXT and IMAGE). You should be working to migrate away from that to VARCHAR(MAX) (along with NVARCHAR(MAX) and VARBINARY(MAX)).

    You shouldn't need a separate SELECT COUNT(*) query if it's that simple. COUNT() is allowed to be a window function, so you can specify the OVER() clause: COUNT(*) OVER() should work if you just want a field with an overall row count. That may not be ideal since it's got to transfer that extra data every row, but it may perform better than a second query. Note that the OVER() clause is ANSI SQL, and everybody except MySQL supports it, AFAIK. For MySQL, you'd probably have to CROSS JOIN (SELECT COUNT(*) ...).



  • @BaconBits said:

    Maybe I'm not understanding how you're using cursors

    The word "cursor" is used in several different contexts.

    The cursor that is created when you do DECLARE CURSOR as referenced in the MSDN link you provided is a horrible thing that should only be used when there is no alternative. "Cursor" is also used to describe how the client handles data received from the server, sometimes with the server's help, and sometimes completely independently of the server. ODBC cursors are the latter definition.



  • @Jaime said:

    The word "cursor" is used in several different contexts.

    Whoops, yes. I forgot that there's two different definitions of cursors.

    We're not using MSSQL cursors, just using raw SELECT statements with ODBC which has has the choice of four different cursors to connect to a database (static, dynamic, forward, keyset). Someone tried to explain it as client-side versus server-side cursors, but that kind of confused me more than it helped :)



  • @SeriousEion said:

    Someone tried to explain it as client-side versus server-side cursors, but that kind of confused me more than it helped 😄

    It's worse than that. ODBC cursors can be either client-side or server-side. Something like a keyset driven cursor can be implemented by ODBC on top of a back end that doesn't support cursors, but a dynamic cursor requires the server's help. It's even theoretically possible for an ODBC cursor to be implemented with the DECLARE CURSOR statement, but I've never seen it.



  • Gross.

    Conceptually it sounds like deciding to use either System.Data.SqlDataAdapter and .Fill() a DataTable, closing the connection, and then iterate through the rows of the DataTable as a "client cursor", or using System.Data.SqlDataReader and iterating through the rows one at a time directly as a "server-side cursor" since SqlDataReader leaves the connection open until you've finished iterating through all the data.

    Why you'd want the database driver of all things to handle that, I'm not sure. Probably made more sense before well defined data structures like DataTables.

    My guess then is that this is what he's probably running into. Again: might be worthwhile to migrate away from TEXT. No idea if or whether your ancient ODBC driver would support VARCHAR(MAX), though.


  • Discourse touched me in a no-no place

    @BaconBits said:

    Again: might be worthwhile to migrate away from TEXT. No idea if or whether your ancient ODBC driver would support VARCHAR(MAX), though.

    I find it very hard to see what the difference between them is (other than “we don't want to support TEXT any more”). Can anyone explain?



  • Text is harder to interact with. varchar(max) acts just like a varchar for everything. text data needs helper function like TEXTPTR and UPDATETEXT to work with the content of long strings.



  • TEXT and NTEXT can't be searched with a LIKE expression, and most string manipulation functions don't work with them. I think can't do full text search on them, either. It used to be that you couldn't read or write to them directly, either. Originally you had to use TEXTPTR, READTEXT, WRITETEXT, and UPDATETEXT to access the columns because they were LOBs (large objects) where the only thing stored with the row is a small (16 byte) pointer. I think this dates back to when the largest string allowed was a VARCHAR(255), so we're talking like SQL Server 6 or 7 or earlier. In my experience with TEXT fields SQL Server 2008+ you don't need those special functions anymore to read, write, or update, but it may be that there's some implicit type casting going on that the query engine doesn't tell you about. IMAGE is basically like TEXT with the same weird access functions, but it's a binary field.

    Now you're allowed up to a VARCHAR(8000) or an NVARCHAR(4000) which keep the fields as row data as much as it can, or you can use VARCHAR(MAX) or NVARCHAR(MAX) which gives you 2GB but you'll get LOBs (with 24 byte pointers). I've been told conflicting information about whether or not the MAX versions will store date on-row if it fits, but I've never investigated it myself. Also, no matter what you do, a row in SQL Server is never allowed to be larger than one page (8K less overhead, so 8060 bytes) so if you have a large row then some of the data will be stored off-row like a LOB. The system will move data off-row as best it can, but eventually you'll hit that limit. If you create a table with 350 VARCHAR(MAX) fields, for example, you'll get a warning that your table definition allows for values that exceed the limit.



  • @BaconBits said:

    I've been told conflicting information about whether or not the MAX versions will store date on-row if it fits

    That's because it's an option.



  • @BaconBits said:

    Conceptually it sounds like deciding to use either System.Data.SqlDataAdapter and .Fill() a DataTable, closing the connection, and then iterate through the rows of the DataTable as a "client cursor", or using System.Data.SqlDataReader and iterating through the rows one at a time directly as a "server-side cursor" since SqlDataReader leaves the connection open until you've finished iterating through all the data.

    ADO.Net was designed to avoid cursor problems. A better example is RDO, an ancient COM library that is a wrapper over ODBC.

    Sample RDO code to update data that uses a cursor implementation to do its thing:

    Set rs = cn.OpenResultset("SELECT * FROM pubs", rdOpenKeyset, rdConcurRowVer)
    While Not rs.EOF
      rs.Edit
      rs("pub_id") = 4
      rs.Update
      rs.MoveNext
    Wend
    

    All of the stuff that RDO uses to track which record is being updated and how to transport the data back to the server and update the correct data would be the "cursor".


  • Discourse touched me in a no-no place

    @BaconBits said:

    TEXT and NTEXT can't be searched with a LIKE expression, and most string manipulation functions don't work with them.

    Thanks. Don't know why they didn't make TEXT be an alias for VARCHAR(MAX), at least for new databases, but there's probably some technical reason of which I'm blissfully unaware. And keen to stay that way. 😄


  • FoxDev

    @dkf said:

    Don't know why they didn't make TEXT be an alias for VARCHAR(MAX)

    Because TEXT and NTEXT are not stored in the table itself, so more rows of the actual table can fit in memory at a given time.

    it's an optimization thing. if you have a lot of text and you don't need to search it ever using TEXT lets the database do things more efficiently. It also is a hint to full text indexing stuff to not bother with the field because it's non searchable.

    although in the general case i would argue that both VARCHAR(MAX) and TEXT are :doing_it_wrong: as there is almost certainly a better architecture.*

    * i said in the general case! yes i'm sure you can show me an use case for which both VARCHAR(MAX) and TEXT are correct solutions, those would be specific examples.


  • Discourse touched me in a no-no place

    @accalia said:

    i would argue that both VARCHAR(MAX) and TEXT are :doing_it_wrong: as there is almost certainly a better architecture.

    They solve the same general problem: wanting to keep a text document (of a priori unknown size) in the database. Presumably there are other fields that provide useful metadata about the document. The alternative is to just keep filenames in the database and require the client to figure out all that stuff (plus figuring out how to mount the device holding the file, of course) and that's at least as bad.

    The problem comes when some maroon decides to make all their fields be VARCHAR(MAX) (or TEXT or whatever) but some people are just dumbasses.



  • @dkf said:

    Don't know why they didn't make TEXT be an alias for VARCHAR(MAX)

    You can't create a synonym for data types, so the "for new databases" idea isn't feasible.


  • Discourse touched me in a no-no place

    @Jaime said:

    You can't create a synonym for data types

    Well we can't.


  • FoxDev

    @accalia said:

    Because TEXT and NTEXT are not stored in the table itself, so more rows of the actual table can fit in memory at a given time.

    VARCHAR(MAX) and NVARCHAR(MAX) are also stored outside the table ;)



  • @RaceProUK said:

    VARCHAR(MAX) and NVARCHAR(MAX) are also stored outside the table

    Sometimes. TEXT also is sometimes stored on the row page.



  • @dkf said:

    Thanks. Don't know why they didn't make TEXT be an alias for VARCHAR(MAX), at least for new databases, but there's probably some technical reason of which I'm blissfully unaware. And keen to stay that way. 😄

    Not sure if this really counts, but Microsoft has deprecated TEXT and NTEXT.

    ntext, text, and image data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.



  • @dkf said:

    Thanks. Don't know why they didn't make TEXT be an alias for VARCHAR(MAX), at least for new databases, but there's probably some technical reason of which I'm blissfully unaware. And keen to stay that way. 😄

    The same reason PowerShell.exe is and will always be stored at C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe, even on Windows 10 which ships with PowerShell v5.0 (and, no, there's no soft link). Because Microsoft has built their empire on nearly unbreakable backwards compatibility. It wasn't until 64-bit operating systems came about that you couldn't run 16-bit applications. Indeed, you can still run 16-bit applications on Window 10 32-bit. You just have to turn it on.


  • Discourse touched me in a no-no place

    @abarker said:

    Not sure if this really counts

    Well, it was what we were talking about.

    I know, I know, reading the whole thread is a :barrier: to posting…


Log in to reply