Python - My take



  • @topspin said in Python - My take:

    LAPACK even being 4 orders of magnitude slower than their hand rolled C++ code

    Other way around, and it's not my hand rolled C++ code. Someone else's.

    They were calculating cosine similarity by just looping over everything.



  • @stillwater said in Python - My take:

    @mott555 said in Python - My take:

    @hannibalrex said in Python - My take:

    I'll say it: R

    Ah, yes. R. The bane of my stats class in college. Where we weren't really expected to understand statistics, but just use R to solve the problems, and yet the professor wouldn't teach us how to use R.

    I seem to be sitting in a giant pile of my own vomit now.

    I'm not sure when this stats class happened. If it was anytime before the last couple years then yeah R used to be just vomit. Without the tidyverse packages, it is super easy to write vomit code in R. If you still have to deal with all the vomit then I suggest you take a look at packages like tidyr and dplyr and replace just a few lines of code with code from these packages one code block at a time. That's how I used to clean R vomit :/

    I took that class around ten years ago.


  • BINNED

    @aygeeplus said in Python - My take:

    @topspin said in Python - My take:

    LAPACK even being 4 orders of magnitude slower than their hand rolled C++ code

    Other way around,

    That's what I meant, sorry. But the comparison was to hand-rolled C++, not to Python, for which @dkf explained the huge performance difference would be expected. For C++ you'd expect a difference, but not that large.

    and it's not my hand rolled C++ code. Someone else's.

    I know, their is plural. ;)



  • @topspin said in Python - My take:

    For C++ you'd expect a difference, but not that large.

    I think it really depends. Four orders is a lot, but ... if you or the compiler fails to vectorize loops, you get almost one order (AVX = 8x for floats, and you have two or EUs capable of doing vector add/mul/fma simultaneously). Estimating the penalty from not caring about caches is a bit harder, and depends a lot on the size of the inputs. But if you really go in for doing it wrong (i.e., read a single element from each cache line and throw away the rest), that's easily another if not two or more orders of magnitude. If it's a high-end CPU, you have a bunch of cores, which a simple loop wouldn't be using.



  • @cvi said in Python - My take:

    you have a bunch of cores

    Their first instinct was just to build a mondo server to do number crunching which sat at 5% cpu utilization because their analysis was single-threaded.



  • @aygeeplus said in Python - My take:

    mondo server

    Cloud provider? (Their homepage is singularly unhelpful.)



  • @cvi said in Python - My take:

    @aygeeplus said in Python - My take:

    mondo server

    Cloud provider? (Their homepage is singularly unhelpful.)

    I understood that as "mondo" == "very big". It's an archaic slang usage.


  • Discourse touched me in a no-no place

    @cvi said in Python - My take:

    But if you really go in for doing it wrong (i.e., read a single element from each cache line and throw away the rest), that's easily another if not two or more orders of magnitude.

    You'll get at least an order of magnitude just from going from a list of list of boxed Python floats to an actual efficient array. The overhead of Python's datatypes is painfully high (enough that even Tcl — the language famed for making everything be a string — is quite a bit faster). A matrix structured in the way that Fortran nested arrays work is vastly more efficient, so much so it isn't funny. For all that though, I don't regard this case as being a crippling problem with Python, but rather a usage that is explicitly not optimised for; using numpy instead of doing it all yourself is the official way to fix it here. (Also, I've dug down into the comparative performance of various languages quite a bit, and Python is surprisingly bad if you don't use numpy.)

    Doesn't stop other languages from using the same tricks. ;-)



  • @sockpuppet7 said in Python - My take:

    Is there any sort of C# interactive on Linux?

    Probably?

    @sockpuppet7 said in Python - My take:

    Monodevelop sucks when you're used to Visual Studio (the real one, not this vs code thing)

    It also hasn't been named that in like 5 years afaik.



  • @magus what isnt named like that and what is it named now?



  • @sockpuppet7 Xamarin took over development of Monodevelop a long time ago, and renamed it Xamarin Develop or something, and then purged the old name from everywhere they could. I think it might be somewhat okay now, but when I tried it last it was much worse than Monodevelop had been.



  • @magus said in Python - My take:

    @sockpuppet7 said in Python - My take:

    Is there any sort of C# interactive on Linux?

    Probably?

    @sockpuppet7 said in Python - My take:

    Monodevelop sucks when you're used to Visual Studio (the real one, not this vs code thing)

    It also hasn't been named that in like 5 years afaik.

    Not sure what happened in the interim, but https://www.monodevelop.com/

    Edit: It looks like Xamarin Studio is a fork of MonoDevelop.



  • @tharpa Basically, for a while, the download link just went to Xamarin Studio... now the windows version just goes to source, lol

    I liked it back before I got a job. It was one of the better IDEs I've used. I bet it's bad now.


Log in to reply