Python - My take


  • Banned

    @aygeeplus said in Python - My take:

    @gąska A very good question. If you know of one please let me know?



  • @mott555 said in Python - My take:

    @magus Fascinating. I really need to move on from Visual Studio 2008.

    yes. but migrate to vs 2017. it will eat all your hard-disk space and you can ask manager to buy new laptop for you.



  • Python has one or two good features, which I wish someone would steal and implement in a better language. Like javascript. :trollface:



  • @cartman82 I did find myself wishing I had settimeout in python the other day. :/



  • @cartman82 JavaScript needs some threading, maybe they can take it from Python 🚎



  • @aygeeplus said in Python - My take:

    @cartman82 I did find myself wishing I had settimeout in python the other day.

    You don't need threads when you have setTimeout.



  • @cartman82 I hate it when you realize you produced a real wtf a couple days ago.

    I could have defined a small lambda that did time.sleep() and then the thing and called it in a different thread. :(

    EDIT: DAMN now I don't even remember why I wanted to do that. WTF averted, technically?



  • @cartman82 said in Python - My take:

    Python has one or two good features, which I wish someone would steal and implement in a better language. Like javascript. :trollface:

    This post is the winner today! ⚕ 🤑 🤑 🤑



  • @nagesh I find the opposite to be true. 2010, 2012, 2013, 2015... all of those did horrible things to my machine, but 2017 seems fine.



  • @magus said in Python - My take:

    @nagesh I find the opposite to be true. 2010, 2012, 2013, 2015... all of those did horrible things to my machine, but 2017 seems fine.

    How many component have you installed? Enterprise or professional? I find all my Solid State Disk full after installing VS 2017 Enterprise Edition.


  • Banned

    @nagesh the better question is, how big is your SSD.



  • @gąska Mine is 2.5". Is that big enough?


  • Notification Spam Recipient

    @hannibalrex Welcome to the forums, lurker!



  • @mott555 said in Python - My take:

    @gąska Mine is 2.5". Is that big enough?

    No, but don't feel bad. You can always buy a Big SUV.



  • @cabrito said in Python - My take:

    No, but don't feel bad. You can always buy a Big SUV.

    I already have a crew cab diesel pickup. I guess I could put a monster lift kit, giant mud tires, and some truck nuts on it.


  • Discourse touched me in a no-no place

    @blakeyrat said in Python - My take:

    Imagine of those libraries were .net assemblies and researchers could use any damned language they wanted.

    I don't have anything to do with the choices that lead to the researchers wanting to use Python (it was like that when I got there) but people doing AI and neuromorphic computing tend to work with Python more than most other languages.

    There are good features of Python. I like how package installation is nicely low-fuss for users. That's something they've made work very well; it's a remarkably tricky problem and someone's seriously obsessed over doing the right thing (well, majority-case right thing). It's not perfect, but it's very good. Having heard people here grumbling about weirdnesses of NuGet, I thank my lucky stars I'm professionally dealing with languages which get that stuff even more right.

    But the Python implementation itself is made of flaming psycho rhino turds. It's awful and it's surprisingly slow too.


  • Discourse touched me in a no-no place

    @onyx said in Python - My take:

    I haven't figured out how to properly write a long running script in Python without it just murdering the CPU.

    At least on Unix, you can use selectors so that you can do event-based processing.

    A pity that that API is still horribly low level. And nobody seems to have properly bitten the bullet to get rid of that ridiculous only-sockets-on-Windows restriction. (It's because nobody's written a good interface to WaitForMultipleObjects or I/O completion ports, the latter probably because Python's threads are so fucking awful.)

    The stupid bit is that I know how to fix this. But my fix starts with throwing away Python entirely, and probably involves burning the Python source code to the ground, salting the earth from whence it arose, and then stampeding wild horses over the area to ensure that nobody will ever know where all that occurred. It probably goes downhill after that…


  • Discourse touched me in a no-no place

    @magus said in Python - My take:

    C# Interactive has been in since VS2015...

    There was an interactive Java implementation called Beanshell; it wasn't very popular, and it was pretty slow. I guess a better implementation could have been written, but that didn't happen…



  • @nagesh Certainly not all of them. Why would I? The new installer makes it much easier to add things on when I need them, and installing anything to do with Android or C++ seems dangerous, considering how long they take.

    nagesh: "ZOMG ITS HUGE!"
    magus: "Have you tried not installing absolutely everything, especially things you don't need?"



  • @dkf

    "In the following, events is a bitwise mask indicating which I/O events should be waited for on a given file object."

    Shouldn't we have something better than bitwise masks by now? This is supposed to be a high-level abstraction over select??



  • @nagesh why is that called a solid state disk? it's not like normal hard drives are liquid or something



  • @magus said in Python - My take:

    Do you prefer Python 2, Python 3, Jython, or Iron Python?

    No matter your answer, someone will be along shortly to tell you why you're wrong!

    Downvoted for excluding pypy



  • @createdtodislikethis The prediction came true!


  • Impossible Mission - B

    @blakeyrat said in Python - My take:

    @dfdub said in Python - My take:

    From what I've heard about JNI, it's neither easy nor fast. I've never used it myself, but my colleagues who did mentioned that function calls through JNI have quite a lot of overhead.

    Well I've never used it either, but pinvoke in .NET is damned close to native speed.

    Yeah, JNI is awful compared to p/invoke. I've heard it said that this was done by Sun deliberately, to make it painful enough that people would think twice about using JNI instead of just doing it in Java, though I don't know if that's true or just a rumor. JNI is certainly bad enough that it's plausible, though...



  • @masonwheeler If Oracle were trying to make something awful, it'd probably come out extremely good.


  • BINNED

    @blakeyrat said in Python - My take:

    I can't actually think of a worse language for data science.

    That's pretty easy to explain, actually: It's used only as a very easy way to script the driver logic, not do the actual data processing. You write a few lines and then NumPy calls fast code, or Tensor Flow puts your NN on the GPU, or things like that.
    Comparable to how game engines are written in C++ but game logic is often scripted in Lua. The actual performance critical stuff is fast, for everything else developer time is more valuable.

    @blakeyrat said in Python - My take:

    It's (comparatively) super-easy to interface with native code from Python, so you have a nice scripting language to interact with native code optimized for performance.

    That's true of literally every interpreted language I know of. (Except perhaps Ruby; I haven't used it enough.)

    Last year I've written an interface for one of our C++ libraries in both Python and Matlab (yes, fuck, I know). Python was done in half a day and Matlab took forever to get right.


  • BINNED

    @dkf said in Python - My take:

    I wish it had threading that wasn't done by a fucking moron. And which is now so deeply grafted into things that making it unfucked will break a lot of code. Seriously, everything you say that's bad about Open Source (and which I usually disagree with you about, FWIW) is entirely true about Python's threading.

    That's as good as having no threading at all.
    Several times I've read to just use the multi-processing module and that's "the UNIX way". I'm unsure if that's a good reason to grudge over UNIX* or just a retarded statement.

    (* I do that anyway because, after all these years, I can't be convinced fork was a good idea just because it was only a few lines to implement decades ago)



  • @topspin Even if you have cheap forking, threading is really better in a lot of ways. For OSes that don't have cheap forking, then obviously threading should be available.



  • I've only done basic scripting in Python, but if it can call C DLL's is it possible to call the Win32 CreateThread function?


  • BINNED

    @blakeyrat Yes, I know that, basically two separate complains. Threading is usually much easier to do than multi-processing because of shared address space. They're very much not the same thing, even if you can use them achieve the same. So the attitude of "you don't need threading" is weird at best.

    The fork remark was just an aside that my simple mind can't fathom why the fork+exec model would make any kind of sense compared to e.g. CreateProcess.


  • Considered Harmful

    @blakeyrat said in Python - My take:

    Mono's been around forever. The .NET spec has been open source forever. I honestly am mystified when I see this all over the place-- why do so many people think that only just now it's been cross-platform?

    Third-party cooties. Businesses like guarantees and support.


  • Considered Harmful

    @dkf said in Python - My take:

    I guess a better implementation could have been written, but that didn't happen…

    Yes it did, and it's called jshell, and it's in Java 9.



  • @topspin said in Python - My take:

    You write a few lines and then NumPy calls fast code, or Tensor Flow puts your NN on the GPU, or things like that.

    I talked to a guy who wrote code for blue waters a few years back. We needed him to tell us why our C++ was slow, and he said 'do you have any for loops?' and we said yes and he said 'get rid of those' and that was his whole advice.

    We eventually figured out what he actually meant, which was 'if you can make it a matrix multiplication, do so. LAPACK is faster than you.' A six hour runtime (no threading, no caching, no source control, you get the idea) went to a little under six seconds.



  • @gąska said in Python - My take:

    @nagesh the better question is, how big is your SSD.
    having 256 GB
    I am


  • ♿ (Parody)

    @nagesh said in Python - My take:

    @dkf said in Python - My take:

    @stillwater said in Python - My take:

    The thread s taking a very wrong turn.

    It started from a bad position of doing unspeakable things with a snake. What did you expect?

    is that story from Bibel?

    Welcome back!



  • @aygeeplus The math code written for "serious" applications (LAPACK and friends) is scary efficient.

    As part of my dissertation, I wrote some simulation code (doing Schrodinger wave-packet propagation on potential energy surfaces) in Python. This involved two FFT steps (one forward and one back) per time step, using whatever library Scipy calls out to. By far, that wasn't the limiting step (my own slow code was).



  • @blakeyrat said in Python - My take:

    @masonwheeler If Oracle were trying to make something awful, it'd probably come out extremely goodlike the Star Wars holiday special: so bad it blows right past "so bad it's good" territory and into the cesspit.

    (as per xkcd)



  • @dfdub said in Python - My take:

    What would you choose instead? Please don't say R or I'll have to bang my head against a wall.

    I actually prefer R for anything that involves Data Analysis. It was designed with statisticians and Data Analysis in mind and it shows. It's not that fun for general programming, but when it comes to Data Analysis (not the complex Machine Learning fuckery) R is much better than Python anyday IMO.



  • @mott555 said in Python - My take:

    @hannibalrex said in Python - My take:

    I'll say it: R

    Ah, yes. R. The bane of my stats class in college. Where we weren't really expected to understand statistics, but just use R to solve the problems, and yet the professor wouldn't teach us how to use R.

    I seem to be sitting in a giant pile of my own vomit now.

    I'm not sure when this stats class happened. If it was anytime before the last couple years then yeah R used to be just vomit. Without the tidyverse packages, it is super easy to write vomit code in R. If you still have to deal with all the vomit then I suggest you take a look at packages like tidyr and dplyr and replace just a few lines of code with code from these packages one code block at a time. That's how I used to clean R vomit :/


  • BINNED

    @aygeeplus said in Python - My take:

    We eventually figured out what he actually meant, which was 'if you can make it a matrix multiplication, do so. LAPACK is faster than you.' A six hour runtime (no threading, no caching, no source control, you get the idea) went to a little under six seconds.

    We use Intel's MKL as LAPACK implementation too (supposedly among the best for their hardware), but I doubt handwritten dgemm would make our code 4 orders of magnitude slower.
    That must have been some impressively inefficient code.

    EDIT: The magic of LAPACK, afaiu, is mostly in picking the right block sizes to optimize cache usage and keep the CPU at maximum throughput. The actual number of floating-point operations should be the same compared to your hand-written loop. There's some advanced algorithms that beat O(n^3) for matrix multiplications but those are for huge sizes you probably haven't hit if you can do it in 6 seconds. Not sure if LAPACK implements these. And of course there's parallelization, which won't explain the described difference


  • Discourse touched me in a no-no place

    @aygeeplus said in Python - My take:

    Shouldn't we have something better than bitwise masks by now?

    We do. In other languages. The Python developers apparently think that sane APIs are for weaklings.


  • Discourse touched me in a no-no place

    @topspin said in Python - My take:

    That's as good as having no threading at all.

    No, it's worse because it fucks things up enough that nobody else can fix it. If there simply was no threading at all, doing the job right would be much easier.


  • Discourse touched me in a no-no place

    @mott555 said in Python - My take:

    I've only done basic scripting in Python, but if it can call C DLL's is it possible to call the Win32 CreateThread function?

    Yes. It doesn't help. The problem isn't creating the thread, the problem is that there's a global interpreter lock which guards every access to the Python memory space, and the way locking is handled is dumb as hell once you've got more than one CPU core. Like how it is on every modern processor. It's not good enough to just unlock and immediately relock periodically; you've got have some sort of scheduling queue so that you're not starving threads from access. As it is in Python as it stands, a single CPU-bound thread can totally lock out all the IO-bound threads you might have, and if you have multiple CPU-bound threads, they can tussle back and forth over the lock in quite the worst way possible.

    I believe that Jython and IronPython are much better (because they've got basic runtimes that were not written by GvR, being the JVM and the CLR respectively) but porting code to work on those is a non-trivial prospect.


  • Discourse touched me in a no-no place

    @topspin said in Python - My take:

    @aygeeplus said in Python - My take:

    We eventually figured out what he actually meant, which was 'if you can make it a matrix multiplication, do so. LAPACK is faster than you.' A six hour runtime (no threading, no caching, no source control, you get the idea) went to a little under six seconds.

    We use Intel's MKL as LAPACK implementation too (supposedly among the best for their hardware), but I doubt handwritten dgemm would make our code 4 orders of magnitude slower.
    That must have been some impressively inefficient code.

    EDIT: The magic of LAPACK, afaiu, is mostly in picking the right block sizes to optimize cache usage and keep the CPU at maximum throughput. The actual number of floating-point operations should be the same compared to your hand-written loop. There's some advanced algorithms that beat O(n^3) for matrix multiplications but those are for huge sizes you probably haven't hit if you can do it in 6 seconds. Not sure if LAPACK implements these. And of course there's parallelization, which won't explain the described difference

    One of the big advantages over basic Python is that LAPACK et al are using true arrays of floating point numbers with careful packing so that memory accesses are efficient, instead of multiple levels of boxed value. Value boxing is really expensive, both in terms of actual computation time to handle and in terms of memory overhead.

    Another is that LAPACK will have had an actual numerical analyst work on optimising the code. It turns out that that's a remarkably good idea. ;)

    (Where we want our code to go fast, we hoist carefully into numpy or review exactly how much work is being done to compute a value. It makes a huge difference. Measure, don't assume or guess.)



  • @blakeyrat said in Python - My take:

    @dkf said in Python - My take:

    That's seriously fucked.

    What's even more fucked is that this language without real threading and with extremely weak typing was picked by data scientists as their hot shit language. For... reasons?

    I can't actually think of a worse language for data science.

    Because it was successfully designed and marketed as a "super easy to get started with! Non-programmers will love it! (at least as much as they can love a programming language)".

    You could argue that C# would be better, and you might be right, but data scientists never got C# sold to them properly. In fact I'd say Microsoft has done a pretty bad job of selling C# and .NET.

    The only decent C# IDE is Visual Studio. Visual Studio requires a 5GB download and potential ~1 hour install just to get started, that alone is enough for most people to reconsider Python. Its interface is also clearly designed for experienced programmers.

    The language is also not designed for scripting, it's much more verbose (even requires some boilerplate just to do anything) and strongly typed, which is a good thing for "real programming" but feels annoying when you're just getting starter.


  • BINNED

    @dkf said in Python - My take:

    One of the big advantages over basic Python is that LAPACK et al are using true arrays of floating point numbers with careful packing so that memory accesses are efficient, instead of multiple levels of boxed value. Value boxing is really expensive, both in terms of actual computation time to handle and in terms of memory overhead.

    Yeah, that's awful, NumPy has real arrays. But @AyGeePlus was talking about LAPACK even being 4 orders of magnitude slower than their hand rolled C++ code, which hopefully doesn't have quite all of these issues like boxing.



  • @anonymous234 said in Python - My take:

    The only decent C# IDE is Visual Studio. Visual Studio requires a 5GB download and potential ~1 hour install just to get started, that alone is enough for most people to reconsider Python.

    I put away installing VS forever for this exact reason. Installed Python to do my stuff meanwhile and I don't even like python.

    @anonymous234 said in Python - My take:

    In fact I'd say Microsoft has done a pretty bad job of selling C# and .NET.

    Agreed. The only publicity that C# has is that it sucks and it is bad because of its assoication with MS.



  • @magus said in Python - My take:

    C# Interactive has been in since VS2015... And F# interactive from the start of the language.

    Is there any sort of C# interactive on Linux?

    @blakeyrat said in Python - My take:

    Mono's been around forever.

    Monodevelop sucks when you're used to Visual Studio (the real one, not this vs code thing)


  • Discourse touched me in a no-no place

    @topspin said in Python - My take:

    But @AyGeePlus was talking about LAPACK even being 4 orders of magnitude slower than their hand rolled C++ code, which hopefully doesn't have quite all of these issues like boxing.

    I read what he said as being the other way round, which is much less surprising, especially given some of the more subtle ways that Python is expensive internally. A parallelised LAPACK should stomp doing everything by hand in basic Python; numpy would be much closer (especially as it uses sensible data structures and could actually be delegating to LAPACK behind the scenes and nobody'd grumble; I'd applaud it because the numerical stability would be more likely to be correct).



  • @dkf said in Python - My take:

    @topspin said in Python - My take:

    But @AyGeePlus was talking about LAPACK even being 4 orders of magnitude slower than their hand rolled C++ code, which hopefully doesn't have quite all of these issues like boxing.

    I read what he said as being the other way round, which is much less surprising, especially given some of the more subtle ways that Python is expensive internally. A parallelised LAPACK should stomp doing everything by hand in basic Python; numpy would be much closer (especially as it uses sensible data structures and could actually be delegating to LAPACK behind the scenes and nobody'd grumble; I'd applaud it because the numerical stability would be more likely to be correct).

    Numpy (if properly built) does delegate to LAPACK or another such math library, depending on the system and the available libraries.

    And the difference between LAPACK and hand-rolled C++ seemed to be in loop structures--parallelizing those makes a huge difference for a lot of trivially parallelizable math (like matrix multiplications, whose rows are independent, no need for memory barriers). Modern CPUs have built-in vector math operations so you can do a whole row in a single gulp instead of doing 2N-1 operations (N multiplications, N-1 additions) per row.


Log in to reply