To library or not to library, that is the question



  • I wasn't really sure whether to put this in Coding Help or not, since I wasn't sure where it would otherwise go.

    We have a dev that is a strong evangelist of Composer and Packagist. For those not familiar with the PHP ecosystem (thank fuck, you're all thinking), this is the principle package manager and and major repository respectively for third party libraries of PHP code.

    Unlike NPM, packages might have dependencies but are generally sane about it - no left-pad here (mostly because we have that shit built in) - so we have actual packages with actual uses and actual sanity. For the most part.

    Now my question comes back to this development. I'm generally fairly against bringing in swathes of third party code into an otherwise bespoke platform. I don't object so much to very specialised dependencies that anything I write is going to be a poor imitation of (e.g. PHPExcel), but I have objections about introducing random dependencies for handling tasks.

    For example I discovered recently that a library was brought in to handle CSV files. Which hilariously broke on actual CSV files that had a backslash in the data because this library is just a huge wrapper around PHP's own built in CSV functions that assume conventions like backslash to escape things rather than actual CSV spec. (Having stupid defaults is not the problem here) So we have a fairly huge - I think 1500 lines - library that wraps around core functionality and provides nothing of use that I can see unless you wanted to abstract it away for mock purposes, like say unit testing your CSV code. But you don't need 1500 lines for that, you really don't.

    And then we get to the fun that was Slim. For those who don't know it, it's a URL routing/front-side controller tool. The fact that you can easily end up with a stack trace 100 layers deep when anything actually occurs is just hilarious. The fact that in our development environments we've had to up the limit in Xdebug because it actually exceeds Xdebug's default stack limit is just... something else. But should we be bringing in such libraries when we could do it ourselves in a fraction of the code/complexity?

    So in general... is bringing libraries in a good thing? Is bringing libraries whose relevant functionality can be replicated with core functions a good thing? Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

    At what point is bringing in third party libraries a good thing? Should it be your first port of call before trying to write it yourself (as some people here seem to think)?


  • Discourse touched me in a no-no place

    @Arantor said in To library or not to library, that is the question:

    The fact that you can easily end up with a stack trace 100 layers deep when anything actually occurs is just hilarious.

    Java Web Tooling Developers: “Hold my beer…”


  • FoxDev

    @Arantor said in To library or not to library, that is the question:

    But should we be bringing in such libraries when we could do it ourselves in a fraction of the code/complexity?

    Probably not.

    @Arantor said in To library or not to library, that is the question:

    So in general... is bringing libraries in a good thing?

    In general, yes, but only if the libraries are good quality. I find it especially useful when dealing with things like database access, parsing complex file formats, or handling file transfers and other streams, as the libraries hide away a lot of the boilerplate and faffing about, so I can get on with focussing on the stuff I need.

    If the library is bad however, you may as well just write the damn thing yourself ;)

    @Arantor said in To library or not to library, that is the question:

    Is bringing libraries whose relevant functionality can be replicated with core functions a good thing?

    Depends on the complexity. For example, that left-pad NPM package is utterly ridiculous, as any competent coder can write their on in just a half-dozen lines. However, an FTP package is worth using, as you can avoid writing hundreds of lines of boilerplate stream writing and reading, and connection management.

    TL;DR: Depends on the task.

    @Arantor said in To library or not to library, that is the question:

    Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

    I'd work out if you even need a framework in the first place, then evaluate the options. Too many people immediately reach for the flavour of the week without working out if they actually need it or not, and they end up with bloated unmaintainable crap.

    @Arantor said in To library or not to library, that is the question:

    At what point is bringing in third party libraries a good thing?

    When it represents a significant and worthy reduction in time and effort required to implement the solution.

    @Arantor said in To library or not to library, that is the question:

    Should it be your first port of call before trying to write it yourself (as some people here seem to think)?

    Again, depends on the complexity of the task.


  • Discourse touched me in a no-no place

    @Arantor said in To library or not to library, that is the question:

    At what point is bringing in third party libraries a good thing?

    When it lets you do something complicated more easily and more correctly with less effort. For example, writing a CSV parser that is capable of handling all the annoying edge cases is quite hard, so people are usually recommended to use a library for that. People are definitely recommended to use a library instead of writing their own security code (because that's hilariously easy to fuck up in non-obvious ways). Obviously, the simpler the functionality, the less of a benefit there is in using a library to achieve it.

    OTOH, if you've already solved the hard stuff then you're into the space of exchanging the potential to have someone else fix bugs so you don't have to, against the fact that switching to something external might actually introduce bugs in that you've already solved. No easy answer there.



  • @Arantor said in To library or not to library, that is the question:

    For example I discovered recently that a library was brought in to handle CSV files.

    It's not like PHP supports that natively.



  • @RaceProUK said in To library or not to library, that is the question:

    In general, yes, but only if the libraries are good quality. I find it especially useful when dealing with things like database access, parsing complex file formats, or handling file transfers and other streams, as the libraries hide away a lot of the boilerplate and faffing about, so I can get on with focussing on the stuff I need.

    How does one judge what is good without really using it in the first place? Especially given that the language has a number of these things built in?

    @RaceProUK said in To library or not to library, that is the question:

    Depends on the complexity.

    The examples I'm thinking of that are questionable are libraries to handle CSV files, libraries to abstract the file system away (despite the fact that we don't have any use beyond the regular physical file system, we're not abstracting away to S3 for example), libraries to handle URL routing + middleware.

    Especially when we're talking about things that the abstracted form brings more complexity than we use.

    @RaceProUK said in To library or not to library, that is the question:

    When it represents a significant and worthy reduction in time and effort required to implement the solution.

    This is kind of the crux of my issue. The amount of time the developer in question is using to implement these libraries is actually about the time it would take to do it without the libraries - except we're now on the library treadmill and now need to think about updating the libraries or not.



  • @ben_lubar said in To library or not to library, that is the question:

    @Arantor said in To library or not to library, that is the question:

    For example I discovered recently that a library was brought in to handle CSV files.

    It's not like PHP supports that natively.

    The library we use is really just a fairly big wrapper around the native PHP functions.


  • FoxDev

    @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place?

    That is the £64,000 question, and I'm afraid I don't have a good answer for it.

    @Arantor said in To library or not to library, that is the question:

    libraries to handle CSV files

    Yes for complex CSV files with lots of edge cases, no for simple CSV files.

    @Arantor said in To library or not to library, that is the question:

    libraries to abstract the file system away

    Unless you're multiplatform e.g. Windows and Linux, don't bother.

    @Arantor said in To library or not to library, that is the question:

    libraries to handle URL routing + middleware

    As a fan of ASP.NET MVC, I'm tempted to say this is a good use case for libraries, but then, is ASP.NET MVC a library or a framework? I think some would favour calling it the latter. I guess really it depends on whether you favour being able to add new routes easily with minimal effort, or having a well-defined set of routes that's tightly managed.



  • @RaceProUK said in To library or not to library, that is the question:

    Unless you're multiplatform e.g. Windows and Linux, don't bother.

    Even on Windows and Linux, I still wouldn't bother. Both accept forward slashes :D

    @RaceProUK said in To library or not to library, that is the question:

    As a fan of ASP.NET MVC, I'm tempted to say this is a good use case for libraries, but then, is ASP.NET MVC a library or a framework? I think some would favour calling it the latter.

    ASP.NET MVC is probably nearer to a framework but you're doing something fairly chunky with that. Slim just handles routing, it doesn't do anything more structured than that.

    Especially given a platform where the rest of it treats Apache itself as the router... new route? Make a new file!


  • Considered Harmful

    If you do use third-party libraries, which I'm generally in favor of if the task is sufficiently complex, you have to consider the extra effort to keep a curated repository. You want reproducible packages, and few things are more annoying than randomly breaking shit because someone introduced a bug in something you depend on and what worked perfectly yesterday goes down in flames today just because you pulled the current package from a public mirror instead of a known-good one. That curation can be a significant amount of work, too.



  • Bringing in a third-party library, especially one that is open source* and might need frequent updates (think: something that interacts with Facebook's API, which changes frequently and usually without warning) should always be an absolute last resort.

    It doesn't take long at all to get to the point where you've spend more time dicking around to get the library to work than it would have taken you to write the functionality you need (probably 10% of what the library does) in-house.

    If you do end up using a third-party library, for God's sake, keep your own copy of it so you have reproducible builds. Do not have your build system pull the newest shiny version off NPM or whatever each time it builds the product.


    * I should clarify here I mean "open source development methodology" not "the code is open source". C++'s STL was carefully designed for years by experts to be efficient and high-quality, the fact that the current implementations are open source is incidental. Ditto that with the various frameworks in .net. They could not contrast more with the "Facebook API Doodz" library on GitHub which spends like 80% of its time broken due to "release early, release often". We really need better language here.



  • @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place?

    Most people don't bother. They just hitch their horse to the new shiny thing and off they go without thinking 6 months down the road. There's zero planning, high-level design, or any kind of foresight in modern software development.



  • @blakeyrat said in To library or not to library, that is the question:

    @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place?

    Most people don't bother. They just hitch their horse to the new shiny thing and off they go without thinking 6 months down the road. There's zero planning, high-level design, or any kind of foresight in modern software development.

    That's certainly what happens in my workplace.



  • @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place?

    You look at how many stars it has on the APPAPI store.

    Just joking... or not? A "reviews" feature on github might be interesting.



  • @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place?

    Popularity is a good sign that it is reasonable. Unpopular things that appears good always have a hidden reason to be unpopular that you'll only find after wasting a lot of time.



  • @wharrgarbl The number of contributors is also something I'm looking at. A one-man-show, while not an immediate show stopper, definitely warrants a second look (if only for the bus factor).



  • @Arantor said in To library or not to library, that is the question:

    And then we get to the fun that was Slim. For those who don't know it, it's a URL routing/front-side controller tool. The fact that you can easily end up with a stack trace 100 layers deep when anything actually occurs is just hilarious. The fact that in our development environments we've had to up the limit in Xdebug because it actually exceeds Xdebug's default stack limit is just... something else. But should we be bringing in such libraries when we could do it ourselves in a fraction of the code/complexity?

    I did some work with Slim, and AFAIR, it was pretty bare bones little library. I wouldn't expect to see more than 4-5 function calls before it reaches your controller.

    Are you sure you didn't end up with some endless redirect loop or something?

    So in general... is bringing libraries in a good thing? Is bringing libraries whose relevant functionality can be replicated with core functions a good thing? Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

    At what point is bringing in third party libraries a good thing? Should it be your first port of call before trying to write it yourself (as some people here seem to think)?

    As always, it depends. Some OSS libraries are great, they cover all sorts of edge cases you'd learn about the hard way and can generally make things a lot easier. Others are like your CSV library.

    I also download libraries from npm, examine the code and usually get disappointed. I feel like everything is written by monkeys and I could do better myself. And sometimes I do just that.

    But here are a few important points I try to keep in mind.

    • Unlike 3rd party libs, my code doesn't get improved or fixed on its own. A year down the line, that ugly lib has all sorts of new features and edge case fixes, while my once superior lib is now a pile of crap in comparison.

    • Coding paradigms change. My code meshes well with my other code, so it all calcifies into one hodge-podge framework that is hard to tear apart. OSS code is by design modular and interchangeable.

    • Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine? A 3rd party lib will generally have well established patterns, so there will not be problems like that.

    • Finally, will other coders even want to work on my artisinally made in-house framework? I mean, even if my thing is superior, once the time comes to update their CV, I bet they'd rather have "2 years of express+mongo+slim+whatevers-hip" there, than "2 years of cartman's internal framework no one's ever heard of before".

    Fight the instinct and let the modules in. Unless they absolutely suck and you have no other option.



  • @Arantor said in To library or not to library, that is the question:

    How does one judge what is good without really using it in the first place? Especially given that the language has a number of these things built in?

    My metrics:

    • Github stars
    • Last commit + activity of project
    • Number of maintainers + how active they are (the more the better)
    • Number of open/resolved issues, pending pull requests (is there a huge backlog?)


  • @dkf said in To library or not to library, that is the question:

    OTOH, if you've already solved the hard stuff then you're into the space of exchanging the potential to have someone else fix bugs so you don't have to, against the fact that switching to something external might actually introduce bugs in that you've already solved. No easy answer there

    QFT



  • @cartman82 said in To library or not to library, that is the question:

    Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine?

    Oooh, I know the answer to this one!



  • The real shame is that some of the coolest parts of our tech stack aren't frameworks or libraries and couldn't really be made libraries without compromising what they were made for.

    I guess I'm just pissy about people just adding more and more crap on top of the crap we already have especially when we have enough crap that we don't really need. I honestly don't feel that '3rd party library' should be the first suggestion to any problem that we encounter.


  • :belt_onion:

    @Arantor said in To library or not to library, that is the question:

    The real shame is that some of the coolest parts of our tech stack aren't frameworks or libraries and couldn't really be made libraries without compromising what they were made for.

    I guess I'm just pissy about people just adding more and more crap on top of the crap we already have especially when we have enough crap that we don't really need. I honestly don't feel that '3rd party library' should be the first suggestion to any problem that we encounter.

    But if you guys want to bring the hellish library community of npm to the toxic hellstew of PHP, feel free :P



  • @sloosecannon Packagist is vastly more sane than npm. And I'm not the one in our company who is "is there a library for that", because I could easily be the person writing a library rather than consuming. It's not like I can't write it myself in just about every case we currently use a third party dependency - the exception is PHPExcel, and I'd rather have my sanity first.

    The fact we are now in a treadmill where almost every library on it has upgrades we currently cannot deploy because our platform does not yet work on PHP 7 and our libraries all now mandate PHP 7 is just a wonderful place.


  • :belt_onion:

    @Arantor said in To library or not to library, that is the question:

    @sloosecannon Packagist is vastly more sane than npm. And I'm not the one in our company who is "is there a library for that", because I could easily be the person writing a library rather than consuming. It's not like I can't write it myself in just about every case we currently use a third party dependency - the exception is PHPExcel, and I'd rather have my sanity first.

    The fact we are now in a treadmill where almost every library on it has upgrades we currently cannot deploy because our platform does not yet work on PHP 7 and our libraries all now mandate PHP 7 is just a wonderful place.

    Ow...

    It's like you took a post from the thread and compressed all the ow in it into one sentence...



  • @sloosecannon fun, isn't it?



  • @cartman82 said in To library or not to library, that is the question:

    My metrics:

    • Github stars
    • Last commit + activity of project
    • Number of maintainers + how active they are (the more the better)
    • Number of open/resolved issues, pending pull requests (is there a huge backlog?)

    so basically this? http://packagequality.com/





  • People have a very odd idea of what package quality looks like: If a package is actually good, I would hope there isn't much churn.

    I know people prefer the JS model of, 'if updating the package doesn't break your code, it's five years too old even if it was created a month ago' but every single one of those people is wrong.


  • FoxDev

    @Magus You got me curious, so I'm digging into how they measure quality, and right off the bat, they admit it's a flawed premise:

    Any objective measurements of quality are going to be flawed one way or another. package-quality only attempts to give some indications about quality, not be an absolute rating on which to bet your farm. If you don't agree with our ratings, please help us improve them!

    They also have some odd measures:

    • More versions means higher quality.
      I can imagine that being pulled apart quite spectacularly here.
    • More downloads means higher quality.
      Same again. Just because a package is popular, doesn't automatically mean it's good. Still, it's a better metric than version count.
    • Repo quality.
      This is a complex one. There's three measures:
      • Total Factor 1-1/total_number_of_issues
      • Open Factor 1.2-open_issues/total_number_of_issues ('healthy' is 20% or less of issues open)
      • Long Open Factor 1-long_open_issues/total_number_of_issues ('long' means 'over a year')

    I'll be honest: I'm not exactly sure I understand the reasoning behind that last measure.

    Then there's this:

    New packages will always appear with low stars, until they get enough momentum. Also, packages that "just work" and get no issues will be underrated by our system.

    Which just throws more doubt over their measurements.


    I just PR'd adding shields for this to both SockBot and SockMafia. Now I'm wondering if we got a bit carried away.


  • ♿ (Parody)


  • Java Dev

    @boomzilla Hm, maybe module size is a good criterium? But above some limit that becomes a negative...

    Is there a good complexity metric? Though that has the same problem as size.



  • @boomzilla said in To library or not to library, that is the question:

    I'm sold!

    I actually used left-pad in one of my code dumps in the mettle threads. I'm disappointed no one spotted it.

    It did the job, no complaints here.


  • ♿ (Parody)

    @PleegWat I doubt there's a single metric that doesn't have counter examples. The best approach for a quick look is to have a diversity of metrics. But a quick look is only ever going to be that.


  • kills Dumbledore

    @PleegWat said in To library or not to library, that is the question:

    Is there a good complexity metric?

    cyclomatic complexity?



  • @boomzilla I looked this source. There is lots of performance optimizations I wouldn't care to write if I was doing my own thing, it's actually worth something.

    But if I would use it I would copy it instead of referencing trough npm. And that probably would only happen if I were doing something were the performance of a left-pad operation was meaningful.


  • Discourse touched me in a no-no place

    @RaceProUK said in To library or not to library, that is the question:

    Long Open Factor 1-long_open_issues/total_number_of_issues ('long' means 'over a year')

    The “Not-Jeff Factor” should be the number of issues that have been opened without being closed as fixed in a way that the submitter finds acceptable, as that's one which is fairly hard to game. Except nobody collects the critical metrics in the first place, so tricks like deleting everything after 10 days that hasn't been fixed let you get a hugely high acceptability metric while not making anyone actually happy.

    Also, things go wonky if package-quality can't find where the issue DB actually is. ;)


  • kills Dumbledore

    @dkf said in To library or not to library, that is the question:

    Also, things go wonky if package-quality can't find where the issue DB actually is.

    Like if it's in a forum that's invulnerable to bots?


  • I survived the hour long Uno hand

    @RaceProUK said in To library or not to library, that is the question:

    I'm not exactly sure I understand the reasoning behind that last measure.

    Issues that remain open for over a year is a good sign the package has been abandoned by the creators, and thus, if you ever find a bug, good luck getting it fixed.


  • FoxDev

    @Yamikuronue I now understand that measure 🙂



  • @cartman82 said in To library or not to library, that is the question:

    Coding paradigms change. My code meshes well with my other code, so it all calcifies into one hodge-podge framework that is hard to tear apart. OSS code is by design modular and interchangeable.

    Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine? A 3rd party lib will generally have well established patterns, so there will not be problems like that.

    Finally, will other coders even want to work on my artisinally made in-house framework? I mean, even if my thing is superior, once the time comes to update their CV, I bet they'd rather have "2 years of express+mongo+slim+whatevers-hip" there, than "2 years of cartman's internal framework no one's ever heard of before".

    Pretty much sums up my feelings on it.

    If it is a decent open source library that is actively maintained. used, popular and has unit-tests then use it.

    At the moment me and another dev are making a .NET web app. I need to do some Image processing, I could use System.Drawing and write a load of code for cropping, rotation and a few other things we have to do with incoming images.

    Or I could just use this http://imageprocessor.org/ and call crop, rotate etc.

    I don't particularly like frameworks. I like micro-frameworks as I can just get the bits I need.


Log in to reply