Wtfbuild: A build system for the next decade.



  • I wanted to automate some of the effort of building my hobby projects at home, integrating testing and source control into project management. A quick search showed up a bunch of possibilities, but since it would have required learning I decided that it would be more fun to make my own.

    And so wtfbuild was born!

    The goal is to be able to trigger a build by simply committing to a staging branch in a repo, and letting the system run tests, compile different targets, etc, and reject the commit if any step fails. That way guaranteeing that the build is unbreakable. Eventually I want to have it work as a service managing several slave machines to make the best use of resources.

    Some of the awesome design decisions:

    • Building it in C++, since that's the best language to build something like this in.
    • Optimized for building C++ projects, since why would you want to build anything else?
    • Integrated with git, mostly because it's the source control system I'm most familiar with. It should eventually be extensible an capable of working with SVN or Mercurial or whatever.
    • Will first run on windows, but eventually able to run on and manage a mix of linux and windows machines.

    The expected development experience should be: dev writes code, code passes code review, goes into staging branch, build system runs tests and tries to build, if successful change makes it to the real branch. At this point other devs can then sync and receive the change.

    Not sure how to link it with a ticketing system yet, though. Maybe enforce that every change has a ticket assigned before code review? Maybe with an ID in the commit comment? Let the system then update the ticket with the message that a change was applied referencing that ticket?



  • What was wrong with travis, appveyor, etc + GitHub integration? Just enjoy reinventing the wheel? I guess it would be cool to have my own continuous integration, but that's a lot of effort for something I already have access to.


  • Winner of the 2016 Presidential Election

    @Kian said:

    Building it in C++

    Only :wtf: I have found so far.

    1/10, misleading clickbait title


  • Discourse touched me in a no-no place

    @asdf said:

    Only :wtf: I have found so far.

    I saw one around

    @Kian said:

    hobby projects at home, integrating testing and source control into project management

    But that might be because I'm assuming @Kian does that stuff at work and I don't get home and think "oh, that stuff I do at work, I'll do that at home too".



  • @LB_ said:

    What was wrong with travis, appveyor, etc + GitHub integration? Just enjoy reinventing the wheel?

    @Kian said:

    it would have required learning I decided that it would be more fun to make my own.
    Reinventing wheels is fun! Also, lets me play with things I don't usually do, like creating services, launching child processes and reading the output and process return code, network communications and the like. It's a learning project and I end up with something kind of useful.

    @loopback0 said:

    "oh, that stuff I do at work, I'll do that at home too".
    But programming is fun! And good processes can automate all the boring parts so I don't have to do them.


  • Discourse touched me in a no-no place

    @Kian said:

    But programming is fun!

    To a degree. The rest - not so much.



  • @loopback0 said:

    "oh, that stuff I do at work, I'll do that at home too".

    I do some of the stuff at home, because it works. Source control is obvious, work items because it's handy to keep track of tasks and bugs somewhere, and CI/build on a different box because F5 is not a build process.



  • @Kian said:

    The expected development experience should be: dev writes code, code passes code review,

    Ok the obvious problem here is you should do your CI stuff simultaneously with your code review, and here's why:

    @Kian said:

    goes into staging branch, build system runs tests and tries to build, if successful change makes it to the real branch.

    What happens if there's a merge conflict?

    Your humans are already hands-off because it's gone through code review, so now there's nobody to solve the merge conflict except this dumb machine which doesn't know how.

    Again, I hate to talk-up any Atlassian product, but Stash/Bamboo combo does this right. The CI stuff happens while It's in code review, and if the CI fails (or hasn't finished yet, or if there's a merge conflict), you're not allowed to "approve" the review. Once approved, the PR goes directly into the branch it's intended for.

    Otherwise, you're building a system with TWO code reviews. One for the human-caught stuff, and another for the CI-caught stuff. That's dumb. Save everybody's time and just have one code review. Plus it's less UI for you to write.



  • @blakeyrat said:

    What happens if there's a merge conflict?

    Ok, running the tests at the same time, and blocking code review approval, is a reasonable change and makes sense. Point taken. However, the issue with merge conflicts isn't solved by it.

    Lets say there are two devs, and they both make changes shortly within each other. Both changes are based off the same repository state and unaware of each other. The system tests each separately, and decides they're fine. They both also pass code review. The system then tries to apply them. Whichever goes first succeeds. However, if it had something that is incompatible with the second change, like changing a function signature, it would break the second change. So the system has to recognize that the second change is being applied to a repository that is different to the repo it was tested against, and rerun the tests.

    If there's a merge conflict, or tests fail, it still has to tell the dev "your change that had passed the tests and was approved could not be applied because someone entered a change ahead of you and broke your change!".

    I don't think this is a solvable problem. The alternative is to enforce an ordering as soon as a change is submitted, but then subsequent changes would be blocked until the changes ahead of them pass.



  • @Kian said:

    The system then tries to apply them.

    Right; but it's not going to do this simultaneously.

    @Kian said:

    Whichever goes first succeeds.

    Correct.

    @Kian said:

    However, if it had something that is incompatible with the second change, like changing a function signature, it would break the second change.

    Right; so after each merge, you have to look for open pull-requests against the branch you merged into and re-check them for merge conflicts. (generally there won't be more than one or two, if any.) Again, the Stash/Bamboo combo gets this correct.

    @Kian said:

    If there's a merge conflict, or tests fail, it still has to tell the dev "your change that had passed the tests and was approved could not be applied because someone entered a change ahead of you and broke your change!".

    Right; which is why you can only do one merge at a time. If two people hit the button simultaneously, one of them is just gonna have to wait and if he's the "loser" he might have to fix the merge conflict. Sucks to be him, but oh well.

    @Kian said:

    I don't think this is a solvable problem.

    It's not only solvable, it's is solved in at least one product.

    @Kian said:

    The alternative is to enforce an ordering as soon as a change is submitted,

    Not when it's submitted (then a long-running code review would fuck everything up.) When it's merged.

    @Kian said:

    but then subsequent changes would be blocked until the changes ahead of them pass.

    Ok, it's simple. There's two states basically:

    • Approved
    • Merge-able

    Just keep in mind that a pull request can be approved (by humans, CI, etc) but not Merge-able (because doing so would cause a merge conflict.)

    In this state, you do what Stash does: disable the "merge" button and put up a big ol' banner that says, "you can't merge this PR until you fix the merge conflicts, sorry bud". You don't have to go through all the approvals again, because they're all still green.



  • @blakeyrat said:

    Otherwise, you're building a system with TWO code reviews. One for the human-caught stuff, and another for the CI-caught stuff. That's dumb. Save everybody's time and just have one code review. Plus it's less UI for you to write.

    IMO redundancy is better in this case.



  • @blakeyrat said:

    It's not only solvable, it's is solved in at least one product.

    Sorry, I meant solvable in the sense that you can't avoid having "a loser" when there's a merge conflict. Managing that situation gracefully is solvable, of course, as you described.



  • @LB_ said:

    IMO redundancy is better in this case.

    No it's irritating. Nobody's going to like using it if you thought everything was all done, then OOPS! Suddenly you get a new email that's like, "fuck you I'm a useless stupid machine here to waste your time, I bet you thought you were done, huh? Nope! Sucker!"

    And again, I cannot believe I am defending the usability of an Atlassian product here. Even those morons got this right.



  • I mean, have the machine give the OKAY before anyone even beings to look at the code manually.



  • That doesn't work, because then PRs can only be merged into the order they were opened.

    You want them to be merged in the order they were closed.

    Otherwise, Bill's HUGE rewrite of the billing system that's going to take like 4 weeks to do a code review of is holding up Ted's minor typo fix where you put "poot" instead of "pot".



  • Um, what? I'm not sure how that has anything to do with what I am talking about. In my experience the order of PRs doesn't matter for continuous integration and manual review.



  • @blakeyrat said:

    You want them to be merged in the order they were closed.

    I think he meant that the machine first tests the change, gives the okay, and only then a reviewer can even look at it. Instead of the tests being triggered together with the code review.



  • @Kian said:

    * Building it in C++, since that's the best language to build something like this in.

    Sorry, no, it isn't. I work in C++ and I like it, but for something that is mostly executing other processes and (probably web) user interface I'd go to python, perl or ruby.

    @Kian said:

    * Optimized for building C++ projects, since why would you want to build anything else?

    C++ projects are the one thing that you can't really optimize for, because it does not have any standard build system. Each project uses its own quirky pile of fragile build scripts tied together with duct tape and spit.

    So you have to specify the build commands and run them with shell and the general part is just a glorious cron with a fancy user interface.

    @Kian said:

    Will first run on windows, but eventually able to run on and manage a mix of linux and windows machines.

    A word of warning: Windows API is the odd man out. Portability layers work mostly like POSIX, so don't tie yourself to the Windows API. Things are a bit better since C++11 added thread support, but neither filesystem access nor process execution have portable interface in standard library.

    Which leads us back to whether C++ is good choice. Every other language does have those in standard library. And a decent web application framework, which C++ lacks too.

    @Kian said:

    I think he meant that the machine first tests the change, gives the okay, and only then a reviewer can even look at it. Instead of the tests being triggered together with the code review.

    Especially since you are talking about C++, this is no good. You see, I work on C++ project. Our "continuous" integration never really deserved to be called continuous, but now that a build typically takes two FUCKING <huge>HOURS**</huge> it definitely does not. There is no way to even build all revisions at times. Having to wait for this to start a review would be pain in the arse.



  • @Kian said:

    I think he meant that the machine first tests the change, gives the okay, and only then a reviewer can even look at it. Instead of the tests being triggered together with the code review.

    Oh, that makes more sense. I still wouldn't recommend it, because CI (after a few years of product development) can take a significant amount of time, and I don't see any reason to not do it in parallel with the human code review.

    If you have an "emergency, this will kill the business" bug and you're still waiting an hour for the 463264 unit tests to complete before your dev lead can even LOOK at it, that's gonna suck.

    Which brings up another point: if you want to turn this into a product instead of just dicking around, please test it on very mature and very LARGE projects. Ideally, something like Photoshop or Windows. Don't test it on your 500-line CS-101 project and say, "well works fine there, it'll work for Photoshop."


  • Winner of the 2016 Presidential Election

    @Bulb said:

    python

    😃

    @Bulb said:

    perl or ruby

    😷

    @Bulb said:

    C++ projects are the one thing that you can't really optimize for, because it does not have any standard build system.

    CMake is pretty much the standard nowadays, at least for cross-platform builds.

    @Bulb said:

    Each project uses its own quirky pile of fragile build scripts tied together with duct tape and spit.

    Isn't that true for every major project ever written in any language? Especially for cross-platform builds?

    @Bulb said:

    Our "continuous" integration never really deserved to be called continuous, but now that a build typically takes two FUCKING HOURS it definitely does not.

    We run our "CI" tests on a cluster and they still take about 50 minutes to run.



  • @asdf said:

    Isn't that true for every major project ever written in any language? Especially for cross-platform builds?

    In the C# world, it's just MSBuild and your Solution or Project XML file.

    I mean, you could make it more fragile, but there's no need to. The project XML file can specify advanced or weird build directions by itself, if needed.


  • Winner of the 2016 Presidential Election

    @blakeyrat said:

    In the C# world, it's just MSBuild and your Solution or Project XML file.

    Even if you have native libraries as dependencies? Because that's usually what makes builds ugly.



  • @Bulb said:

    Sorry, no, it isn't. I work in C++ and I like it, but for something that is mostly executing other processes and (probably web) user interface I'd go to python, perl or ruby.

    Web interface? Tsk. Rich client is where it's at! Been meaning to do a qt project....

    @Bulb said:

    C++ projects are the one thing that you can't really optimize for, because it does not have any standard build system. Each project uses its own quirky pile of fragile build scripts tied together with duct tape and spit.
    Right. So if you can get various C++ projects built, you are already flexible enough to build everything else!

    @Bulb said:

    Things are a bit better since C++11 added thread support, but neither filesystem access nor process execution have portable interface in standard library.
    Yeah, I'll need an abstraction layer to call the correct system functions from the system. But the concepts themselves translate well enough, so it's not going to be too leaky an abstraction. It just means I have to make my own "MakeDirectory", "OpenFile" or "CreateProcess" functions that call down to the OS properly, or look for someone else's.

    @Bulb said:

    And a decent web application framework, which C++ lacks too.
    I have ports and sockets! What else do you need?

    @Bulb said:

    You see, I work on C++ project. Our "continuous" integration never really deserved to be called continuous, but now that a build typically takes two FUCKING HOURS it definitely does not.
    It sounds as though you need to optimize your build process. Out of curiosity, is that time spent running tests, compiling, or linking?

    @blakeyrat said:

    if you want to turn this into a product instead of just dicking around, please test it on very mature and very LARGE projects. Ideally, something like Photoshop or Windows.
    I don't think they'll give me access to their source, but good point. I will try to find some big open source project to test building once I have it working. And of course, it's going to be building itself too. The more complex it becomes, the more it tests itself.

    @blakeyrat said:

    I mean, you could make it more fragile, but there's no need to. The project XML file can specify advanced or weird build directions by itself, if needed.
    C++ projects don't need to be retardedly complex to build, but they turn that way because instead of doing sane refactoring, some devs decide that sprinkling #ifdefs around and the like is easier.



  • @asdf said:

    Even if you have native libraries as dependencies?

    Well, again, you could do it the stupid way. But you have all the tools you need to make it work between MSBuild and the .csproj XML file.


  • Winner of the 2016 Presidential Election

    @Kian said:

    Right. So if you can get various C++ projects built, you are already flexible enough to build everything else!

    I wouldn't be so sure. Python and Java builds may be complicated in different ways than the average C++ build.

    @Kian said:

    It sounds as though you need to optimize your build process.

    I can only speak for myself, but I already optimized the hell out of our build (parallelization etc.) and it still takes almost an hour to run on our cluster. And we don't even run all of our tests all the time, some of them are only run once per day during the night. Sometimes, a project is just that complex.



  • Seems like "made in C++" is the only requirement not fulfilled by Jenkins with the Gerrit Trigger plugin.



  • Even with caching? Usually you can just cache the entire project directory and let CMake decide which files need to be rebuilt.


  • Winner of the 2016 Presidential Election

    @LB_ said:

    Even with caching?

    Yes, the compilation itself is pretty fast anyway (2-8 min), despite the fact that we build quite a few different variants. The unit tests (~10 min.) and integration tests (~40 min.) are the slow part.


  • I survived the hour long Uno hand

    Github + Travis seem to integrate CI and merging well, also:

    There's no code review tool enabled in this screenshot, but presumably you'd wait until those are green then hold your code review on this screen, where you can easily see the change by toggling these tabs:

    ("Conversation" includes commit messages and any comments placed since the PR was opened)



  • @Bulb said:

    but now that a build typically takes two FUCKING HOURS it

    Luxury! Three-and-a-half hours, over here...



  • @asdf said:

    CMake is pretty much the standard nowadays, at least for cross-platform builds.

    CMake is an Eldrich Abomination crossed with Flying Spaghetti Monster. The fact that it is also the best tool for cross-platform C++ projects is telling.

    Besides, you still need to set the correct generator and there are differences like whether you need to pass build type at configuration or build time and such that it is just a build command that needs to be simply configured per project anyway.

    @asdf said:

    Isn't that true for every major project ever written in any language? Especially for cross-platform builds?

    No. Java is already quite a bit simpler, Python's setuptools usually do the right thing out of the box and Cargo totally kicks ass. On the other hand the worst offender in the C++ ecosystem are the extra build steps for mobile applications and other languages barely ever attacked that problem.

    @asdf said:

    We run our "CI" tests on a cluster and they still take about 50 minutes to run.

    Yes. Then you should understand why the reviewer may not want to wait for it.

    @Kian said:

    Web interface? Tsk. Rich client is where it's at! Been meaning to do a qt project....

    I am not sure how sensible that is. It sounds like it would be more difficult for administration and not really give much benefit anyway.

    @Kian said:

    It just means I have to make my own "MakeDirectory", "OpenFile" or "CreateProcess" functions that call down to the OS properly, or look for someone else's.

    Well, if you want to use Qt, you can just go with QtCore.

    @Kian said:

    Out of curiosity, is that time spent running tests, compiling, or linking?

    Tests? You are joking, right?

    It is spent compiling. And then again, and again, and again, because there are something like 9 targets to be made on 3 computers and the third can only do one of them.

    There are some tests. They take quite long for being totally useless (this is a GUI application and a lot of the logic can't be tested without the GUI).

    @Circuitsoft said:

    Jenkins

    Jenkins is another close relative of the Eldrich Abomination. But I think it is mainly Java's fault. Java can be surprisingly memory-hungry.

    Actually, it demonstrates another reason why C++ might not be the right language. The core of Jenkins is pretty simple, but then you have, and for any non-trivial setup need to use, many plugins and Groovy scripts and whatnot.

    @asdf said:

    Yes, the compilation itself is pretty fast anyway (2-8 min), despite the fact that we build quite a few different variants. The unit tests (~10 min.) and integration tests (~40 min.) are the slow part.

    Here it is over 30 min just compilation for most platforms. Though I suspect there is something between crappy operating system (windows), crappy build drivers (make or visual studio) and crappy old machine, because I can do the same compilation in 5 min with ninja on my Linux box (but only one of the platforms can be built on Linux, so the build servers have to remain windows).


  • Winner of the 2016 Presidential Election

    @Bulb said:

    > CMake is pretty much the standard nowadays, at least for cross-platform builds.

    CMake is an Eldrich Abomination crossed with Flying Spaghetti Monster.

    Those two statements don't contradict each other. ;) In fact, I ranted about CMake in a few other threads.

    @Bulb said:

    No. Java is already quite a bit simpler,

    You should see our Java build. It might change your mind. :D

    @Bulb said:

    Cargo totally kicks ass

    Totally agree. Cargo is awesome.

    @Bulb said:

    Jenkins is another close relative of the Eldrich Abomination.

    Yeah, Jenkins sucks.

    @Bulb said:

    Here it is over 30 min just compilation for most platforms.

    OK, you should definitely fix your build and/or get a new server.



  • @tar said:

    Luxury! Three-and-a-half hours, over here...
    Yuck. I don't know what you are building, but I can't imagine that that is justified. Granted, it's not easy to optimize C++ build times, but there are ways of doing it! Thankfully, once modules make it into the standard they should restore some sanity.

    @Bulb said:

    It is spent compiling. And then again, and again, and again, because there are something like 9 targets to be made on 3 computers and the third can only do one of them.

    It's hardly the fault of the language or tools if your bosses are too dumb to realize that the cost of a decent server (or servers) is more than made up for with the increase in productivity of not having to wait hours for a build. If your computer can run a build in 5 minutes, 3 computers should be able to go through 9 targets in little over 15 minutes.


  • Winner of the 2016 Presidential Election

    @tar said:

    Three-and-a-half hours

    Holy mother of…

    And I thought our build times were bad. How can you be productive if it takes 3.5 hours to get feedback on whether your changes work correctly?



  • I imagine that 3.5 hours is the time it takes a full rebuild in the CI system. I hope he can run a local build, which would only require recompiling whatever files he touched, in a fraction of the time.


  • Winner of the 2016 Presidential Election

    After making major changes, you still have to check whether something unrelated broke (= whether your integration tests work). And that takes 3.5h, apparently. Not even a full rebuild should take that long. If it does, buy more servers and parallelize your build!



  • @tar said:

    @Bulb said:
    but now that a build typically takes two FUCKING HOURS it

    Luxury! Three-and-a-half hours, over here...

    When I was at VMware, I seem to remember a complete build of Workstation was something like 6 1/2 hours. (that was for WS and all the tools that could go into the various VMs) But luckily many things were "cached". So a typical build was closer to 3.



  • So what I'm hearing is that a build system that could improve C++ buildtimes would be quite valuable to many big companies :D


  • Winner of the 2016 Presidential Election

    First you have to know what your bottleneck even is. Many companies have magic build scripts glued together by more magic scripts and nobody really understands the whole thing. Rewriting such a thing in another build system is next to impossible.



  • @Kian said:

    I hope he can run a local build, which would only require recompiling whatever files he touched, in a fraction of the time.

    Yeah, as long as you don't have to touch certain headers, it's usually not so bad building locally. You have to be a bit strategic about syncing though -- that might knock you out for 5 or 6 hours if a lot's changed (if there was an integration) although it can be under an hour if you're lucky...


  • Garbage Person

    The package step of our build causes subsequent incremental builds to produce non-working binaries. Thank you release engineering guy who no longer works here.



  • @blakeyrat said:

    herwise, you're building a system with TWO code reviews. One for the human-caught stuff, and another for the CI-caught stuff. That's dumb.

    Yeah, nobody should ever look at code if they have a compiler or try to compile code that someone has looked at. That's just a waste.



  • @Kian said:

    It's hardly the fault of the language or tools

    In part it is.

    • Visual Studio (2008, the last one with WinCE target) is not able to build in parallel. For desktop, it at least builds separate projects in parallel, but since (for histerical raisins) big chunk of the thing is in one big library, it is not helping much. And for WinCE target even that does not seem to work.
    • There is a huge difference between make and ninja. Windows suck at disk access and it especially hurts make as it tries all it's implicit rules, plus it also can't be told to build in parallel, because it occasionally fails when doing that with what appears to be a sharing problem (Windows suckage again).
    • However, the ninja build was not reliable with older cmakes, because it handles dependencies between projects somewhat differently and there are some tricks involved around the calls to ant that builds the Android packages. It should now work since cmake 3.2 or 3.3 with the BYPRODUCTS specification, but I'll have to go through the build script with fine-tooth comb before I trust it enough to switch the server to ninja.

    The servers somewhat make up for the lack of parallelism by making 2 and 3 builds in parallel, but the disk becomes a bottleneck. But the disk becomes a bottleneck anyway, which is made worse by the amount of file copying that goes on. And did I say Windows suck at disk access?

    @Kian said:

    So what I'm hearing is that a build system that could improve C++ buildtimes would be quite valuable to many big companies 😃

    CMake with ninja generator is a huge improvement. However, Microsoft has planted themselves in the corner with the pdb abomination. When you are trying to write debug information for all modules to a single file, ability to parallelize decreases significantly. Plus you can't integrate things like distcc/ccache.

    And of course the other part is the language itself. The template code must be in headers and they get recompiled again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, and again, … … … … and again, … … … … … … … … … … (yes, there are precompiled headers; but whenever we tried to use them, it was a maintenance problem, so we don't use them).


    That does not mean there would not be things to optimize. But it would require splitting the project to sensible modules, which we started, but the amount of effort that can be dedicated to that kind of thing is rather limited.



  • @LB_ said:

    Just enjoy reinventing the wheel?



  • @xaade, post:44, topic:54485, full:false said:

    I've worked with omni wheels before, I was the lead programmer for my high school robotics team for 4 years. They're cool things, we mainly used them to allow the robots to strafe.



  • Well, after becoming more familiar with the Windows API, and messing about with this during the weekend, I finally managed to get the project to build itself.

    Sure, everything is hardcoded right now, and instead of caching it rebuilds everything every time. And it currently only does the debug build. But that's all easy to fix. Designing a sane way to describe the build rules is going to be a bigger challenge than actually getting it to work.

    I basically have it creating one thread for each file to be compiled. Each thread passes the required command line arguments to a child process and then waits for the child process to finish. The main thread joins all the threads, and once they all return finally calls the linker.



  • @Bulb said:

    Windows suck at disk access and it especially hurts make as it tries all it's implicit rules,

    Yeah, I've found that make runs significantly faster on Windows with implicit rules disabled (which is either the -R or -r option, I can never remember which....)



  • Well, this weekend was a waste. Got bit by git's awfulness.

    Turns out there is no straightforward way of exporting the files from a specific point in time into a separate directory. Every alternative has some annoyance or another. The thing that does the closest to what I want is git archive, except that it creates a zip (or tar.gz) that I then have to extract. So I had to add a dependency to a zip library (settled for miniz in the end). A bunch of time wasted on a tangent to be able to have reproducible builds of any point in time.

    The other alternatives were to clone into a specific directory (although you can only clone the top of a branch, from what I understood of the docs, so you'd have to then check out the correct commit in the new repo), or to check out the current index into another folder, which gives you the files by themselves but you have to modify your current repo to point to the right commit first (at least from what I understood of the docs).


  • FoxDev

    @Kian said:

    Turns out there is no straightforward way of exporting the files from a specific point in time into a separate directory.

    cd ~/build
    git clone https://github.com/SockDrawer/SockBot.git
    git checkout f82bc5e
    

    is insufficient?




Log in to reply