More stupid Git errors THIS TIME IN FIRST-PERSON!

TheCPUWizard

I can't wrap my brain around a branch being a pure point in time to begin with

A branch is a sequence of commits where each is a point in time. If two commits get "smushed" together then a point in time is lost.

If information is transferred from one branch to another as an atomic action, then there would be no point in time where one of the commits existed in the target branch but the other did not.

I really don't understand why people are having a hard time with the concept....

Gąska

I'll repeat once again: git checkout A; git merge --no-ff B. Also, it's minority case - most of the time when you're merging in git, it's due to push/pull, and I really doubt you want to have those 10 upstream commits squashed into one thing, or even have explicit merge commit between B and B.

boomzilla

@TheCPUWizard said:

A branch is a sequence of commits where each is a point in time. If two commits get "smushed" together then a point in time is lost.

Ah, right.

@TheCPUWizard said:

I really don't understand why people are having a hard time with the concept....

Sorry, wasn't integrating it all, and don't work with VCS that allows you to combine commits. And the way you said it referred to a branch as a point, which it isn't (unless nothing ever happens with it).

But then...if that stuff is a problem for you...don't do it?

TheCPUWizard

@Gaska said:

I'll repeat once again: git checkout A; git merge --no-ff B. Also, it's minority case

I don't disagree that you can do it...I am stating that to the best of my knowledge you can not prevent someone from not doing a ff....
Not a minority case for the companies I work with or those that my colleagues at various companies work with. Tracability is king on many fields.

TheCPUWizard

@boomzilla said:

But then...if that stuff is a problem for you...don't do it?

When a tool has the ability (and especially if it is the default) to do this type of stuff then "Don't do it" is really not the greatest of options.

My key point is that there is [IMPO] considerable advantages to centralized VCS systems for these purposes, and that DVCS is not always the answer...To me this leads to the conclusion that if DVCS is not the best option, then a tool which is based on DVCS being coerced to approximate a CVCS is almost certainly not the best option.

Circuitsoft

That sounds an awful-lot like DARCS. Main reason Linus created Git instead of using DARCS was performance, and I imagine that a semantic merge is not something that is truly necessary very often.

HardwareGeek

@blakeyrat said:

@boomzilla said:
And then you get on the plane and get arrested after ranting about how confusing the seat belt buckle is.

They actually explain in great detail how the seatbelt works.

Except that you have earplugs in, your hands over your ears, and are going, "La la la la. I don't want to listen," while they are explaining, and then complain about how confusing it is.

superjer

@TheCPUWizard said:

We start at time 1 with a single "branch" (everything linear up to this point)....[call this A]

Then a branch is made for some purpose [call this B]. On B, changes 2 and 3 are made as distinct commits.

Some time later the latest version of B is brought over to A....

If a "FF" is done then it will appear that A has an intermediate state where 2 existed, but 3 did not.

Ok I see what you mean now.

But it shouldn't "appear" to you that A ever pointed to 2 because, well, nothing says it did. Git says 3 was created from 2 and A points at 3 now. And that's it.

The fact that on someone's machine somewhere A pointed to 2 at some point seems like useless information to me. Unless you're developing in production or something.

TheCPUWizard

@superjer said:

But it shouldn't "appear" to you that A ever pointed to 2 because, well, nothing says it did. Git says 3 was created from 2 and A points at 3 now. And that's it.

If I look at the history of B, I want to see 3, 2, 1 (newest to oldest)
If I look at the history of A, I want to see 4,1 (where 4 is the merge operation)

Gąska

@TheCPUWizard said:

1) I don't disagree that you can do it...I am stating that to the best of my knowledge you can not prevent someone from not doing a ff....

Yep, that's a problem. But it's people problem. And you could probably kinda-sorta-enforce it by making a hook that rejects all commits that don't start with some stupid prefix on this particular branch - this way, you cannot fast-forward, since commit message will be invalid if you do.

@TheCPUWizard said:

2) Not a minority case for the companies I work with or those that my colleagues at various companies work with. Tracability is king on many fields.

It is always minority case since you always push/pull more often than merge branches. It's theoretically possible to be otherwise, but let's be serious - in real life, you'd never achieve it.

TheCPUWizard

@Circuitsoft said:

I imagine that a semantic merge is not something that is truly necessary very often.

The specific use-case I gave before has a frequency that depends heavily on a number of factors. For some teams it can be quite frequent (multiple times per sprint), in others very rare...

But...

There are so many other things that semantic merge does that the benefits are huge.

superjer

Why? 3 and 2 are in A, with all the information about who created them, when, and from what!

TheCPUWizard

@Gaska said:

Yep, that's a problem. But it's people problem.

Technically I agree. But when "people problems" are frequent enough and costly enough - something needs to be done about them.

TheCPUWizard

@superjer said:

Why? 3 and 2 are in A, with all the information about who created them, when, and from what!

Not really, not at the timestamps of the commits, and especially not independently

TheCPUWizard

@Gaska said:

It is always minority case since you always push/pull more often than merge branches

Depends on how branches are being used (we have been through this part before)

Gąska

@TheCPUWizard said:

But when "people problems" are frequent enough and costly enough - something needs to be done about them.

https://upload.wikimedia.org/wikipedia/commons/9/93/Police-sniper_600.jpg

TheCPUWizard

Illegal in most locations [but sometimes desirable]...

Seriously, everyone makes mistakes, putting things in place to safe guard against them to a reasonable (this is where the cost of the mistake comes into play) degree is quite important.

Gąska

@TheCPUWizard said:

Depends on how branches are being used (we have been through this part before)

Before every merge, people usually do pull to be sure they have newest revisions, and when they're done, they push their merge upstream. It's already 2-to-1 ratio. The lowest you can get is 1-to-1 ratio between ff-merges and non-ff-merges, which is when you never had to pull any remote changes whatsoever. Getting lower ratio means not publishing your changes.

@TheCPUWizard said:

Seriously, everyone makes mistakes, putting things in place to safe guard against them to a reasonable (this is where the cost of the mistake comes into play) degree is quite important.

What about the message hook solution?

tarunik

@TheCPUWizard said:

For these having a solid and atomic view of the history of each branch is critical without there ever being the appearance of an intermediate state which did not really exist.

Aren't tags the correct tool for designating a point in time? (Considering Git lets you cryptographically sign tags, and even individual commits...you'd figure that'd be an ironclad enough guarantee for anyone.)

superjer

@TheCPUWizard said:

Not really, not at the timestamps of the commits, and especially not independently

You're making me very worried. How is it important to know at what time during development a commit was referenced under a particular branch, rather than just when the change was created? This smells very strongly of developmestuction.

TheCPUWizard

@superjer said:

This smells very strongly of developmestuction.

Actually quite far from it. There are often 5 distinct environments something goes through before it hits production.

gleemonk

@blakeyrat said:

What was that site that reduced all the Linux arguments into little "trademarks"? Ah. Here: http://tmrepository.com/trademarks/

A site with class I see no there.

hifi

@TheCPUWizard said:

Have you ever worked with large teams?

I haven't. I think it's a blessing on its own I work with such small teams I might know from a ticket name what someone is going to do at code level knowing our products codebase inside out.

I understand it's not so trivial with larger teams but I tried to emphasize the importance of communication to make your and your co-workers day easier if it's at all possible.

In your example of doing a large refactor, just letting people know you are going to do that might just put them off from touching that part of the codebase until you have checked in so they can do their work on top of yours rather than merge. Sound reasonable even in larger scale?

TimeBandit

@gleemonk said:

A site with class I see no there.

And of course, this site is served by Nginx on Linux, no less

ChrisH

Why shoot from a helicopter while it's on the ground?
And why is he grabbing one leg of the bipod like that?

Gąska

@ChrisH said:

Why shoot from a helicopter while it's on the ground?

Because it's the first photo I found that seemed okay enough.

@ChrisH said:

And why is he grabbing one leg of the bipod like that?

It doubles as front grip. I guess police snipers are underfunded.

TheCPUWizard

@hifi said:

In your example of doing a large refactor, just letting people know you are going to do that might just put them off from touching that part of the codebase until you have checked in so they can do their work on top of yours rather than merge. Sound reasonable even in larger scale?

Key word there is "might". Yes, communication is key, but it is not perfect.

I am really baffled by the people who are so dead set against using a tool (in this case a CVCS) that meets the requirements for a team perfectly, and instead maintain that using a tool (in this case a DVCS) conceived for a completely different type of environment. Seem like the reverse of the "When all you have is a hammer, everything looks like a nail"....

dkf

@Circuitsoft said:

semantic merge

The main problem with semantic merging is that you have to actually understand the semantics of what you're merging. Computers are quite good at fucking that up, and you don't always have the tools available. (A lot of projects are in multiple languages, even if one of those “languages” is something like HTML or JSON.)

@TheCPUWizard said:

The "Test Case" that I use

If you're hitting that case a lot, someone needs a quick COMPLAIN-slapping in a code review for not following conventions.

gleemonk

When your argument hinges on the argumentative failures of others, you shan't be picky.

dkf

@TheCPUWizard said:

I regularly work with clients who have a single TFS Team Project Collection (Source Control Repro) that is measured in multiple Terabytes.

If you're dealing with a single repository for everything that a corporation does, that's quite achievable. It's nothing like as likely once you do the decomposition down to having a repository-per-project (whether that's stored on a central server or not). One of the things that the various DVCS systems are all pretty opinionated about is that having a single repository for everything is a poor idea. After all, why would the state of Product ABC have much impact on Product XYZ or vice versa? Or were you assuming that the corporation only does one thing?

A core feature of all DVCSs is that it is the overall directory tree that is the actual versioned unit. Once you use that model (and it's very difficult to create a distributed VCS without it) then megarepositories cease to be sensible. Like they were particularly sensible in the first place; the key problem I've noticed is that it becomes progressively more difficult to actually find anything in them. (We used to use one SVN repo for all our group projects and it was really difficult to actually see what was what as code would get copied all over the place “just because”. We also had tags within branches within tags. Because Raisins!)

@TheCPUWizard said:

the desire to have the ability to do a comprehensive audit across the entire base.

That's going to be an absolute horror in reality. Just checking out terabytes of stuff is going to be significantly annoying, and then auditing will actually require reading all that stuff. Even with strong code standards that have been religiously enforced (heh!) that's going to be an awfully difficult task.

TheCPUWizard

dkf - I am not disagreeing with you regarding the use of a DVCS - my point is the exact opposite. There are many situations where I completely agree that a DVCS is a great approach. My argument is that that there are also many places where a CVCS is a great approach; and one should use the right tool to achieve their goals.

As far as the audit part, it really is quite easy with certain CVCS systems. TFVC [the centralized version control that is available with TFS - note: TFS also fully supports using Git for those organizations wishing a distributed model] has everything stored in an OLTP database. As we should all know, querying a properly indexed database with even a billion rows can be quite fast - but that is not all...Since the OLTP is SqlServer there is also an OLAP cube which can be configured with all of the dimensions, measure, facts to have even faster reporting capabilities.

In the above situation there is ZERO need to "checkout" (i.e. retrieve) any of the actual content [though that could be a part of the final analysis after the items have been determined, and will be a extremely time consuming task no matter what if there turns out to be a huge number of files].

hifi

@TheCPUWizard said:

I am really baffled by the people who are so dead set against using a tool (in this case a CVCS) that meets the requirements for a team perfectly, and instead maintain that using a tool (in this case a DVCS) conceived for a completely different type of environment. Seem like the reverse of the "When all you have is a hammer, everything looks like a nail"....

Might be the hammer talking, but git-svn is kind of nice and it's basically how a lot of people use git anyway except they don't use svn as the remote but actual git. DVCS doesn't mean you need to stop thinking about a centralized repo like I pointed out earlier because that mystical single point needs to exist where everyone push their stuff to in the end. With git-svn that is the svn branch you are working on, usually trunk because people hate branching and merging in svn.

In the case of Linux, Torvalds' repo is the one your work eventually is supposed to end up in. There can be many levels of patches, repositories and rewrites before some code gets there and that's where git and DVCS in general is great when you have a long chain and still be able to maintain your authorship and code integrity.

On the other hand one misconception is that DVCS somehow makes it a lot harder to work centralized with one remote repo. I can't see the issue when it just makes your local work easier to manage when you have more options to handle your commits and unfinished features.

Also I'm not really against Subversion or CVCS, it's the client side tools that suck. I could use git-svn with it and ignore the fact it exists on the other end because that's how I use a remote git repository anyway. It's the local jumping around with many unfinished or unrelated things that is a lot harder with the Subversion client. If it had git-like features like local private branching and stashing it would be quite nice actually. But who needs that because git-svn exists? Maybe TFS has these kinds of features or they are planned for svn.

TheCPUWizard

@hifi said:

DVCS doesn't mean you need to stop thinking about a centralized repo like I pointed out earlier because that mystical single point needs to exist where everyone push their stuff to in the end.

The key is that it gets there (maybe!!!) eventually... Consider a team where they are working in 45 minute bursts [every 45 minutes, on average each developer commits new, working, tested code]. Now if there are 10 developers on the team, this equates to an average check-in every 5 minutes (I rounded up).

To maintain this pace the team needs to have detailed information about what is going on. Given every commit should be associated with a Task (multiple commits are typically required to complete a Task) and contains ONLY those elements specific to that task, we have a good solid basis...but only if the material is in a repository that is accessible by all of the potentially involved people.

Getting a notification when a person starts working on a file [non-exclusive checkout from the central repository] is a great trigger (which can be setup via dynamic subscriptions to be extremely well targeted). This can lead to communication [often chat, at least to start] or many other courses of action.

But when people are clueless as to exactly what other people are doing, this type of coordination becomes impossible.

Again, I am not saying that this approach is "globally right" or even "better" in the general sense. But there are a large number of cases where it is the most effective, and using tooling that supports it by design, instead of attempting to find workarounds for a tool that was deliberately designed for the antithesis is almost certainly a poor choice.

hifi

@TheCPUWizard said:

Again, I am not saying that this approach is "globally right" or even "better" in the general sense. But there are a large number of cases where it is the most effective, and using tooling that supports it by design, instead of attempting to find workarounds for a tool that was deliberately designed for the antithesis is almost certainly a poor choice.

Using the right tool for the right job is always the better choice. If you can and want to automate it at VCS level, fine, there's no reason why it wouldn't work or why it should be avoided if your tooling and workflow is built around that.

The notification about someone starting work on some file is nice if you also get the info what task it is related to so you have an idea of the implications. I don't see how this could be sanely done with git because there's no single point in time or branch or whatever when you can determine some particular work has been started except if you create tooling for that around git. Don't know if that's a good idea at all though.

Gąska

@hifi said:

git-svn

No. Just, no.

dkf

@TheCPUWizard said:

Getting a notification when a person starts working on a file [non-exclusive checkout from the central repository] is a great trigger (which can be setup via dynamic subscriptions to be extremely well targeted). This can lead to communication [often chat, at least to start] or many other courses of action.

So… you're getting a notification every 5 minutes and you're expecting to get some ~~reading of WTDWTF~~ work done too…

boomzilla

@hifi said:

The notification about someone starting work on some file is nice if you also get the info what task it is related to so you have an idea of the implications.

I would turn that shit off pretty damn fast. STFU and let me work!

hifi

@boomzilla said:

@hifi said:
The notification about someone starting work on some file is nice if you also get the info what task it is related to so you have an idea of the implications.

I would turn that shit off pretty damn fast. STFU and let me work!

Might only work with small teams? I prefer talking instead of notifications.

boomzilla

Even on a small team. I work on a small team. I don't want to know that Ed Juan opened up a file and started typing. Odds are it won't conflict with whatever I'm doing. It's just noise. I don't need that noise. I suppose there are teams with the problems that @TheCPUWizard is talking about here, but I can't recall anything like that has ever biting me.

Jaloopa

There's a difference between getting a notification and seeing that Bill has the file checked out for edit when you go to check it out yourself

TheCPUWizard

@boomzilla said:

I don't want to know that Ed Juan opened up a file and started typing. Odds are it won't conflict with whatever I'm doing

This is why targeted subscriptions for notifications are key for the times when it is necessary - such as a case I had last week...

I was going to re-org some files by splitting class definitions from single files to partial classes in multiple files, I was also going to be moving where these files lived in the directory structure. Based on knowing what tasks others had active, and the files previously associated with delivering related functionality (mapped via user stories); I also knew that there was a high potential that some of the other developers may need to work on these files.

Their work was higher priority than mine [if the inverse was true, I would have simply locked the files]. I could have repeatedly looked to see if they had a checkout (because of CVCS), but it was much simpler to set up a notification.

About 30 minutes into my work, one of the other developers did indeed checkout one of the files. I contacted him, and found out that his changes were going to be significant. So, I put that file back to original condition, finished the other files, and suspended my work [ShelveSet], and worked on another task until I got the commit notification. As soon as I got that notification, I suspended the other work, resumed the original effort, changed the file, and committed (queued a gated check-in) within about 10 minutes [of which >5 was running the impacted tests locally].

I later found out, that a 3rd developer had re-ordered his days work because of notifications based on what I was doing.

The real benefit was that all three of us ended up in the same Rolling CI [which runs many more tests than the Gated] without there being any files which resulted in any type of merge (even automatic merges which result in code which compiles have the risk of introducing side effects), meaning the code made it to the next stage of the pipeline in that push rather than in multiple updates (which would have incurred additional costs).

Yamikuronue

@boomzilla said:

I don't want to know that Ed Juan opened up a file and started typing

I want to know whenever Intern Billy decides he needs to edit the Grunt script I spent six hours working out, though.

boomzilla

To be fair, that's a supervision of junior employees issue, not a source control issue.

Yamikuronue

If your entire team consists of intelligent, senior-level devs, and you never take on interns or train new associates, then can I work for you? And how many virgins would I get?

boomzilla

@Yamikuronue said:

If your entire team consists of intelligent, senior-level devs, and you never take on interns or train new associates, then can I work for you?

My team has a "senior" person who needs to have her hand held like an intern.

Jaloopa

Sounds like a sexual harassment suit waiting to happen

Yamikuronue

See adjective one in that description :)

boomzilla

Yes, I was providing a counter-example to prove that the entire team wasn't that.

blakeyrat

MAN GIT IS SURE SUCKY SHlT SAUCE CRAPY DUNG!!!

superjer

@blakeyrat said:

MAN GIT IS SURE SUCKY SHlT SAUCE CRAPY DUNG!!!

Ah yes there's our dose of Blakeyrat.