Do You Git It? A Version Control System Developers Love

Discussions

News: Do You Git It? A Version Control System Developers Love

  1. Git is the tool development teams should be using for version control. Developed for developers by developers, it makes doing version control pretty darn easy. But there is a learning curve associated with Git. But you can beat that learning curve by following the steps outlined in the following article:

    Introductory Guide to Git Version Control System


    Advantages of Using Git

    • Git is super easy to install: I will take you through the installation process – it’s a breeze.
    • Git is easier to learn compared to other systems: by the end of this guide, you will have enough knowledge to get going with Git.
    • Git is fast: So much so that it doesn’t become one of those things you have to force yourself to remember to do and you can integrate it seamlessly with your current workflow.
    • Git is decentralized: If many people are working on a project, they each can have their own copy and not save over each other.

    Disadvantages of Using Git

    • Git has a learning curve: Whilst I did say that it’s one of the easier version control systems to use, any new thing you introduce to your workflow will need some learning time. Learning Git will be similar to learning a new software application such as Word or Excel.

    Introductory Guide to Git Version Control System

    Threaded Messages (19)

  2. Cameron's commentary[ Go to top ]

    I use Git and only use Git, but Cameron's commentary of "Git is the tool development teams should be using for version control." is pretty strong.  There's plenty of reasons that other teams "should" be using other VCSs.

  3. Last year I tried using GIT and found that the windows clients were quite there yet. I remembering trying several and felt unsatisfied. The eclipse git integration is also lacking. Has that improved? Anyone care to share their experience?

  4. Last year I tried using GIT and found that the windows clients were quite there yet. I remembering trying several and felt unsatisfied. The eclipse git integration is also lacking. Has that improved? Anyone care to share their experience?

     

    I use it on windows and linux, but I'm typically using it via the command line which works fine for me on Windows.  There is  Git Extensions which has a decent GUI for Windows.  IIRC, the Eclipse integration was pretty much there, but not everything.

  5. Last year I tried using GIT and found that the windows clients were quite there yet. I remembering trying several and felt unsatisfied. The eclipse git integration is also lacking. Has that improved? Anyone care to share their experience?

    My team is using git bash + gitk + git gui from msysgit, plus some use tortoisegit on top of that.

    Haven't tried latest eclipse git plugin. I've also heard that IntelliJ Idea has very good support fir git.

  6. A version control system some developers love.

    * Continuous integration doesn't fit well into git. I've seen people caliming the opposite - they redefine ether "continuous" or "integration". Or both.

    * Git doesn't work well with binary content

    * The article basically says nothing on working with shared the code. That's where "easy to learn" claim becomes  questionable.

    It would be nice if the article mentioned http://code.google.com/p/tortoisegit/ . This tool can help many svn/cvs developers to get started.

    ---

    I've been working with git for the past 2 years, and I'd be very careful in VCS selection.

  7. Hmm.. having worked extensively with Git I find some of your comments confusing:

    • Continuous integration certainly can work well with Git.  Certainly there are no technical reasons it can't. Obviously where you run it depends how you do your releases and branching.  Most companies that I have seen utilize remote branches which individual developers push their changes to.  A natural place for continuous integration. Certainly nothing stopping you using continuous integration on your local branches either.
    • Git works great with binary content.  Some of the earlier versions might of, but that hasn't been the case for quite a while now.
    • How does not it not work well with shared code?  One of Gits (and other DVCSs) main strengths is its branching model and its merging capabilities.
  8. Well, I'll try to explain.

    * Continuous integration certainly can work well with Git.  Certainly there are no technical reasons it can't. Obviously where you run it depends how you do your releases and branching.  Most companies that I have seen utilize remote branches which individual developers push their changes to.  A natural place for continuous integration. Certainly nothing stopping you using continuous integration on your local branches either.

    Well, 2 technical reasons. One is topic branches model. People work in/push to separate branches that aren't integrated until someone decides to merge or rebase. One can set up CI tool to build those branches, but that's certainly not integration, even though formally CI tool is in use.

    Another one is local branches/"offline" commits. One can (and typically would) make several commits to local branch, and then push it. Noone is aware of commits till branch is pushed. And then fetched by other developers. That's where "continuous" is undermined.

    If anyone can share a specific recipe for truely continous integration with git (or any other dcvs for that matter), that would be just awesome.

     

    * Git works great with binary content.  Some of the earlier versions might of, but that hasn't been the case for quite a while now.


    I'm sure it works as great as possible for dcvs. Still all versions of all files are stored in, and transferred over network to every local and remote repo. When repo is mostly text, and binary content is mostly static, that's fine. And forunately this is the typical case for most development repos. If repo has a lot of binary content though that changes frequently, repo size will grow rapidly with all the implications for clone/fetch.

    * How does not it not work well with shared code?  One of Gits (and other DVCSs) main strengths is its branching model and its merging capabilities.

    I said nothing on how well it works, nor did I question its branching and merging capabilities.
    I did question "easy to learn" claim though. E.g. imagine mental excercise for someone new to git who did "git fetch" and noticed "(forced update)" for branch he forked from. What does it mean? What are the next steps?

    I've watched different developers learning git and svn, and I've seen how git was harder to master for them. And myself.

  9. Can't agree with your comments[ Go to top ]

    Well, I'll try to explain.

    * Continuous integration certainly can work well with Git.  Certainly there are no technical reasons it can't. Obviously where you run it depends how you do your releases and branching.  Most companies that I have seen utilize remote branches which individual developers push their changes to.  A natural place for continuous integration. Certainly nothing stopping you using continuous integration on your local branches either.

    Well, 2 technical reasons. One is topic branches model. People work in/push to separate branches that aren't integrated until someone decides to merge or rebase. One can set up CI tool to build those branches, but that's certainly not integration, even though formally CI tool is in use.

    Sure don't follow you there.  If there has been not integration (no merge or rebase), there really isn't a need for integration test.

    Any CI tool that can run a job from one branch of a vcs can certainly be configured to run multiple jobs, each from a different branch.  Any branch that CI runs from would be an 'integration' branch.  This applies to git, svn, whatever.  In svn, run a CI job on trunk and any number of 'branches' that have ongoing development.  Same with git.

    Another one is local branches/"offline" commits. One can (and typically would) make several commits to local branch, and then push it. Noone is aware of commits till branch is pushed. And then fetched by other developers. That's where "continuous" is undermined.

    The idea of pushing a branch is 'publicizing it' to a central location.  With respect to CI, there is no distinction between using git to push a branch or svn to commit to a branch.

    Its not necessary to run CI on every branch on each developer's machine.  Why bother?  You don't HAVE to work that way.  It seems you have an assumption that every branch has to pass test on every commit.  The vcs technology in use is irrelevant to this question.

    If anyone can share a specific recipe for truely continous integration with git (or any other dcvs for that matter), that would be just awesome.

    Define a workflow that targets maybe two branches for CI: a release maintenance branch, and a next release branch.  The workflow won't release commits until they are committed to one of the other of these branches.

     

    * Git works great with binary content.  Some of the earlier versions might of, but that hasn't been the case for quite a while now.


    I'm sure it works as great as possible for dcvs. Still all versions of all files are stored in, and transferred over network to every local and remote repo. When repo is mostly text, and binary content is mostly static, that's fine. And forunately this is the typical case for most development repos. If repo has a lot of binary content though that changes frequently, repo size will grow rapidly with all the implications for clone/fetch.

    This indeed is extreme condition.  On a clone, the size copied is a function of the size of the image and the number of times it changed.  However, I would still use git in this case because git still saves time and bandwidth.  Compare the download patterns of git and svn.  Lets say a project has 3 images each changed 3 times: 9 images in repo.  One git clone copies down all 9 images once and for all.  Each svn checkout copies down 3 images.  It would take only 3 checkouts in this pattern for git to be more efficient.  Then, lets say somebody changes one of the images.  In git, the change image is fetched once.  In svn, the image has to be downloaded for every local checkout.  git wins.

    * How does not it not work well with shared code?  One of Gits (and other DVCSs) main strengths is its branching model and its merging capabilities.

    I said nothing on how well it works, nor did I question its branching and merging capabilities.
    I did question "easy to learn" claim though. E.g. imagine mental excercise for someone new to git who did "git fetch" and noticed "(forced update)" for branch he forked from. What does it mean? What are the next steps?

    We have 20 years of code control culture (sccs -> rcs -> cvs -> svn) embedded in our dna.  Doing something different will hurt a bit.  I agree git is not good for every situation.  However, in the situation where git is a good fit, overly weighting the learning effort during evaluation of git is a huge mistake in the long term (IMHO).

    I've watched different developers learning git and svn, and I've seen how git was harder to master for them. And myself.

  10. (cutting a few quotations of myself to keep things readable)

    Sure don't follow you there.  If there has been not integration (no merge or rebase), there really isn't a need for integration test.

    Well, it can be the core of misunderstanding. CI is about integrating every meaningful change (commit) as soon as you have it. If you don't do that, your integration is not continuous.

    Any CI tool that can run a job from one branch of a vcs can certainly be configured to run multiple jobs, each from a different branch.  Any branch that CI runs from would be an 'integration' branch.  This applies to git, svn, whatever.  In svn, run a CI job on trunk and any number of 'branches' that have ongoing development.  Same with git.

    Indeed, there are no problems with setting up a a job against a branch. Or several jobs against several branches. The problem is how you get an up-to-date branch that integrates all changes from all committers. With svn or other centralized vcs that happens naturally - people commit to shared trunk, every commit is integrated and is available for CI tool to pull from remote repo.

    With git you have topic branches. In a team of 10 developers working in topic branches, what is the way to get a branch that integrates their work in an automated fashion? I'm personally not sure there is a way.

     

    The idea of pushing a branch is 'publicizing it' to a central location.  With respect to CI, there is no distinction between using git to push a branch or svn to commit to a branch.

    Yes, somewhat. It does also show the difference of git commit vs svn commit for CI. With svn you make a meaningful change (commit), and it is available for CI immediately. It is not the case for git. In order to achieve the same effect, it would be developer responsibility to push the branch immediately after each commit. Which would be really strange use of git.

    Its not necessary to run CI on every branch on each developer's machine.  Why bother?  You don't HAVE to work that way. 

    Absolutely. I would never do that. I prefer dedicated machine for CI tool. And I need to feed it with current integrated result of everyone's work. And the getting latter is problematic.

    It seems you have an assumption that every branch has to pass test on every commit.  The vcs technology in use is irrelevant to this question.

    Well, here is the short, vcs-neutral requirement for CI: every maningful change by every developer should be integrated into result of the teamwork right away, and validated automatically right away.

    Define a workflow that targets maybe two branches for CI: a release maintenance branch, and a next release branch.  The workflow won't release commits until they are committed to one of the other of these branches.

    Thank you. This is similar to what my team is doing now - we have a workflow that includes review, rebase against mainline (or 2 mainlines, if the branch targets "current" and "next" release, just like you described), and build on Hudson. After that branch can be merged.

     

    (On frequently changing binary content)

    This indeed is extreme condition.

    Several times I've seen projects keeping requirement specs, design artifacts and project plans under version control side by side with code. Another interesting case I've seen was drools rules in excel in the repo. That doesn't happen often, but I wouldn't call it extreme condtions. I would for database dump or build products under version control.
    (There are also old school ant-backed projects that have libraries checked in, but these don't change often).

    Compare the download patterns of git and svn.  Lets say a project has 3 images each changed 3 times: 9 images in repo.  One git clone copies down all 9 images once and for all.  Each svn checkout copies down 3 images.  It would take only 3 checkouts in this pattern for git to be more efficient.  Then, lets say somebody changes one of the images.  In git, the change image is fetched once.  In svn, the image has to be downloaded for every local checkout.  git wins.

    That would be true if you check out svn project each and every time. But normal workflow is check out once, and update onwards. Update only gives the most recent version, and this is also what working copy stores. I'd say for this scenario the only chance for git to win in bandwidth/storage is if usage pattern implies going back in history often.

    We have 20 years of code control culture (sccs -> rcs -> cvs -> svn) embedded in our dna.  Doing something different will hurt a bit.  I agree git is not good for every situation.  However, in the situation where git is a good fit, overly weighting the learning effort during evaluation of git is a huge mistake in the long term (IMHO).

    Well said, and this also explains my point with slightly different accents.

  11. (cutting a few quotations of myself to keep things readable)

    Sure don't follow you there.  If there has been not integration (no merge or rebase), there really isn't a need for integration test.

    Well, it can be the core of misunderstanding. CI is about integrating every meaningful change (commit) as soon as you have it. If you don't do that, your integration is not continuous.

    Ok, I'll also use that term: 'meaningful change'.  It seems to say that some changes are not meaningful.  We are not applying any rule as to what is meaningful here, leaving that to the developer shop.  It makes sense that a change that is not meaningful is not worth putting through integration with meaningful changes.

    Any CI tool that can run a job from one branch of a vcs can certainly be configured to run multiple jobs, each from a different branch.  Any branch that CI runs from would be an 'integration' branch.  This applies to git, svn, whatever.  In svn, run a CI job on trunk and any number of 'branches' that have ongoing development.  Same with git.

    Indeed, there are no problems with setting up a a job against a branch. Or several jobs against several branches. The problem is how you get an up-to-date branch that integrates all changes from all committers. With svn or other centralized vcs that happens naturally - people commit to shared trunk, every commit is integrated and is available for CI tool to pull from remote repo.

    To be more precise, only meaningful changes are committed into a shared branch ('trunk' is merely a name of a branch in svn and by community convention, not the svn software, it is the 'next release' branch).  Users of git can also have a central repo with a 'trunk' branch, they just have not carried over the convention of calling it 'trunk'.

    You lose me with the term 'happens naturally'.  I'm left guessing what you mean.  Users have to have write permission to the repo.  They have to deal with merging their commits with changes on that branch.  They have to test the result of their merge.  They have to be responsible for the result of the commit.  Integration of a meaningful change is intentional, manual and consumes some amount of time.

    With git you have topic branches. In a team of 10 developers working in topic branches, what is the way to get a branch that integrates their work in an automated fashion? I'm personally not sure there is a way.

    'topic branch' is a matter of user convention and can be supported by svn, by convention, as well. svn lets developers create 'topic branches' by convention in the 'branches' subdirectory.  So, your paragraph above could have easily started as: "With svn you have topic branches...".  I'm not sure what you intend to say by 'integrates in an automated fashion', but the question is due to the use of topic branches, not to any inherent feature of the vcs tool.  In other words, if topic branches don't work, don't use them.  svn and git do not prevent the creation of a topic branch, so in both cases this is a matter of a developer conforming to the site's workflow.

    If I can take a guess at what you want with 'integrates topic branches in an automated fashion': maybe you expect some automated tool to gather all topic branches together into a single build for integration testing?  The issues here are the tool does not know which topic branches have meaningful changes and which do not, and the tool has no way of resolving merge conflicts that would occur.  Both of these require manual, deliberate decisions.  So, if this is what you mean, I would agree there is no automated solution.

    But the bottom line for me is that dvcs and cvcs both support topic branches and both do not require use of topic branches.  Whether they are put to use or not depends on the developer shop's workflow.

    The idea of pushing a branch is 'publicizing it' to a central location.  With respect to CI, there is no distinction between using git to push a branch or svn to commit to a branch.

    Yes, somewhat. It does also show the difference of git commit vs svn commit for CI. With svn you make a meaningful change (commit), and it is available for CI immediately. It is not the case for git. In order to achieve the same effect, it would be developer responsibility to push the branch immediately after each commit. Which would be really strange use of git.

    Another way of looking at this: commits to cvcs are visible to everyone and so every commit must be meaningful change; whereas commits to dvcs are not visible to everyone so not every commit has to be meaningful change.  dvcs allows me to work through a series of private commits into a single meaningful change.  This gives the advantage of further incrementalizing (?) the change.  I can privately commit a single unit of work even if my effort as a whole is still unstable.  Smaller increments of change are easier to manage.  cvcs users might avoid committing anything until the full single meaningful change is ready.  On changes that take a week or more to implement, I prefer private change control (i.e., all intermediary change is 'not meaningful' :)  For testing of these private commits, I invoke unit tests manually.

    Not sure what you mean by 'strange'.  "Life is strange, but compared to what?".  I have the feeling your constraint to get every commit under a CI run ("Every commit is a meaningful change") is somewhat contrived.  If I had to work that way, I would defer commit until I knew CI test would pass, potentially retaining a large amount of uncommitted work in my local file system without a change history log.  I would have to work out an alternate backup (not using the vcs).  Sorry, I don't like this constraint.

    It seems you have an assumption that every branch has to pass test on every commit.  The vcs technology in use is irrelevant to this question.

    Well, here is the short, vcs-neutral requirement for CI: every maningful change by every developer should be integrated into result of the teamwork right away, and validated automatically right away.

    Yup. I'm working from that view also.  I, for one, have not been convinced that dvcs is obstructing this requirement, which I understand was your main point to begin with.

    Define a workflow that targets maybe two branches for CI: a release maintenance branch, and a next release branch.  The workflow won't release commits until they are committed to one of the other of these branches.

    Thank you. This is similar to what my team is doing now - we have a workflow that includes review, rebase against mainline (or 2 mainlines, if the branch targets "current" and "next" release, just like you described), and build on Hudson. After that branch can be merged.

    Sorry, I don't use rebase.  Changes history which tangles merges.  I like 'merge --squash' for merging 'meaningful change' into a shared branch.

     

    (On frequently changing binary content)

    This indeed is extreme condition.

    Several times I've seen projects keeping requirement specs, design artifacts and project plans under version control side by side with code. Another interesting case I've seen was drools rules in excel in the repo. That doesn't happen often, but I wouldn't call it extreme condtions. I would for database dump or build products under version control.
    (There are also old school ant-backed projects that have libraries checked in, but these don't change often).

    Sorry, I meant that binary files having a performance impact on git is an extreme condition.  Its not extreme to commit binaries into the repo.  Its extreme to have large binaries, modified often in such a manner to impact performance of git.

    Compare the download patterns of git and svn.  Lets say a project has 3 images each changed 3 times: 9 images in repo.  One git clone copies down all 9 images once and for all.  Each svn checkout copies down 3 images.  It would take only 3 checkouts in this pattern for git to be more efficient.  Then, lets say somebody changes one of the images.  In git, the change image is fetched once.  In svn, the image has to be downloaded for every local checkout.  git wins.

    That would be true if you check out svn project each and every time. But normal workflow is check out once, and update onwards. Update only gives the most recent version, and this is also what working copy stores. I'd say for this scenario the only chance for git to win in bandwidth/storage is if usage pattern implies going back in history often.

    I'll have one checkout for the release maintenance branch so I can respond to production emergencies.  I'll have one for the 'next release' branch for my main project of the week.  I'll get re-prioritized before I'm done with my main project of the week, and I'll have a branch or two hanging out that my manager has not yet declared as 'meaningful change'.  Sometimes, I screw up a merge so badly, its better if I recheckout head and load my changes atop that.  So, I will normally have more than one checkout; for me, git wins.

     

    Regards.

  12. Re: Can't agree with your comments[ Go to top ]

    Ok, I guess I'll just acknowledge there is a good amount of disagreement and misunderstanding in this discussion, so I'll leave it at that, instead of arguing on multiple points:

    * Whether every single commit should be a meaningful change, and whether it should be integrated.
      - and whether is has any implications on commit granularity.

    * Which pattern fits naturally and is typically used with cvcs and dvcs, and which isn't: topic branches vs shared trunk.

    * Whether intermediate commits that make your work unstable are good.

    * What does it mean from CI perspective to implement change that takes a week under a private change control.

    * Whether it is good or bad to defer commit until it passes the test.

    * Whether or not integrating on every commit results in large amounts of uncommitted work.

    And I'll finish with a neutral, trivial yet relevant statement: when selecting a vcs, one should understand the implications and have plan.

  13. * Continuous integration doesn't fit well into git. I've seen people caliming the opposite - they redefine ether "continuous" or "integration". Or both.

     

    Why? A continuous integration system only needs to repeatetly checkout the code, in readonly mode. Where is the difficult here?

  14. A continuous integration system only needs to repeatetly checkout the code, in readonly mode. Where is the difficult here?

    This is responsibility of CI tool. CI as engineering practice is more than that. It requires code changes to be integrated from the beginning and continously, and validated by CI tool right away.

    I elaborated this point a few messages above.

  15. GIT has just too many pitfalls that mediocre folks fall into. Try explaining when and when not to use rebasing to your team. Half the folks heads will asplode.

    For 99% of shops out there, Mercurial is the sane alternative (about the same features, half the hazzle.)

  16. when not to use rebase[ Go to top ]

    Like in cvs/svn or any vcs, don't change history after sharing the project.  This is a fundamental rule: don't change history after sharing the project.  How is that hard to understand?  git has ways of changing history, rebase is one of them.  Don't use rebase on shared commits.  The difficulty here is what?

  17. when not to use rebase[ Go to top ]

    Like in cvs/svn or any vcs, don't change history after sharing the project.  This is a fundamental rule: don't change history after sharing the project.  How is that hard to understand?  git has ways of changing history, rebase is one of them.  Don't use rebase on shared commits.  The difficulty here is what?

     

    Didn't you read my comment? The difficulty is getting the idea across. What you just said makes no sense to most developers.

    Rebasing is supereasy indeed, yet people **** it up all the time. For proof, search LKML.

     

  18. GIT has too much detail[ Go to top ]

    I'm using GIT as a member of the JFXtras project and BZR in some other projects. But I'm advising our corporate system to switch to BZR. GIT works fine, but you too often have to actually think about / understand what you are doing and that puts up a treshhold. I want developers to be able to just check-in their changes, without having everyone put on a distributed revision control course.

    This means the default needs to be SVN alike, because that is used most (otherwise it would not be so popular), and optionally local checkins and other repository dynamics should be possible. Available for those interested. BZR allows for different operating modes and SVN like is one of them.

  19. I have a problem with the assumption that 'other' systems are hard and we should all therefore switch to git.

    I really would be concerned about the capabilities of any developer who finds Subversion/Tortiose SVN hard to use. What's so hard to use about it?

    I didn't even find it hard to install and setup: I'm not a system admin but installed the server, installed the Tortoise SVN client and it all just works really well.

    Perhaps if someone is currently using UCM/ClearCase/ClearQuest or PVCS or something then I can understand the 'hard to use' claim but not Subversion.

    Nothing could be simpler to use that Subversion - and the 'multiple repositories' concept of git is just going to help make people make a complete mess with git. The longer multiple people leave resyncing their repository with another repository the greater chance they will have of an unmanageable merge effort.

  20. toys for tots[ Go to top ]

    git is a toy that appeals to script kiddies and their lot

    it is not suited for serious work

     

    the whole "distributed version control" thing is an oxymoron

    it encourages bad habits by undisciplined 'developers'

     

    git is not a vcs, it is a tinkertoy for endlessly playing with merging files,

    which is a far cry from true version control

     

    the best evidence for all this is the fanboi zealotry itself surrounding git

    look! a bright shiny object! must be better than anything else!