Thursday, 13 March 2008

Git

The other day Ryan told me about Git and sent me a video of Linus Torvalds talking about it. Clearly I'm ugly and stupid from Linus's definition. Let's leave it as I just don't really "get it" yet and I'll iterate the issues that I see with it and people can tell me why I'm wrong / right.

Here's my understanding of Git:
Git was designed for open source projects with many people working on the project at the same time. Low bandwidth, quick merges, etc. No one is special. Everyone can modify the code, but what code become the "lead" code is based on a network of trust of developers.

Here's why I'm not sure if it will work well in a corporate env (as they are now):

1) corp's have servers that they backup, work stations they do not. If the code is only on workstations, I see that as very dangerous because of hard drive failure, viruses could wipe out your whole project (because generally all workstations are based on the same image and usually all windows).

2) for smaller projects where it is 1 or 2 developers, this whole network of trust isn't needed. You can't tell your boss "Fred spent 2 weeks on that feature, but I don't trust him, so I'm not going to take it". If you don't trust some else's code in your company, that's a different issue and I think that you have to address that in a different way. The whole "no node is special, but you need a network of trust" strikes me as a Animal Farm-esq way of saying "all animals are equal, but some animals are more equal than others".

3) most project teams are co-located on the same LAN. Bandwidth is a non-issue.

4) if you have a build system that's doing continuous integration, what's it building against? Do you need to have it installed on each person's workstation?

Overall it looks pretty exciting. It looks like what it was designed for it does very well. There doesn't seem to much tool support for it (IDE integration, etc) yet, but that only comes when things hit a certain critical mass.

Tell me what I'm missing and or how it would be addressed. I might be bitching about not being able to climb through the window but missing the door I am standing beside. ;-)

6 comments:

  1. I'll start off by saying Git, like Linux, wasn't explicitly designed to work in a corporate environment. However this doesn't mean the corporate world can't adjust the tool to work well for them (and/or their processes to work well with the tool) -- Git is open source after all.
    There are lots of things about Git that Linus doesn't go over. Number one being that you *can* have a centralized repository where people "push" their changes to and grab changes from others. GitHub.com is offering this service now.
    The Linux kernel is different: Linus takes patches from others (or pulls from other trees) and then releases his repository as pull-only for others. Linus uses his network of trust to decide who to pull from.
    Decentralized repositories mean that everyone has a full copy of the whole repository. If the central server goes down and the hard drive fails, no problem -- just grab a copy off a developer's machine (if you're a douche and you don't make backups of the central server). You can't do this with SVN or CVS or most other SCMs because they only check out "working copies".
    Git really is a paradigm shift, which requires all team members to "get it" and "buy in". If that doesn't happen, using Git is doomed in that corporate environment. Like most open source projects, the Git people really don't care about that! :) If people want to continue to use old SCMs (or none at all) that's their business.

    ReplyDelete
  2. More: the centralized repository with Git solves your issues 1, 2 and 4. As for 3, I would argue that bandwidth is *always* an issue, even on a corporate LAN. Projects get big, you don't want to wait around for a "synchronize" to figure itself out. Git doesn't need to go to the network to do this or even a commit, so it's blazing fast. You only need to use bandwidth when pushing or pulling, and that's compressed traffic!

    ReplyDelete
  3. I think that for any paradigm shifts, you need to get everyone to "get it" and buy-in. Otherwise you just end up with a hybrid solution which usually work worse than either of what the hybrid was based on.
    Thank you for pointing out that you can use a central repo with Git... you're right that addresses the issues that I can think of (other than tool support being young).
    I knew you'd correct me. :-D

    ReplyDelete
  4. The command line tools for Git are great. There's also a GUI-based tool that helps with merging. I'm not sure any more tool support is needed, so I wouldn't worry too much about that. There's also an Eclipse plugin, but it's probably just a wrapper for the command-line tools.

    ReplyDelete
  5. I'm thinking of tools like maven that can be tied to different source control systems (through the scm section of the pom.xml file). Same with other systems that will display diff's in a webpage or send an email based on a changeset.
    Command line is great, but it's also helpful to build in hooks so that other apps can integrate and leverage the good stuff. ;-)

    ReplyDelete
  6. As far as bandwidth not being an issue, and assuming that everybody is on the corporate LAN, well, let's just say that that's how things like Visual Source Safe get invented.
    Personally my favourite source control system (haven't tried Git) is Subversion. It seems to work pretty good, although I could see it getting out of hand if you had a lot of people trying to commit to it. Even with a very large respository, you probably wouldn't have everybody committing to the same files in the repository. Each person would have their own small parts that they would be working on, so I could see how that could still work.
    As far as everybody having copy of the repository in Git, well, this is from Linus, the guy who says "real men don't backup, they put their stuff on FTP and let everybody mirror it". I personally think it would be really nice to have the entire repository on my computer, when you're trying to look through the history of your code to figure out what changed in the code that's causing the new bug-of-the-day. I think that one of the downfalls of only having a working copy is that you have to have network access to look at the history.

    ReplyDelete