Secrets to Project Happiness Using Git
I have worked with dozens of projects over my career, on platforms ranging from Drupal to Yii to Django to AngularJS to ASP.NET. Most of them have had a good, solid strategy for how to manage their Git repositories. A few notable ones have not.
Here are a few things I've learned for how to manage your repository to ensure a successful project.
A branch for each environment
Your repository should contain a branch for each hosting environment. At Metal Toad, we typically call these "dev", "staging", and "prod." Code is developed on feature branches and not on the environment branches. The first step to deploying code is to merge the feature branch into the environment branch and push. From there, typically a CI server or other deployment script takes over.
Once you have your environment branches set up, the servers should stay on the same branch forever. Don't go changing branches on the server. Doing so would confuse developers, make tracking the history harder, and possibly break the deployment or CI script.
* We prefer not to use the "master" branch, to avoid ambiguity. Some developers like to use "master" as their development branch, and others like to use it as their production branch. Our solution to this is to delete it entirely and use the branches named above.
Environment branch merging
Try to keep your environment branches in sync as much as possible. Avoid committing anything that would cause conflicts between them.
You should be able to merge a higher environment branch (e.g., prod) into a lower environment branch (e.g., staging) at any time.
This is a handy thing to do after deployments. It helps make sure that any hotfixes deployed to production also exist on the staging and dev environments.
Additionally, you should be able to merge the production branch into any of your feature branches at any time.
Did you just deploy that bug fix to production? And now you need it on the other feature branch you're working on? Simple. Just merge the production branch into your feature branch and you're all set.
Every branch should be able to compile or run on any server or developer machine. If your "dev" branch only works on the dev server and won't run on a developer's local machine, you won't be able to tell if your code actually works until you deploy it. Avoid surprises during deployment and standardize your branches.
What if your "dev" branch has gotten so out of sync with production that you can no longer merge "prod" into "dev"? It might be time to blow away your "dev" branch and reset your dev server to production code, then re-merge your feature branches. It might be painful, but it will save you a lot of time in the long run.
This one is usually pretty simple, but I've seen a few projects thrown into chaos when this wasn’t managed effectively.
Your repository should contain everything you need to run the project. It should be as simple as "git clone", get a copy of the database, run a package install script, and start coding.
I recently worked on a big .NET project where the former dev team had been manually copying DLLs around after every build. The dependencies were not managed in Git at all, and the only way to get the site running was to get a whole working copy from another developer! If this situation describes you it's time to rethink how you're managing your files.
There are two common methods for managing dependencies in Git:
1. Keep a complete, up-to-date manifest of dependencies which your package manager can install
Every package manager has its own dependency manifest file. NPM has package.json, Python has requirements.txt, and ASP.NET has NuGet's package.config. Always keep these files in your repo, and keep them up to date. Then, as packages get upgraded, all the developers need to do is run "npm install" or "pip install -r requirements.txt" or another appropriate command to install all the dependencies.
2. If necessary, keep the dependency installer files in your Git repo
This could take many forms: DLLs for NuGet, Gzipped pip packages for Python, etc.
Yes, this will cause some repo bloat, but it's still sometimes the best course. This might be the best course of action if:
- Your project relies on older packages which have been removed from the package repository
- The firewall on your web server prevents access to the package repository
- You have patched some of the packages to fix bugs
- You're using Drupal. (PHP was late to the game with the package manager idea.)
If you're going to do this, you'll likely need to do some additional work. First, keep the packages in a clearly labeled folder so everyone understands what they are. Second, make sure your package manifest or setup command contains the necessary flags to install the packages from the files in Git, instead of downloading them from the package repository server.
For example, if you have Python packages in a folder called "packages", your pip install script might look like this:
pip install -r requirements.txt --no-index --find-links=path/to/packages
If you're keeping the dependencies in the Git repo along with the project, it might be tempting for cowboy developers to hack the code within the packages. Don't do this! Or, if you really must patch them to fix a bug, at least keep a detailed readme file explaining what you did. Keep it in the repo next to the package itself.
Most applications will have a few configuration files for storing the application settings. Drupal's settings.php, Python's settings.py, ASP.NET's Web.config, etc. These should be tracked in Git. If you're not sure about that, re-read what I wrote above: "git clone", get the database, install packages, and go.
There's a gotcha to keeping these files in Git, though. What if each developer uses different database connection strings? Or different debug settings? How can the team stop these from being accidentally committed to Git?
Simple. Create a "base" configuration file with all the shared settings, and a "local" configuration file which overrides the local settings on a per-machine basis. Drupal 8 has this built in, with its "settings.php" and "settings.local.php" files. It's possible to implement in other systems like Django, as well.
Keep your base configuration file in Git, and don't keep the machine-specific configuration file in Git.
To make the next developer's life a little easier, it's nice to keep a "template" of the local configuration file in Git. For example, if your local config file is called "settings.local.php", keep a file called "settings.local.php.default" around. In it, place the connection strings you need to run the project, with placeholder usernames and passwords. Then, a new developer just needs to copy that file to the correct file name and enter their username and password, and they're up and running.
ASP.NET Configuration files
ASP.NET is its own animal when it comes to config file management, but you can use a similar technique here as well. If you're clever enough with your Web.config transformations, you can keep a config file for each environment in Git, and use a switch at compile time to identify which one to use.
If you do this, keep a Web.Base.config file with the base settings, and Web.local.config with developer machine connection strings, Web.prod.config with production server connection strings, etc. Then, configure your project to compile these into Web.config at build time. See Seb Nilsson's blog post for an example. Keep all the environment config files in Git, but add Web.config itself (which is auto-generated in this scenario) to your .gitignore.
Keep your git repository in good working order, including branches for each environment and complete package manager information. New developers on the project will have a much easier time setting everything up.