Computer stuff

Merging multiple Git repositories into one

Recently, I came across an interesting use-case for Git, and I thought it would be neat to make a little write-up about it, so here we go.

The setup

I had a project that was composed of multiple components:

  • An API ;
  • A Ruby wrapper for the API in the form of a gem ;
  • A CLI that used the gem to talk to the API.
flowchart LR A[CLI] <--> B[Gem] B <--> C[API]

These three components each lived in their own Git repository, with their own GitLab project, their own CI, et cætera... The problem with this configuration was that in order to develop a new feature, I had to

  1. Develop the API ;
    • Push some commits
    • Make a tag
    • Run a pipeline to build and deploy
  2. Develop the gem ;
    • Push some commits
    • Make a tag
    • Run a pipeline to build and release
  3. Develop the CLI ;
    • Update the gem to the newer version
    • Push some commits
    • Make a tag
    • Run a pipeline to build and release

It was a lot of boilerplate for such a simple architecture. I don't want to delve too deep into the intricacies of monorepos, especially since I know so little about them, but nevertheless, I decided I wanted to have all of my code in a single place.

The catch

What I wanted was a single Git repository where my three components could live in different subfolders. I could have setup a new repository, copied all the stuff from the other ones and make one huge "Initial commit", but that would mean losing all the history from the previous repos.

So I starting looking up ways to make my dream come true.

So here is how it went down.

A clean slate

I created a new, empty Git repository. I could have used one of the old repositories as a base the merged one, but there were other considerations into play, like project naming, Docker registry URLs and whatnot, so I decided to keep the old histories on archived GitLab projects, and get a fresh start.

mkdir new-repo
cd new-repo
git init
git remote add origin git@gitlab.com:Richard-Degenne/new-repo.git

Importing the API

Since the API was, in my opinion, the central piece to all of this project, I wanted to reuse its master branch as the master branch for the new monorepo. In order to do this, simply add the old repository as a new remote source called old-api, set up master to track old-api/master and remove the remote. Easy enough.

git remote add old-api git@gitlab.com:Richard-Degenne/old-api.git
git fetch old-api master
git branch --track master old-api/master
git remote remove old-api
gitGraph branch old-api/master commit id:"Historic API commits" tag: "master"

Before pulling in more stuff, it's necessary to move the source code of the API to its own subfolder so that it doesn't cause conflicts with the code of the gem or the CLI. So, I made a new src/api folder, moved most of the files in there, and tweaked whatever needed tweaking, such as the CI configuration.

I say "most" here because some files are still intended to stay at the root of the project, like the CI configuration for instance.

mkdir -p src/api
mv <bunch of API stuff> src/api/
# Edit things so that everything run smooth
git add .
git commit -m 'Moved API to its own folder'
gitGraph commit id:"Historic API commits" commit id: "Moved API"

Merging another component

So far, so good, right? In order to merge the gem, I wanted to do the same, expect that instead of tracking the old-gem/master branch, I would merge it into my own master branch. Sounds like a plan.

git remote add old-gem git@gitlab.com:Richard-Degenne/old-gem.git
git fetch old-gem master
git merge old-gem/master
fatal: refusing to merge unrelated histories

Oh no! Git has a security check that prevents merges between "unrelated histories", i.e. you can't merge a branch that doesn't share a parent commit at some point. Fortunately for us, and because Git is the pinnacle of the "If you know what you're doing..." approach, the option --allow-unrelated-histories lets us bypass that check.

git merge old-gem/master --allow-unrelated-histories
# Solve conflicts that show up, and conclude the merge.
git remote remove old-gem
gitGraph branch old-gem/master checkout master commit id:"Historic API commits" checkout old-gem/master commit id:"Historic gem commits" checkout master commit id: "Moved API" merge old-gem/master

Now, we can make a new src/gem folder, move all the necessary files in there, and add a move commit.

mkdir -p src/gem
mv <bunch of gem stuff> src/gem
git add .
git commit -m 'Moved gem to its own folder'
gitGraph branch old-gem/master checkout master commit id:"Historic API commits" checkout old-gem/master commit id:"Historic gem commits" checkout master commit id: "Moved API" merge old-gem/master commit id: "Moved gem"

We can now repeat the same strategy with the CLI.

git remote add old-cli git@gitlab.com:Richard-Degenne/old-cli.git
git fetch old-cli master
git merge old-cli/master --allow-unrelated-histories
git remote remove old-cli

mkdir src/cli
mv <bunch of CLI stuff> src/cli
git add .
git commit -m 'Moved CLI to its own folder'
gitGraph branch old-gem/master branch old-cli/master checkout master commit id:"Historic API commits" checkout old-gem/master commit id:"Historic gem commits" checkout old-cli/master commit id:"Historic cli commits" checkout master commit id: "Moved API" merge old-gem/master commit id: "Moved gem" merge old-cli/master commit id: "Moved CLI"

Conclusion

After that, we can finally tag a version for release and publish everything at once!

gitGraph branch old-gem/master branch old-cli/master checkout master commit id:"Historic API commits" checkout old-gem/master commit id:"Historic gem commits" checkout old-cli/master commit id:"Historic cli commits" checkout master commit id: "Moved API" merge old-gem/master commit id: "Moved gem" merge old-cli/master commit id: "Moved CLI" commit id: "Released 1.0.0" tag: "1.0.0"

My final structure for the repository looks something like this.

.
├── CHANGELOG.md
├── .gitignore
├── .gitlab-ci.yml
├── README.md
└── src
    ├── api
    │   ├── app
    │   ├── bin
    │   ├── config
    │   ├── config.ru
    │   ├── db
    │   ├── Dockerfile
    │   ├── Gemfile
    │   ├── Gemfile.lock
    │   ├── .gitignore
    │   ├── lib
    │   ├── Procfile
    │   ├── Rakefile
    │   └── spec
    ├── cli
    │   ├── Dockerfile
    │   ├── Gemfile
    │   ├── Gemfile.lock
    │   ├── .gitignore
    │   ├── lib
    │   ├── Procfile
    │   ├── Rakefile
    │   └── spec
    └── gem
        ├── Gemfile
        ├── Gemfile.lock
        ├── .gemspec
        ├── .gitignore
        ├── lib
        ├── Rakefile
        └── spec

Honestly, this was a pretty fun use-case and it is a testimony to Git's sheer depth and flexibility.

A question, a feedback? Write a comment!