5 Tips for managing monorepos in GitLab
GitLab was founded 10 years ago on Git because it is the market leading version control system. As Marc Andressen pointed out in 2011, we see teams and code bases expanding at incredible rates, testing the limits of Git. Organizations are experiencing significant slowdowns in performance and added administration complexity working on enormous repositories or monolithic repositories. Why do organizations develop on monorepos? Great question. While some might believe that monorepos are a no-no, there are valid reasons why companies, including Google or GitLab (that’s right! We operate a monolithic repository), choose to do so. The main benefits are: Monorepos can reduce silos between teams, streamlining collaboration on design, development, and operation of different services because everything is within the same repository. Monorepos help organizations standardize on tooling and processes. If a company is pursuing a DevOps transformation, a monorepo can help accelerate change management when it comes to new workflows or the rollout of new tools. Monorepos simplify dependency management because all packages can be updated in a single commit. Monorepos offer unified CI/CD and build processes. Having all services in a single repository means that you can set up one system of pipelines for everyone. While we still have a ways to go before monorepos or monolithic repositories are as easy to manage as multi-repos in GitLab, we put together five tips and tricks to maintain velocity while developing on a monorepo in GitLab. 1. Use CODEOWNERS to streamline merge request approvals CODEOWNERS files live in the repository and assign an owner to a portion of the code, making it super efficient to process changes. Investing time in setting up a robust CODEOWNERS file that you can then use to automate merge request approvals from required people will save time down the road for developers. You can then set your merge requests so they must be approved by Code Owners before merge. CODEOWNERS specified for the changed files in the merge request will be automatically notified. 2. Improve git operation performance with Git LFS A universal truth of git is that managing large files is challenging. If you work in the gaming industry, I am sure you’ve been through the annoying process of trying to remove a binary file from the repository history after a well-meaning coworker committed it. This is where Git LFS comes in. Git LFS keeps all the big files in a different location so that they do not exponentially increase the size of a repository. The GitLab server communicates with the Git LFS client over HTTPS. You can enable Git LFS for a project by toggling it in project settings. All files in Git LFS can be tracked in the GitLab interface. GitLab indicates what files are stored there with the LFS icon. 3. Reduce download time with partial clone operations Partial clone is a performance optimization that allows Git to function without having a complete copy of the repository. The goal of this work is to allow Git to better handle extremely large repositories. As we just talked about, storing large binary files in Git is normally discouraged, because every large file added is downloaded by everyone who clones or fetches changes thereafter. These downloads are slow and problematic, especially when working from a slow or unreliable internet connection. Using partial clone with a […]
