CI

An SA story about hyperscaling GitLab Runner workloads using Kubernetes

The following fictional story1 reflects a repeating pattern that Solutions Architects at GitLab encounter frequently. In the analysis of this story we intend to demonstrate three things: (a) Why one should be thoughtful in leveraging Kubernetes for scaling, (b) How unintended consequences of an approach to automation can create a net productivity loss for an organization (reversal of ROI) and (c) How solutions architecture perspectives can help find anti-patterns – retrospectively or when applied during a development process. A DevOps transformation story snippet Gild Investment Trust went through a DevOps transformational effort to build efficiency in their development process through automation with GitLab. Dakota, the application development director, knew that their current system handled about 80 pipelines with 600 total tasks and over 30,000 CI minutes so they knew that scaled CI was needed. Since development occurred primarily during European business hours, they were interested in reducing compute costs outside of peak work hours. Cloud compute was also a target due to acquring the pay per use model combined with elastic scaling. Ingrid was the infrastructure engineer for developer productivity who was tasked with building out the shared GitLab Runner fleet to meet the needs of the development teams. At the beginning of the project she made a successful bid to leverage Kubernetes to scale CI and CD to take advantage of the elastic scaling and high availability all with the efficiency of containers. Ingrid had recently achieved the Certified Kubernetes Administrator (CKA) certification and she was eager to put her knowledge to practical use. She did some additional reading around applications running on Kubernetes and noted the strong emphasis on minimizing the resource profile of microservices to achieve efficiency in the form of compute density. She defined runner containers with 2GB of memory and 750millicores (about three quarters of a CPU) had good results from running some test CI pipelines. She also decided to leverage the Kubernetes Cluster Autoscaler which would use the overall cluster utilization and scheduling to automatically add and remove Kubernetes worker nodes for smooth elastic scaling in response to demand. About 3 months into the proof of concept implementation, Sasha, a developer team lead, noted that many of their new job types were failing with strange error messages. The same jobs ran fine on quickly provisioned GitLab shell runners. Since the primary difference between the environments was the liberal allocation of machine resources in a shell runner, Sasha reasoned that the failures were likely due to the constrained CPU and memory resources of the Kubernetes pods. To test this hypothesis, Ingrid decided to add a new pod definition. She found it was difficult to discern which of the job types were failing due to CPU constraints, which ones due to memory constraints and which ones due to the combination of both. She knew it could be a lot of her time to discern the answer. She decided to simply define a pod that was more liberal on both CPU and memory and have it be selectable by runner tagging when more resources were needed for certain CI jobs. She created a GitLab Runner pod definition with 4GB of memory and 1750 millicores of CPU to cover the failing job types. Developers could then use these larger containers when the smaller ones failed by adding the […]

Read More

Când căutarea simplității creează complexitate în conductele CI bazate pe containere

Într-un club de carte GitLab, am citit recent „ Legile simplității ”, o carte grozavă pe un subiect care m-a fascinat profund de mulți ani. Cartea conține un acronim care exprimă abordările de generare a simplității: SHE, care înseamnă „shrink, hide, inbody”. Aceste trei abordări pentru generarea simplității au toate un atribut comun: toate creează iluzii – nu eliminări. Am văzut această iluzie repetându-se în multe, multe tărâmuri de urmărire de mulți ani. Chiar și în limbajul uman, dezvoltarea vocabularului, jargonul și acronimele încapsulează pur și simplu lumi de complexitate care încă există, dar pot fi mai ușor de făcut referire într-o formă compactă care realizează SHE în lumea conceptelor. Orice iluzie are o limită sau o cortină unde în fața cortinei complexitatea poate fi rezolvată urmând reguli simple, dar, în spatele cortinei, complexitatea trebuie gestionată de un manager de scenă. De exemplu, când spectacolul de magie creează spectrul tăierii oamenilor în jumătate, ceea ce pare a fi o simplă cutie este de fapt un instrument extrem de elaborat. Nu numai asta, dar procesul de fabricație pentru o cutie simplă reală și pentru cutia de ferăstrău sunt semnificativ diferite în ceea ce privește complexitatea. Producerea complexității și rezultatul acesteia sunt, în esență, compromisul pentru ceea ce ar fi complexitatea din lumea reală de a tăia oamenii în jumătate și de a-i face să se vindece și să se ridice nevătămați imediat după. Pentru a aduce acest lucru în domeniul abilităților tehnice, luați în considerare că atunci când utilizați o componentă terță parte sau API pentru a adăuga funcționalitate, trebuie doar să cunoașteți parametrii pentru a obține rezultatul dorit. Persoanele care întrețin acea componentă sau API-ul trebuie să cunoască nivelul de detaliu al mecanicii cuantice despre cum să efectueze acea lucrare într-un mod fiabil și complet. Containerele Docker sunt un mecanism pentru încorporarea complexității și sunt utilizate în aplicații scalate și în CI bazat pe container. Când un inginer de automatizare CI/CD folosește CI bazat pe container, este posibil să facă lucrurile mai complexe și mai scumpe atunci când încearcă să facă exact invers. În esență, această postare este preocupată de modul în care se poate întâmpla ca urmărirea unei lumi mai simple prin containere să se transforme într-un antimodel – o inversare a rezultatelor dorite – de multe ori, fără ca noi să observăm că inversarea ne afectează productivitatea. Închisoarea unei paradigme este într-adevăr sigură. A doua lege a dinamicii complexității De-a lungul anilor, am ajuns să cred că căutarea reducerii complexității are caracteristici similare celei de-a doua lege a termodinamicii . Rezultatul net al unei schimbări între masă și energie are ca rezultat aceeași cantitate netă de masă și energie, dar raportul și forma lor s-au schimbat. În ceea ce voi inventa „A doua lege a dinamicii complexității”, complexitatea este în mod similar „conservată”, este doar reformată. Dacă complexitatea nu este eliminată prin simplificarea eforturilor, îi reducem impactul într-un domeniu dat prin schimbarea raportului dintre complexitate și simplitate pe fiecare parte a uneia sau mai multor perdele. Dar, din păcate, complexitatea nu a murit, doar s-a ascuns și acum este provocarea de management a altcuiva. Este important să nu te gândești la asta ca la o înșelăciune. Nu există nicio îndoială că ascunderea complexității are potențialul de câștiguri masive de eficiență atunci când lumea din spatele mecanismelor de ascundere […]

Read More

When the pursuit of simplicity creates complexity in container-based CI pipelines

In a GitLab book club, I recently read “The Laws of Simplicity,” a great book on a topic that has deeply fascinated me for many years. The book contains an acronym that expresses simplicity generation approaches: SHE, which stands for “shrink, hide, embody.” These three approaches for simplicity generation all share a common attribute: They are all creating illusions – not eliminations. I’ve seen this illusion repeat across many, many realms of pursuit for many years. Even in human language, vocabulary development, jargon, and acronyms all simply encapsulate worlds of complexity that still exist, but can be more easily referenced in a compact form that performs SHE on the world of concepts. Any illusion has a boundary or curtain where in front of the curtain the complexity can be dealt with by following simple rules, but, behind the curtain, the complexity must be managed by a stage manager. For instance, when the magic show creates the spectre of sawing people in half, what appears to be a simple box is in fact an exceedingly elaborate contraption. Not only that, but the manufacturing process for an actual simple box and the sawing box are markedly different in terms of complexity. The manufacturing of complexity and its result are essentially the tradeoff for what would be the real-world complexity of actually sawing people in half and having them heal and stand up unharmed immediately afterward. To bring this into the technical skills realm, consider that when you leverage a third-party component or API to add functionality, you only need to know the parameters to obtain the desired result. The people maintaining that component or API must know the quantum mechanics detail level of how to perform that work in a reliable and complete way. Docker containers are a mechanism for embodying complexity, and are used in scaled applications and within container-based CI. When a CI/CD automation engineer uses container-based CI, it is possible to make things more complex and more expensive when attempting to do exactly the opposite. At its core, this post is concerned with how it can happen that pursuing a simpler world through containers can turn into an antipattern – a reversal of desired outcomes – many times, without us noticing that the reversal is affecting our productivity. The prison of a paradigm is secure indeed. The Second Law of Complexity Dynamics Over the years I have come to believe that the pursuit of reducing complexity has similar characteristics to The Second Law of Thermodynamics. The net result of a change between mass and energy results in the the same net amount of mass and energy, but their ratio and form have changed. In what I will coin “The Second Law of Complexity Dynamics,” complexity is similarly “conserved,” it is just reformed. If complexity is not eliminated by simplifying efforts, we reduce its impact in a given realm by changing the ratio of complexity and simplicity on each side of one or more curtains. But alas, complexity did not die, it just hid and is now someone else’s management challenge. It is important not to think of this as cheating. There is no question that hiding complexity carries the potential for massive efficiency gains when the world behind the hiding mechanisms becomes the realm of specialty skills and specialists. […]

Read More