Parallelism and Resource Groups: Improving Pipeline Performance and Resource Utilization
Running a Job on Multiple Instances Using a Single Pipeline
We have seen many tips and techniques to optimize the pipeline like managing artifacts, conditional jobs and more, but there are still some ways to make the pipeline faster.
In our following examples, we are going back to our todo app and we are going to run its tests in parallel. We will switch Pytest as our testing framework instead of the built-in UnitTest. This change is not as important as what we are going to do with it and the goal of showing how to run tasks in parallel in GitLab CI/CD. Basically, testing is probably the most time-consuming part of the pipeline for most of us. So, running tests in parallel can save us a lot of time.
Python community developed a Pytest plugin called "pytest-test-groups" that allows us to split tests into groups and run them in parallel (hence our choice to switch to Pytest). Here is how it works:
- The plugin will split the tests into groups based on the number of groups you want to create.
- Pytest will run the tests in parallel based on the number of groups you have created.
Example: 3 groups and 3 tests
pytest --test-group-count 3 --test-group=1
pytest --test-group-count 3 --test-group=2
pytest --test-group-count 3 --test-group=3
This would make the tests run in parallel. What if we want to run the tests in parallel in GitLab CI/CD? We may think of using the following configuration but it will not work as expected because we are not really running the tests in parallel but rather in sequence:
image: python:3.12
stages:
- test
test:
stage: test
script:
- pip install -r requirements.txt
- pip install pytest pytest-test-groups
- pytest --test-group-count 3 --test-group=1
- pytest --test-group-count 3 --test-group=2
# ..etc
To run the tests in parallel, we need to use the parallel keyword. This keyword allows us to run multiple instances of the same job in parallel. The parallel keyword takes an integer as an argument that specifies the number of instances to run. This is how it is done (we're using 2 parallel jobs in this example):
cat <$HOME/todo/app/.gitlab-ci.yml && \
cd $HOME/todo/app && \
git add . && \
git commit -m "Add parallel" && \
git push origin main
image: python:3.12
stages:Cloud Native CI/CD with GitLab
From Commit to Production ReadyEnroll now to unlock all content and receive all future updates for free.
