Accelerating Deployment with Kubernetes and Streamlined Automation

Currently, our staging environment uses GitHub, Jenkins, AWS, and Jira. This setup involves creating a branch for a Jira task, opening a pull request on GitHub, and then using Jenkins pipelines to create a machine on AWS and deploy all necessary applications for testing on that machine. However, this setup has several issues. First, it relies on a single machine, which can lead to resource contention and slow testing and deployment times. Second, managing and deploying all of our applications on a single machine can be complex and time-consuming.

Proposed Staging Environment

We propose to use Kubernetes to create a more flexible and scalable staging environment. Kubernetes is a container orchestration platform that allows us to deploy our applications across multiple machines. This will allow us to better utilize our resources and speed up testing and deployment times.

In addition to Kubernetes, we will also use a number of other tools to streamline our staging process and automate tasks. These tools include:

AWS Codebuild: A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.
AWS Lambda: A serverless computing platform that runs code in response to events and automatically manages the underlying infrastructure.
Terraform: A tool for building, changing, and versioning infrastructure safely and efficiently.
Helm Charts: A package manager for Kubernetes that makes it easy to deploy, upgrade, and manage applications on Kubernetes.
AWS EKS: A managed Kubernetes service that makes it easy to deploy and run applications on Kubernetes.
AWS ECR: A fully managed container registry that makes it easy to store, manage, and deploy Docker container images.
Slack: A communication platform for teams.
Requestly: A tool that allows users to modify HTTP requests and responses to test, debug, and demo applications.
Mitmproxy: An open source interactive HTTPS proxy that help to modify your requests from client.
Traefik Ingress: A Kubernetes ingress controller that routes traffic to the correct service within a cluster.

Vision Bot: Managing Deployments for Feature Branch Tests

We will also be using a new tool called the V Bot to manage deployments for feature branch tests. The Vision Bot is triggered via Slack and allows developers to request a staging environment by providing their JIRA task ID and desired environment duration. The Vision Bot retrieves a list of necessary projects from JIRA and checks for the existence of feature branches in the corresponding repositories on GitHub. If a feature branch exists, the Vision Bot generates Terraform files to create an AWS Codebuild that will build the project from the feature branch. If no feature branch exists, the bot directly deploys the project to Kubernetes from the develop branch of the repository.

To allow for the concurrent testing and deployment of multiple features, we are using namespaces on Kubernetes to isolate the different tasks.

Namespaces: Enabling Concurrent Testing and Deployment

Isolation for Concurrent Testing
Projects built from the develop branch are deployed to the default namespace, while projects built from feature branches are deployed to their own namespaces named after the JIRA task ID. This allows us to test and deploy multiple features at the same time without interfering with each other.

DNS Configurations: Enabling Internal Requests and Dependencies

Efficient Dependency Management

To facilitate internal requests between projects within the same namespace, we are using DNS configurations that allow projects to look for dependencies within their own namespace before reaching out to the default namespace.

Traefik Ingress: Routing Requests to the Correct Namespace

External Request Routing

To facilitate external requests to the staging environment, we are using the Traefik Ingress to route requests to the correct namespace. The developer can use the Requestly tool to add a special namespace header to their requests, which will be matched by the Traefik Ingress and routed to the namespaced version of the desired project.

Overall, this new setup allows us to more efficiently and effectively test and deploy features. We can build and deploy only the necessary projects, rather than all projects on a single machine. It also allows us to test multiple features concurrently without interfering with each other, thanks to the use of namespaces and DNS configurations.

Here is an example of how this new setup would work in practice.
Let's say you are a developer working on a JIRA task with the task ID of jira-12345. You have been assigned to work on projects project_1 and project_2, which both depend on project_3.

You would first create feature branches for project_1 and project_2. Then, you would open pull requests on GitHub and add these projects to the project list on JIRA.
Once your pull requests have been merged, you would deploy project_1 and project_2 to their own namespaces, named after the JIRA task ID. Project_3 would be deployed to the default namespace.
You would then be able to test project_1 and project_2 independently, without interfering with each other. You would also be able to test them concurrently, without any problems.

This new setup allows us to more efficiently and effectively test and deploy features. It is a more scalable and flexible solution than our previous setup.

Testing Changes and Deployment

The Testing Process

To test the changes you made in a controlled environment, you can use the Vision Bot to request a test environment through Slack. The bot will gather the necessary project information from JIRA and check if there are corresponding branches in GitHub. It will find that project_1 and project_2 have feature branches, but project_3 doesn't.

Next, the Vision Bot will generate specific files (Terraform files) to create AWS Codebuilds for project_1 and project_2. These files will be sent to a repository called "codebuild-files." Another tool developed in-house will pick up these files and create the Codebuilds accordingly. The feature branches for project_1 and project_2 will trigger these Codebuild pipelines. The pipelines will simultaneously build the projects from their respective feature branches and deploy them to the "jira-12345" namespace in Kubernetes. This process ensures that the build time is determined by the longest build, with a maximum duration of 20 minutes for Insider. As for project_3, it will be deployed to the "jira-12345" namespace as an image from AWS ECR, without the need for a Codebuild process.

To test your changes, you can utilize the Requestly tool to add a special header to your requests. This header will direct the requests to the "jira-12345" namespace through the Traefik Ingress. By doing this, you can access project_1 and project_2 within the "jira-12345" namespace, while still using the same domain names (project_1.example.com and project_2.example.com) as you would for accessing the default namespace. The Traefik Ingress Routes have specific conditions based on headers to control which namespace is being requested.

Here is the code snippet related to Traefik Ingress Routes with header conditions controlling the requested namespace:

{{- if .Values.ingress.enabled -}}
{{- $namespace := .Release.Namespace -}}
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: {{ .Release.Name }}-{{  $namespace }}-ingress-route
spec:
  entryPoints:
    - web
    - websecure
  routes:
  {{- range .Values.ingress.hosts }}
  {{- $host := .host -}}
  {{- range .paths }}
  {{ if eq $namespace "default" }}
  - match: HostRegexp(`{{ $host }}`, `{subdomain:[a-z]+}.{{ $host }}`) && PathPrefix(`{{ .path }}`)
  {{ else }}
  - match: Headers(`x-namespace`, `{{ $namespace }}`) && HostRegexp(`{{ $host }}`, `{subdomain:[a-z]+}.{{ $host }}`) && PathPrefix(`{{ .path }}`)
  {{ end }}
    kind: Rule
    priority: 1
    services:
      - kind: Service
        namespace: {{ $namespace }}
        passHostHeader: true
        name: {{ .service }}
        port: {{ .servicePort }}
  {{ end }}
  {{ end }}
{{- end }}

To ensure that project_3 can access its dependencies (project_4 and project_5) in the "jira-12345" namespace, we have made some changes.
We added a configuration called dnsConfig for all projects, which uses "default.svc.cluster.local" as the default address format. This means that each project will first look for its dependencies within its own namespace when making internal requests. If a required service is not found in the namespace, the request will be directed to the corresponding service in the default namespace using the dnsConfig.

Here is an example of a deployment configuration that includes dnsConfig:

apiVersion:Copy codekind: Deployment metadata: name: deployment-name spec: ... template: ... dnsConfig: searches: - default.svc.cluster.local ...

This new setup for the staging environment allows us to test and deploy features more efficiently. Instead of building and deploying all projects on a single machine, we can now focus on building and deploying only the necessary projects. This significantly reduces the build time, making it almost five times faster. Additionally, it enables us to test multiple features simultaneously without any interference, thanks to the use of namespaces and the DNS configurations.

Conclusion

In conclusion, we have recognized the challenges with our existing staging environment setup, which depends on a single machine and can be difficult to handle. To overcome these challenges, we are adopting a new technology stack and utilizing our own tools like Kubernetes and the Vision Bot. These changes will enable us to establish a staging environment that is more adaptable and scalable. With this new setup, we will be able to deploy and test multiple features at the same time, simplifying our testing and deployment procedures. We are confident that this approach will result in quicker and more efficient development and deployment of features, ultimately enhancing the stability and dependability of our systems for our users.