Deployment

Architecture diagram

_images/webcompat-ml-deploy.png

Technology stack

Current deployment is using the following technologies

  • Docker
    • Package models and dependencies in a single container
    • Distribute docker images in Docker Hub
  • AWS
    • AWS Lambda
      • Handle GitHub web hooks
    • AWS API Gateway
      • Expose lambda handler as API
    • AWS Batch
      • Schedule webcompat-ml tasks
    • AWS S3
      • Store webcompat-ml tasks results
  • GitHub API / webhooks
    • Extract data to build datasets
    • Consume webhook events to trigger the automation

Even though most of the services are deployed in the cloud, all the primitives can be self hosted. The idea is that a webhook from GitHub triggers the automation and a simple HTTP API handles the request and spawns a task.

_images/webcompat-ml-deploy-generic.png

Infrastructure as Code

About

All the infrastructure is managed as code and the codebase lives under mozilla/webcompat-ml-deploy.

To avoid over-complicating things, terraform is maintained in the git repository encrypted using git-crypt.

Important

For each change maintainers should make sure that the state is also checked in the repository. The state also leaks credentials so its important to always make sure that the state is encrypted before pushing.

All ML tasks should be described as a Dockerfile under docker/ and should have the ML model prebundled.

Examples

Regular maintenance tasks

Build the needsdiagnosis model dataset

$ webcompat-ml-needsdiagnosis build-dataset --es-url "<URL>" --es-index-name="<INDEX>" --es-doc-type="<TYPE>" --output "</path/to/dataset.csv>"

Train the needsdiagnosis model

$ webcompat-ml-needsdiagnosis train --data "</path/to/dataset.csv>" --output "</path/to/model.bin>"

Releasing a new needsdiagnosis task image

$ cd webcompat-ml-deploy/docker/needsdiagnosis
$ docker build . -t ml-task:needsdiagnosis --build-arg MODEL_PATH="</path/to/model.bin>"
$ docker tag ml-task:needsdiagnosis mozillawebcompat/ml-task:needsdiagnosis
$ docker push mozillawebcompat/ml-task:needsdiagnosis

Applying a terraform change

$ git-crypt unlock
$ terraform plan
$ terraform apply
$ git add .
$ git add terraform.tfstate
$ git add terraform.tfstate.backup
$ git commit -m '<change applied>'