Deploying Dagster to Amazon Web Services

To Nha Notes | July 22, 2024, 1:05 p.m.

Download sample code

Clone dagster sample code from the Github repos

https://github.com/elestio-examples/dagster/tree/main

https://github.com/menendes/dagster_deployment_docker/tree/main

https://github.com/AntonFriberg/dagster-project-example

Deploying sample code to ECS via AWS Copilot tool

## Step 1: Install AWS Copilot

Before we begin, ensure you have AWS Copilot installed on your local machine. If you haven’t installed it yet, you can do so with the following commands:

```sh
curl -Lo copilot https://github.com/aws/copilot-cli/releases/latest/download/copilot-linux \
  && chmod +x copilot \
  && sudo mv copilot /usr/local/bin/copilot

export AWS_PROFILE=<YOUR_AWS_PROFILE>
```

## Step 2: Initialize the Copilot Application

Navigate to your project directory where you have your Docker Compose files. Start by initializing a new Copilot application:

```sh
copilot app init

copilot env init

copilot env deploy
```

Follow the prompts to set up your application. This step creates a new AWS application and prepares it for adding services.

## Step 3: Create Copilot Services

### Create PostgreSQL Service

In Docker Compose, you define a PostgreSQL service. AWS Copilot does not natively support managed databases, so you have a couple of options: use Amazon RDS for a managed database or run PostgreSQL within an ECS service. Here, we'll use an ECS service for simplicity.

Create a Dockerfile for PostgreSQL, named `Dockerfile_postgres`:

```Dockerfile

# Dockerfile for PostgreSQL

FROM postgres:14

# Set environment variables

ENV POSTGRES_USER=postgres_user

ENV POSTGRES_PASSWORD=postgres_password

ENV POSTGRES_DB=postgres_db

# Expose port 5432 (default PostgreSQL port)

EXPOSE 5432

# Create the data directory and set permissions

RUN mkdir -p /var/lib/postgresql/data && chown -R postgres:postgres /var/lib/postgresql/data

# Specify the entrypoint script

ENTRYPOINT ["docker-entrypoint.sh"]

# Add a health check

HEALTHCHECK CMD pg_isready -U $POSTGRES_USER -d $POSTGRES_DB

# Start PostgreSQL

CMD ["postgres"]

```

Then, initialize the PostgreSQL service:

```sh
copilot svc init --name postgres --svc-type "Backend Service" --dockerfile ./Dockerfile_postgres

copilot svc deploy --name postgres
```

Deploy

```sh
copilot svc deploy --name postgres
```

### Create Dagster User Code Service

For the Dagster user code service, use the Dockerfile `Dockerfile_user_code`:

```Dockerfile
FROM python:3.10-slim

RUN pip install \
 dagster \
 dagster-postgres \
 dagster-docker

WORKDIR /opt/dagster/app
COPY ./repo.py /opt/dagster/app

EXPOSE 4000

CMD ["dagster", "api", "grpc", "-h", "0.0.0.0", "-p", "4000", "-f", "repo.py"]
```

Initialize the Dagster user code service:

```sh
copilot svc init --name dagster-user-code --svc-type "Backend Service" --dockerfile ./Dockerfile_user_code
```

Configure environment variables:

# .copilot/dagster-user-code/manifest.yml
name: dagster-user-code
type: Backend Service

image:
  build: ./Dockerfile_dagster
  port: 3000

variables:
  DAGSTER_POSTGRES_USER: postgres_user
  DAGSTER_POSTGRES_PASSWORD: postgres_password
  DAGSTER_POSTGRES_DB: "ostgres_db

 

Deploy

```sh
copilot svc deploy --name dagster-user-code
```

### Create Dagit Service

For the Dagit service, create a Dockerfile named `Dockerfile_dagster`:

```Dockerfile
FROM python:3.10-slim

RUN pip install \
    dagster \
    dagster-graphql \
    dagit \
    dagster-postgres \
    dagster-docker \
    dagster-webserver

ENV DAGSTER_HOME=/opt/dagster/dagster_home/

RUN mkdir -p $DAGSTER_HOME

COPY ./dagster.yaml ./workspace.yaml $DAGSTER_HOME

WORKDIR $DAGSTER_HOME

EXPOSE 3000

ENTRYPOINT ["dagit", "-h", "0.0.0.0", "-p", "3000", "-w", "workspace.yaml"]
```

Initialize the Dagit service:

```sh
copilot svc init --name dagit --svc-type "Load Balanced Web Service" --dockerfile ./Dockerfile_dagster --port 3000
```

Configure environment variables:

# .copilot/dagit/manifest.yml
name: dagit
type: Load Balanced Web Service

image:
  build: ./Dockerfile_dagster
  port: 3000

variables:
  DAGSTER_POSTGRES_USER: postgres_user
  DAGSTER_POSTGRES_PASSWORD: postgres_password
  DAGSTER_POSTGRES_DB: "ostgres_db

# Any other necessary configuration for your service

Deploy

```sh
copilot svc deploy --name dagit
```

### Create Dagster Daemon Service

For the Dagster daemon service, use the same Dockerfile as Dagit but modify the entrypoint:

```Dockerfile
# app/Dockerfile_dagster
ENTRYPOINT ["dagster-daemon", "run"]
```

Initialize the Dagster daemon service:

```sh
copilot svc init --name dagster-daemon --svc-type "Backend Service" --dockerfile ./Dockerfile_daemon
```

Configure environment variables:

# .copilot/dagster-daemon/manifest.yml
name: dagster-daemon
type: Backend Service

image:
  build: ./Dockerfile_dagster
  port: 3000

variables:
  DAGSTER_POSTGRES_USER: postgres_user
  DAGSTER_POSTGRES_PASSWORD: postgres_password
  DAGSTER_POSTGRES_DB: "ostgres_db

# Any other necessary configuration for your service

 

Deploy

```sh
copilot svc deploy --name dagster-daemon
```

## Step 4: Configure Services

Configure your services using environment variables and other settings. Deploy each service:

```sh
copilot svc deploy --name postgres --env test
copilot svc deploy --name dagster-user-code --env test
copilot svc deploy --name dagit --env test
copilot svc deploy --name dagster-daemon --env test

```

## Step 5: Clean Up Residual Docker Compose Setup

After migrating your setup to AWS Copilot, you can remove or archive your old Docker Compose files as they are no longer needed.

 

How to define docker_image in dagster.xml file
List ECR Repositories

aws ecr describe-repositories

{
    "repositories": [
        {
            "repositoryArn": "arn:aws:ecr:ap-northeast-1:910188066190:repository/dagster001/dagster-user-code",
            "registryId": "910188066190",
            "repositoryName": "dagster001/dagster-user-code",
            "repositoryUri": "910188066190.dkr.ecr.ap-northeast-1.amazonaws.com/dagster001/dagster-user-code",
            "createdAt": "2024-07-22T23:18:44.917000+07:00",
            "imageTagMutability": "MUTABLE",
            "imageScanningConfiguration": {
                "scanOnPush": false
            },
            "encryptionConfiguration": {
                "encryptionType": "AES256"
            }
        },
        {
            "repositoryArn": "arn:aws:ecr:ap-northeast-1:910188066190:repository/dagster001/dagster-tutorial",
            "registryId": "910188066190",
            "repositoryName": "dagster001/dagster-tutorial",
            "repositoryUri": "910188066190.dkr.ecr.ap-northeast-1.amazonaws.com/dagster001/dagster-tutorial",
            "createdAt": "2024-07-23T11:30:31.263000+07:00",
            "imageTagMutability": "MUTABLE",
            "imageScanningConfiguration": {
                "scanOnPush": false
            },
            "encryptionConfiguration": {
                "encryptionType": "AES256"
            }
        }
    ]
}

List Images in a Specific Repository

aws ecr list-images --repository-name 'dagster001/dagster-user-code'

{
    "imageIds": [
        {
            "imageDigest": "sha256:d74e8578d19ede2b4ff05855901d21e73a05ee04dee910c6249876bb8c6913eb"
        },
        {
            "imageDigest": "sha256:f72b2cabccdef1d369fee9bae99965fd6061a99feb06da5160431595a9f42996"
        },
        {
            "imageDigest": "sha256:3caf66210b1c3431347227943a9223cd5ccd70f27358799ffefd5886f21b3916",
            "imageTag": "latest"
        },
        {
            "imageDigest": "sha256:da3b57f693da1e715158bde84c827b1e649311f869c638de1a822ab3f009df9b"
        },
        {
            "imageDigest": "sha256:03a115077c5bc2be429fcb18eba18e1f3985e1cb77eb6f817317da49944cf916"
        }
    ]
}

Get Detailed Information about Images

aws ecr describe-images --repository-name dagster001/dagster-user-code

{
    "imageDetails": [
        {
            "registryId": "910188066190",
            "repositoryName": "dagster001/dagster-user-code",
            "imageDigest": "sha256:d74e8578d19ede2b4ff05855901d21e73a05ee04dee910c6249876bb8c6913eb",
            "imageSizeInBytes": 110528528,
            "imagePushedAt": "2024-07-23T00:11:30+07:00",
            "imageManifestMediaType": "application/vnd.docker.distribution.manifest.v2+json",
            "artifactMediaType": "application/vnd.docker.container.image.v1+json",
            "lastRecordedPullTime": "2024-07-23T00:12:26.607000+07:00"
        }
    ]
}



 

 
References

https://docs.dagster.io/deployment/guides/aws

https://ibrahimhkoyuncu.medium.com/dagster-complete-guide-to-deploy-multiple-data-pipelines-to-dagster-on-docker-environment-aae47028a4ce

https://docs.dagster.io/deployment/guides/docker