Greg's blog

A Curious Setup

I've been working through Rob Conery's excellent book A Curious Moon. It's an interesting take on the technical book genre. Instead of dryly telling you everything about some topic in a structured way, Conery has instead created a narrative around a young data scientist named Dee Yan who is tasked with sifting through tons of data related to the Cassini mission to Saturn's moon Enceladus. Along the way, she (and the reader) learn PostgresQL.

The book recommends installing Postgres, obviously, but more and more I'm hesitant to install software on my PC. Instead, my preference is to just spin up a docker container purpose-built for that application. For A Curious Moon, here's what that looks like:

Docker Compose

The purpose of the book is to learn Postgres database, but all docker containers are ephemeral, so everything you've changed in the container is lost when it's closed. To get around this, we can use docker volumes to persist our data. Here's my docker-compose.yml file:

version: '3.1'

services:
  db:
    image: postgres:latest
    container_name: red1
    restart: always
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_USER: dee

    volumes:
      - data:/var/lib/postgresql/data
    ports:
      - 5432:5432

volumes:
  data:

This is pretty straightforward. We are using the latest version of the postgres container from DockerHub, and we set some environment variables for the username and password. The db container will have a volume created by docker to use for persisting data. Just run this using docker-compose up.

Makefile

Throughout the book, Dee uses a Makefile to setup her database. Using a container running Postgres rather than a local DB requires a few changes to the Makefile:

HOST=<The IP address of the machine running docker-compose>
PORT=5432
DB=enceladus
USER=dee
PASSWORD=password
BUILD=${CURDIR}/build.sql
SCRIPTS=${CURDIR}/scripts
CSV='${CURDIR}/master_plan.csv'
MASTER=$(SCRIPTS)/import.sql
NORMALIZE=$(SCRIPTS)/normalize.sql

all: normalize
    PGPASSWORD=$(PASSWORD) psql -U $(USER) -h $(HOST) -p $(PORT) -d $(DB) -f $(BUILD)

master:
    @cat $(MASTER) >> $(BUILD)

import: master
    @echo "\COPY import.master_plan FROM $(CSV) WITH DELIMITER ',' HEADER CSV;" >> $(BUILD)

normalize: import
    @cat $(NORMALIZE) >> $(BUILD)

clean:
    @rm -rf $(BUILD)

The big changes here are we need to specify an IP address, port, user and password now that the database is remote. Just make sure the username, password and port match what was done above in the docker-compose.yml file. The next BIG small change is that the COPY command has been replaced with \COPY (note the slash). This means the command is using local files, but is still working on the remote database. We don't easily have access to the filesystem of the container, so this is the way to do it.

Conclusion

That's it! A few simple changes that allow me to work the way I prefer. I didn't mention, I'm running the docker container on my host Windows machine, but interacting with it through Ubuntu running on Windows Subsystem for Linux. This way I'm able to easily use Linux specific tools like make without needing to install MinGW. Check out A Curious Moon, I'm really enjoying it.

#docker #make #postgres