Deploying to S3 upon Git Push

October 23, 2013

With a simple post-receive hook and using s3cmd, you can have Git deploy to S3 after a pushing to your remote repository. If you’re simply interested in the hook code, I have provided it at the bottom of this post.

Setting up s3cmd

To get started, you’ll want to configure s3cmd on the user account that is holding the bare repository with your either security credentials of your AWS account or security credentials of an IAM user. I highly recommend creating a dedicated IAM user for s3cmd with an user policy that grants it full control to S3 and use its security credentials rather than giving it unlimited permissions by using your AWS account security credentials.

 1$ s3cmd --configure
 2
 3You will be prompted for the access key and secret key:
 4
 5Access key and Secret key are your identifiers for Amazon S3
 6Access Key: ACCESSKEY
 7Secret Key: SECRETKEY
 8
 9Next, you'll be prompted for a GPG encryption key and the path to GPG that will be used when transferring files to S3. You can leave these blank to not use GPG when transferring.
10
11Encryption password is used to protect your files from reading
12by unauthorized persons while in transfer to S3
13Encryption password:
14Path to GPG program [/usr/bin/gpg]:
15
16Then, you'll be prompted if you want to use HTTPS when transferring files:
17
18When using secure HTTPS protocol all communication with Amazon S3
19servers is protected from 3rd party eavesdropping. This method is
20
21slower than plain HTTP and can't be used if you're behind a proxy
22Use HTTPS protocol [No]:
23
24If you said `no` to HTTPS, you will be able to provide a proxy. Leave the proxy name blank if you do not wish to provide a proxy.
25
26On some networks all internet access must go through a HTTP proxy.
27Try setting it here if you can't conect to S3 directly
28HTTP Proxy server name:
29HTTP Proxy server port [0]:
30
31You will then have a chance to review what you have provided and to test access with the supplied credentials.
32
33New settings:
34  Access Key: ACCESSKEY
35  Secret Key: SECRETKEY
36  Encryption password:
37  Path to GPG program: /usr/bin/gpg
38  Use HTTPS protocol: True
39  HTTP Proxy server name:
40  HTTP Proxy server port: 0
41
42Test access with supplied credentials? [Y/n]
43
44If all goes well, you will be provided with the following:
45
46Please wait, attempting to list all buckets...
47Success. Your access key and secret key worked fine :-)
48
49Encryption will also be tested. Finally, you will be prompted whether to save the configuration.
50
51Save settings? [y/N] Y
52Configuration saved to '/home/git/.s3cfg'

Setting up the hook

Navigate into the working directory of your bare Git repository. Then, open up hooks/post-receive in your favorite text editor. Let’s start with the following:

1#!/bin/sh
2
3S3_BUCKET=yourbucket
4TEMP_DEPLOY_DIR=/tmp/$S3_BUCKET/

These are variables we will be working within the hook. You’ll want to set S3_BUCKET to the actual name of your S3 bucket. Currently, we’ll be writing to a directory named after the bucket name in /tmp/, however you can change this if necessary.

We will want to ensure the temporary directory is clean and any Git environment variables aren’t going to conflict, so we’ll add the following to the hook:

1# Ensure that the temporary directory is clean and unset potential conflicting
2# environment variables
3rm -rf $TEMP_DEPLOY_DIR
4unset GIT_DIR
5unset GIT_WORK_TREE

Now we will want to set up populating the working tree. If you have no submodules in your repository, we will use the following:

1# Create a working tree with a bare repo that does not have submodules
2mkdir -p $TEMP_DEPLOY_DIR
3export GIT_DIR=$(pwd)
4export GIT_WORK_TREE=$TEMP_DEPLOY_DIR
5git checkout -f
6cd $TEMP_DEPLOY_DIR

If you do have submodules, dealing with them using the above method is problematic. I found the best solution is to make an entire clone of the repository in order to get the submodules to initialize and update properly:

1# Create a working tree with a bare repo that has submodules
2git clone $(pwd) $TEMP_DEPLOY_DIR
3cd $TEMP_DEPLOY_DIR
4git submodule update --init --recursive

Then, we can now sync the repository with S3:

1# Sync with S3
2s3cmd sync --delete-removed --acl-public --exclude '.git/\*' ./ s3://$S3_BUCKET/

If anything should be preprocessed before syncing with S3, say a Jekyll site, we can build the site and sync only the _site directory:

1# Build and sync
2jekyll build
3s3cmd sync --delete-removed --acl-public --exclude '.git/\*' _site/ s3://$S3_BUCKET/

You will want to ensure anything ran from the hook is set up on the remote server, otherwise it will fail.

Finally, we clean up the temporary directory we were using to sync with S3.

1# Clean up
2cd ..
3rm -rf $TEMP_DEPLOY_DIR

That’s all there is to it. Git will now deploy to your S3 bucket each time you push to your remote repository.

Example post-receive hook

Here is the complete post-receive hook code.

 1#!/bin/sh
 2# post-receive hook that syncs with S3 upon a push
 3
 4S3_BUCKET=yourbucket
 5TEMP_DEPLOY_DIR=/tmp/$S3_BUCKET/
 6
 7# Ensure that the temporary directory is clean and unset potential conflicting
 8# environment variables
 9rm -rf $TEMP_DEPLOY_DIR
10unset GIT_DIR
11unset GIT_WORK_TREE
12
13# Create a working tree with a bare repo that does not have submodules
14mkdir -p $TEMP_DEPLOY_DIR
15export GIT_DIR=$(pwd)
16export GIT_WORK_TREE=$TEMP_DEPLOY_DIR
17git checkout -f
18cd $TEMP_DEPLOY_DIR
19
20# If the repo has submodules, comment out ore remove the above and uncomment the below:
21#
22# git clone $(pwd) $TEMP_DEPLOY_DIR
23# cd $TEMP_DEPLOY_DIR
24# git submodule update --init --recursive
25
26# Sync with S3
27s3cmd sync --delete-removed --acl-public --exclude '.git/\*' ./ s3://$S3_BUCKET/
28
29# If you use Jekyll, comment out or remove the above line and uncomment the below:
30#
31# jekyll build
32# s3cmd sync --delete-removed --acl-public --exclude '.git/\*' _site/ s3://$S3_BUCKET/