The innards of pisoc.net

Posted: August 18, 2018 | Author: Tom Mitchell
Repository: https://github.com/pisoc/pisoc.net
python hugo flask github

TL;DR

Our flask app listens for webhooks triggered by pushes to the repository. When it recieves a POST request to the “webhook endpoint”, it pulls the latest versions of the repositories for the website (main and theme submodule) and uses Hugo to rebuild the static portion of the site.

In some more detail

Python and Flask

Flask is a web development “microframework” written in Python. We’re using this as the “bridge” between the two functionalities we need for this project: serving web content, and executing commands on the host machine. There are probably “more proper” ways to achieve the same result (some sort of task queue running in the background), but this approach is more simple.

Serving web content

The first of these is achieved with a modification of flask.helpers.send_from_directory. The original method takes a file path and directory as an argument, and if the path is “below” the directory, returns the file to send to the user.

This wasn’t quite fit for purpose because of the way Hugo generates content. Certain resources aren’t a file - they are a directory containing a file called index.html. This causes a problem for send_from_directory because it only sends single files, not directories, and will fail if you don’t give it a file.

For example, the structure for the news directory:

news/
├── election-results-2017
│   └── index.html
├── first-gameathon-anounce-2017
│   └── index.html
├── first-gameathon-update-2017
│   └── index.html
├── first-linux-install-party-2016
│   └── index.html
├── first-linux-install-party-2017
│   └── index.html
├── index.html
├── index.xml
├── page
│   ├── 1
│   │   └── index.html
│   └── 2
│       └── index.html
├── pentesting-announcement
│   └── index.html
├── trip-to-cambridge
│   └── index.html
└── welcome-freshers-2017
    └── index.html

However, the fix for this is fairly simple - instead of failing if the requested resource isn’t a file, we perform a further check to see if the resource the user requested is a directory. If it is, we assume there’s an index.html inside the directory and try to send that to the user (failing if our assumption was wrong).

Using python to execute shell commands

In app.py, we define a function called rebuild. Flask registers this as the view function for the “rebuild endpoint”. Essentially, when we get requests sent to the URL associated with that endpoint, rebuild is run with the context of the request we recieved. Once we get that far, two checks are performed. We ensure that the hook came from GitHub, and that the push was to the master branch (this is considered the “live” production copy of the codebase).

If these two conditions are met, we use python’s submodule package to do the following on the server the site is hosted on:

$ git submodule update --recursive --remote
$ git pull --recurse-submodules
$ hugo --cleanDestinationDir -s hugo/

This updates the “commit pointers” for all submodules (in our case, we only have one), checks out the appropriate commits from GitHub, and uses Hugo to rebuild our static content.

Hugo

All of the “content” for the site (HTML, etc.) is generated by Hugo. Hugo is a static site generator that uses markdown files (the content of the site) and a theme (how to style that markdown) to generate static webapges. This static content isn’t stored on GitHub, because it’d be redundant to keep it in more than one place (the server) if we have everything we need to recreate it.

All of the site’s markdown is under the hugo/content/ directory. News posts are under hugo/content/news, and the project writeups (like this article) are under hugo/content/projects. Likewise, the theme we get from the pisoc-theme repository is stored under hugo/theme.