Architecting a Raspberry Pi air quality monitor with AWS & IoT Core (V2)

Kimberly McManus
6 min readJan 3, 2022
Figure 1: Architecture of approach (Image by author)

Dec 2021: Hi Friends — this is an updated version of this with a more sustainable architecture. The main update is swapping in postgreSQL/lambda/S3 and out dynamodb, as I found it to be costly to read and non-performant.

I live in Northern California and it is wildfire season. Increasingly, this means poor air quality even in cities. With COVID (don’t go indoors!) and wildfires (don’t go outdoors!) — there was really nowhere else to go besides my own apartment. That of course meant that it was time for a new project. If my goal was to just get an air quality monitor, I probably would have gone with purple air, as it doesn’t require coding, comes with a social component and allows API access for maximum flexibility. I chose the below approach because I wanted to use a raspberry pi, wanted future flexibility to add additional sensors, and plan to build up my sensor device collection.

Some features of the approach I took below:

  • Data can be contributed to this open source community.
  • Sensors: BME280 (tracks temperature, humidity, & pressure), PMS5003 (laser particle counter — tracks particular matter), LTR-559 (light sensor), MEMS (microphone, noise sensor)

Things I knew how to do: Write Python

Things I didn’t know how to do: Everything else described below

Cost: Raspberry Pi Zero W + enviro+, soldering iron if you don’t have one, everything AWS was free.

Below are my steps to create a Raspberry Pi driven air quality monitor.

Figure 2: Enviro+ and particulate matter sensor (Image by author)

Raspberry Pi and enviro+

This is my first Raspberry Pi adventure and now I understand why >30 million have already been sold worldwide. I purchased a Raspberry Pi Zero W, which is essentially a 10 dollar computer. It has its own operating system, Raspbian, and connects to wifi. (I’ve discovered it is not, however, fast enough to load YouTube.) A Raspberry Pi is basically a circuit board and one can choose what to attach to it. I got a starter pack that contained some extra cords which ended up coming in handy. Individual sensors (humidity, temperature, etc.) are sold for a few dollars. However since I’m a newbie, I went with the more inclusive enviro+ attachment. It worked out of the box and comes with multiple sensors: temperature, pressure, humidity, light, etc. To monitor air quality, there is an additional particulate matter attachment that records PM1, PM2.5 and PM10. The Air Quality Index (AQI) is derived from these measurements. It also already has its own Python library, so it is simple to extract measurements from the sensors. After following this and this (minus the outdoor construction part), I successfully graphed sensor readings on the Raspberry Pi. Overall, the tutorials worked well. The most challenging part for me (a software not hardware person) was the soldering, but it is possible to buy a pre-soldered Raspberry Pi. I’d recommend that approach if you don’t have a relative with a garage full of metal & wood working equipment.

Graphing sensor readings on the Raspberry Pi itself was a great first step, but I wanted to monitor air quality remotely. My eventual goals were 1. Build a dashboard with metrics and analyses I can refer to and 2. Leverage this data to turn on or off various other appliances (not discussed in this post).

AWS: Sending data to a database

The basic steps:

1. Run python code on the raspberry pi to get sensor measurements

2. From raspberry pi, send sensor measurements via a MQTT message to IoT core

3.From IoT core create two ‘rules’: one to send the data to S3 via Kinesis Firehose, and another to send data to postgreSQL via Lambda

Register a device

I decided to use AWS — Internet of Things Core, as it has great (and free) tools for connecting devices to the internet and to each other. I registered a ‘thing’ in the AWS — IoT Things Core, keeping all the defaults. At the end you download the policy and put it on the ‘thing’. I just scp’d the scripts to the Raspberry Pi.

Figure 3: Creating a ‘thing’ on AWS IoT core (Image by author from AWS website)

Data storage

In this updated approach, I use Amazon Kinesis Firehose to send data to S3 for future use following this method. The basic steps are to create a ‘rule’ within IoT that sends incoming messages (individually or in batches up to 5 minutes) to an S3 message for storage. I chose S3 as one of my storage methods because it is a simple, low cost format of data storage.

I also send data to postgreSQL, which is the primary data storage location I use for analysis. I chose postgreSQL over newer, time series or noSQL databases primarily because it is simple and I don’t expect to have massive amounts of data to query. For those new to postgreSQL + AWS, this is a great tutorial. The steps are: Create a postgreSQL RDS instance on AWS, create a database and table, create a lambda function for inserting new messages into postgreSQL, create a rule in IoT core that uses the lambda function. The lambda function creation was by far the most challenging part due to many non-obvious errors, and I’ve included my functioning lambda function code below.

Code running on Raspberry Pi

https://github.com/kimberlymcm/raspberrypi

I set this code up to run automatically on start-up

# Type into terminal

crontab -e

# Added lines

@reboot sudo python3 /path_to_script/raspberrypi/src/read_and_send_to_aws.py &

Code for lambda function

https://github.com/kimberlymcm/flaskapp/tree/master/lambda

AWS: Displaying data on a web app or website

Code for this part (running first locally & then on AWS): https://github.com/kimberlymcm/flaskapp

Great so now I have my Raspberry Pi up and running and the sensor readings being stored in my AWS postgreSQL database and S3. The next step is to read from postgreSQL and create a dashboard. For this, I chose to use Flask + Plotly + AWS Elastic Beanstalk. Flask is a super simple Python micro web framework. Of the two most popular Python web frameworks (the other being Django), it is the most lightweight and makes iterating very fast. Plotly makes it super easy to create professional looking Python-based dashboards and graphs. AWS Elastic Beanstalk is a service for deploying web applications that hides most of the complicated parts. I found it to be the most finicky part of the whole project.

First, the goal is to get a flask app running on my own computer. I’m new to flask so I followed some tutorials (like this) to get the hang of how it worked. Then I edited the tutorial for my use case (see my git repo above). It was pretty quick to get this up and running — I’ve definitely jumped on the flask bandwagon.

The next step, uploading the web app to Elastic Beanstalk took forever. The crux of the problem appears to be that the Elastic Beanstalk CLI had some inconsistencies with the framework I was using, so I eventually just uploaded the app to the UI, which worked successfully. I also turned off the automated load balancer because it was sending many requests to check the table health, which was costing dollars :). The last steps were configuring the website to use https (like this) and routing the website to a subdomain of my personal website.

MVP is complete! See at https://airquality.kimberlymcmanus.com

This blog was more of a general overview on how I went about the project — happy to provide more specifics if helpful.

Next steps: Send myself notifications and automatically turn on my air purifier. Until next time…

Figure 4: MVP air quality dashboard (Image by author)

--

--