
Dying house plants? AWS IoT Services to the rescue!
In the face of unrelenting house plant death there is only one option; to data the hell out of it! We apply cutting-edge AWS IoT services to extract an unnecessary quantity of data from an unfortunate succulent. Just how far over-the-top can we go? Tune in to find out!
Playing around with AWS’ IoT stack
So it happened again; yet another of my house plants was knocking on death’s door. This is sadly not an uncommon occurrence for house plants in my apartment. They seem to do fine for a while, then decide to pack it in and keel over.
This time, however, the ailing of yet another plant and purchase of its possibly ill-fated successor coincided with a desire to play around with AWS’ IoT stack. I’d been looking for a little project that would make use of some Raspberry Pi sensor kits I had lying around, so…
How closely could I monitor my new house plant? Could excessive levels of surveillance bring an end to the unceasing cycle of senseless plant death? Could I have totally over-datafied this thing?
To answer these questions, and maybe more, we must first engineer an absurd solution to a meaningless problem!
IoT hardware with Raspberry Pi and friends
The hardware is often the most time-consuming aspect of IoT systems. Deploying, hooking up and maintaining large fleets of tiny little IoT devices necessitates a lot of leg work.
In my case I only had one device to build and I already had a collection of the ever-useful Raspberry Pis, a IR Pi camera, and the Grove sensor hardware for them that’s accumulated over the years, so I figured this would be a breeze; all done in no time, super easy-peasy… right???
Of course not! The code for the GrovePi+ hasn’t been updated in years and instead of using a nice PyPi module or such to install, uses an imperative Bash script… which didn’t want to play nice with the latest version of the Raspberry Pi OS. Not wanting to run an outdated OS, or mess around with camera drivers later, I hacked my way through dependency hell for the next few days, tracing and working around each failed dependency install from the script.
I never did get that script to run to completion, however, after a couple of days I’d gotten it far enough that I could read sensor data via the GrovePi+ successfully with the “grovepi.py” file included directly in my project directory. So I declared victory and moved on.
Side note: why are we still using imperative deployment scripts in 2022? A declarative approach using Infrastructure-as-Code (IaC) and standard package or container managers would seem like a much more robust approach to software deployment.
So what sensors should I hook up?
Common sense might advise tracking a couple of obvious parameters, like soil moisture level, sunlight intensity and temperature, but that’s just not disco enough for an absurdly over-engineered solution!
So I attached three moisture sensors to the plant, measuring soil moisture levels at the top, middle and bottom of its pot along with a sunlight sensor providing separate readings for visible, UV and IR light. To ensure the sunlight sensor wouldn’t be blocked by the plant, and to give a reasonable approximation of sunlight levels experienced by a horizontal leaf, I built a small rig out of scrap furniture parts (IKEA hacking anyone?) to hold the sunlight sensor over the plant. This also provided a good place to position the IR camera.
I wanted to experiment with an IoT solution that involved large file capture, like images, and thought that an image dataset of my own may come in handy for playing with computer vision ML things in the future. And IR seemed like a good option for getting consistent images night or day. For this, I used a “F” type Pi camera with a pair of attached IR lights governed by built-in photoresistors to provide consistent illumination and again keep the data captured as consistent as possible.
The camera was by far the easiest part of the Raspberry Pi setup. Once you get up to speed on the switch over to libcamera in the Bullseye version of Raspberry Pi OS, it’s all very straightforward. The only downside of the transition is the lack of a supporting Python module, resulting in the need to use an “os.system” call to run the libcamera CLI tools externally.
Finally, I added some guides to the base of my rig to keep the plant in a set position relative to the camera, for consistency. If you can engineer consistency into your data at the point of generation/capture, you’ll save yourself a ton of hassle down the line!
With this, my plant surveillance device was complete!
But wait, there’s more (data)!
Now I can capture seven different parameters about my hapless home vegetation, but surely we can ramp up our solution’s absurdity level even more? After all, my new Raspberry Pi plant surveillance rig isn’t the only IoT device in the room…
In the same room as the plant I have an Awair, an WiFi-connected air quality monitoring device, and Awair recently added a local network API as a beta feature. The API actually provides even more detailed information than the Awair device or app display, although the beta maturity of the feature, unfortunately, shines through; it will randomly return empty responses to GETs with a HTTP 200 response code. This renders the typical approach to detecting HTTP API errors – checking the HTTP response code – useless, however this issue was easily overcome by simply checking if the “.text” property of the response collected with the Python “requests” module is empty.
The Awair Local API provides 15 distinct fields of data, although some of them are device baseline calibrations or estimates that aren’t so useful for this project. In the end, I pulled nine distinct fields from the Awair to provide fairly comprehensive capture of the unfortunate succulent’s environment, resulting in the 16 pieces of data being captured overall:
- Infrared image of the plant’s leaves
- Visible light level
- Ultraviolet light level
- Infrared light level
- Soil moisture level (top of pot)
- Soil moisture level (middle of pot)
- Soil moisture level (bottom of pot)
- Room air temperature
- Dew point for the room
- Relative humidity level
- Absolute humidity level
- Carbon Dioxide (CO2) level
- Air Volatile Organic Compounds (VOC) level
- Air hydrogen (H2) level
- Air ethanol level
- Airborne particulate matter (PM2.5) level
So with the actual “thing” part of “Internet-of-Things” sorted, it was time to move on to that whole “internet” part.
Cometh the data, cometh the cloud
Being able to measure a bunch of stuff with an IoT device is all well and good, but not so useful unless you can collect all that data. That’s where we turn to AWS IoT Core.
IoT Core is, shockingly, the core of AWS’ IoT suite! It provides the core services for things to securely connect to AWS and send data. Connections with “Things”, as AWS officially calls them, are secured using certificates and mutual TLS (mTLS) with policies assigned to each Thing granting them permissions to use AWS services. With the necessary permissions granted, a Thing can then send and receive messages via AWS IoT Core’s MQTT message bus, which provides the typical structure of publish-subscribe (pub/sub) topics.
Our over-the-top plant monitoring solution is pretty simple, by AWS’ standards, so I just had to set up one of everything, basically. With that done, the real work was in getting the Raspberry Pi to send the data to AWS.
Fortunately, the AWS IoT Python SDK is pretty straightforward to get up and running. Using a handy Knowledge Center article as a starting point, I built a simple little Python script that performed the bare basics of MQTT connection management and message publishing, that I could later expand on.
However MQTT is really only for short messages; the maximum payload size for an IoT Core publish request is only 128KB, afterall – raising the question of how to handle the IR camera images, which average around 5.8MB in size… And that question was easily answered, because when you have doubts about where to put BLOBs (Binary Large Objects) in AWS, use S3! Using the standard Boto3 AWS Python SDK I wrote a little bit of code to upload each image to S3, and problem solved.
Finally, I had to bring all the bits of data together and send some well formatted JSON to AWS IoT Core. Personally, I prefer to have a bit more structure in my semi-structured data blobs which convey the semantic relationships between individual fields, rather than use a flat structure and rely on field names to pass along this potentially valuable information. Also, it’s usually easier to flatten the structure later if need be than to add back in structure.
To maintain the link between each camera image and its associated sensor readings I added the S3 URL of the uploaded image to the JSON and the results look something like this:
{ "timestamp": "2022-01-01-000000", "plant": { "pot": { "soil": { "moisture_top_a0": 87, "moisture_middle_a1": 31, "moisture_bottom_a2": 49 } }, "env": { "visible_light": 278, "uv_light": 0.11, "ir_light": 349 }, "images": { "infrared": "https://bucket.s3.region.amazonaws.com/infrared/2022-01-01-000000.png" } }, "room": { "env": { "dew_point": 16.66, "temp": 28.34, "rel_humid": 49.17, "abs_humid": 13.6, "co2": 471, "voc_total": 40, "voc_h2": 26, "voc_ethanol": 36, "pm25": 8 } } }
A quick cron job on the Raspberry Pi runs the Python code to capture all of the data and publish it to AWS every five minutes. The full code for this cron job is available on GitHub here. I’ve written this with the intent that it provides a minimal, easy-to-get-started-with example of using AWS IoT Core in practice so it doesn’t include all of the bits you’d want to consider in a production environment, like proper MQTT (re)connection management, so I don’t recommend copying-and-pasting this for enterprise production usage!
AWS Timestream: For all your streaming time-series needs!
So we’re now producing a sequence of sensor readings and images, which are linked to specific points in time, and are published to AWS IoT Core’s message bus as they’re produced… which means we have time-series data ingested as a stream… I wonder which AWS service might be suitable for persistent storage of streaming time-series data?
The clue’s in the name; AWS Timestream provides a “serverless” (in that we don’t pay for fixed, pre-provisioned capacity) service for storing and utilizing time-series data, and handily has native integration with IoT Core and a bunch of other AWS services we might want to use to consume that data from later, like SageMaker (for analysis with machine learning) and QuickSight (for dashboards and visualization).
The final feature of IoT Core that we’ll utilize is message routing. Message routing rules let us select data from each message on a given topic using that old stalwart of the data world, SQL, and send it to one or more destinations, including a good range of AWS services and custom VPC or HTTP endpoints. This is where the native integration with Timestream comes in.
In this case, the message routing rule is also where I had to do some restructuring of the data to suit a time-series database. Whilst I could just inject the whole JSON payload into a Timestream record, it would just be a VARCHAR-type record; Timestream doesn’t (as far as I’m aware) have a method of querying JSON fields within a record. This would likely greatly limit the options for consuming the data and calculating aggregates from it down the line. Simply selecting “*” works, however retains only the field name and not the full JSON path, losing all of that semantic relationship data we injected earlier.
Instead, I used the message routing rule’s SQL query, where JSON fields can be addressed, to rename each field and effectively flatten my hierarchical JSON structure:
SELECT plant.pot.soil.moisture_top_a0 AS plant_pot_soil_moisture_top, plant.pot.soil.moisture_middle_a1 AS plant_pot_soil_moisture_middle, plant.pot.soil.moisture_bottom_a2 AS plant_pot_soil_moisture_bottom, plant.env.visible_light AS plant_env_visiblelight, plant.env.uv_light AS plant_env_uvlight, plant.env.ir_light AS plant_env_irlight, plant.images.infrared AS plant_image_infrared, room.env.dew_point AS room_env_dewpoint, room.env.temp AS room_env_temperature, room.env.rel_humid AS room_env_relativehumidity, room.env.abs_humid AS room_env_absolutehumidty, room.env.co2 AS room_env_co2, room.env.voc_total AS room_env_voctotal, room.env.voc_h2 AS room_env_voch2, room.env.voc_ethanol AS room_env_vocethanol, room.env.pm25 AS room_env_pm25 FROM 'succulentpi/readings'
And with that our IoT plant panopticon was complete and running, and the overall solution looks like this:
What’s Next?
Basically, data collection. I plan to collect data over the next few months and then experiment with visualization and analysis of it. As is typical for data & analytics projects we can’t know in advance precisely what the data might tell us, so… watch this space, I guess!
In the meantime, if you’d like to see some not so absurd IoT solutions built by Cloudreach (for actual customers, rather than unfortunate house plants!) there’s one covering an energy company’s solution for managing their wind farms, a fitness studio chain’s automated customer check-in system using AWS DeepLens, and the traffic data management solution for a US state’s department of transportation.