In a recent project, a colleague asked me to look at a HIVE_CURSOR_ERROR in Athena that they weren’t able to get rid of. Since the error message was not incredibly helpful and the way this error appeared is not that uncommon, I thought writing this may help you, dear reader, and future me when I inevitably forget about it again.
Working with datasets in pandas will almost inevitably bring you to the point where your dataset doesn’t fit into memory. Especially parquet is notorious for that since it’s so well compressed and tends to explode in size when read into a data
Working with CSV files and Big Data tools such as AWS Glue and Athena can lead to interesting challenges. In this blog I will explain to you how to solve a particular problem that I encountered in a project - the HIVE_PARTITION_SCHEMA_MISMATCH.
In a previous blog post we built QuickSight Datasets by directly loading files from S3. In the wild the data in S3 is often already aggregated or even transformed in Athena. In this new blog post we see how to create a QuickSight Dataset directly relying
Automating Athena Queries with Python Introduction Over the last few weeks I’ve been using Amazon Athena quite heavily. For those of you who haven’t encountered it, Athena basically lets you query data stored in various formats on S3 using SQL
It is surprising how many resources on the Internet are carrying on outdated or deprecated information - the Chef ecosystem is no exception to this. While outdated style in Ruby files has been detected via cookstyle for a while, Test Kitchen files still h