Amazon Ion is one of the data serialization formats you can use when exporting data from DynamoDB to S3. Recently, I tried to select data from one of these exports with Athena after using a Glue Crawler to create the schema and table. It didn’t work, and I got a weird error message. In this post, I’ll show you how to fix that problem. If you’re not familiar with Ion yet, check out my recent blog post introducing it for more details.
This is the story of how we accidentally learned more about the internals of Glue Crawlers than we ever wanted to know. Once upon a time (a few days ago), André and I were debugging a crawler that didn’t do what it was supposed to. Before we dive in
This point explains how to work around Glue’s problem of selective amnesia when creating Dynamic Frames from the Glue data catalog.
There are many components under the Glue umbrella that can fit together into a cohesive big picture. In this introduction to Glue I’m explaining my version of this big picture.
In today’s data-driven world, effective data management is crucial for organizations aiming to make well-informed, data-driven decisions. As the importance of data continues to grow, so does the significance of robust data management practices. This
Transforming large amounts of data into formats that help solve business problems is what data engineers excel at. A combination of Serverless tools such as Athena, StepFunctions, Lambda, or Glue can get the job done in many projects. However, some custom