Skip to main content

Live, Learn, and Lung

Tag: Python

Powertools for AWS Lambda (4) - Idempotency

In real-world applications, functions can be computationally expensive or time-consuming. To improve efficiency, we can cache and reuse results for identical inputs, a pattern known as idempotency. AWS Lambda Powertools provides an idempotency utility that returns the previously successful result when a function is called repeatedly with the same input. In this post, we will explore how to use it.

Powertools for AWS Lambda (3) - Tracing

AWS X-Ray helps developers analyze and debug distributed applications by providing a holistic view of requests as they travel through the system.1 Tracer is a wrapper around the AWS X-Ray SDK for Python that offers a simplified interface for instrumenting Lambda functions.2 In this post, I will show you how to use Tracer to gain insights into your function’s execution.

How to Yield Batches from an Iterable in Python

In a past project, I built a pipeline to ingest crawled job postings from Elasticsearch and standardize them into a unified schema. Since the postings couldn’t all fit in memory, I needed to process them in batches. The Elasticsearch client yields postings one by one, so I wrapped it with a batching generator that groups them into chunks.

To illustrate, suppose you have a generator that yields numbers 0 to 9 (in reality, it might yield thousands or millions of items). To read them in batches (for example, batch size of 4), you can loop and use itertools.islice to take the next 4 items. Each call to itertools.islice consumes up to 4 elements from the generator, and the loop continues until the generator is exhausted.