How to Build Progress Monitoring Using Advanced tqdm for Async, Parallel, Pandas, Logging, and High-Performance Workflows

In this tutorial, we explore tqdm in depth and demonstrate how we build powerful, real-time progress tracking into modern Python workflows. We begin with nested progress bars and manual progress control, then move into practical scenarios such as streaming downloads, pandas data processing, parallel execution, structured logging, and asynchronous tasks. Throughout this tutorial, we focus on writing clean, production-ready code that runs in Colab while showcasing the advanced capabilities of tqdm beyond simple loops.

Copy CodeCopiedUse a different Browser

!pip -q install -U tqdm


import time, math, random, asyncio, hashlib, logging
import pandas as pd
import requests


from tqdm.auto import tqdm, trange
from tqdm.contrib.concurrent import thread_map, process_map
from tqdm.contrib.logging import logging_redirect_tqdm
import tqdm as tqdm_pkg


print("tqdm version:", tqdm_pkg.__version__)
print("pandas version:", pd.__version__)
print("requests version:", requests.__version__)

We install and configure tqdm in a Colab-safe manner while preserving the existing environment dependencies. We import all required libraries, including concurrency and logging helpers from tqdm.contrib. We also print version information to verify that our runtime setup is stable before proceeding.

Copy CodeCopiedUse a different Browser

print("1) Nested progress bars (position/leave) + tqdm.write()")
outer = trange(5, desc="Outer loop", leave=True)
for i in outer:
   inner = trange(20, desc=f"Inner loop {i}", leave=False, position=1)
   for j in inner:
       time.sleep(0.01)
       if j in (0, 10, 19):
           tqdm.write(f"  note: i={i}, j={j}")
print()


print("2) Manual progress (unknown -> known total, update(), set_postfix())")
items = list(range(1, 101))
pbar = tqdm(total=None, desc="Processing (discovering total)", unit="item")
seen = 0
for x in items:
   time.sleep(0.005)
   seen += 1
   if seen == 25:
       pbar.total = len(items)
       pbar.refresh()
   pbar.update(1)
   if x % 20 == 0:
       pbar.set_postfix(last=x, sqrt=round(math.sqrt(x), 3))
pbar.close()
print()

We demonstrate nested progress bars and show how we manage multiple levels of iteration cleanly using position and leave. We also explore manual progress control by dynamically setting totals and updating progress explicitly. By using set_postfix, we enrich the progress bar with live metadata as it runs.

Copy CodeCopiedUse a different Browser

print("3) Download with streaming progress")
url = "https://raw.githubusercontent.com/tqdm/tqdm/master/README.rst"
out_path = "/content/tqdm_README.rst"


with requests.get(url, stream=True, timeout=30) as r:
   r.raise_for_status()
   total = int(r.headers.get("Content-Length", 0)) or None
   chunk = 1024 * 32
   with open(out_path, "wb") as f, tqdm(
       total=total,
       unit="B",
       unit_scale=True,
       unit_divisor=1024,
       desc="Downloading README",
       miniters=1,
   ) as bar:
       for part in r.iter_content(chunk_size=chunk):
           if not part:
               continue
           f.write(part)
           bar.update(len(part))


print("Saved:", out_path)
print()

We implement a real-world streaming download scenario using requests with chunk-based updates. We track byte-level progress with automatic unit scaling and accurate handling of content length. This shows how we efficiently and transparently monitor external I/O operations.

Copy CodeCopiedUse a different Browser

print("4) pandas progress_apply (Series) + DataFrame row-wise progress (safe)")
tqdm.pandas()


df = pd.DataFrame({
   "user_id": range(1, 2001),
   "value": [random.random() for _ in range(2000)],
})


def heavy_fn(v: float) -> str:
   time.sleep(0.0005)
   s = f"{v:.10f}".encode("utf-8")
   return hashlib.sha256(s).hexdigest()[:10]


df["hash"] = df["value"].progress_apply(heavy_fn)


df2 = df[["value"]].copy()
df2["hash2"] = [
   heavy_fn(float(v))
   for v in tqdm(df2["value"].to_list(), desc="Row-wise hash2", total=len(df2))
]
df["hash2"] = df2["hash2"]


print(df.head(3))
print()

We integrate tqdm with pandas to monitor vectorized operations using progress_apply. We implement a hashing workload to simulate realistic, computationally heavy transformations. We also demonstrate a safe row-wise progress pattern to ensure compatibility with Colab’s pinned pandas version.

Copy CodeCopiedUse a different Browser

print("5) Concurrency progress: thread_map / process_map")
def cpuish(n: int) -> int:
   x = 0
   for i in range(50_000):
       x = (x + (n * i)) % 1_000_003
   return x


nums = list(range(80))
thread_results = thread_map(cpuish, nums, max_workers=8, desc="thread_map")
print("thread_map done:", len(thread_results))


proc_results = process_map(cpuish, nums[:20], max_workers=2, chunksize=2, desc="process_map")
print("process_map done:", len(proc_results))
print()


print("6) logging_redirect_tqdm (logs won’t break bars)")
logger = logging.getLogger("demo")
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter("%(levelname)s: %(message)s"))
logger.handlers = [handler]


with logging_redirect_tqdm():
   for k in tqdm(range(60), desc="Work with logs"):
       time.sleep(0.01)
       if k in (5, 25, 45):
           logger.info(f"checkpoint k={k}")
print()


print("7) asyncio progress (as_completed) — Colab/Jupyter-safe")
async def io_task(i: int):
   await asyncio.sleep(random.uniform(0.02, 0.12))
   return i, random.random()


async def run_async():
   tasks = [asyncio.create_task(io_task(i)) for i in range(80)]
   results = []
   for fut in tqdm(asyncio.as_completed(tasks), total=len(tasks), desc="async tasks"):
       results.append(await fut)
   return results


results = await run_async()
print("async done:", len(results), "results")

We explore advanced execution patterns including multithreading, multiprocessing, structured logging, and asynchronous task tracking. We use thread_map and process_map to parallelize CPU-bound workloads with visible progress. Also, we handle asyncio safely in a notebook environment using top-level await, ensuring smooth progress tracking without event loop conflicts.

In conclusion, we integrated tqdm across synchronous, parallel, logging-aware, and asynchronous environments. We saw how progress bars enhance observability, improve debugging clarity, and make long-running workloads more transparent and user-friendly. With these advanced patterns, we now have a solid foundation to incorporate robust progress monitoring into data pipelines, machine learning workflows, distributed systems, and real-world production applications.

Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post How to Build Progress Monitoring Using Advanced tqdm for Async, Parallel, Pandas, Logging, and High-Performance Workflows appeared first on MarkTechPost.

Related Posts