This directory contains the scripts and configuration for the SA-Text dataset curation pipeline, from "Text-Aware Image Restoration with Diffusion Models". The pipeline processes images from the SA-1B ...