FlashFlood AI

FlashFlood AI is a real time flash flood prediction system that identifies which streets are most likely to flood next while heavy rainfall is still occurring. Unlike traditional flood warning systems that issue broad, slow updates, FloodSprint continuously updates street-level flood risk using live rainfall intensity, elevation, drainage capacity, and soil saturation data.

Because flash floods evolve minute by minute, the system is fundamentally speed dependent. Ultra fast inference enables FlashFlood AI to stay ahead of rapidly changing flood conditions, allowing earlier, more targeted alerts and emergency response in high-risk, densely populated regions such as South Asia.

Video

FlashFlood AI’s Interface

Why FlashFlood needs high inference

We chose flash flood prediction because it is a real-world problem where speed directly determines whether an AI system is useful or not. Unlike many applications where a few seconds of delay is acceptable, flash floods evolve rapidly and often leave people with only minutes to react. This makes flood prediction a clear example of a use case that fundamentally depends on ultra-fast inference.

We also focused on flash flooding because it disproportionately affects dense, high-risk regions such as South Asia, where millions of people are displaced each year during monsoon seasons. Existing flood warning systems in these regions tend to be broad, slow, and reactive, issuing alerts after flooding has already begun. This gap between prediction speed and real-world conditions makes it an ideal case study for demonstrating why faster inference enables entirely new capabilities.

FlashFlood was designed to show what becomes possible when AI operates at the speed of real events. By predicting which streets will flood next in real time, the system moves beyond general warnings and toward actionable, location-specific insight. This use case clearly demonstrates how ultra-fast inference is not just an optimization, but a requirement for meaningful impact in time-critical environments. Some limitations are that it doesn’t use real-world flooding events though.

People look towards a flooded area along the bank of overflowing Bagmati River following heavy rains in Kathmandu. Credit: CNN

Technical aspects of the project

The FlashFlood AI system implements a Python-based microservices architecture centered on hydrological simulation and AI-augmented risk assessment. The core technologies include FastAPI for high-performance asynchronous API delivery, WebSocket protocols for real-time client-server communication, and the Cerebras Cloud SDK for accelerated inference workloads. The backend features a custom spatiotemporal flood modeling engine that combines physics-based flow routing with machine learning—utilizing a graph-based representation of terrain where streets and drainage networks are modeled as connected nodes with D8 flow-direction algorithms.

The system ingests rainfall intensity data and processes it through a physics-informed accumulation model that calculates water flow across elevation grids, factors in drainage capacity with simulated blockage rates, and tracks soil saturation levels. This hydrological computation is enhanced by Cerebras-powered LLM inference (Llama 3.3 70B) for contextual risk assessment, generating structured predictions including flood depth estimates, time-to-flood calculations, and evacuation recommendations. The architecture supports continuous prediction loops with minute-by-minute update cycles, automated alert generation with severity classification, and broadcast distribution to connected clients via persistent WebSocket connections.

Key technical capabilities include fault-tolerant design with physics-only fallback predictions when AI inference encounters errors, geospatial bounding for regional analysis across Pakistan, Nepal, and India, and background task orchestration for simultaneous monitoring of multiple catchment areas. The frontend is served as static files through FastAPI's mounting system, suggesting a decoupled React-based interface consuming these real-time endpoints.

GIthub Library of different libraries used

Next Steps

Data Integration: Replacing simulated elevation and drainage parameters with live SRTM/ASTER DEM queries and municipal stormwater infrastructure databases would sharpen prediction accuracy for specific urban catchments.

Model Optimization: Evolving from general-purpose LLM inference to custom PyTorch Geometric models compiled for Cerebras wafer-scale hardware could unlock the full potential of sub-second inference for million-node street network graphs.