Module 27 Systems

Data Wrangling & Everyday Code

There's a category of engineering work that doesn't fit neatly into any framework or architecture — it's the everyday glue code that connects systems, transforms data, and handles the messy reality of information in the wild. JSON with precision pitfalls. Regex that catastrophically backtracks on certain inputs. Dates and timezones that silently corrupt data across daylight saving transitions. This module covers the everyday code that every engineer writes constantly and that traps the unwary in subtle, hard-to-debug errors. Vibe coders reach for these tools constantly — but they use them with false confidence. They generate a regex pattern without knowing about catastrophic backtracking. They use new Date() without thinking about timezones. They JSON.stringify a number with 18 decimal places without knowing JavaScript's floating-point precision limit. They mutate arrays directly instead of creating immutable copies. Small errors in these everyday utilities cause bugs that are disproportionately hard to track down because the tools feel simple. This module puts you in command of the tools you use every day. You'll understand the data validation ecosystem (Zod, Joi, JSON Schema) and learn to use it systematically rather than hoping for valid input. You'll understand file streaming for large files instead of loading everything into memory. You'll understand the Unicode edge cases in string manipulation that affect real-world data. By the end, you'll write everyday code that's correct, robust, and performant.

What You'll Learn

  • 1
    JSON Inside and Out — parse, stringify, JSON Schema, streaming, and precision pitfalls
  • 2
    Working with Files — fs module, streaming large files, CSV parsing, temp files
  • 3
    Regular Expressions — Syntax, common patterns, catastrophic backtracking, when not to use regex
  • 4
    Date and Time — Unix timestamps, ISO 8601, the Temporal API, timezones, and DST
  • 5
    String Manipulation and Template Literals — Methods, tagged templates, UTF-16, Unicode edge cases
  • 6
    Array and Object Transformation Patterns — groupBy, zip, chunk, partition, immutable updates
  • 7
    Working with Third-Party APIs — Authentication, error handling, rate limiting, pagination
  • 8
    Data Validation and Format Interoperability — Zod, Joi, YAML, XML, Protocol Buffers

Capstone Project: Build a Data Pipeline from Three External Sources

Build a data pipeline that fetches data from three different external APIs with different authentication schemes and pagination styles, transforms and validates the data using Zod schemas, handles rate limiting with exponential backoff, normalizes date/time values across different formats and timezones, and outputs the combined result as a validated JSON file — handling every error case gracefully and logging what went wrong for any records that failed validation.

Why This Matters for Your Career

The everyday code — parsing, transforming, validating — is where a surprising number of production bugs live. A regex that works for test inputs but catastrophically backtracks on certain user inputs can bring a server to its knees. A date that looks right in UTC but displays wrong in a user's timezone corrupts the user experience. A number that loses precision in JSON serialization produces subtle financial calculation errors. These aren't exotic edge cases; they're common patterns that cause real incidents. Data validation at the boundary is one of the highest-leverage practices in application development. Every piece of data that enters your system — from an API, from a form, from a database, from a file — should be validated against a schema at the point of entry. Zod and similar tools make this explicit and type-safe, eliminating a whole class of runtime errors from invalid assumptions about data shape. Engineers who validate systematically write applications that fail loudly at the boundary rather than silently in the middle. Third-party API integration is a daily reality in modern software development, and the difference between a robust integration and a fragile one is largely in the handling of edge cases: rate limiting, pagination, partial failures, retries with idempotency. Engineers who understand these patterns build integrations that survive real-world API behavior rather than only working in the happy path.