Data File Formats

Data formats are the plumbing of the internet — every API response, config file, and database dump is structured data in a text file.

Formats 25
Most common .accdb, .avro, .cfg
About data files

Data formats are the plumbing of the internet. Every API response, every config file, every spreadsheet export, every database dump — it's all structured data in a text file. The format just determines how the structure is expressed: commas, curly braces, indentation, or angle brackets. Pick the wrong one and your evening is ruined.

CSV is the lowest common denominator — everything can read it, nothing can misinterpret it (except Excel, which will helpfully mangle your dates). JSON is what web APIs speak. XML is what enterprise systems speak. YAML is what DevOps speaks. TOML is what Rust and Python build tools speak. INI is what Windows spoke in the 1990s and somehow still does.

The choice between formats usually isn't a choice — it's dictated by whatever system you're talking to. But when you do get to choose: JSON for APIs and web data, YAML for configuration, CSV for tabular data, and SQL for database operations.

All data formats
.accdb ACCDB is the current Microsoft Access database format, repla... .avro Avro is a row-based data serialisation format with built-in ... .cfg CFG is a generic configuration file — plain text key-value p... .conf CONF is a configuration file format common on Linux and Unix... .csv CSV is a plain text file where each line is a row and commas... .db DB is a generic database file extension — it could be SQLite... .env ENV files store environment variables as key-value pairs — A... .geojson GeoJSON is a JSON-based format for encoding geographic featu... .hdf5 HDF5 is a format for storing large, complex scientific datas... .ini INI is a simple configuration format using sections and key-... .json JSON is a lightweight, human-readable data format used by vi... .jsonl JSONL is the JSON Lines format — one JSON object per line, i... .log LOG is a plain text file recording events, errors, or system... .mat MAT is MATLAB's binary data format for storing matrices, arr... .mdb MDB is the legacy Microsoft Access database format (pre-2007... .ndjson NDJSON stores one JSON object per line — designed for stream... .parquet Parquet is a columnar storage format for big data analytics ... .plist Plist is Apple's configuration and data storage format — fou... .properties Properties files are simple key-value configuration files us... .sql SQL files contain database commands — CREATE, INSERT, SELECT... .sqlite SQLite is a self-contained relational database in a single f... .toml TOML is a configuration format designed for readability — us... .tsv TSV is like CSV but uses tabs instead of commas — avoids del... .xml XML is a markup language for structured data — verbose but p... .yaml YAML is a human-readable configuration format using indentat...
Safety notes
.accdb Use caution

ACCDB files can contain VBA macros. Only open files from trusted sources, and consider disabling macros if you don't need them.

.env Use caution

ENV files often contain API keys, passwords, and other secrets. Never share .env files publicly.

.mdb Use caution

MDB files can contain VBA macros that execute automatically. Only open MDB files from trusted sources.

.sql Use caution

SQL files can contain destructive commands (DROP, DELETE). Review before executing against a database.

FAQ
What's the difference between JSON and XML?
Both represent structured data. JSON uses curly braces and square brackets — it's compact and easy to parse. XML uses opening and closing tags — it's verbose but supports schemas, namespaces, and complex document structures. JSON dominates web APIs; XML dominates enterprise systems.
How do I open a CSV file?
Any text editor shows the raw data. Excel, Google Sheets, and LibreOffice Calc display it in a proper grid. Be aware that Excel may auto-format data (turning ZIP codes into numbers, dates into different formats) unless you import with explicit column types.
What's YAML used for?
YAML is the standard for DevOps and infrastructure configuration: Docker Compose files, Kubernetes manifests, GitHub Actions workflows, CI/CD pipelines. It uses indentation instead of brackets, making it readable but fragile — a single misplaced space can break everything.
Related categories