RDLRDLEcosystem
News

Writing a Great Dataset README

A README is the front door to your dataset. Learn what to put in it so anyone — including future you — can understand, trust, and reuse your data without asking for help.

The front door to your data

A README is a plain, human-readable file that explains a dataset. It is the first thing a reuser opens, and often the difference between data that gets reused and data that gets abandoned. Writing one well takes minutes and saves everyone hours.

What to include

  • Title and summary — what the dataset is, in one or two sentences.
  • Creators and contact — who made it and how to reach them.
  • Files — a list of files and what each contains.
  • Variables — every column/field explained, with units and codes (e.g. what "-99" or "NA" means).
  • Methods — how the data was collected or generated.
  • Licence — the terms of reuse.
  • Citation — how you would like the dataset cited, ideally with its DOI.

Write for a stranger

Assume the reader knows your field but nothing about your specific project. Spell out abbreviations, state assumptions, and never rely on context that lives only in your head or your lab notebook.

The payoff

A good README makes your data Findable and Reusable in the FAIR sense, cuts the emails you get asking "what does this column mean?", and makes your work far more likely to be cited.