Skip to content

Data Preparation

Raw data is rarely chart-ready.

PressViz works best when your dataset is focused, clear, and shaped around a single story. This guide shows how to reduce noisy spreadsheets into something that produces a clean chart instead of an overloaded one.

Charts are for communication, not storage.

When a dataset is too large or too detailed, the result is usually harder to read:

  • bars, lines, or dots begin to overlap
  • axis labels become crowded
  • interactions feel slower
  • the chart stops telling a clear story

PressViz includes a 500 row import limit to help keep both the editor and the published chart usable. That limit is not just a technical guardrail. It also encourages better data storytelling.

The best charts usually answer one focused question, not every possible question in the source file.

If you are importing years of daily data, start by narrowing the time period.

Instead of uploading:

  • All daily ridership from 2020–2025

Try something more focused:

  • January 2024 daily ridership compared with January 2023
  • The last 90 days of signups
  • Q4 2024 weekly revenue

Ask yourself:

  • What story am I trying to tell?
  • What date range actually matters to that story?
  1. Open the data in Excel or Google Sheets.
  2. Filter to the date range that matters.
  3. Copy the filtered rows into a new sheet.
  4. Export that smaller sheet as CSV.
  5. Upload the refined CSV to PressViz.

Daily data is often too dense for a clean chart, especially over long periods.

If you want to show a trend, aggregate the data into:

  • weekly values
  • monthly values
  • quarterly values

For example:

  • Raw data: 1,706 daily records
  • Refined data: 52 weekly aggregates

You keep the trend, but remove the clutter.

  1. Add a helper column for week, month, or quarter.
  2. Use a pivot table or formulas like SUMIF() to group the data.
  3. Calculate the metric you need: sum, average, or count.
  4. Keep only the aggregated rows for export.

Too many categories or series can turn a chart into noise.

Instead of showing everything:

  • All 47 U.S. states
  • All 150 product SKUs
  • Every department in the company

Focus on a useful subset:

  • Top 5 states by population
  • Best-selling categories
  • Sales and Marketing only

Once a chart goes beyond about 5–7 series, it usually becomes much harder to read.

Filter to:

  • top performers
  • most relevant groups
  • the categories tied to the story you are telling

Many raw exports contain far more columns than a chart needs.

PressViz usually needs:

  • one label column
  • one or more value columns

Everything else is often extra noise.

  1. Identify the label column.
  2. Identify the value column or columns.
  3. Remove metadata, notes, audit fields, and unused calculations.
  4. Rename columns clearly.
  5. Export the cleaned sheet.

Good column names help both you and your readers:

  • Region
  • Week
  • Q4 Revenue
  • Average Ridership

Imagine you have MTA daily ridership data for 7 years and want to compare recent trends across transit lines.

  • about 2,555 rows
  • about 12 columns
  • too much detail for a quick visual comparison
  1. Filter to the last 3 months.
  2. Aggregate to weekly averages.
  3. Keep only the columns for:
    • week
    • line
    • average ridership
  4. Limit the comparison to the top 3 lines.

Instead of a massive raw sheet, you now have:

  • roughly 36 focused rows
  • 3 useful columns
  • a chart that clearly compares weekly ridership trends

That is the difference between dumping data into a chart and shaping data into a story.

MistakeWhy it failsBetter approach
Uploading the entire raw datasetToo much overlap and poor readabilityFilter by time range or category
Mixing daily and weekly dataInconsistent granularity confuses the chartAggregate everything to one level
Showing 15 or more seriesThe visual becomes spaghettiLimit to the top 5–7 series
Keeping missing valuesGaps can mislead or weaken the chartFill or remove incomplete rows
Using vague column namesHard to understand inside the editor and on the final chartRename columns clearly before import

You do not need special software to prepare data well.

Useful options:

  • Excel
  • Google Sheets
  • Python with pandas
  • SQL queries
  • OpenRefine

For most users, Excel or Google Sheets is enough.

Different chart types tolerate different levels of density.

Chart typeIdeal row countMax seriesNotes
Bar / Line10–503–5Good for trends and comparisons
Pie / Doughnut5–121Keep slices limited
Area10–502–3Works best with fewer series
Scatter20–100MultipleEach point matters, so clarity still matters

These are not hard rules, but they are strong defaults.

  • File has fewer than 500 rows
  • File is under 5 MB
  • First row contains headers
  • Column names are clear and descriptive
  • Extra columns have been removed
  • Missing values are handled
  • Dates use a consistent format
  • You can explain the chart in one sentence

Example:

This chart shows weekly average ridership for our top 3 subway lines over the last quarter.

If you cannot describe the chart simply, the data probably still needs refinement.

It is a practical limit designed to keep charts readable and the editor responsive. Even below that number, smaller and more focused datasets usually produce better results.

You can, but most readers will struggle to compare them. In practice, 3–7 series is a much better range for clear charts.

Yes. PressViz does not aggregate raw datasets for you, so preparing the data first gives you more control and a better result.

Section titled “What if I want to show trends over five years?”

Use monthly or quarterly values instead of daily rows. 60 monthly points are easier to understand than 1,825 daily ones.