dygraphs Data Format

When you create a Dygraph object, your code looks something like this:

g = new Dygraph(document.getElementById("div"), data, { options });

This document is about what you can put in the data parameter.

There are five types of input that dygraphs will accept:

  1. CSV data
  2. URL
  3. array (native format)
  4. function
  5. DataTable

These are all discussed below. If you're trying to debug why your input won't parse, check the JS error console. dygraphs tries to log informative errors explaining what's wrong with your data, and these can often point you in the right direction.

There are several options which affect how your input data is interpreted. These are:

  • xValueParser affects CSV only.
  • errorBars affects all input types.
  • customBars affects all input types.
  • fractions affects all input types.
  • labels affects all input types.

CSV

Here's an example of what CSV data should look like:

Date,Series1,Series2
2009/07/12,100,200  # comments are OK on data lines
2009/07/19,150,201

"CSV" is actually a bit of a misnomer: the data can be tab-delimited, too. The delimiter is set by the delimiter option. It default to ",". If no delimiter is found in the first row, it switches over to tab.

CSV parsing can be split into three parts: headers, x-value and y-values.

Headers

If you don't specify the labels option, dygraphs will look at the first line of your CSV data to get the labels. If you see numbers for series labels when you hover over the dygraph, it's likely because your first line contains data but is being parsed as a label. The solution is to either add a header line or specify the labels like this:

new Dygraph(el, "2009/07/12,100,200\n" + "2009/07/19,150,201\n", { labels: [ "Date", "Series1", "Series2" ] });

x-values

Once the headers are parsed, dygraphs needs to determine what the type of the x values is. They're either dates or numbers. To make this determination, it looks at the first column of the first row ("2009/07/12" in the example above). Here's the heuristic: if it contains a '-' or a '/', or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a number.

Once the type is determined, that doesn't mean all the values will parse correctly. The general rule is:

  • For dates, your strings have to be parseable by Date.parse.
  • For numbers, your strings have to be parseable by parseFloat.

You can manually verify this using a JavaScript console. If a value doesn't parse, dygraphs will put a warning about it on your console. But beware: different browsers support different date formats!

Here are some valid date formats:

  • 2009-07-12
  • 2009/07/12
  • 2009/07/12 12
  • 2009/07/12 12:34
  • 2009/07/12 12:34:56

If you specify the xValueParser option, then all this detection is bypassed and your function is called instead. Your parser function takes in a string and needs to return a number. For dates/times, you should return milliseconds since epoch. You may also want to specify a few other options to make sure that everything gets displayed properly.

Here's code which parses a CSV file with unix timestamps in the first column:

new Dygraph(el, "Date,Series1,Series2\n" + "1247382000,100,200\n" + "1247986800,150,201\n", { xValueFormatter: Dygraph.dateString_, xValueParser: function(x) { return 1000*parseInt(x); }, xTicker: Dygraph.dateTicker });

y-values

Dependent (y-axis) values are simpler than x-values because they're always numbers. The complexity here comes from the various ways that you can specify the uncertainty in your measurements.

If your y-values are just numbers, then they need to be parseable by JavaScript's parseFloat function. Acceptable formats include:

  • 12
  • -12
  • 12.
  • 12.3
  • 1.24e+1
  • -1.24e+1

If you have missing data, just leave the column blank (your CSV file will probably contain a ",," in it).

If your numbers have uncertainty associated with them, then there are three basic ways to express this: using fractions, standard deviations or explicit ranges.

Fractions

If you specify the fractions option, then your data will all be interpreted as ratios between zero and one. This is often the case if you're plotting a percentage.

new Dygraph(el, "X,Frac1,Frac2\n" + "1,1/2,3/4\n"+ "2,1/3,2/3\n"+ "3,2/3,17/49\n"+ "4,25/30,100/200", { fractions: true });

Why not just divide the fractions out yourself? There are two attractive reasons not to:

  • If you set both fractions and errorBars, then the denominator is interpreted as a sample size and dygraphs will plot Wilson binomial proportion confidence intervals around each point.
  • If you set showRoller, then dygraphs will combine the values as fractions. If two point are a/b and c/d, it will plot (a+b) / (c+d) rather than (a/b + c/d) / 2, which is what you'd get if you divided the fractions through. This will also shrink the confidence intervals.
Standard Deviations

Often you have a measurement and also a measure of its uncertainty: a standard deviation. If you specify the errorBars option, dygraphs will look for alternating value and standard deviation columns in your CSV data. Here's what it should look like:

new Dygraph(el, "X,Y1,Y2\n" + "1,10,5,20,5\n" + "2,12,5,22,5\n", { errorBars: true });

The "5" values are standard deviations. When each point is plotted, a 2-standard deviation region around it is shaded, resulting in a 95% confidence interval. If you want more or less confidence, you can set the sigma option to something other than 2.0.

When you roll data with standard deviations, dygraphs will plot the average of your values in each rolling period and the RMS value of your standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.

Custom error bars

Sometimes your data has asymetric uncertainty or you want to specify something else with the error bars around a point. One example of this is the "temperatures" demo on the dygraphs home page., where the point is the daily average and the bars denote the low and high temperatures for the day.

To specify this format, set the customBars option. Your CSV values should each be three numbers separated by semicolons ("low;mid;high"). Here's an example:

new Dygraph(el, "X,Y1,Y2\n" + "1,10;20;30,20;5;25\n" + "2,10;25;35,20;10;25\n", { customBars: true });

The middle value need not lie between the low and high values. If you set a rolling period, the three values will all be averaged independently.

URL

If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and attempt to parse the returned data as CSV.

Common problems. Make sure the URL is accessible and returns data in text format (as opposed to a CSV file with an HTML header). You can see what the response looks like by checking your JS console or by requesting the URL yourself.

Array (native format)

If you'll be constructing your data set from a server-side program (or from JavaScript) then you're better off producing an array than CSV data. This saves the cost of parsing the CSV data and also avoids common parser errors.

The downside is that it's harder to look at your data (you'll need to use a JS debugger) and that the data format is a bit less clear for values with uncertainties.

Here's an example of "native format":

new Dygraph(document.getElementById("graphdiv2"), [ [1,10,100], [2,20,80], [3,50,60], [4,70,80] ], { labels: [ "x", "A", "B" ] });

Headers

Headers for native format must be specified via the labels option. There's no other way to set them.

x-values

If you want your x-values to be dates, you'll need to use specify a Date object in the first column. Otherwise, specify a number. Here's a sample array with dates on the x-axis:

[ [ new Date("2009/07/12"), 100, 200 ], [ new Date("2009/07/19"), 150, 220 ] ]

y-values

You can specify errorBars, fractions or customBars with the array format. If you specify any of these, the values become arrays (rather than numbers). Here's what the format looks like for each one:

errorBars: [x, [value1, std1], [value2, std2], ...] fractions: [x, [num1, den1], [num2, den2], ...] customBars: [x, [low1, val1, high1], [low2, val2, high2], ...]

To specify missing data, set the value to null or NaN. You may not set a value inside an array to null or NaN. Use null or NaN instead of the entire array. The only difference between the two is when the option connectSeparatedPoints true. In that case, the gaps created by nulls are filled in, and gaps created by NaNs are preserved.

Functions

You can specify a function that returns any of the other types. If x is a valid piece of dygraphs input, then so is

function() { return x; } Functions can return strings, arrays, data tables, URLs, or any other data type.

DataTable

You can also specify a Google Visualization Library DataTable object as your input data. This lets you easily switch between dygraphs and other gviz visualizations such as the Annotated Timeline. It also lets you embed a Dygraph in a Google Spreadsheet.

You'll need to set your first column's type to one of "number", "date" or "datetime".

DataTable TODO:
- When to use Dygraph.GvizWrapper
- how to specify fractions
- how to specify missing data
- how to specify value + std. dev.
- how to specify [low, middle, high]
- walkthrough of embedding a gadget in google docs/on a web page
- walkthrough of using std. dev. in a spreadsheet chart