The Visualization Process

1. Aquisition and Preparation
To make visualizations of data, we perform several operations:

  1. Acquire data
  2. Prepare data formats for import
  3. Import data into visualization program
  4. Manipulate data realizations
  5. Produce output
Obviously, the first is to acquire some data, though in fact, simple models can be entirely generated and run within DX using the remarkable Compute module and no external data source at all. Normally, however, one comes to DX with data acquired through experiment, data base extraction, or simulation on a computer.

An important step that newcomers think they can leave out is to assemble clear documentation of the details of the stored data. It is all too often the case that someone who worked in your research lab five years ago invented the format you are still using, but no one there now knows exactly what that format is. Or, you invented this scheme six months ago, but didn't document it because you were sure you'd remember it (any red faces out there?). Or, you use more than one format variation for different types of data and cannot quite remember the specific details about each. Sorry: you need the details. DX is a visualization program, not a mind-reading program. So, good documentation of how the data files are structured, including information about number type (float, integer, etc.), binary format (IEEE, MSB, LSB), structure (scalar, vector, vector shape, are the components together or in separate blocks), and so on, must be available at the time you sit down to describe your data to DX.

 

If your data is in a "spreadsheet-like" format (tab-delimited text file, for instance), many of these requirements are unnecessary. The ImportSpreadsheet module does do a bit of mind-reading and figures out the data type from the first entry in each column, and also is clever enough to stop reading at the end of the data, so you don't have to tell it how many rows there are in advance. However, there are many data sets that are impossible or unwieldy to describe in a spreadsheet form. The Import module can read so many different types of data that you must provide it with this detailed information.

We've talked at some length about the necessity of describing both the data itself and its spatial coordinate frame. The next process in visualizing data is translating the data format you use to a context that DX can interpret. Sometimes this is trivial: either you already use a format that DX directly understands, or your format is straightforward and easy to describe to DX using the Data Prompter (most useful for simple grid type data), or your data is in a "spreadsheet-like" form, such as tab-delimited columns of text (the ImportSpreadsheet module is for you!). Occasionally The researcher's data format is arbitrary and complicated. This may call for a bit more effort in creating a scheme for translating the data into DX form. While it cannot always be done, there are many cases in which using the so-called "DX native file format" permits this translation without any need to modify the original data. You can read more about the DX Native File Format if you are inclined, but you probably should complete this workshop before venturing into those deeper waters. The relevant section in the OpenDX documentation is found in the Users Guide, Appendix B.2.

There is no program to automatically generate "native file format" (unless you write it), but it's not difficult to understand and employ when necessary. On rare occasions, the input data is so tangled or complex that even this fails and the only recourse is to write a filter program to convert the researcher's output into a more parseable format.

 

Optional: View this Technical Aside if you want more detail on this subject; the information is not required to understand the upcoming material.