Long ago, in 1993, I complained to IBM that the DX documentation was too complicated for beginners to understand. The folks at DX development then challenged me to do something about it, so I wrote them a fairly long chapter designed to ease scientists and analysts like yourself into the hot water and bizarre nomenclature of visualization. Since that time, my chapter has been cut into pieces and distributed throughout the DX documentation, but the majority of it survives as User's Guide Chapter 2 and Appendices A.2-A.5. At this point, I won't take complete credit for everything in these sections, since others have added or modified them over time, but for the most part, I think you'll find it useful to read these sections. Some of the same material is covered in this workshop in other modules (as written in Y2K by me) but these chapters spend a bit more time discussing why one does visualization rather than dwelling on the details of Data Explorer itself.
Here's the original introduction that got left on the cutting room floor:
In this introductory chapter to Data Explorer, we'll discuss a variety of issues you'll encounter as you learn to perform scientific visualization of your research data using DX. Here we want to introduce a number of topics, show how they relate, and give you some inspiration to guide you as you discover how to create three-dimensional full-color animations from your data.
The problem with documentation is that people don't like to read it until they get stuck. The problem is even worse when the program is inherently visual and has a friendly user interface. Many people assume they can "figure it out" just by playing with the program.
If you are the kind of person who dislikes reading manuals, we suggest you skim User's Guide Chapter 2 and Appendices A.2-A.5" to get some ideas, then dive into using Data Explorer. DX is a very powerful "visual programming language" that will enable you to make quite elaborate and instructive imagery. Because we recognize that many people learn a new computer language by studying other people's working programs, the example "networks" that are included with your copy of Data Explorer are a great place to begin. You can open up any net and study how the different modules are interconnected and run the networks to observe the visual output. In no time, you'll be modifying the examples and creating new images of your own. Use the online help system to get more information about these example networks; use the Context Sensitive Help (Help Menu) to click on items in the user interface for more detail. The online help system contains hypermedia references to additional information, so you can click on highlighted words to jump to other sections.
If you get stuck or would like more detailed information on a topic or on a specific module, the documentation contains a wealth of information including graphics, sample code, and data examples that may help you solve your problem. The complete documentation is provided with OpenDX. We recommend you work through this workshop before starting to read the more advanced chapters in the documentation; the workshop will direct you to read certain parts of the online documentation, then offer some additional explanation and background material to help you understand it better.
Scientists and engineers have used graphic techniques for centuries to help them order raw data into information. Scatterplots, plans and elevations, grids, colored maps, and artistic renderings are all common techniques that connect numeric data values (whether quantitative measures or qualitative indices) into coherent structures appealing to the human visual system. When such visual structures are perceived by our built-in parallel processor, the eye-brain system, a unique type of mental processing is engaged and a higher order of comprehension can be achieved.
The development of written language â€" numbers and letters as abstract symbolic representations or descriptors of reality â€" is a much younger cultural development than is the biological evolution of our million-year old visual system. We are therefore attuned to interpret input through visual perception in quite different ways than we are to the "higher brain" rational symbol interpretation we perform when analyzing written text. Vision gives us direct perception of motion rates and directions, edge detection, stereoscopic depth cues, light and shade, material surface properties, and so on. While all of these parameters can be codified into numbers (as in a computer graphics system), it is virtually impossible for the unaided human mind to turn a huge mass of numbers into images containing the rich detail of a computer-generated image.
Scientific visualization is a new term for an old process that scientists and engineers have performed in the past through mental effort, sketches on the backs of envelopes, or renderings by scientific illustrators. Instead of attempting to extract a correspondence between large quantities of written numbers, visualization, with the aid of appropriate software and computer hardware, permits the scientist to directly perceive patterns, connections or disjunctions, correlations or lack thereof between the parameters he or she has measured or calculated. Scientific visualization becomes a camera for taking pictures of events that are (frequently) inherently non-visual. This "camera" can slow down or speed up time by many orders of magnitude and can transform space to zoom in on microscopic features or zoom out to encompass galaxies.
Current State of Scientific Visualization
Scientific visualization grew in tandem with the rapid expansion of supercomputer use and the concomitant ability to generate enormous data sets in relatively short amounts of time. Of course, there are many data sets of equal or greater size (from remote sensing or data mining, for example) for which the same visualization tools are also appropriate. Yet the power of a tool like Data Explorer does not preclude its use for visualizing more modest data sets. The relevant factor in determining the utility of visualization for a project is by no means the quantity of data involved, but the power and immediacy that visual comprehension gives to the person exploring that data.
The current state of the art of scientific visualization, as embodied in OpenDX, has five key attributes. Data Explorer is:
- Interactive
The software can be learned, "programmed", and used by the scientist him or herself, thus removing intermediate agents (viz gurus) from the critical direct perception and discovery process. Furthermore, Data Explorer permits a researcher to develop a "visual program" with an easy to control user interface that allows the analyst to interactively modify the imagery by changing input values with dials, sliders, and other graphic controls. Additionally, the data itself can directly drive the program where appropriate.
- Sharable
Like any publication of a new scientific technique, the visual program developed by one scientist can be shared and manipulated by other scientists around the world. Since a visual program is stored as a digital computer file, it may be transmitted either electronically or through a print medium. This permits others to adapt a visual program to their own needs or to inspect the underlying assumptions made by the first researcher as he or she made a data visualization.
- Multi-dimensional
Data Explorer generates multi-dimensional visual objects from multi-parametric numeric data. Three-dimensional objects can be created and manipulated just as easily as two-dimensional plots. Other parameters can be mapped onto objects through the use of color, "glyphs," or "isosurfaces," thus increasing the dimensionality of the image.
- Temporal
This software provides a degree of "animation" or motion. This permits temporal processes to be perceived directly. Time is frequently one of the most important dimensions in a scientific or engineering process. Time steps in the data can be shown sequentially, or, objects can be rotated allowing the researcher to study them from different points of view. Moving or rotating an object helps us better perceive it as 3D.
- Modular
Generic software building blocks called "modules" and "macros" in Data Explorer can be assembled in many ways for the analysis of different classes of data. Once built, these larger assemblies called "networks" or "visual programs" in Data Explorer can be used to visualize different instances of similar data sets. Thus, the effort of developing a new network is repaid many times over when used throughout a project. And macros (custom tools developed by the researcher) can be reused in many different networks, speeding development of future visualization programs as researchers build and share personal suites of tools.
We are very much proponents of incorporating scientific visualization into the entire methodology of scientific research, and not simply using it to create images for presentations at the end of the research process. Think of OpenDX as a configurable "apparatus" to be used throughout the process of exploration, hypothesis development and testing, and the development of new theory and knowledge. It displaces nothing in the traditional scientific method but adds a powerful new means for investigating large quantities of complex data. It has been our observation that researchers who employ scientific visualization immediately begin to make new discoveries about their own data: they spot new details, features, or patterns they had not predicted; they discover and correct errors in their computer simulation code or data measurement techniques by visually identifying irregularities; and very often, they go on to refine their hypotheses and methods based on the new insights that visualization gives them.