Skip navigation links

Package edu.cmu.cs.cs214.hw5

The Data Analysis Framework enables users to gather data from various sources and display it in different formats and visualizations.

See: Description

Package edu.cmu.cs.cs214.hw5 Description

The Data Analysis Framework enables users to gather data from various sources and display it in different formats and visualizations. The framework was designed to support the incorporation of third party plugins for both data gathering and data visualization.

The main design goal was to permit users to easily write new plugins. This document will encapsulate the fundamentals required to write compatible plugins.

Overview

The framework's core is based on representing all data in tabular form. In other words, everything is represented internally as a set of columns where each cell represents a data value. Plugins use this format as the intermediate representation for all data.

The minimal data structure used by the framework is the Column; columns are essentially lists of Strings with an optional header label. Data View plugins are expected to return these objects. Using the Data Source Specifications, the framework can show and use Data Source plugins. The framework will then use the Data View Specifications to parse the columns into a meaningful structured format for the Data View plugins.

The parsing involves creating Series: each series represents a related group of entries, like "Rainfall" or "Pineapple mass". Some plugins may also require Shared Data common to all series: for example, the labels on the bottom of a bar chart. These will be combined into a Data object and passed to the Data View plugins. Finally, the view plugins will retrieve the raw data from the Data object and display it visually.

Thus, the general data flow may be summarized as follows:

Source → Source Plugin → Columns → View Specification → Series and/or Shared Data → Data → View Plugin.

This concludes the overall explanation of how the framework architecture and data flow. For more details about how these classes interact between each other, please refer to the sections below and the relevant class documentations.

Part 1: Data gathering

Closely related data like coordinates, temperature measurements, etc. are held in a Column object as pure strings. Accompanying each Column object is a ColumnSpecification; this class is responsible for specifying to the framework the kind of data contained in a particular Column object. For example, Integer, String, Shape, Color, etc. It is recommended to extend the AbstractColumn class when writing new ColumnSpecification objects instead of implementing ColumnSpecification directly. This class implements a bunch of functionality and also provides some nifty convenience methods; the only thing you need to implement is a function describing how to parse any single cell.

The framework will use the specifications alongside columns to pass meaningful data between plugins in a type-safe way. It is strongly advised to review the ColumnSpecification documentation to understand the close relationship between the Column and ColumnSpecification classes.

Part 2: Data source plugins

As mentioned above, each data source must be accompanied by a corresponding DataSourceSpecification class. This class is very simple: it just lists the name of the source and a method for creating the source. The framework will use this class to enable users to select the plugin. It is used to create DataSource objects.

The DataSource class is responsible for all the GUI components required to gather the raw data that will be passed to the framework. Plugins must extend this class to be compatible with the framework.

Finally, each DataSource must return a DataResult object. This object enables the framework to handle failures when retrieving data from a plugin. A DataResult is a simple object that can represent either a successful computation or a failure. This class is responsible for passing the raw data to the framework, as a List of Column objects. For more information regarding the proper use of this class, refer to its documentation.

Part 3: Data Visualization Plugins

Following the same design paradigm, it is required to write a DataViewSpecification class for each DataView. The DataViewSpecification describes to the framework which kind of columns are required by the corresponding DataView class to display its data accordingly. For more information, please refer to part 4. Important: DataViewSpecification objects are responsible for determining what prerequisites raw data must fulfill to be used by DataView plugins.

Objects that extend the DataView class are responsible for displaying the data they receive in a meaningful manner. In essence, the classes that extend DataView objects form the Visualization plugins of the framework.

The framework guarantees that all data provided to a view plugin will pass verification by its column specification. Thus, if the column specification's isValid(Column) method is correctly implemented, then the view plugin may assume that all data provided is valid.

Note: There could be cases were no data is passed by the framework; that is, the series or the shared data might be empty.

Part 4: The Data Object

The Display Plugins will receive a Data object from the framework containing all the necessary information to display something meaningful to the user. Each Data object is composed of a List of Series objects and a Map representing the Shared Data. Display Plugins will be able to retrieve the data from this object and display it as they desire.

See Also:
ColumnSpecification, DataViewSpecification, DataSourceSpecification, DataView, DataSource
Skip navigation links