A Visual Framework for Longitudinal and Panel Studies (with Examples in R)
No Thumbnail Available
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Uva Wellassa University of Sri Lanka
Abstract
Visual analysis is an essential part of modeling as it helps identify potential data issues
and select appropriate methods for further analysis. We focus on simple yet effective
visual tools applicable to panel and longitudinal data. Our objective was to find
suitable tools for the following sequence of tasks: i) detect data anomalies/outliers, ii)
describe patterns for missing and zero values, iii) identify data patterns and patches
requiring special attention or modeling approaches, iv) assess the properties of
distributions for the variables of interest, v) choose most suitable transformations, vi)
assess the evidence for trends. This study demonstrates that existing software is not
always suitable for the above tasks and there are areas for improvements and also
found that many journal papers lack the visual analysis part for panel studies providing
only summary tables, which does not give a clear picture of data features. Thus, the
study proposes a framework aiming to solve the above tasks using a methodology
based on a set of principles for effective statistical graphics. One well-known difficulty
when plotting panels occurs due to the problem of overlapping: when a lot of points
belonging to different panels are shown on one plot, the plot becomes hard to read. The
framework proposed consists of a set of visual techniques that help solve this problem
and increase the readability of plots. In particular, the study use colors and symbols
with high visual discrimination and several plots to describe different data features
keeping consistent colors/symbols for all plots. Th study describes several tools for the
improved analysis, provide our guidelines and present R codes for the implementation
of the tools. The study demonstrates the flexibility and the ease of use of R to plot,
summarize, slide-and-dice panels, and to transform or impute variables. And the study
used real data (relating to financial variables of Sri Lanka’s companies) to show how
the framework works.
Keywords: Exploratory data analysis, Visual tools, Panel data, Longitudinal data, R
programming language
Description
Keywords
Business Management, Statistics, Data Analysis, Computer Science