Skip to contents

inspectdf 0.0.13

  • Fixed compatibility with dplyr >= 1.1.0 by replacing deprecated functions: select_if() replaced with select(where()), and mutate_if() replaced with mutate(across(where())).
  • Fixed critical bug in plot_cat() where bind_rows(.id = ) with unnamed lists caused failures in newer dplyr versions. Function now properly assigns column names to list elements.
  • Fixed issue in plot_cat() where filtering by non-existent jsd column removed all rows when plotting single dataframe summaries.
  • Fixed #48, inspect_num(df1, df2) with different ranges and different set of columns. Thanks to cregouby for the #51 fix.
  • Fixed #45, partial argument matching warning in format_size(). Changed unit = "auto" to units = "auto" in call to format(). Thanks to @salim-b for the report.
  • Updated CRAN checks badge URL from deprecated cranchecks.info/badges/ to new badges.cranchecks.info/ service.
  • Fixed ggplot2 deprecation warning by replacing size parameter with linewidth in geom_bar().

inspectdf 0.0.11

CRAN release: 2021-04-02

inspectdf 0.0.10

CRAN release: 2021-02-20

  • Add include_int option in inspect_cat() to allow treatment of integer columns as categorical.
  • Improved p-values associated with binned categorical and numeric comparisons. This is now based on a modified chi-squared test and is labelled as pval in the resulting output.
  • Fixed #27 ensuring plots for inspect_cat() respect any filtering or sorting of the summary output prior to show_plot(). Thanks to Roel Verbelen for the report.
  • Additional detail in inspect_type() comparison of two dataframes to make it easier to see which columns and types differ.

inspectdf 0.0.9

CRAN release: 2020-09-07

  • Minor change, ensuring all functions use return properly.

inspectdf 0.0.8

CRAN release: 2020-06-25

  • Important change: the show_plot argument has been removed from all inspect_*() functions. To generate visualisations of data frame summaries, please use the more flexible show_plot(inspect_*()) or via the pipe inspect_*() %>% show_plot().
  • show_plot() improvement that nudges points that might otherwise have coincided for dataframe comparisons of imbalance (for example, with inspect_imb(df1, df2) %>% show_plot())
  • Plots for grouped summaries: inspect_cor(), inspect_na() and inspect_num().
  • inspect_cor() slight speed up for dataframes with large numbers of columns.
  • inspect_cor() can be filtered prior to plotting, for example inspect_cor(starwars) %>% filter(abs(corr) > 0.2) %>% show_plot(). Thanks to Roel Verbelen for the suggestion
  • Fixed bug causing inspect_imb() to fail on certain types of factor columns. Thanks to Roel Verbelen for the report.
  • show_plot() has new arguments label_size, label_angle and label_color. Each provide adjustments to text annotation where applicable. Thanks to Bartosz Bursa for the suggestion.
  • changes to text annotation to improve how coord_flip() works on resulting plots. Thanks to Roel Verbelen for the report.

inspectdf 0.0.7

CRAN release: 2019-11-05

  • Added bytes column to inspect_mem() output, for downstream numeric comparison and consistency with inspectpd.
  • Added pcnt_nna column to inspect_cor() output containing the percentage of pairwise complete observations used calculated correlations. Thanks to Theo Broekman for the suggestion.
  • Fixed bug causing order of grouping variable in grouped inspect_ statements to be incorrect. Thanks to the report from Theo Broekman.
  • Removed erroneous print statement from inspect_num().

inspectdf 0.0.6

CRAN release: 2019-09-29

  • Updates to documentation throughout.
  • inspect_* functions now returns results by group grouped dataframes.
  • Added option for inspect_num() %>% show_plot() to show histograms with color palettes specified by the col_palette argument.
  • Fixed bug causing inspect_imb() to sometimes fail when factors present. Thanks to Doug Friedman for the report.

inspectdf 0.0.5

CRAN release: 2019-08-26

  • Fixed error causing inspect_num() to fail when columns contained all NA values. Thanks to Ryan Tanner for the report
  • Speed-up of inspect_cor() for large data frames with many numeric columns.
  • Added approximate confidence intervals and tests for method = 'kendall' and method = 'spearman' in inspect_cor().

inspectdf 0.0.4

CRAN release: 2019-07-27

  • Fix issue causing inspect_na() %>% show_plot() to fail when 0 NA present. Thanks to the report by Metin Yazici.
  • show_plot() now returns a ggplot2 object rather than printing the plot - thanks to Garrick Aden-Buie for the suggestion.
  • Dramatic speed up of inspect_cat plotting by avoiding text labels for small regions.
  • Added tech dataset.
  • Fix for text annotation of inspect_cat() plots when labels are empty strings. By default "" will be shown. Thanks to Michael Swenson for the report
  • inspect_cor(method = ...) argument added, thanks to suggestion from George Dontas. Options for pearson, spearman and kendall. Note that confidence intervals and tests currently only supported for pearson.
  • Fix error when duplicate factor labels present in inspect_cat() & inspect_imb().

inspectdf 0.0.3

CRAN release: 2019-06-27

  • text_labels autoscale size using ggfittext::geom_fit_text(). For an example see inspect_cat(). Thanks to David Wilkins for the PR.
  • 6 different color palettes supported in show_plot() via col_palette argument. Colorblind friendly option specified via show_plot(col_palette = 1) - thanks to Richard Careaga for the suggestion.
  • inspect_imb().
    • include_na option for categorical columns that are 100% missing, or constant are underlined in plot for easier comprehension.
  • inspect_cor()
    • Points and whiskers changed to coloured bands for single dataframe summaries - these are easier to see when CIs are narrow.
    • Points changed to bars for inspect_cor() comparison plots - makes it easier to see smaller differences in correlations.
    • NA correlations omitted from inspect_cor() comparison when plotted. Ordering of correlations reversed to be consistent with returned tibble.

inspectdf 0.0.2

CRAN release: 2019-05-23

  • show_plot() function (show_plot argument in inspect_ functions will be dropped in a future version)
  • high_cardinality argument in show_plot() for combining unique or near-unique categories for plotting inspect_cat().
  • progress bars shown when processing larger datasets
  • Improvements to plots throughout

inspectdf 0.0.1

CRAN release: 2019-04-24

  • Initial CRAN release