14  vroom Data I/O

vroom is a high-performance delimited data reader and writer with extended functionality.

14.1 Read delimited file

Important arguments:

  • delim: Field delimiter. If NULL, will guess by looking at the data.
  • na: Character vector of strings to interpret as missing values, i.e. NA.
dat <- vroom("https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv")
Rows: 299 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (13): age, anaemia, creatinine_phosphokinase, diabetes, ejection_fractio...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

vroom returns a tibble.

dat
# A tibble: 299 × 13
     age anaemia creatinine_phosphokinase diabetes ejection_fraction
   <dbl>   <dbl>                    <dbl>    <dbl>             <dbl>
 1    75       0                      582        0                20
 2    55       0                     7861        0                38
 3    65       0                      146        0                20
 4    50       1                      111        0                20
 5    65       1                      160        1                20
 6    90       1                       47        0                40
 7    75       1                      246        0                15
 8    60       1                      315        1                60
 9    65       0                      157        0                65
10    80       1                      123        0                35
# ℹ 289 more rows
# ℹ 8 more variables: high_blood_pressure <dbl>, platelets <dbl>,
#   serum_creatinine <dbl>, serum_sodium <dbl>, sex <dbl>, smoking <dbl>,
#   time <dbl>, DEATH_EVENT <dbl>

14.2 Select columns to read

dat <- vroom(
    "https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv", 
    col_select = c("age", "anaemia", "ejection_fraction"))
Rows: 299 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (3): age, anaemia, ejection_fraction

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dat
# A tibble: 299 × 3
     age anaemia ejection_fraction
   <dbl>   <dbl>             <dbl>
 1    75       0                20
 2    55       0                38
 3    65       0                20
 4    50       1                20
 5    65       1                20
 6    90       1                40
 7    75       1                15
 8    60       1                60
 9    65       0                65
10    80       1                35
# ℹ 289 more rows

14.3 Resources

Read more about vroom here.