Page 1 of 1

Cannot check for panel data id on a tibble

Posted: 09 Apr 2021, 23:34
by gregorymacfarlane
I wrote about this in another post that has yet to be moderated, but I can link to it there. There can be a problem identifying the length of unique ID's when the database object is an instance of class tibble. The error arises from this line in apollo_validateControls()

Code: Select all

length(unique(database[, apollo_control$indivID])) < nrow(database)
The result of df[, name] is actually a dataframe of one column, and not a vector. The correct code should be df[[name]], which pulls the column out as a vector. In base R data frames, the distinction is kind of abstract but still meaningful. In other contexts, it matters greatly. I believe this is a bug that should be fixed, and that probably happens in other contexts as well.

Code: Select all

> length(unique(database[[apollo_control$indivID]])) < nrow(database)
[1] FALSE
> length(unique(database[, apollo_control$indivID])) <  nrow(database)
[1] TRUE

> database[,apollo_control$indivID]
# A tibble: 210 x 1
   ID   
   <chr>
 1 1    
 2 2    
 3 3    
 4 4    
 5 5    
 6 6    
 7 7    
 8 8    
 9 9    
10 10   
# … with 200 more rows
> database[[apollo_control$indivID]]
  [1] "1"   "2"   "3"   "4"   "5"   "6"   "7"   "8"   "9"   "10"  "11"  "12"  "13"  "14"  "15" 
 [16] "16"  "17"  "18"  "19"  "20"  "21"  "22"  "23"  "24"  "25"  "26"  "27"  "28"  "29"  "30" 
 [31] "31"  "32"  "33"  "34"  "35"  "36"  "37"  "38"  "39"  "40"  "41"  "42"  "43"  "44"  "45" 
 [46] "46"  "47"  "48"  "49"  "50"  "51"  "52"  "53"  "54"  "55"  "56"  "57"  "58"  "59"  "60" 
 [61] "61"  "62"  "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71"  "72"  "73"  "74"  "75" 
 [76] "76"  "77"  "78"  "79"  "80"  "81"  "82"  "83"  "84"  "85"  "86"  "87"  "88"  "89"  "90" 
 [91] "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98"  "99"  "100" "101" "102" "103" "104" "105"
[106] "106" "107" "108" "109" "110" "111" "112" "113" "114" "115" "116" "117" "118" "119" "120"
[121] "121" "122" "123" "124" "125" "126" "127" "128" "129" "130" "131" "132" "133" "134" "135"
[136] "136" "137" "138" "139" "140" "141" "142" "143" "144" "145" "146" "147" "148" "149" "150"
[151] "151" "152" "153" "154" "155" "156" "157" "158" "159" "160" "161" "162" "163" "164" "165"
[166] "166" "167" "168" "169" "170" "171" "172" "173" "174" "175" "176" "177" "178" "179" "180"
[181] "181" "182" "183" "184" "185" "186" "187" "188" "189" "190" "191" "192" "193" "194" "195"
[196] "196" "197" "198" "199" "200" "201" "202" "203" "204" "205" "206" "207" "208" "209" "210"
In the meantime, it is probably important to coerce input data frames out of tibble

Re: Cannot check for panel data id on a tibble

Posted: 14 Apr 2021, 16:30
by stephanehess
Thanks. We will coerce tibbles back to data.frame in the next version