A couple weeks ago, I wrote a popular article, Pry, Ruby, and Fun With the Hash Constructor demonstrating the usefulness of pry with the Hash bracket constructor. I just ran into a super fun test example of pry that I couldn’t resist sharing!
The Task: Convert CSV File without Headers to Array of Hashes
For example, you want to take a csv file like:
|--- -------- --------| | 1 | Justin | Gordon | | 2 | Tender | Love | |--- -------- --------|
And create an array of hashes like this with column headers “id”, “first_name”, “last_name”:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
You’d think that you could just pass the headers to the CSV.parse
, but that
doesn’t work:
1 2 3 4 5 6 7 8 |
|
Using Array#zip
I stumbled upon a note about the CSV parser that suggested using Array#zip
to
add keys to the results created by the CSV parser when headers don’t exist in
the file.
Using Array#zip
? What the heck is the zip
method? Compression?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
Hmmmm….Why would that be useful?
Here’s some pry command that demonstrate this. I encourage you to follow along in pry!
I first created a CSV string from hand like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Doooh!!!! That taught me that creating a legit CSV string is not as easy as it sounds.
Let’s create a legit csv string:
1 2 3 4 5 |
|
Notice, there’s no quotes around the single word names!
If I use CSV to parse this, we get the reverse result, the array of arrays, back:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Ahh…Could we use the Hash[] constructor to convert these arrays into Hashes that place the proper keys?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Bingo!
Now, let’s fix the array of arrays, creating an array called rows
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Then the grand finale!
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
And sure, you can do this all on one line by inlining the rows
variable:
1
|
|
Using headers option in CSV?
Well, you’d think that you could just pass the headers to the CSV.parse
, but
that doesn’t work:
1 2 |
|
Well, what’s the doc?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Hmmm…seems that passing the headers
should have worked.
The CSV docs clearly state that the initialize method takes an option :headers
:headers If set to :first_row or true, the initial row of the CSV file will be treated as a row of headers. If set to an Array, the contents will be used as the headers. If set to a String, the String is run through a call of ::parse_line with the same :col_sep, :row_sep, and :quote_char as this instance to produce an Array of headers. This setting causes #shift to return rows as CSV::Row objects instead of Arrays and #read to return CSV::Table objects instead of an Array of Arrays.
So, what can we call on a new CSV object? Let’s list the methods.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
How about this:
1 2 3 4 5 |
|
Well, that’s getting closer.
How about if I just map those rows with a to_hash
?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Bingo!
I hope you enjoyed this!
This is a companion discussion topic for the original entry at http://www.railsonmaui.com//blog/2014/09/15/pry-ruby-array-zip-csv-and-the-hash-constructor/