Bumblebee

Higher level languages, such as Ruby, make interacting with CSV (Comma Separated Values) files trivial. Even so, this library provides a very simple object/CSV mapper that allows you to fully interact with CSV's in a declarative way. Locking in common patterns, even in higher level languages, is important in large codebases. Using a library, such as this, will help ensure standardization around CSV interaction.

Installation

To install through Rubygems:

gem install install bumblebee

You can also add this to your Gemfile:

bundle add bumblebee

Examples

A Simple 1:1 Example

Imagine the following CSV:

id	name	dob	phone
1	Matt	1901-02-03	555-555-5555
2	Nick	1921-09-03	444-444-4444
3	Sam	1932-12-12	333-333-3333

Using the following column configuration:

columns = %i[id name dob phone]

We could parse this data and turn it into hashes:

objects = Bumblebee::Template.new(columns: columns).parse(data)

Then objects is this array of hashes:

[
  { id: '1', name: 'Matt', dob: '1901-02-03', phone: '555-555-5555' },
  { id: '2', name: 'Nick', dob: '1921-09-03', phone: '444-444-4444' },
  { id: '3', name: 'Sam',  dob: '1932-12-12', phone: '333-333-3333' }
]

Note: Data, in this case, would be the CSV file contents in string format.

Custom Headers

If our headers are not a perfect 1:1 match to our object, such as:

ID #	First Name	Date of Birth	Phone #
1	Matt	1901-02-03	555-555-5555
2	Nick	1921-09-03	444-444-4444
3	Sam	1932-12-12	333-333-3333

Then we can explicitly map those as:

columns = {
  'ID #' => :id,
  'First Name' => :name,
  'Date of Birth' => :dob,
  'Phone #' => :phone
}

Nested Objects

Let's say we have the following data which we want to create a CSV from:

objects = [
  {
    id: 1,
    name:     { first: 'Matt' },
    demo:     { dob: '1901-02-03' },
    contact:  { phone: '555-555-5555' }
  },
  {
    id: 2,
    name:     { first: 'Nick' },
    demo:     { dob: '1921-09-03' },
    contact:  { phone: '444-444-4444' }
  },
  {
    id: 3,
    name:     { first: 'Sam' },
    demo:     { dob: '1932-12-12' },
    contact:  { phone: '333-333-3333' }
  }
}

We could create a flat-file CSV:

ID #	First Name	Date of Birth	Phone #
1	Matt	1901-02-03	555-555-5555
2	Nick	1921-09-03	444-444-4444
3	Sam	1932-12-12	333-333-3333

Using the following column config:

columns = {
  'ID #' => :id,
  'First Name': {
    property: :first,
    through: :name
  },
  'Date of Birth': {
    property: :dob,
    through: :demo
  },
  'Phone #': {
    property: :phone,
    through: :contact
  }
}

And executing the following:

csv = Bumblebee::Template.new(columns: columns).generate(objects)

The above columns config would work both ways, so if we received the CSV, we could parse it to an array of nested hashes.

Custom Formatting

You can also pass in built-in or custom functions that can do the value formatting. For example:

columns = {
  'ID #': {
    property: :id,
    to_object: :integer
  },
  'First Name': {
    property: :first,
    through: :name,
    to_csv: ->(v) { v.to_s.upcase }
  },
  'Date of Birth': {
    property: :dob,
    through: :demo,
    to_object: { type: :date, nullable: true }
  },
  'Phone #': {
    property: :phone,
    through: :contact
  }
}

would ensure:

id is an integer data type when parsed
the CSV has only upper-case First Name values
dob is a date data type when parsed

Other formatting functions that can be used for to_object and/or to_csv:

bigdecimal: converts to BigDecimal (nullable, non-nullable default is 0)
boolean: converts to flexible boolean (nullable; non-nullable default is false). 1,t,true,y,yes all parse to true, 0,f,false,n,no all parse to false
date: converts to Date (nullable; non-nullable default is 1900-01-01)
integer: converts to Fixnum (nullable, non-nullable default is 0)
join: array is joined by separator option (defaults to comma)
float: converts to Float (nullable, non-nullable default is 0.0f)
function: custom lambda function (input is the resolved value, output of lambda will be used resolved value)
pluck_join: map the sub-property (sub_property option) then join them with separator (defaults to comma)
pluck_split: array is split by separator option (defaults to comma), then new object (object_class option) is created and sub-property (sub_property option) set.
split: array is split by separator option (defaults to comma)
string: calls to_s method on the value

Pluck Join / Pluck Split Explained

Pluck join and pluck split comes in handy when you have an array of objects and would like to:

map one value from each object and join it (in order to output in a CSV)
take a string value, split it, the map each value to a new object (in order to parse as objects)

Take this input and configuration for example:

objects = [
  {
    id: 1,
    name:     { first: 'Matt' },
    demo:     { dob: '1901-02-03' },
    contact:  { phone: '555-555-5555' },
    children: [ { id: 9, name: 'Spunky' }, { id: 10, name: 'Dunker' } ]
  },
  {
    id: 2,
    name:     { first: 'Nick' },
    demo:     { dob: '1921-09-03' },
    contact:  { phone: '444-444-4444' },
    children: [ { id: 11, name: 'Bonzi' }, { id: 12, name: 'Buddy' } ]
  },
  {
    id: 3,
    name:     { first: 'Sam' },
    demo:     { dob: '1932-12-12' },
    contact:  { phone: '333-333-3333' }
  }
]

columns = {
  'ID #': {
    property: :id,
    to_object: :integer
  },
  'Children ID #s': {
    property: :children,
    to_csv: { type: :pluck_join, separator: ';', sub_property: :id },
    to_object: { type: :pluck_split, separator: ';', sub_property: :id },
  }
}

Generating a CSV:

csv = Bumblebee::Template.new(columns: columns).generate(objects)

would output:

ID #	Children ID #s
1	9;10
2	11;12

Parsing a CSV:

objects = Bumblebee::Template.new(columns: columns).parse(csv)

would output:

objects = [
  {
    id: 1,
    children: [ { id: 9 }, { id: 10 } ]
  },
  {
    id: 2,
    children: [ { id: 11 }, { id: 12 } ]
  },
  {
    id: 3
  }
]

Parsing Into Custom Classes

Hash is the default return type when parsing a CSV. You can change this by providing a Hash-like class:

objects = Bumblebee::Template.new(columns: columns, object_class: OpenStruct).parse(csv)

Objects will now be an array of OpenStruct objects instead of Hash objects.

Note: you must also specify this in pluck_split:

columns = {
  'ID #': {
    property: :id,
    to_object: :integer
  },
  'Children ID #s': {
    property: :children,
    to_csv: { type: :pluck_join, separator: ';', sub_property: :id },
    to_object: { type: :pluck_split, separator: ';', sub_property: :id, object_class: OpenStruct },
  }
}

Further CSV Customization

The two main methods:

Template#generate
Template#parse

also accept custom options that Ruby's CSV::new accepts. The only caveat is that Bumblebee needs headers for its mapping, so it overrides the header options.

Template DSL

You can choose to pass in a block for template/column specification if you would rather prefer a code-first approach over a configuration-first approach.

Using Blocks

csv = Bumblebee::Template.new do |t|
  t.column 'ID #',        property: :id,
                          to_object: :integer

  t.column 'First Name',  property: :first,
                          through: :name
end.generate(objects)

objects = Bumblebee::Template.new do |t|
  t.column 'ID #',        property: :id,
                          to_object: :integer

  t.column 'First Name',  property: :first,
                          through: :name
end.parse(data)

Subclassing Bumblebee::Template

Another option is to subclass Template and declare your columns at the class-level:

class PersonTemplate < Bumblebee::Template
  column 'ID #',        property: :id,
                        to_object: :integer

  column 'First Name',  property: :first,
                        through: :name,
                        to_object: :pluck_split
end

template  = PersonTemplate.new
csv       = template.generate(objects)
objects   = template.parse(data)

Column Precedence

The preceding examples showed three ways to declare columns, and each is additive to the next (in the following order):

Class level (parent-first)
Argument level (passed into constructor)
Block level

To illustrate all three:

class PersonTemplate < Bumblebee::Template # first
  column 'ID #',        property: :id,
                        to_object: :integer

  column 'First Name',  property: :first,
                        through: :name,
                        to_object: :pluck_split
end

columns = {
  'Middle Name': {
    property: :middle
  }
}

template  = PersonTemplate.new(columns: columns) do |t| # second
  t.column 'Last Name', property: :last # third
end

When executed to generate a CSV, the columns would be (in order): ID #, First Name, Middle Name, Last Name.

Encoding Support

This library, currently, only supports UTF-8. You can choose to force the inclusion the UTF-8 byte order mark, for example:

csv = Bumblebee::Template.new(columns: columns).generate(objects, bom: true)
# csv will now start with "\xEF\xBB\xBF"

UTF-8 byte order marks will also be ignored while parsing.

Contributing

Development Environment Configuration

Basic steps to take to get this repository compiling:

Install Ruby (check bumblebee.gemspec for versions supported)
Install bundler (gem install bundler)
Clone the repository (git clone git@github.com:bluemarblepayroll/bumblebee.git)
Navigate to the root folder (cd bumblebee)
Install dependencies (bundle)

Running Tests

To execute the test suite run:

bundle exec rspec spec --format documentation

Alternatively, you can have Guard watch for changes:

bundle exec guard

Also, do not forget to run Rubocop:

bundle exec rubocop

Publishing

Note: ensure you have proper authorization before trying to publish new versions.

After code changes have successfully gone through the Pull Request review process then the following steps should be followed for publishing new versions:

Merge Pull Request into master
Update lib/bumblebee/version.rb using semantic versioning
Install dependencies: bundle
Update CHANGELOG.md with release notes
Commit & push master to remote and ensure CI builds master successfully
Run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Code of Conduct

Everyone interacting in this codebase, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

License

This project is MIT Licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
bin		bin
lib		lib
spec		spec
.editorconfig		.editorconfig
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Gemfile		Gemfile
Guardfile		Guardfile
LICENSE		LICENSE
README.md		README.md
Rakefile		Rakefile
bumblebee.gemspec		bumblebee.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bumblebee

Installation

Examples

A Simple 1:1 Example

Custom Headers

Nested Objects

Custom Formatting

Pluck Join / Pluck Split Explained

Parsing Into Custom Classes

Further CSV Customization

Template DSL

Using Blocks

Subclassing Bumblebee::Template

Column Precedence

Encoding Support

Contributing

Development Environment Configuration

Running Tests

Publishing

Code of Conduct

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ckattner/bumblebee

Folders and files

Latest commit

History

Repository files navigation

Bumblebee

Installation

Examples

A Simple 1:1 Example

Custom Headers

Nested Objects

Custom Formatting

Pluck Join / Pluck Split Explained

Parsing Into Custom Classes

Further CSV Customization

Template DSL

Using Blocks

Subclassing Bumblebee::Template

Column Precedence

Encoding Support

Contributing

Development Environment Configuration

Running Tests

Publishing

Code of Conduct

License

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages