Datasets

Datasets is an abstraction layer over various data backends (mongodb, elastic, etc) that introduce generic way of accessing, manipulating and filtering data. It introduces common concepts that are mapped to particular terminology in each of cases:

backend - root namespace representing the particular instance of database technology. e.g. mongo
namespace - encapsulates database or folder notions
dataset - represents a table/collection/file
item - represents row/document/line

Datasets, when run as a standalone server (pyramid app) or as part of another server exposes RESTful resources mapped to configured backend.

e.g. if mongo db connections are configured it will expose:

api/mongo/{namespace}/{dataset} set of nested resources that are mapped one to one to mongo databases and collections.

Datasets are essential part of ETL pipeline server.

Current implementation supports following data backends:

Mongo DB
Elastic Search
CSV files in local file system
CSV files on AWS S3 buckets
HTTP backend - any public http resource that returns JSON data.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
datasets		datasets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
circle.yml		circle.yml
requirements.txt		requirements.txt
setup.py		setup.py
test.ini		test.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

vahana/datasets

Folders and files

Latest commit

History

Repository files navigation

Datasets

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages