Crawler (.NetCore - RestAPI)

A simple crawler Api written in .Net Core. It will start crawling from the root url and then traverse on the whole tree of pages to get the unique urls with in that domain and store the pages content into the files. Urls are also stored within a json file.

Getting Started

Clone the repository and run the project through visual studio otherwise you can all use command line with the project directory with the following command.

dotnet run

Now you will get the following output on the browser, if your browser is allowed to start.

Now you can call the api from the postman by passing a url parameters as follow.

Response

In the response you will get different url with the child urls, but all child urls are unique. The url which are already a child of some other url will be skipped because we donot want to crawl same url again and again.

Packages which are used in the crawler are as follow

Repository Pattern

Although this was a small project, still i have tried to do this through Repository Pattern approch. Normally Repositoy Pattern is used to add an abstraction layer between the Database and Business Logic. Here we donot have any database involved, so i put all the logic in repository and keep the controller clean.

But if you want to understand more about the repository pattern, please follow this link.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
CrawlerTest		CrawlerTest
imagesReadMe		imagesReadMe
.gitattributes		.gitattributes
.gitignore		.gitignore
CrawlerTest.sln		CrawlerTest.sln
CrawlerTest.zip		CrawlerTest.zip
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crawler (.NetCore - RestAPI)

Getting Started

Response

Packages which are used in the crawler are as follow

Repository Pattern

About

Uh oh!

Releases

Packages

Uh oh!

Languages

saqibcare/Crawler

Folders and files

Latest commit

History

Repository files navigation

Crawler (.NetCore - RestAPI)

Getting Started

Response

Packages which are used in the crawler are as follow

Repository Pattern

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages