crawling-and-serving Akamai Task

Find a file

Kfir Dayan e04a9dd3b1 pushing for testing mongoDB		2023-04-20 18:52:55 +03:00
src	pushing for testing mongoDB	2023-04-20 18:52:55 +03:00
test	first commit	2023-04-16 23:00:55 +03:00
.dockerignore	work version , still bugs for auth in mongo, changing vesrion to 5.0.15	2023-04-19 14:54:30 +03:00
.env.example	pushing for testing mongoDB	2023-04-20 18:52:55 +03:00
.eslintrc.js	first commit	2023-04-16 23:00:55 +03:00
.gitignore	installing mongoose + continuing to rest api	2023-04-19 01:47:05 +03:00
.prettierrc	first commit	2023-04-16 23:00:55 +03:00
docker-compose.yaml	pushing for testing mongoDB	2023-04-20 18:52:55 +03:00
Dockerfile	work version , still bugs for auth in mongo, changing vesrion to 5.0.15	2023-04-19 14:54:30 +03:00
nest-cli.json	first commit	2023-04-16 23:00:55 +03:00
package-lock.json	trying to fix problem	2023-04-19 11:10:15 +03:00
package.json	work version , still bugs for auth in mongo, changing vesrion to 5.0.15	2023-04-19 14:54:30 +03:00
README.md	consistency	2023-04-19 15:14:47 +03:00
tsconfig.build.json	first commit	2023-04-16 23:00:55 +03:00
tsconfig.json	first commit	2023-04-16 23:00:55 +03:00

README.md

Crawing & Serving

The crawler is a simple crawler that crawls the web and stores the results in a database and assets in a file system. The server is a simple server that serves the results of the crawler.

Crawler

Usage

Post a JSON object to the crawler with the following format:

domain.com/crawl { "url": "http://www.example.com", }

The crawler will then crawl the given url and store the results in a database and assets in a file system crawler_assests/www.example.com/.

API

The API is a simple API that serves the results of the crawler.

Routes

GET

/sites - Returns a list of all sites
/sites/:id - Returns the site object for the given site Id
/sites/domain/:domain - Returns the domain object for the given domain

DELETE

/sites/:id - Deletes the site object for the given site Id
/sites/domain/:domain - Deletes the domain object for the given domain

Post

/sites/:id - Updates the site object for the given site Id