crawling-and-serving
Akamai Task
src | ||
test | ||
.eslintrc.js | ||
.gitignore | ||
.prettierrc | ||
docker-compose.yaml | ||
Dockerfile | ||
nest-cli.json | ||
package-lock.json | ||
package.json | ||
README.md | ||
tsconfig.build.json | ||
tsconfig.json |
Crawing & Serving
The crawler is a simple crawler that crawls the web and stores the results in a database and assets in a file system. The server is a simple server that serves the results of the crawler.
Crawler
Usage
Post a JSON object to the crawler with the following format:
domain.com/crawl
{
"url": "http://www.example.com",
}
The crawler will then crawl the given url and store the results in a database and assets in a file system
crawler_assests/www.example.com/
.
API
The API is a simple API that serves the results of the crawler.
Routes
GET
/sites - Returns a list of all sites
/sites/:id - Returns the site object for the given site Id
sites/domain/:domain - Returns the domain object for the given domain
DELETE
/sites/:id - Deletes the site object for the given site Id
sites/domain/:domain - Deletes the domain object for the given domain
Post
sites/:id - Updates the site object for the given site Id