clj-robots-parser

What

A Clojure(-script) library to parse robots.txt files as specified by The Great Goog themselves. As robots.txt is woefully underspecified in the "official" docs, this library tolerates anything it doesn't understand, extracting the data it does.

It can use the extracted data to query whether a given user-agent is allowed to crawl a given URL.

Why

Why use Google's (much more stringent) documentation for handling robots.txt? In terms of SEO, googlebot is what you ought to care about the most.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
src/clj_robots_parser		src/clj_robots_parser
test/clj_robots_parser		test/clj_robots_parser
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

clj-robots-parser

What

Why

About

Uh oh!

Releases

Packages

Languages

License

isker/clj-robots-parser

Folders and files

Latest commit

History

Repository files navigation

clj-robots-parser

What

Why

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages