C++ library to parse WARC files according to the specification. Work in progress with no tests or support for decompressing response bodies or parsing HTTP headers in responses. Basic parsing works on a recent common crawl dump file.
| Name | Name | Last commit date | ||
|---|---|---|---|---|
C++ library to parse WARC files according to the specification. Work in progress with no tests or support for decompressing response bodies or parsing HTTP headers in responses. Basic parsing works on a recent common crawl dump file.