The main focus of comprehensive crawls is to automatically harvest the biggest number of Czech web resources.
The requirements of comprehensive crawls are:
- Domain – Czech domain (.cz) web resources are collected. Resources with other domains
can be also harvested, but in other types of harvests.
- Format – harvesting different formats of resources depends on a technical settings of
- Access – only freely accessible resources are harvested
- Number of files – maximum 5000 files from one domain
Comprehensive harvesting wouldn't be possible without our partner CZ domain registry CZ.NIC which provides us list of .cz domains.