Current location - Recipe Complete Network - Complete cookbook - What is a reptile?
What is a reptile?
Web crawler, also called web spider, netants, web robot, etc. , you can automatically browse the information in the network. Of course, when browsing information, we need to follow our rules. These rules are called web crawler algorithms. With Python, you can easily write a crawler to automatically retrieve Internet information. You need to know the following:

① Have a solid python grammar foundation, which is the foundation of everything.

(2) Have a certain understanding of front-end knowledge, at least understand it.

③ How to obtain the target data: request module, etc.

④ How to parse the target data: regularization, xpath, jsonpath, etc.

⑤ How to achieve anti-climbing: Experience summary

⑥ How to obtain data on a large scale: scrapy framework