按照老师的代码输出后是这样的,之前也出现过好多次,具体是什么原因呢?

来源:7-4 实战—自定义文本处理类

破邪返瞳

2022-07-09 22:00:44

import requests

# //div[@class='e']/a[@class='el']/p[@class='t']/span/@title
url = "https://search.51job.com/list/000000,000000,0000,00,9,99,python,2,1.html"
header = {
    "Accept": "text / html, application / xhtml + xml, application / xml;q = 0.9, image / avif, image / webp, image / apng, * / *;q = 0.8, application / signed - exchange;v = b3;q = 0.9",
    "Accept - Encoding": "gzip, deflate, br",
    "Accept - Language": "zh - CN, zh;q = 0.9, en;q = 0.8Cache - Control: max - age = 0",
    "Connection": "keep - alive",
    "Cookie": "_uab_collina=165724966571579439025188;guid=e43c9ae66458b2fbcf09698769fca338;partner=class_imooc_com;acw_tc=ac11000116572811682083828e00df52ec398a53440c85411c1a91d34a1c2f;acw_sc__v2=62c81b6a896474647bf585785c630e7a2ac1bc4f; ssxmod_itna=YqUxBQG=KmqDqwxl4iqYKE=xfhQQFDu0me8io=BG8x0vcIeGzDAxn40iDtoOTOBYpwxkC7GiL+hR0icdwt827wxEh4c3SnpqmDB3DEx06i1=YxiiSDCeDIDWeDiDGR8DXO50OD7qiOD7otDj4GS9qGcDYQ2+u4DCOD51GtDI4GMDqDuDGt3EorDYLNDmdtDYfNDjqQDKLX3oeD2mhWMDYPSCDDlCAH502xTsuFkqlb=MlTrdPnrxdab66pMnuXBDYo27crSAIX/j=P38ODW4Sqo9BpijAx3CYme7re5b0zdADpr9AAenB+Qe0GBDDDpPCDK4bYeD;ssxmod_itna2=YqUxBQG=KmqDqwxl4iqYKE=xfhQQFDu0me8io=BixnKLPa4PDsFkqDLBG5X2QvGDUC+9e32QD6im4Ui3o82oWCoBrc+nvdLtpkFXeyZCva8COeKvxXbyC=EWLkpR/j991N5fdOCXyMXLKdNuAfEMBoXCAkAPv85EbWdpmbR31nNE/ot102Kd0WUxIobCYAnNhnExa=2ylq2WOeuI8mX=ba9th=a/nCaDTpR1mpiVA1vmalj7WaqdYa3UlLuf=K5LTpS6V1ICbR96Ps5UdHBRZzSzdvljuhIR=jC9fz/jPvXBm180U/f/rEEgUmWX8m=cp=V66gj6P9UapmneNlPrdQKfmeqco=fI0aPpAXKRrX2pm0tPS2z2r8rbBjUhBiNR538=FDG2C0QD08DiQ1SDPAh8iDdYD===",
    "Host": "jobs.51job.com",
    "Referer": "https: // jobs.51job.com / nanjing / 133201744.html",
    "Sec - Fetch - Dest": "document",
    "Sec - Fetch - Mode": "navigate",
    "Sec - Fetch - Site": "same - origin",
    "Sec - Fetch - User": "?1",
    "Upgrade - Insecure - Requests": "1",
    "User - Agent": "Mozilla / 5.0(WindowsNT10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 103.0.0.0Safari / 537.36sec - ch - ua: '.Not/A)Brand';v = '99', 'Google Chrome';v = '103', 'Chromium';v = '103'",
    "sec - ch - ua - mobile": "?0",
    "sec - ch - ua - platform": "Windows",
}
response = requests.get(url=url, headers=header)
response.encoding='ascii'
print(response.text)
https://img.mukewang.com/climg/62c989e20923cd0d18090230.jpg            
下载视频          
写回答

1回答

好帮手慕凡

2022-07-10

同学,你好!以上原因是遇到了一些反爬措施,同学需要先在网页上确保能够访问到网页,再将请求头复制下来。

同学可以直接访问网址:https://search.51job.com/list/000000,000000,0000,00,9,99,python,2,1.html,将需要的一些请求头信息复制下来,如下图:可以将response.encoding改为utf-8,确保不乱码

https://img.mukewang.com/climg/62ca41210914b58021270908.jpg

参考代码:

url = "https://search.51job.com/list/000000,000000,0000,00,9,99,python,2,1.html"
header = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
    # "Accept-Encoding": "gzip,deflate,r",
    # "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8Cache-Control:max-age=0",
    "Connection": "keep-alive",
    "Cookie": "_uab_collina=165724966571579439025188;guid=e43c9ae66458b2fbcf09698769fca338;partner=class_imooc_com;acw_tc=ac11000116572811682083828e00df52ec398a53440c85411c1a91d34a1c2f;acw_sc__v2=62c81b6a896474647bf585785c630e7a2ac1bc4f; ssxmod_itna=YqUxBQG=KmqDqwxl4iqYKE=xfhQQFDu0me8io=BG8x0vcIeGzDAxn40iDtoOTOBYpwxkC7GiL+hR0icdwt827wxEh4c3SnpqmDB3DEx06i1=YxiiSDCeDIDWeDiDGR8DXO50OD7qiOD7otDj4GS9qGcDYQ2+u4DCOD51GtDI4GMDqDuDGt3EorDYLNDmdtDYfNDjqQDKLX3oeD2mhWMDYPSCDDlCAH502xTsuFkqlb=MlTrdPnrxdab66pMnuXBDYo27crSAIX/j=P38ODW4Sqo9BpijAx3CYme7re5b0zdADpr9AAenB+Qe0GBDDDpPCDK4bYeD;ssxmod_itna2=YqUxBQG=KmqDqwxl4iqYKE=xfhQQFDu0me8io=BixnKLPa4PDsFkqDLBG5X2QvGDUC+9e32QD6im4Ui3o82oWCoBrc+nvdLtpkFXeyZCva8COeKvxXbyC=EWLkpR/j991N5fdOCXyMXLKdNuAfEMBoXCAkAPv85EbWdpmbR31nNE/ot102Kd0WUxIobCYAnNhnExa=2ylq2WOeuI8mX=ba9th=a/nCaDTpR1mpiVA1vmalj7WaqdYa3UlLuf=K5LTpS6V1ICbR96Ps5UdHBRZzSzdvljuhIR=jC9fz/jPvXBm180U/f/rEEgUmWX8m=cp=V66gj6P9UapmneNlPrdQKfmeqco=fI0aPpAXKRrX2pm0tPS2z2r8rbBjUhBiNR538=FDG2C0QD08DiQ1SDPAh8iDdYD===",
    "Host": "search.51job.com",
    # "Referer": "https://jobs.51job.com/nanjing/133201744.html",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "same-origin",
    "Sec-Fetch-User": "?1",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3610.2 Safari/537.36",
}
response = requests.get(url=url, headers=header)
response.encoding = 'utf-8'
print(response.text)

祝学习愉快~

0

0 学习 · 1672 问题

查看课程