首页 编程教程正文

爬取拉勾网信息看看python一月多少钱!

piaodoo 编程教程 2020-02-22 22:01:27 902 0 python教程

本文来源吾爱破解论坛

2018-9-23 21:38:02
今天突然想看看应届大学生python在成都多少钱一月!
就打算爬取数据
1. 先找到拉勾网
2. 在f12 找到 headers 和 data  全都复制上面的!
[Python] 纯文本查看 复制代码

headers = {
    'Accept': 'application/json, text/javascript, */*; q=0.01',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Connection': 'keep-alive',
    'Cookie': 'WEBTJ-ID=20180918181741-165ec2f83d743-0ee41db9c69e0e-6e1f147a-2073600-165ec2f83d95d1; _ga=GA1.2.1559975793.1537265862; user_trace_token=20180918181742-13d9fe9b-bb2c-11e8-baf2-5254005c3644; LGUID=20180918181742-13da0831-bb2c-11e8-baf2-5254005c3644; JSESSIONID=ABAAABAAAFCAAEG0FCA6542C1D06F1F470B0A7189B8AD06; index_location_city=%E6%88%90%E9%83%BD; TG-TRACK-CODE=index_search; _gid=GA1.2.1437892044.1537705170; LGSID=20180923201929-eb4f9cfb-bf2a-11e8-bb56-5254005c3644; PRE_UTM=; PRE_HOST=www.baidu.com; PRE_SITE=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DXpamso_IxFbfBezXbGYWv8-vI3sYyGf67_89jrtjXQK%26wd%3D%26eqid%3Da27f010200061c48000000025ba784cc; PRE_LAND=https%3A%2F%2Fwww.lagou.com%2F; Hm_lvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1537265862,1537265872,1537354446,1537705170; LGRID=20180923201940-f1eea38b-bf2a-11e8-bb56-5254005c3644; Hm_lpvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1537705181; SEARCH_ID=baf28c394763403c8afc3fc36dce76a4',
    'Host': 'www.lagou.com',
    'Origin': 'https://www.lagou.com',
    'Referer': 'https://www.lagou.com/jobs/list_python?isSchoolJob=1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.62 Safari/537.36',
    'X-Anit-Forge-Code': '0',
    'X-Anit-Forge-Token': 'None',
    'X-Requested-With': 'XMLHttpRequest',
}

[Asm] 纯文本查看 复制代码
    data = {
        'first': 'true',
        'pn':1,
        'kd': 'python',
    }


其实代码很简单 直接放代码!
又不懂的可以留言评论!
有一点 可以爬的数据保存到excel中 更加直观!我忘了咋弄的个就没弄,可以迭代一下!
[Python] 纯文本查看 复制代码
# /usr/bin/env python
# -*- coding: utf-8 -*-
# [url=home.php?mod=space&uid=238618]@Time[/url] : 2018/9/23 20:22 
# [url=home.php?mod=space&uid=686208]@AuThor[/url] : TrueNewBee

import requests
import json
import time

# 网站
url = 'https://www.lagou.com/jobs/positionAjax.json?city=%E6%88%90%E9%83%BD&needAddtionalResult=false&isSchoolJob=1'

headers = {
    'Accept': 'application/json, text/javascript, */*; q=0.01',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Connection': 'keep-alive',
    'Cookie': 'WEBTJ-ID=20180918181741-165ec2f83d743-0ee41db9c69e0e-6e1f147a-2073600-165ec2f83d95d1; _ga=GA1.2.1559975793.1537265862; user_trace_token=20180918181742-13d9fe9b-bb2c-11e8-baf2-5254005c3644; LGUID=20180918181742-13da0831-bb2c-11e8-baf2-5254005c3644; JSESSIONID=ABAAABAAAFCAAEG0FCA6542C1D06F1F470B0A7189B8AD06; index_location_city=%E6%88%90%E9%83%BD; TG-TRACK-CODE=index_search; _gid=GA1.2.1437892044.1537705170; LGSID=20180923201929-eb4f9cfb-bf2a-11e8-bb56-5254005c3644; PRE_UTM=; PRE_HOST=www.baidu.com; PRE_SITE=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DXpamso_IxFbfBezXbGYWv8-vI3sYyGf67_89jrtjXQK%26wd%3D%26eqid%3Da27f010200061c48000000025ba784cc; PRE_LAND=https%3A%2F%2Fwww.lagou.com%2F; Hm_lvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1537265862,1537265872,1537354446,1537705170; LGRID=20180923201940-f1eea38b-bf2a-11e8-bb56-5254005c3644; Hm_lpvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1537705181; SEARCH_ID=baf28c394763403c8afc3fc36dce76a4',
    'Host': 'www.lagou.com',
    'Origin': 'https://www.lagou.com',
    'Referer': 'https://www.lagou.com/jobs/list_python?isSchoolJob=1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.62 Safari/537.36',
    'X-Anit-Forge-Code': '0',
    'X-Anit-Forge-Token': 'None',
    'X-Requested-With': 'XMLHttpRequest',
}

# 因为有四页数据一个for循环 
for i in range(1, 5):
    data = {
        'first': 'true',
        'pn': i,
        'kd': 'python',
    }
    html = requests.post(url, data=data, headers=headers)
    jsonData = html.json()['content']['positionResult']['result']           # 因为是json数据转换一下
    for item in jsonData:
        time.sleep(1)
        print('*'*80)
        print(item['companyFullName'])
        print(item['workYear'])
        print(item['positionName'])
        print(item['salary'])
        print(item['secondType'])


记得给点 热心 哦 免费的哦!! 以后继续分享心得! !!!!!

QQ截图20180923214006.png (234.87 KB, 下载次数: 2)

下载附件  保存到相册

2018-9-23 21:40 上传

先找到网页

先找到网页

222332214434.png (33.64 KB, 下载次数: 0)

下载附件  保存到相册

2018-9-23 21:44 上传

这个是爬取的部分数据!

这个是爬取的部分数据!

版权声明:

本站所有资源均为站长或网友整理自互联网或站长购买自互联网,站长无法分辨资源版权出自何处,所以不承担任何版权以及其他问题带来的法律责任,如有侵权或者其他问题请联系站长删除!站长QQ754403226 谢谢。

有关影视版权:本站只供百度云网盘资源,版权均属于影片公司所有,请在下载后24小时删除,切勿用于商业用途。本站所有资源信息均从互联网搜索而来,本站不对显示的内容承担责任,如您认为本站页面信息侵犯了您的权益,请附上版权证明邮件告知【754403226@qq.com】,在收到邮件后72小时内删除。本文链接:https://www.piaodoo.com/7322.html

搜索