副标题[/!--empirenews.page--]
【新品产上线啦】51CTO播客,随时随地,碎片化学习
前言:
随着互联网行业的日益兴盛,吸引力越来越多的牛人加入其中,也有许多小伙伴跃跃欲试,想要在互联网的浪潮中大展身手。今天我们通过看准网的数据,帮助大家对各大互联网公司有一个比较概括的了解。
01数据来源
看准网提供了许多员工对于公司的评价,我们从中提取需要的数据,包括整体评分、面试难度、推荐率、前景看好情况、CEO支持率,代码如下:
- ## 获得信息
- def get_company_info(num,headers):
- ## 获得评价数据
- url = 'https://www.kanzhun.com/gsr'+str(num)+'.html?ka=com-blocker1-review'
- js='window.open("'+url+'")'
- driver.execute_script(js)
- time.sleep(5)
- driver.close()
- driver.switch_to_window(driver.window_handles[0])
- bsObj=BeautifulSoup(driver.page_source,"html.parser")
- tag=bsObj.find('div',attrs={'class':'all_item'}).text.replace('t','').replace('n','').replace('(',' ').replace(')',' ').split(' ')
- tag=tag[0:len(tag)-1]
- this_tag = {tag[i*2]:tag[i*2+1] for i in np.arange(int(len(tag)/2-1))}
- this_name = bsObj.find('div',attrs={'class':'co_name t_center'}).text
- this_overal = float(bsObj.find('div',attrs={'class':'res_box_star f_right'}).find('em').text)
- points = bsObj.find('ul',attrs={'class':'score_rate clearfix'}).text.replace('n',' ').split()
- this_recommend = float(points[0][0:2])/100*5
- this_future = float(points[2][0:2])/100*5
- this_ceo = float(points[4][0:2])/100*5
- ## 获得CEO头像和公司logo
- ceo_pic = bsObj.find('div',attrs={'class':'ceo_info'}).find('div').find('img').attrs['src']
- ceo_name = bsObj.find('div',attrs={'class':'ceo_info'}).find('p').text
- head_logo = bsObj.find('div',attrs={'class':'com_logo f_left'}).find('img').attrs['src']
- head_loc = 'D:/爬虫/看准/公司logo/'+this_name+'.jpg'
- ceo_loc = 'D:/爬虫/看准/CEOlogo/'+this_name+'.jpg'
- request.urlretrieve(head_logo,head_loc)
- request.urlretrieve(ceo_pic,ceo_loc)
- ## 获得面试难度
- url = 'https://www.kanzhun.com/gsm'+str(num)+'.html?ka=com-floater-interview'
- js='window.open("'+url+'")'
- driver.execute_script(js)
- time.sleep(5)
- driver.close()
- driver.switch_to_window(driver.window_handles[0])
- bsObj=BeautifulSoup(driver.page_source,"html.parser")
- req=request.Request(url,headers=headers)
- html=urlopen(req)
- bsObj=BeautifulSoup(html.read(),"html.parser")
- this_difficulty = float(bsObj.find('section',attrs={'class':'interview_feel'}).find('em').text)
- this_feeling = bsObj.find('ul',attrs={'class':'score_list'}).find_all('span',attrs={'class':'percent'})
- this_feeling = [float(k.text.replace('%','')) for k in this_feeling]
- this_feeling = (this_feeling[0]*5+this_feeling[1]*3+this_feeling[2]*1)/100
- ## 整合数据成为字典
- this_company ={'name':this_name,'overal':this_overal,'comments':tag[1],'recommend':this_recommend,
- 'future':this_future,'ceo':this_ceo,'difficulty':this_difficulty,'feeling':this_feeling}
- return this_company,this_tag,this_name
02整体对比
(编辑:阜阳站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|