揭开互联网公司的神秘面纱，数据解读那些slay整个行业的互联网公司

发布时间：2018-10-02 09:37:06 所属栏目：教程来源：徐麟

导读：【新品产上线啦】51CTO播客，随时随地，碎片化学习前言：随着互联网行业的日益兴盛，吸引力越来越多的牛人加入其中，也有许多小伙伴跃跃欲试，想要在互联网的浪潮中大展身手。今天我们通过看准网的数据，帮助大家对各大互联网公司有一个比较概括的了解。

副标题[/!--empirenews.page--] 【新品产上线啦】51CTO播客，随时随地，碎片化学习

前言：

随着互联网行业的日益兴盛，吸引力越来越多的牛人加入其中，也有许多小伙伴跃跃欲试，想要在互联网的浪潮中大展身手。今天我们通过看准网的数据，帮助大家对各大互联网公司有一个比较概括的了解。

01数据来源

揭开互联网公司的神秘面纱，数据解读那些slay整个行业的互联网公司

看准网提供了许多员工对于公司的评价，我们从中提取需要的数据，包括整体评分、面试难度、推荐率、前景看好情况、CEO支持率，代码如下：

## 获得信息  
def get_company_info(num,headers): 
   ## 获得评价数据 
   url = 'https://www.kanzhun.com/gsr'+str(num)+'.html?ka=com-blocker1-review' 
   js='window.open("'+url+'")' 
   driver.execute_script(js) 
   time.sleep(5) 
   driver.close()  
   driver.switch_to_window(driver.window_handles[0]) 
   bsObj=BeautifulSoup(driver.page_source,"html.parser") 
   tag=bsObj.find('div',attrs={'class':'all_item'}).text.replace('t','').replace('n','').replace('(',' ').replace(')',' ').split(' ') 
   tag=tag[0:len(tag)-1] 
   this_tag = {tag[i*2]:tag[i*2+1] for i in np.arange(int(len(tag)/2-1))} 
   this_name = bsObj.find('div',attrs={'class':'co_name t_center'}).text 
   this_overal = float(bsObj.find('div',attrs={'class':'res_box_star f_right'}).find('em').text) 
   points = bsObj.find('ul',attrs={'class':'score_rate clearfix'}).text.replace('n',' ').split() 
   this_recommend = float(points[0][0:2])/100*5 
   this_future = float(points[2][0:2])/100*5 
   this_ceo = float(points[4][0:2])/100*5 
   ## 获得CEO头像和公司logo 
   ceo_pic = bsObj.find('div',attrs={'class':'ceo_info'}).find('div').find('img').attrs['src'] 
   ceo_name = bsObj.find('div',attrs={'class':'ceo_info'}).find('p').text 
   head_logo = bsObj.find('div',attrs={'class':'com_logo f_left'}).find('img').attrs['src'] 
   head_loc = 'D:/爬虫/看准/公司logo/'+this_name+'.jpg' 
   ceo_loc = 'D:/爬虫/看准/CEOlogo/'+this_name+'.jpg' 
   request.urlretrieve(head_logo,head_loc) 
   request.urlretrieve(ceo_pic,ceo_loc) 
   ## 获得面试难度 
   url = 'https://www.kanzhun.com/gsm'+str(num)+'.html?ka=com-floater-interview' 
   js='window.open("'+url+'")' 
   driver.execute_script(js) 
   time.sleep(5) 
   driver.close()  
   driver.switch_to_window(driver.window_handles[0]) 
   bsObj=BeautifulSoup(driver.page_source,"html.parser") 
   req=request.Request(url,headers=headers)   
   html=urlopen(req)   
   bsObj=BeautifulSoup(html.read(),"html.parser")     
   this_difficulty = float(bsObj.find('section',attrs={'class':'interview_feel'}).find('em').text) 
   this_feeling = bsObj.find('ul',attrs={'class':'score_list'}).find_all('span',attrs={'class':'percent'}) 
   this_feeling = [float(k.text.replace('%','')) for k in this_feeling] 
   this_feeling = (this_feeling[0]*5+this_feeling[1]*3+this_feeling[2]*1)/100 
   ## 整合数据成为字典 
   this_company ={'name':this_name,'overal':this_overal,'comments':tag[1],'recommend':this_recommend, 
                  'future':this_future,'ceo':this_ceo,'difficulty':this_difficulty,'feeling':this_feeling}     
   return this_company,this_tag,this_name

02整体对比

（编辑：阜阳站长网）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!

1/7

尾页

Hadoop替换du命令降低	手机杀毒介绍,教您手机
电脑出现蓝屏代码 0x0