Now more than ever before, companies are utilizing data analysis to improve decision-making. However, the value of analytics is no longer just for companies; the value has now extended to the private life with access to affordable hardware and open source tools. Individuals can now benefit from data science methods to make a decision.
Recently, as explained in this blog post, I used publicly available data to discover which cities in the United States are the best match for my customized criteria. Now, I want to go a step further and use data science to help me find a job that meets my requirements. This is a necessary data science task because recruiting and job listing platforms do not provide all details I need in a centralized place. However, by leveraging data science methods, I can build my own tool designed to support my job seeking decision making. As an engineer, I avoid reinventing the wheel. So first let us take a look at the necessary features that a job listing website should have, then we can judge if the existing job listing websites are able to fulfill my needs.
An ideal job listing website
I checked multiple job listing websites such as Indeed, LinkedIn, Glassdoor, and CareerBuilder. In some ways the platforms are similar: First, the job seeker searches for available positions by searching the job title and the job description. The website responds to the job seeker’s query by returning the list of available positions. Then, the job seeker reads the job descriptions and about the companies hiring. Finally, the job seeker applies for the selected job postings. Sounds easy? Let’s put it to the test.
Let’s say I am looking for a “Data Scientist” position. I tried Indeed.com and it returns more than 25,000 entries for a full-time position for this query. Obviously, this is too much information to comprehend. Thus, I tried to refine the query as much as possible by searching job postings that are less than 15 days old, excluding staffing agencies and adding “python” to the filters. Indeed still returns 1,566 jobs which is still incredibly difficult to examine. To be more precise, it would take 10 days if you spent 10 hours a day and 5 minutes on each posting. Clearly, the result should be filtered even more, but how can this be done?
As mentioned earlier, job listing websites mostly focus on job title and job description. I believe, the job title is only useful for narrowing down the fields of work and not useful for finding a specific job entry. Fortunately, job descriptions can convey more information about the job to a degree that gives job seekers a general understanding of the position. That being said, in today’s fast-paced and changing environment, job descriptions are not usually accurate and become outdated as soon as they are written.
I hypothesize that in order to find a job that is the right fit, it is more useful to focus on the company and the location (i.e. city and state) rather than the job title and job description. The company determines the people you work with, defines your impact on the community, defines the location you commute to, the benefits you are getting and so on. In the next sections, I will elaborate.
Features of a company
Although it is possible to analyze a company by using frameworks such as McKinsey 7S Framework, it is not realistic to expect company attributes such as culture and structure to be accessible to job listing websites. Consequently, it is important to be practical and hold on to the information that is either publicly available or easy to collect. Another thing that worth noting is, even if all the information about a company was available, not all of them are understandable for the general public. For example, referencing to Competing Values Framework, the culture of companies can be divided into four categories: Group, Developmental, Hierarchical, and Rational. A typical job seeker does not understand the difference between these categories, as the result, it may not useful. Coming up with a list of attributes that are easy to understand and easy to investigate requires a considerable amount of effort and research. To help, I compiled a basic list of attributes which are fairly easy to understand, and important for the job seeker to consider:
- Startup vs Corporate: If you are adventurous and looking for a fast-paced environment look for startups. If you prefer more stability and less pressure, corporate is the way to go. Here is a nice infographic.
- Size of the company: It is similar to the “Startup vs Corporate” but with more focus on the visibility of your impact. The larger the company, the less visible your impact is. Here is a summary from GlassDoor.com
- Work-life balance
- Industry: Each company is active in a specific industry. Especially for a job seeker with professional expertise in specific industries such as telecommunication or healthcare, being able to filter companies based on the industry could save a job seeker’s time.
Characteristics of a Location
According to US Census, around 12% of people in the US move each year. Another study shows that 50% of the relocations are related to jobs. A job dictates so much about a person’s life. According to the Economist, about 3.9 million married Americans ages 18 and over live apart from their spouses in order to keep their jobs, which confirms that the location of a job (i.e. city and state) plays an undeniable roll in people’s career. Unfortunately, job listing websites assume the job seeker already knows the city or state they wish to live in and provides minimum facilities to help job seekers filter jobs based on the attributes of a location. To show what I mean by this, the attributes of a location, consider the following scenarios which existing job listing website cannot fulfill:
“Sara is a data scientist who loves skiing. She likes to live in large cities. Considering that she is not a fan of driving, public transportation plays an important role in her day to day life.”
“Joe is from Austin, TX. He is looking for a job in affordable locations so he can save money to buy a house in near future.”
One might say these requirements are outside of the scope of the standard job search engine. I disagree, adding smart filters for refining results based on the attributes of the location could save days in searching for job seekers.
Here is the summary of attributes that are worth considering:
- Affordability: The median salary of an occupation in an area should be more than the average living cost in the same area.
- Weather: Determine a methodology for bucketing regions by climate. This attribute requires more research but we can either consider five different type of climate in the US or divide the US into two regions based on average annual high temperature.
- Population Density: Metropolitan areas with high population density such as New York and Los Angles are suitable for people who love living in crowded areas. On the other hand, Austin, TX, and Cleveland, OH are more suitable for people who like to live in more relaxed and easygoing areas.
There are additional location attributes that are worth mentioning like the crime rate, education quality, political view, etc. I would not include these attributes at the regional level because they belong to a more granular location level such as zip code and school district. People can use these attributes to find a neighborhood to live in but for a job location, I believe considering these attributes adds no value to our job searching effort as it could just mean the difference of increasing commute time by a few minutes.
Based on what we discussed so far, the following table shows the types of filters each job listing website offers.
|Company based filters||Job-based filters||Location-based filters||Showing company attributes|
|Indeed||Name||Job Description / Salary / Type / Experience Level / Posting age||City, state, or zip within a radius||Overall ranking / Name / Location|
|Name / Industry||Job Description / Salary / Type / Experience Level / Posting age / Job Function||City or State||Employee and Applicant insights / Name / Location|
|Glass Door||Name / Ranking / Size / Industry||Job Description / Salary / Type / Experience Level / Posting age||City or state within a radius||Overall ranking / Name / Location|
|Google For Jobs||Name / Industry||Job Description / Salary / Type / Posting age||City or state within a radius||Name / Location|
How I did it
As shown in the table, existing job listing websites do not support the features that I think are necessary to efficiently look for a job. Fortunately, I have some ideas on how to create a data product to fulfill my job hunting needs.
I developed a tool that can suggest a city based on the given preferences. This tool ranks 7,174 cities in the US by considering housing prices, salaries, crime rate, job opportunity, etc. And considers a custom user-assigned weight for each mentioned feature. Using the tool, when I set weights based on my preferences, Pflugerville, TX comes among the top results. This city is in Austin, TX metropolitan area. The area is affordable, diverse and with a lot of job opportunity. Now I only need to research if I can find a job that suites my requirements.
I used Indeed.com to do my search because the application has more jobs listed than any other job listing websites. Furthermore, the “advanced search” works well for filtering job descriptions by offering extensive text-based filters. Unfortunately, there is no way to filter companies based on their attributes such as size or culture. Companies can be found only by their name which is not helpful for my use case. Taking all that into consideration, I still have a lot to do to find the jobs that satisfy my expectations.
First, to narrow down the list of the companies, I need to be more clear about the type of job I am looking for. My background is in software engineering and for the past couple of years, I have been studying Big Data. Thus, my current interests are to work in a tech-savvy company where I can enrich my experience with Big Data. I tried some keywords and finally found out if I search the job descriptions for “Hadoop,” Indeed returns a list of Big Data related jobs. Eventually, after some data scraping, I am able to get a list of companies that presumably are utilizing Big Data. The python script I wrote to search and produce this list of 187 companies is below.
@classmethod def getList(cls, limit=0): # download the result page response = requests.get(cls.url, headers=cls.headers) # how many jobs are available jobs_count = int(re.search(r'of\s(\d+)\sjobs', response.text).group(1)) print(jobs_count) job_counter = 0 company_names =  """ brows page by page to get name and link for each company ... realted to a job """ while True: # delay to make sure we are not disturbing the server time.sleep(1) print(job_counter) # getting name and link of the company page_dom = BeautifulSoup(response.text, 'html.parser') results = page_dom.find_all('span', 'company') for c in results: current = c.find('a') if current is None: c_name = c.text link = '' else: c_name = current.text link = cls.BASE_URL+current['href'] c_name = c_name.strip() if c_name not in company_names: # check if already have the company company_names.append(c_name.strip()) rankings = cls.list_template.copy() rankings['name'] = c_name.strip() rankings['link'] = link.strip() cls.company_list.append(rankings) job_counter += 10 response = requests.get(cls.url+cls.PAGE_COUNTER.format(job_counter), headers=cls.headers) response.raise_for_status() # check if we are done. 'limit' is for test purposes if job_counter >= jobs_count if limit == 0 else limit: break return cls.company_list
After getting the list of candidate companies, I need to review the companies and scrape all available data points related to these companies, to be able to filter out companies that do not match my expectations. I prefer to work for a medium size company with a collaborative environment and a good Work-Life Balance. Fortunately, Indeed and Glassdoor have a very rich company ranking section which is based on the review of the employees. Although, it is unwise to trust rankings that are based on a few reviews when hundreds of employees giving consistent opinions then I feel confident to make a decision based on the reviews. Eventually, by extracting the ranks regarding work-time balance, job advancement, culture, benefits, and management I can have meaningful insight into the “quality” of each company.
Finally, by filtering companies that have a low score on my desired features, I end up with a list of 9 companies that satisfies all of my exceptions. At this point, I prefer to do the rest of the work manually and study these companies and the jobs they offer.
It took me a couple of days to write the scripts and find the desired result. I suggest that job listing websites invest in an integrated search engine that could filter jobs based on the location, company, and job description. I understand adding the features I suggested requires extensive research and controlled experience. However, considering the similarity of job listing websites, it could be a start for a groundbreaking innovation in how people look for a job.