Our client is one of the leading brands in the industry of tools for Search Engine Optimization. The company runs an internet-scale bot that crawls the whole Web 24/7, storing huge volumes of information to be indexed and structured in a timely fashion. On top of that we're building various analytical services for our customers to play with all this data.
We're looking for someone, who is proficient in working with large amounts of data to extract all kinds of insights and regularities from it.
In a nutshell, we're trying to reverse engineer how Google ranks pages in its search results by analyzing:
- Search queries that people enter in Google;
- Pages that appear in top100 search results for each search query;
- The content of each page (if it is topically relevant to a search query);
- What other pages are linking to a given page and what are their attributes;
- Many more metrics and variables.
We're also building our own models of calculating the "power" of a web page and it's likelyhood of ranking high in search results.
- Good knowledge in a wide variety of advanced data analysis and machine learning techniques, and experience in developing solutions in response to real-world business problems;
- Proficient with programming languages such as Python and R;
- Good knowledge with NLP, and experience with open source NLP library such as Gensim;
- Experience in Tensorflow, Theano, or other deep learning frameworks preferred;
- Experience with Elasticsearch or similar DB, and ability to conduct independent research utilising large unstructured data sets;
- Visualisation using open source tools.
What you get:
- Competitive salary
- Informal and thriving atmosphere
- First-class workplace equipment (hardware, tools)
- Medical insurance
- Modern office in CBD (with amazing view)
- No dress code
How to apply:
Please send updated CV to Jobs@glenhill-group.com for more information and an informal chat, we regret that only shortlisted candidates will be contacted.
Glenhill Group EA Licence No: 16S8180 / EA Registration No: R1105270