.To design, build and deploy high quality solutions across client products, platforms, and applications, ensuring they meet data engineering and QA standards.To promote engineering best practices and being point of expertise for all data related projects and ensuring standards and performance are met across the data engineering team.As a core member of our DB team, you will help in the implementation of our data strategy and transformation roadmap.Based in London, our customer is a global leading law research, ranking and directories business, ranking 'elite' (top 5%) law firms and lawyers in over 200 jurisdictions across the world.
The company is the 'Michelin star' of the legal world, and law firms invest considerable time trying to be ranked.
The customer has 350 staff based in London, of which 250 are researchers, who are experts in given jurisdictions and practices areas.
The company is around 30 years old, in 2018 it was purchased by a private equity house with a view to modernizing and growing the business substantially.
And that foresees digital transformation of all the corporate processes and flows, which will be our main goal for both in short and in long term perspective.Customer's strategy is to transform from being a print-based publishing business, with advertising as prime revenue, to an online subscription business with data insight, ranking directories and differentiated content at its core.
This means moving from a largely anonymous audience to one that is known, and thus building strategic customer platforms and products of the future.Requirements:5+ years of professional Data experience; 3+ years of experience with DataBricks and PySpark; Excellent understanding of SQL and CosmosDB databases,Experience with massive JSON files; Write clean andtestable code using SQL and Python scripting languages; Excellent knowledge of designing, constructing, administering, and maintaining data warehouses and data lakes; Knowledge of Azure Cloud Services; Good understanding T-SQL programming; Good exposure to Azure Data Lake technologies such as ADF, HDFS and Synapse; Good knowledge of Data Governance, Data Catalog, Master Data Management; Knowledge of Advanced Analytics and Model Management including Azure Databricks, Azure ML/ MLFlow as well as deployment of models using Azure Kubernetes Service; Excellent oral and written communications skills; Highly driven, positive attitude, team player, self-motivated and very flexible; Strong analytical skills, attention to detail and excellent problem solving/troubleshooting; Good time management skills; Knowledge of agile methodology; Knowledge of GitHub; Prioritisation skills to handle fast passed dynamic environment; Worked in the media, publishing, research, or a similar consumer focused industry.
(Highly Desirable)