This is a growth hire, adding to Climate Policy Radar’s data science capacity. Reporting to the Head of Data Science, you'll work closely with our existing team of data scientists to analyse our datasets, build tools to deepen that analysis, and then productionise those tools for users to leverage in their own research.
As a skilled generalist within NLP, you'll work on a range of interesting problems here, with a range of interesting people! You'll collaborate with climate experts to understand the problems we need to solve, and with software engineers to take your work from research through to production.
About the data science team
Climate Policy Radar's engineering org is split into a few functional teams: programmes (domain experts), platform (high-quality data sharing), application (user-facing tools), and data science (models and evaluation). Despite those splits, most of our work is cross-functional.
The data science team builds the models that power our search engine, classifiers, and LLM workflows, and the evaluation frameworks that keep them honest. Our work directly informs policy decisions, so we care a lot about evaluation, monitoring, and minimising bias. We're research-informed but production-focused, and we default to working in the open by publishing datasets, models, and papers wherever we can. We also share what we've learned through blogs, papers, and public talks.
Tech preferences
The vast majority of the data science team’s work is written in Python, so we’ll expect you to be a fluent reader/writer!
In addition, here’s an (unordered) list of tools which we work with regularly. We’re always open to new suggestions of tools you've had success with, but familiarity with these tools will help you integrate with our existing stack:
ML & NLP: PyTorch, Huggingface Transformers, Pydantic AI, Pandas, Spacy
APIs & backend: Python, FastAPI, Pydantic
Data & infrastructure: DuckDB, Snowflake, PostgreSQL, Docker, AWS (ECS, S3, etc.), Pulumi, Prefect, GitHub Actions
Search & demos: Vespa, Streamlit
Testing, evaluation & monitoring: Pytest, Hypothesis, Weights & Biases, Posthog
Development tools: GitHub, Cursor, Claude Code
What will you be working on?
Climate Policy Radar's core product is a search engine for 30,000+ climate documents. That engine is supported by a set of single-class classifiers which detect mentions of important climate concepts (e.g. flooding, or deforestation). In your role, you're likely to:
Work closely with the programmes and platform teams to help shape our research & development priorities
Develop, evaluate, and deploy state-of-the-art NLP models which extract structured information from climate-related documents, with an emphasis on small, efficient architectures
Help to build and deploy APIs which demonstrate new model capabilities, for other teams to prototype around
Contribute to building evaluation harnesses for search relevance, classifier accuracy, data quality, etc, which allow other teams to reliably and independently iterate on work
Implement approaches for data labelling or sampling to more efficiently create training and evaluation datasets
Analyse real user search logs and behavioural analytics to describe their journeys across our site and understand their needs.
Research using LLMs to automate tedious parts of the climate policy research process, and partner with experts to build evaluation datasets which ensure that we retain our high standards for accuracy and bias