PhD Position AI Alignment: Value Assessment for Open Models & AI Systems
Challenge: Developing, operationalizing, quantifying, and embedding complex human and legal values into alignment pipelines for AI systems, open-weights, and foundation models.
Change: Advancing from static, generic benchmarks to dynamic, automated validation and red-teaming frameworks tailored for high-risk deployments.
Impact: Enhancing police trustworthiness through AI alignment at the Netherlands Police and ensuring compliance with the EU AI Act by engineering measurably aligned AI systems.
Job description
AI alignment refers to the goal of making AI systems behave in line with human intentions and values. AI alignment ensures that advanced AI systems operate safely and strictly within the bounds of human intentions, ethical standards, and the prevalent legal frameworks. With the rapid proliferation of AI systems, frontier LLMs, multimodal models, autonomous agents and their growing capability, there are equal risks of misalignment with human, organizational, and societal values through behavioral drift, hallucination, and adversarial exploitation. Validating models is crucial before decisions can be made about implementation and is important for continuous monitoring of systems in use, and for facilitating effective human oversight of AI. This is particularly important in high-stakes environments like law enforcement. The main challenge is that validation needs to happen simultaneously along a range of different values that are important in a law enforcement context: accuracy, but also fairness, reliability, trustworthiness, and more need to be ensured. How can we translate abstract democratic, organisational, and societal values such as algorithmic fairness, transparency, explainability (XAI) into rigorous, quantifiable engineering metrics without sacrificing the general utility of said models and AI systems?
As a PhD student at TU Delft, you will conduct impactful research on two key aspects in advancing the responsible use of AI within the Netherlands Police force. First, you will investigate the standards and values surrounding AI usage, particularly in the context of publicly available models. This entails defining what criteria these models must meet, beyond common considerations like bias and fairness. Second, you will also design methods to systematically evaluate various models against these established standards and values. This contributes to the responsible deployment of AI within policing in the Netherlands, pushes forward our understanding of how to align AI models in practice, and maximizes the efficiency of utilizing publicly available models.
1. Formalizing Value Taxonomies and Alignment Metrics
You will investigate the ethical, legal, and operational guardrails required for deploying open-weights foundation models in sensitive public-facing domains. Moving beyond superficial bias benchmarks, you will conduct deep-dive case studies within the Netherlands Police to map operational requirements to formal alignment criteria. You will define what constructs such as "trustworthiness" and "fairness" mean mathematically and procedurally when applied to complex law enforcement workflows.
2. Engineering (Automated) and Human-in-the-Loop Evaluation and Red-Teaming Pipelines
You will design and implement scalable methodologies to systematically stress-test, audit, and benchmark AI models against your established criteria. This includes exploring red-teaming methods, synthetic data generation for vulnerability probing, and investigating how downstream alignment techniques (e.g., DPO, RLHF, or constitutional AI) can be customized to enforce strict adherence to organizational values.
Your project is part of the Model-Driven Decisions Lab, a Netherlands Police - TU Delft initiative, where you will join an interdisciplinary community of four fellow PhD students who have already been hired. Together, you will share knowledge to tackle AI-assisted decision-making from different perspectives. To foster close collaboration with the stakeholders and work on practical implementation, you will spend 20% of your time at the Netherlands Police’s strategy and innovation division. Given the ethical and moral facets of your research, you will also work closely with colleagues of the Delft Digital Ethics Centre at the Faculty of Technology, Policy, and Management (TPM). Your home base will be the Web Information Systems research group at the Computer Science faculty (EEMCS). As an internationally diverse team of driven academics and students, we cultivate a welcoming and collaborative environment. We will give you all the support and training you need to evolve both personally and professionally. Learn more about your project at the Model-Driven Decisions Lab.
Job requirements
- You hold an MSc in computer science, data science, or another relevant subject such as ethics of AI, with practical machine learning/artificial intelligence courses and relevant project and thesis experience.
- You have a keen interest in AI alignment, human-AI interaction, and explainable AI, and enjoy collaborating with experts in different disciplines.
- You thrive on conducting research geared to real-world applications in the security domain and are intrinsically motivated to collaborate with the Netherlands Police.
- You harness your communication skills to work with different scientific and nonscientific stakeholders in different work cultures.
You have a good command of written and spoken English, as you will be working in an international environment. Since you will be working with the Netherlands Police, one of our pre-requisites for a suitable candidate is to have a good command of the Dutch language. This is a strong requirement due to the context of the project that will need interactions with stakeholders and data in the Dutch language.
TU Delft (Delft University of Technology)
Working at TU Delft means contributing to solutions that really make a difference.
For over 180 years, we have been training engineers who make an impact worldwide in companies, government bodies, or as entrepreneurs. Our alumni turn knowledge into concrete solutions for the challenges of today and tomorrow.
These challenges are changing rapidly. That is why we focus on themes such as energy, climate, digitalisation, artificial intelligence (AI), and smart mobility every day. Our education and research are directly aligned with what society needs now and in the future.
At TU Delft, our people make the difference. With their knowledge and curiosity, our staff provide a high-quality education and conduct pioneering research that extends beyond the campus. You will have the opportunity to take the initiative, work with others, and grow as a professional.
Working at TU Delft means join an international community of professionals and students. Together, we create knowledge, innovations, and solutions that help move the world forward.
Faculty of Electrical Engineering, Mathematics and Computer Science
The Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) brings together three scientific disciplines. Combined, they reinforce each other and are the driving force behind the technology we all use in our daily lives. Technology such as the electricity grid, which our faculty is helping to make completely sustainable and future-proof. At the same time, we are developing the chips and sensors of the future, whilst also setting the foundations for the software technologies to run on this new generation of equipment – which of course includes AI. Meanwhile we are pushing the limits of applied mathematics, for example mapping out disease processes using single cell data, and using mathematics to simulate gigantic ash plumes after a volcanic eruption. In other words: there is plenty of room at the faculty for ground-breaking research. We educate innovative engineers and have excellent labs and facilities that underline our strong international position. In total, more than 1000 employees and 4,000 students work and study in this innovative environment.
Click here to go to the website of the Faculty of Electrical Engineering, Mathematics and Computer Science.
Conditions of employment
Pending the screening result, a temporary employment contract as a researcher can be offered for up to 4 months, if requested by the candidate. This contract will be converted to a PhD contract upon a positive screening result. These are 5-year PhD positions, with the extra fifth year (compared to a standard 4-year PhD program) allowing for the additional activities of learning about the police organization and securing the results in the police organization. Doctoral candidates will be offered 5 years of employment in principle but in the form of two employment contracts. An initial 1,5-year contract with an official go/no go progress assessment within 15 months. Followed by an additional contract for the remaining 3,5 years assuming everything goes well and performance requirements are met. The additional fifth year (compared to a standard 4 year PhD program) accommodates the extra activities to get to know the police organization and to secure the results in the police organization.
Salary and benefits are in accordance with the Collective Labour Agreement for Dutch Universities, increasing from € 3059 per month in the first year to € 3881 in the fourth year, based on a fulltime contract (38 hours), plus 8% holiday allowance and an end-of-year bonus of 8.3%. In the 5th year, you will receive a temporary monthly allowance based on the gross difference between salary scale P, step 3, and salary scale 10, step 3.
As a PhD candidate, you will be enrolled in the TU Delft Graduate School. The TU Delft Graduate School provides an inspiring research environment with an excellent team of supervisors, academic staff, and a mentor. The Doctoral Education Programme is aimed at developing your transferable, discipline-related, and research skills. The TU Delft offers a customizable compensation package, discounts on health insurance, and a monthly work costs contribution. Flexible work schedules can be arranged.
Will you need to relocate to the Netherlands for this job? TU Delft is committed to make your move as smooth as possible! The HR unit, Coming to Delft Service, offers information on their website to help you prepare your relocation. In addition, Coming to Delft Service organises events to help you settle in the Netherlands, and expand your (social) network in Delft. A Dual Career Programme is available, to support your accompanying partner with their job search in the Netherlands.
Additional information
If you would like more information about this vacancy or the selection procedure, please contact Dr. Ujwal Gadiraju, via U.K.Gadiraju@tudelft.nl. As you will be working in the security domain, you must undergo a security screening executed by the Dutch government before starting this position. This screening will take on average 2 to 3 months and could be up to 6 months. A positive outcome of the screening is a prerequisite for the contract for these PhD positions to come into effect. At least a BO screening is needed for these PhD positions.
Application procedure
Are you interested in this vacancy? Please apply no later than 10 August 2026 via the application button and upload the following documents:
- CV
- Motivational letter
You can address your application to Dr. Ujwal Gadiraju.
Doing a PhD at TU Delft requires English proficiency at a certain level to ensure that the candidate is able to communicate and interact well, participate in English-taught Doctoral Education courses, and write scientific articles and a final thesis. For more details please check the Graduate Schools Admission Requirements.
Please note:
- You can apply online. We will not process applications sent by email and/or post.
- As part of knowledge security, TU Delft conducts a risk assessment during the recruitment of personnel. We do this, among other things, to prevent the unwanted transfer of sensitive knowledge and technology. The assessment is based on information provided by the candidates themselves, such as their motivation letter and CV, and takes place at the final stages of the selection process. When the outcome of the assessment is negative, the candidate will be informed. The processing of personal data in the context of the risk assessment is carried out on the legal basis of the GDPR: performing a public task in the public interest. You can find more information about this assessment on our website about knowledge security.
- Please do not contact us for unsolicited services.