Public Procurement is a term used to refer to governments’ purchasing activities of goods, services, and construction of public works. Public procurement comprises large shares of government budgets. According to the OECP (French Economic Observatory of Public Procurement), procurement contracts are believed to amount to some e 200 billion a year in France or 10% of the GDP. Yet, as one of the most important areas where the state and the private sector interact extensively, public procurement processes are open to the use of public resources for different interests other than the public good. Public procurement process thus may well involve corrupt transfers between the state officials and the private sector firms, for reasons ranging from personal interests to the funding of political parties. As it is put forward by a 2005 International Monetary Fund (IMF) Report, “corruption in public procurement is the most severe type of corruption”. The World Bank has estimated that roughly USD 1.5 trillion in public contract awards are influenced by corruption, and the volume of bribes exchanging hands for public sector procurement alone is about USD 200 billion per year.

In its ‘Compendium of good practices: on the use of open data for Anti-corruption’ (2017), the Organization for Economic Co-operation and Development argues that “Open data can help increase government performance by enabling decision-makers to design better policies for anti-corruption and follows up on their effective implementation. Not only Open Data allows the provision of incentives to avoid illegal acts by increasing the odds of exposing governmental misconduct, but it can help discover and dismantle corrupt activities by facilitating critical information, tools and mechanism for judicial enforcement, and for media and society to detect the abuse of entrusted power for private gain.”

This focus goes hand in hand with regulatory development. Hence, the new public procurement rules adopted pursuant to European directive (transcribed into French law) –Directive 2014/24/EU and directive 2014/25 (utilities), directive 2014/23 (concessions contracts)– explicitly states that: “the traceability and transparency of decision-making in procurement procedures is essential for ensuring sound procedures, including efficiently fighting corruption and fraud. […]. The essential elements and decisions of individual procurement procedures should be documented in a procurement report. […] The electronic systems for publication of those notices, managed by the Commission, should also be improved with a view to facilitating the entry of data while making it easier to extract global reports and exchange data between systems.”

It was in that vein that France announced her willingness to support the implementation of the Open Contracting Data Standard during the London G20 Anti-Corruption Summit in May 2016. The implementation of the Open Data Contracting Standard (developed by the Open Contracting Partnership, support of DeCoMaP, cf. Appendix) aims to foster public sector transparency and to fight corruption and nepotism on public procurement processes by following an open by-default approach during the whole public contracting process.

DeCoMaP lies at the intersection of these three considerations: regulatory developments, open data and automatic tools for fraud detection and economics analysis. Considering that public procurement data are essentially of a relational nature, we will noticeably use machine learning and graph-based approaches to model and automate fraud detection. As much by its field of inquiry (France) as by its methodology and as by its pluri-disciplinary approach, DeCoMaP is totally innovative.

The design of automatic tools for fraud detection in public procurement is not new and has already been going on for several years (see Section b1), However, no automatic tool for fraud detection has been adapted to the French public procurement legal framework and its data. Moreover, the applied detection methods have three major limitations. Firstly, no real ground truth has been defined up to now. Yet, this is required in order to objectively assess the performance of automatic fraud detection tools. Secondly, these methods rely mainly on standard regression tools and do not take advantage of recent development in the field of machine learning. Thirdly, they focus on individual data which independently characterize customers and suppliers, and therefore ignore relational information derived from their interrelationship. Therefore, existing tools need to be improved in order to take into account the interactions between concerned actors. This project aims at solving these problems and is divided into three phases.