Reproducible samples and analyses are critical for data quality, particularly in monitoring and evaluation of project activities. Okay, some may say, but does it really matter at all? Yes, it does. It helps setting a seed for the future.

Evolution in project monitoring and evaluation

Let us take some capacity-development activity such as a training as an example. In the past, it might have been easy to simply write a “qualitative” description of the contents of a training with some technical jargon / acronyms and mention the number of participants that were there (hopefully also being able to provide or “estimate” a percentage of women among participants).

Monitoring and reporting project implementation was largely seen as a bureaucratic requirement. There was no structured way to learn from participants. There was virtually no systematic data collection for knowing how to improve training contents, methods, materials and results based on participants’ views and suggestions (feedback).

This fitted well with the traditional top-down approach to capacity development. In a way, it also made sense. Digital data-collection tools were complicated to use and considered expensive to maintain. They also represented more pressure on project managers and staff.

Project managers could be confident (and I am afraid some still are) that treating reports merely as regular “literary” exercises while focusing efforts on financial compliance would be enough. After all: “in the end we will write something nice in the implementation report”. Learning from project implementation and evolving from experiences were suffocated by a binary logic of “success” or “failure”. In such a context, it is easy to miss the fact that experimentation “failures” are important steps towards learning for impact success.

Paradigm change

This context has been changing fast in the era of data abundance and analytics. Many still see terms such as “automation” and “machine learning” as threatening. Personally I think that improvised, unstructured and scientifically-weak monitoring, evaluation, accountability and learning systems have done enough harm in terms of loss of resources and opportunities. This is particularly so in the public and international development sector. It is great to see that things are finally evolving from discourse to practice.

Learning from experience is gradually becoming easier and cheaper. The powerful and open source computational tools that are freely available such as R and Python can make it easier to reduce sample-selection bias but require at least some basic knowledge of their syntax. Many organisations are still adapting to the paradigm shift from top-down, specialised expertise to a more collaborative and multidisciplinary data-driven approach to monitoring, evaluation and learning. This process requires data-science skills that blend computing and statistics following professional monitoring and evaluation standards. Investment in human resources and targeted recruitment / contracting are key. Data management and analysis using traditional spreadsheet software such as MS-Excel and conventional, proprietary statistical packages (e.g., SPSS and STATA) are not enough anymore for a world with complex, unstructured data.

Sampling in a scientifically-robust (but simple) way

A common question that clients have asked me is about how best to select participants for feedback surveys in activities such as training and events. Thinking about this and the context above I developed a very simple app using Shinyapps. This app generates a reproducible random sample list of numbers. Samples are reproducible with the function set.seed(44343) using R.

You can access and use the app at: https://wferreira.shinyapps.io/randomsample/. You will simply need to input the total number of participants in the event / activity, and a sample percentage. This will depend on the size of your activity. After that, you can visualise the result in a reactive table and download the output in XLSX format (MS Excel).

The “magic” here is that if anyone executes set.seed() with the same number specified between the parentheses, one will always see the same randomised sample. This makes it reproducible while avoiding the problem of sample-selection bias. So, people in the future can also learn from your experience with assurance that you put some thought into data quality.

It is also possible to draw reproducible samples in many other statistical computing languages. In Python, for example, you should import random and call random.seed() to set your seed number. After that you need to import numpy and call numpy.random.choice() to get your sample. However, be aware that the seed number (44343) used as reference in the randomsample app will generate a different sample in Python as the app is built in R.

The app’s source code is publicly available for download on Github. I hope that this helps others to learn more about these tools. Code contributions will be very welcome too.

Let us learn for real. It is time to set.seed() for the future.

Written by: Eduardo W. Ferreira, PhD / Consultant, data scientist, trainer and facilitator. Eduardo supports designing, managing and evaluating projects and programmes for consultancy firms, non-governmental organisations, governments, research institutions and international organisations (Additional information).