A key objective in conducting a Bell test is to quantify the statistical evidence against a local-hidden variable model (LHVM) given that we can collect only a finite number of trials in any experiment. The notion of statistical evidence is thereby formulated in the framework of hypothesis testing, where the null hypothesis is that the experiment can be described by an LHVM. The statistical confidence with which the null hypothesis of an LHVM is rejected is quantified by the so-called P value, where a smaller P value implies higher confidence. Establishing good statistical evidence is especially challenging if the number of trials is small, or the Bell violation very low. Here, we derive the optimal P value for a large class of Bell inequalities. What is more, we obtain very sharp upper bounds on the P value for all Bell inequalities. These values are easily computed from the experimental data, and are valid even if we allow arbitrary memory in the devices. Our analysis is able to deal with imperfect random number generators, and event-ready schemes, even if such a scheme can create different kinds of entangled states. Finally, we review requirements for sound data collection, and a method for combining P values of independent experiments. The methods discussed here are not specific to Bell inequalities. For instance, they can also be applied to the study of certified randomness or to tests of noncontextuality.