View this issue as a PDF: CrimHighlightsV18N4.pdf
This issue of Criminological Highlights addresses the following questions:
- Does being detained prior to trial affect the likelihood of conviction?
- How do US colleges respond to applicants with criminal records?
- What’s wrong with predictive models of sentencing?
- Are objective-looking tools for predicting repeat domestic violence useful?
- How can the use of risk assessment tools increase a youth’s risk of reoffending?
- How does the strength of the evidence used to convict people affect the sentence that they get?
- Do judges and lawyers understand the reliability and validity of psychological evidence?
- Are sex offender registration and notification laws useful?
People who are detained prior to trial are more likely to be found guilty than are equivalent people who are released into the community while awaiting trial.
Nobody is likely to be surprised by findings suggesting that those who are detained prior to trial are more likely to be found guilty than are people who are released prior to trial. Some of the factors that lead to detention (e.g., criminal record) are also factors that may increase the likelihood of findings of guilt and/or of harsh punishments.
In order to determine whether it is detention per se that increases the likelihood of a finding of guilt, this study compares those who are detained with accused people who have an equal probability of being detained but, for unpredictable reasons, happen to be released.
Cases from six large urban counties in Florida in which an accused was detained were matched with cases in which the accused was released. This was done using a technique called ‘propensity score matching’ in which the likelihood of release was predicted for each person using such variables as offence severity, prior criminal record, age, race, offence, number of charges, whether a cash bail had been offered (but not necessarily met), prior arrests, and type of lawyer. People who were released were then matched (on their ‘propensity score’) with people who were detained. The main outcome variable that was examined for these two groups was the likelihood of being convicted.
The comparison of the matched accused (released vs. detained) showed that about 63% of the accused people who were released were found guilty whereas 71% of the accused people who were detained prior to trial were found guilty. It is also worth noting that many people were offered a chance at putting up cash bail, but were not able to and, as a result, were detained. What is not clear from these results is why those who were detained were more likely to be convicted. It could be that individuals who were detained were less able to prepare a defence or that detained individuals simply pleaded guilty at a higher rate in order to speed up the process.
Conclusion: Detention before trial increases the likelihood that an accused will be convicted. Although 41% of the original sample were in fact detained, 93% were offered some form of release, but were unable to meet the conditions of release (the deposit of cash bail). In other words, much of the impact of pretrial detention can be traced to monetary bail terms that inherently disadvantage the already disadvantaged part of the community. To the extent that other conditions often required for release (e.g., stable housing away from those with criminal records or an acceptable surety) also restrict release of members of disadvantaged groups, it would seem that even justice systems that do not rely on cash bail may end up convicting people in part for who they are as well as what they might have done.
Reference: Lee, Jacqueline G. (2019). To Detain or Not to Detain? Using Propensity Scores to Examine the Relationship Between Pretrial Detention and Conviction. Criminal Justice Policy Review, 30(1), 128-152.
Many US post-secondary educational institutions require criminal history information from those applying for admission. In an experimental study examining actual admissions decisions, applicants with criminal records applying to 4-year colleges were more likely to be rejected than equivalent applicants without records.
There is substantial evidence that those with criminal records are disadvantaged when seeking employment and housing (e.g., Criminological Highlights 17(2)#6, 18(3)#6). Given that “Higher education has long been considered an instrument of social mobility and social cohesion” (p. 157) it is important to examine whether those with criminal records are being excluded from these opportunities.
Studies of youths typically demonstrate that most youths have committed what are considered to be criminal offences. Because most are never convicted, most of these ‘offences’ do not affect their later lives. However, many US colleges – especially the most competitive institutions – require applicants to disclose their criminal records. Over 80% of the top third of US colleges on ‘competitiveness’ ask criminal history questions on their applications, ostensibly because of concerns related to violence, drug use, and other crime. There is, however, no evidence that having “criminal history questions on college applications are effective tools for reducing campus crime” (p. 161).
People were recruited to participate in this study as applicants to 4-year US colleges. Their actual high school transcripts, test scores, and criminal records were used. Other materials were produced as needed for the different applications. Those with criminal records had a single felony conviction (for aiding and abetting robbery or for burglary). The applicants with records were matched with a young person without a record such that the person with the criminal record was always slightly better qualified academically. Two pairs of male testers were used. When applying to a particular college, both testers described themselves as either Black/African American or White. Criminal record was conveyed to the college only when it was asked for. The colleges that were tested were ones in which admission was plausible; the most competitive colleges were not included. 280 pairs of applications (one with a record, the other not) were completed and sent to each college. The pairs clearly were comparable: the two members of each pair (one with a record, one without) were admitted at equal rates at colleges without questions about criminal records.
At colleges with questions about criminal records, 10% of people without criminal records were rejected compared to 25% of people with criminal records and slightly stronger applications. The presence of a criminal record disadvantaged both Black and White applicants. The effect of a criminal record appeared to be a bigger disadvantage at colleges in higher crime locations. In addition, it was discovered in the process of carrying out the study that “it was far more difficult to complete applications that requested criminal history information” (p. 179) because of the need to file additional information. Other studies suggest, not surprisingly, that those with criminal records are likely to drop out of the application process in part because of the extra required information and effort.
Conclusion: It is clear that 4-year colleges discriminate on the basis of criminal records for both Black and White applicants. Applicants – both Black and White – with criminal records are more likely to be rejected than those without criminal records, even though they are equally, or perhaps slightly more, qualified for admission and the records had no obvious relevance to higher education. Given that the presence of a criminal record is not equally distributed across groups, it means that certain groups are excluded from this opportunity to succeed in modern society.
Reference: Stewart, Robert and Christopher Uggen (2020). Criminal Records and College Admissions: A Modified Experimental Audit. Criminology, 58, 156-188.
It has been well known for decades that predictions of future violent offending are more often than not wrong. A sizable majority of people placed in preventive detention awaiting trial, and a sizable majority of people sentenced to longer sentences because of violence predictions, would not have committed violent crimes.
Forty years ago, 4 of 6 people predicted to be violent were not. Today, using the best instruments, 3 of 5 predictions of future violence are wrong. Much more is now known than in earlier times. The mistaken predictions disproportionately affect members of racial and other minority groups and are in large part based on factors such as age, gender, and socioeconomic status that are per se unjust. They are heavily based on criminal history factors that result in part from racial profiling, stereotyping, and police bias.
“Minority offenders are more often incorrectly predicted to be violent than are white offenders” (p. 440). All prediction instruments use socioeconomic status variables such as employment, marital status, education, and residential stability that penalize members of minority group. Criminal history variables exaggerate differences between minority and white offenders and increase racial disparities. “Increasing the severity of a sentence on the basis of risk prediction in effect punishes many offenders for crimes that would not have happened” (p. 440).
In some (largely US) sentencing systems, risk is explicitly a determinant of sentence severity. In others, such as Canada’s, risk enters in subtler ways – in the purposes of sentencing and the use of criminal record in determining the sentence even though sentences are supposed to be proportional to the harm done. But in addition, commercial risk instruments (see Criminological Highlights 17(2)#1, 17(6)#7) with serious limitations have become popular because they appear to be based on modern artificial intelligence principles and do not explicitly involve race or ethnicity as predictors.
The dilemma is a stark one. Some people believe it is “irresponsible not to use state-of-the-art prediction methods” (p. 447). Others believe that offenders should be punished only for what they have done. Some people acknowledge that violence predictions are often inaccurate (just as some of those exposed to an illness do not get it but are quarantined to prevent spread of the disease) but believe society is better served by being cautious and incarcerating those who might offend. Those incarcerated people who would not have reoffended, however, are unlikely to accept that they are being treated justly, especially since the predictions are often implicitly based on factors (e.g., age, race, gender, socioeconomic status) over which they have no or minimal control. Few disadvantaged people in any meaningful sense choose to be poorly educated, to lack work skills, or to have had unstable home lives.
The use of a person’s criminal history to support a harsher sentence is typically justified by the assumption that those with extensive criminal histories are more likely to reoffend than are first offenders. The result, as others have noted, is that a person sentenced at one time for three offences (e.g., three burglaries) will be punished considerably more leniently than will a person who committed the same offences but is sentenced separately for each. Proportionality is lost when criminal record enters the calculation. Scandinavian countries recognize this and typically acknowledge that a person should not be punished twice for the earlier offences. Moving away from the sometimes enormous weight given to criminal record in sentencing would almost certainly reduce (but not eliminate) disproportionate imprisonment rates for various groups (e.g., Blacks in the US; Indigenous people in Canada).
Conclusion: A sentencing system, or, more generally, a criminal justice system less based on prediction would, as a result, impose punishments based on the blameworthiness of the offender and the seriousness of the crime. It would also restore the idea that who the offender happens to be is less important than what the offender has done. The challenge, however, is that “Prevention concerns and prevailing emotionalism may make elimination of preventive sentencing unachievable” (p. 474). On the other hand, it needs to be remembered that sentencing decisions in most civilized countries in the world are not normally based on predictions of future violence and many have low crime and imprisonment rates.
Reference: Tonry, Michael (2019). Predictions of Dangerousness in Sentencing: Déjà Vu All Over Again. Crime and Justice: A Review of Research, 48, 439-482.
The prediction tool used most frequently by British police forces in domestic violence cases to assess the risk for future domestic violence is found to have failed to give substantial assistance to police officers in identifying high-risk re-victimization or recidivism cases.
“One of the most notable reforms on policing domestic abuse internationally has been the introduction of standardized risk assessment” (p. 1013). The purpose in using these instruments, obviously, is to identify perpetrators who are likely to reoffend and to focus interventions (e.g., pretrial detention or special conditions of release) on them.
Though logically such an approach makes sense, the success of such instruments has not been great. Although they may show statistically significant effects larger than chance, the overall accuracy of many of these measures is weak. Said differently, there are many false negatives (recidivism that is not identified by the test) and false positives (people who are predicted to commit offences, but, in fact, do not). A separate question, of course, is what these ‘objective’ tests should be compared to. An earlier study (Criminological Highlights 3(2)#7) found that domestic violence victims were at least as accurate in predicting future violence as were the ‘objective’ measures.
In this study, data from 41,570 intimate partner violence (IPV) incidents were examined in which a British standardized risk assessment tool – the Domestic Abuse, Stalking, and Honour Based Violence (DASH) form – was administered. Data from 19,510 non-IPV cases were also included in some analyses. DASH uses data from 27 questions that are asked of the victim and is described as a “structured professional judgement scale” in which final judgements are made either by a frontline officer or a police specialist, on the basis of the data collected by the officer. Risks are described as high (victim at risk of serious harm), medium (serious harm unlikely unless the circumstances change) and standard (no evidence of the likelihood of serious future harm).
5.6% of the original victims reported being revictimized by the same person within one year of the occurrence in which DASH data were collected. For every 100 people who, in fact, were revictimized, about 6 were initially given a ‘high risk’ rating, 27 were given a medium risk rating, and 67 were given a ‘standard’ (low risk) rating. Of those initially given “high risk” ratings, only about 10% actually reoffended. This was a higher rate than those initially assessed as having a ‘standard’ risk where only about 5% reoffended. But to say that ‘twice the percentage’ of high risk people reoffended in the ‘high risk’ group ignores the fact that 90% of this ‘high risk’ group did not reoffend.
Even if the ‘medium risk’ people were considered ‘high risk’, it turns out that the instrument would only have identified about 33% of those who reoffended. More dramatic is the fact that, using this cutoff of people predicted to reoffend (high and medium risk), only about 8% of those predicted to reoffend actually did. Said differently, if coercive interventions had been used on all of those predicted to be “medium” or “high” likelihood of reoffending solely to stop reoffending, the intervention would not have been justified for 92% of the people.
Other approaches – logistic regression and various machine learning methods – were used with the IPV and non-IPV data to see if more accurate predictions were possible using more sophisticated approaches. These more sophisticated approaches did not improve predictions, possibly because of unmeasured differences in the incidents (above and beyond the IPV/non-IPV distinction) or because of low reliability of the initial measures.
Conclusion: This highly used prediction instrument clearly has relatively low validity – leading to high rates of false positives and false negatives in the prediction of intimate partner violence as well as other types of violence. Hence the results underline the more general conclusion that the use of comprehensive measures (27 separate questions in this measure) or sophisticated looking measures (as in various high-tech approaches – see Criminological Highlights 17(2)#1, 17(6)#7) are unlikely to predict future violence adequately.
Reference: Turner, Emily, Juanjo Medina, and Gavin Brown (2019). Dashing Hopes? The Predictive Accuracy of Domestic Abuse Risk Assessment by Police. British Journal of Criminology, 59, 1013-1034.
Being arrested for a criminal offence increases youths’ likelihood of reoffending. But this contact with the criminal justice system also increases youths’ measured risk of recidivism, furthering the likelihood, therefore, that they will in the future end up deeper in the youth justice system if “risk assessment” is applied to them.
The measurement of criminogenic risk is now seen as “an integral evidence-based component of the criminal justice system” as if identifying and intervening to reduce criminogenic risk factors was uncontroversial. “Proponents suggest that criminogenic risk assessment can improve sentencing procedures, facilitate jail diversion, reduce prison populations, help scale down mass incarceration, improve policing, reduce violence and correctional spending, increase resources for community development, and ultimately prevent crime altogether” (p. 478). In other words, some people see risk prediction as contributing to solving many criminal justice problems.
Risk assessment tools typically measure antisocial behavior, antisocial personality patterns, antisocial cognitions and antisocial associates. Interventions – often imposed or applied as a result of risk assessment – traditionally attempt to lower one or more of these.
The only problem, as has been shown elsewhere (e.g., Criminological Highlights 18(4)#3), is that there are some risks of using risk assessments. And these risks are not evenly distributed across those subjected to the assessments. Furthermore, “exposure to the criminal justice system [may increase] some of the risk factors used to predict recidivism and re-arrest; therefore, risk factors for recidivism and onset/duration of exposure to the criminal justice system are not interchangeable” (p. 480).
This study examines data from the Pittsburgh Youth Study involving boys entering Grade 1 in 1987-8. It oversampled more troubled boys. Measures of antisocial attitudes, behaviours, and the number of anti-social peers were obtained, as well as a host of control variables. A variety of different statistical techniques were applied to the data to assess the impact of arrest and convictions.
The data demonstrated that arrest significantly increased antisocial attitudes, antisocial behaviours (the frequency of minor, moderate, and serious delinquency) and the number of anti-social peers. Conviction for an offence significantly increased antisocial attitudes and the number of anti-social peers, but had a less consistent effect on anti-social behaviours.
The challenge that these data present is with the idea that the risk factors that are measured are simple “causes” of crime. The risk factors measured when a youth is in custody are, at least partially, a result of the youth having been brought into the criminal justice system. In simple words, then, the use of risk assessments can increase crime because those subject to them have criminogenic experiences imposed on them by the justice system. Traditional risk assessments, then, “do not fully distinguish between individual-level propensities for criminal behavior and the criminalizing effects of the criminal justice system itself” (p. 484).
Conclusion: If a youth comes from a troubled neighbourhood, that youth may be more likely to be arrested than a youth from a more affluent neighbourhood. The fact that the youth may have a higher ‘risk’ score would appear to suggest that more intrusive interventions might be warranted. This, however, ignores the fact that the higher risk score might, in fact, be the direct result of criminal justice interventions. The more intrusive intervention (arrest rather than release with a warning for example) is likely to increase ‘risk’ and subsequent offending. Interventions, then, should be aimed at reducing the absolute number of people “flowing into criminological risk. Population prevention strategies, versus population management strategies, would aim to reduce first exposure to the criminal justice system, not merely to deploy criminogenic risk assessments during or after first exposure.” (p. 487).
Reference: Prins, Seth J. (2019). Criminogenic or Criminalized: Testing an Assumption for Expanding Criminogenic Risk Assessment. Law and Human Behavior, 45(5), 477-490.
Judges, at sentencing, appear to revisit the decision to find an accused person guilty. They give harsher sentences in cases where the decision to find an accused guilty was based on stronger evidence.
Sentencing is usually assumed to be largely determined by such factors as the seriousness of the offence(s) of conviction and the criminal record of the person being sentenced. This paper suggests that the confidence the judge has in the appropriateness of the verdict also affects the severity of the sentence.
The paper uses 484 case files completed between 2003 and 2006 in five American locations. The accused in these cases was convicted of a homicide offence, rape, robbery or aggravated assault and was sentenced to prison. (Only 2% received a non-prison sentence.) The focus of the study was the weight of the evidence entered into the court proceedings and its relationship to sentence severity. Three separate measures of the strength of the evidence were employed: presence of a forensic lab report, the quantity of physical evidence, and presence of an eyewitness. 42% of the cases involved examination of one or more pieces of physical evidence in a crime lab. 57% of cases involved eyewitness evidence.
Control variables included the offence (and whether, in cases other than homicide, it involved an injury to the victim), number of prior convictions, whether the case was resolved by a plea or a trial, sex and race of the defendant, sex and race of the victim, and court location.
Above and beyond the control variables, cases in which there was a large amount of evidence against the accused and cases in which a forensic lab report was introduced by the prosecution were each associated with longer custodial sentences for the defendant. The presence of an eyewitness was unrelated to sentence severity.
Perhaps a more ‘pure’ test of the idea that sentence lengths are longer in cases in which there is more convincing evidence comes from the 158 cases in which an accused was convicted at trial. The results were substantially the same: harsher sentences were handed down in cases in which the conviction was apparently based on stronger evidence.
It is possible, of course, that the weight of forensic evidence, itself, is dependent on law enforcement priorities and efforts. If that is the case, “the status characteristics of the parties to the offence actually produce specific types of evidentiary packages. The degree of effort exerted by law enforcement to not only collect multiple types of physical evidence during their investigations but also to expend resources on forensic examination of some or all of that evidence may help explain certain of the sentencing disparities often directly attributed to the race and class of victims and offenders” (p. 381-2).
Conclusion: “These findings provide cause for concern because the type and quantity of the evidence used to establish elements of the crime at the point of conviction, especially in trial cases, should have little or no bearing on the severity of sentence imposed by judges, once offence severity is considered” (p. 381). Nevertheless “when judges use evidentiary strength as a basis for their decisions in the sentence phase post trial, they are in effect revisiting the guilt-phase decisions of the trier of fact and are thus overstepping to evaluate the merit of the case after it has already been decided” (p. 381).
Reference: Nir, Esther and Elizabeth Griffiths (2018). Sentencing on the Evidence. Criminal Justice Policy Review, 29(4), 365-390.
Judges and lawyers do not appear to understand issues related to the reliability and validity of expert psychological evidence offered in court and as a result are not good at deciding whether to admit or exclude this kind of evidence.
In most jurisdictions, judges are expected to exclude from court expert testimony that is unreliable or invalid. The problem is that “Although judges may make a good faith effort to distinguish between low- and high-quality expert testimony, they may lack the skills necessary to detect flaws in research” (p. 543). In one survey of US state judges, only about half thought that they had the skills to do this kind of filtering of evidence.
Most lawyers report that they routinely attempt to exclude expert evidence introduced by opposing counsel. But arguments in court about admitting this evidence, or arguments to a judge or jury about the weight that it should be given, each assume that counsel can evaluate scientific evidence and communicate its strengths or weaknesses. The situation becomes more problematic, obviously, for jurors when they are evaluating evidence since they may assume that, if the evidence is admitted, it must be reliable and valid.
This study evaluated judges’, lawyers’, and ordinary people’s ability to evaluate the reliability of evidence and a crucial component of the validity of evidence – whether the expert could have biased the outcome of an intelligence test. For a test to be reliable, it would normally be expected that the components of it would all be measuring the same thing and thus would be highly correlated, that the test, if repeated, would give essentially the same results, and that two testers would arrive at the same conclusion.
95 judges and 91 lawyers in the US were presented with a written version of a case that centred on the use of an intelligence test with a young person. One group was told that the clinician administering the test knew that the young person’s IQ was expected to be low; the others were told that the person administering the test did not have any specific expectation of the result. Some respondents were given indicators that the test was relatively unreliable; the others were given information suggesting that it was a highly reliable test. Judges were asked whether they would exclude the evidence; lawyers were asked whether they would move to exclude the (opposing) evidence. They were also asked to rate the scientific quality of evidence.
Judges and lawyers indicated awareness of the issue of the importance of the validity of the evidence and they noticed the manipulation of the possible ‘expectation’ of the clinician administering the test. Nevertheless, judges’ decisions on admitting the evidence was not influenced by the manipulations of validity and reliability. “The justifications that judges provided for their admissibility decisions suggest… that judges… will exclude evidence when they believe it is unreliable or invalid; the problem is that they cannot recognize when the evidence actually is unreliable or invalid” (p. 550). But in addition, “variations in the validity of the testing conditions did not affect the likelihood that attorneys would move to exclude the expert’s testimony” (p. 551).
In a separate experiment, ordinary community members were presented with similar evidence (that varied in reliability and validity), with some of each group also hearing naïve or scientifically informed cross examination (from the opposing lawyer or from the judge). Jurors were apparently unaffected by the quality of the scientific evidence or the quality of the cross-examination of the expert.
Conclusion: It would appear that judges, lawyers, and ordinary people have trouble dealing with issues of validity and reliability of expert psychological evidence. The result is that assertions and ‘findings’ are accepted – or not – but the decision to accept the evidence does not seem to be highly responsive to the scientific quality of the evidence. To the extent that legal cases depend on proper evaluation of scientific evidence by judges, lawyers, as well as jurors, it would appear that there is a need to address ways in which decision-making on scientific issues might be improved.
Reference: Chorn, Jacqueline Austin and Margaret Bull Kovera (2019). Variations in Reliability and Validity Do Not Influence Judge, Attorney, and Mock Juror Decisions about Psychological Expert Evidence. Law and Human Behavior, 43(6), 542-557.
A careful analysis of the effects of sex offender registration and notification (SORN) laws in Houston, Texas using data from a 35 year period finds that the original law, as well as changes which broadened its impact, had no overall impact on the incidence of sexual offences. Furthermore, the law had no apparent effect on four subsets of these offences/offenders: sexual assaults, sexual offences against children, first-time offenders, and repeat offenders.
Quick fixes to crime problems are almost always popular. One such ‘quick fix’ are sex offender registration and notification (SORN) laws. These laws vary considerably and have been subject to a fair amount of research the results of which are consistent: they do not prevent sex offending (Criminological Highlights 5(6)#1, 7(4)#4, 8(6)#5, 9(2)#7, 10(3)#7, 11(4)#7, 11(6)#6, 12(5)#4, 14(5)#8, 18(1)#8)).
However, these laws do vary. Hence it is worth examining whether a law – and changes to it – might have an impact in a local context, in this case Houston, Texas, the fourth largest city in the US. The original Texas law was implemented in 1991, requiring all those then in custody (as well as those subsequently imprisoned) to register. In 1997, the law was extended, retroactively, to anyone convicted of a sex offence after 1970. In 2005, it was extended to those living in Texas, but whose convictions were elsewhere.
In this study, the impact of the original law (and changes to it) were examined for different outcome measures using monthly data on the number of charges filed for (a) all sexual offenses, (b) sexual offences involving first time offenders, (c) sexual offences involving repeat offenders, (d) sexual assaults, and (e) sexual offences against children. As one way of ruling out effects that might not be caused by the implementation of SORN laws, changes in the number of non-sexual assaults were also examined. Data were collected for 424 months starting on 1 January 1977 (before the first law came into effect) until 30 April 2012 (about 7 years after the last change).
Because there were sufficient observations both before the laws were first implemented and after the last set of changes, it was possible to use ‘interrupted time series’ analyses to examine the impact of the change in law. Various controls were included to eliminate both seasonal trends as well as overall trends across the full time period that were not related to changes in the law. Various models were examined, but the obvious expectation was that if the law was effective, it “should have an abrupt and permanent impact on sexual offence case filings as would each of the two subsequent modifications…” (p. 1502).
The results of the study are easy to summarize and are consistent with most of the existing research on SORN laws. Texas’ SORN laws “generally reveal a lack of relationship between SORN laws and rates of all sexual offenses, or with specific measures of sexual assaults against adults, or sexual offences against children” (p. 1504). Not surprisingly, most of the sexual offences that were reported were committed by those who had not been previously arrested for a sex offence that was subject to the state’s SORN requirements.
Conclusion: “A growing number of studies suggest that the implementation of SORN requirements has had little relationship to the rates of sexual offending over time” (p. 1507). These findings, combined with the knowledge that these laws have unintended (and additionally punitive) negative impacts on those who are subject to them, should lead those interested in reducing the incidence of these serious offences to consider other approaches.
Reference: Bouffard, Jeff A. and LaQuana N. Askew (2019). Time-Series Analyses of the Impact of Sex Offender Registration and Notification Law Implementation and Subsequent Modifications on Rates of Sexual Offences. Crime & Delinquency, 65(11), 1483-1512.