By: Jonno Bourne

# Summary

There has been a lot of coverage about Brexit and why the UK voted out. A large part of the discussion was along the lines of the uneducated poor vs the metropolitan elites. We take a different approach looking at whether communities where the White-British and non-White British (Asian, Polish, etc) populations are highly integrated tended to increase or decrease the likelihood of voting out. At the extreme ends the most 10 out of 10 of the most segregated local authority districts voted Out whilst 8 of the 10 most integrated voted In. In fact more diversity and integration reduces the probability of voting out. These findings hold true even when we control for deprivation, what’s more we find that the most and least deprived areas were more likely to vote In and it was the local authorities in the middle of the deprivation scale that were more likely to vote Out, challenging the poor vs elite argument. Given these results we conclude that segregation is a serious issue for the UK that can create mistrust between communities and that increasing levels of residential integration should be of concern to policy makers and local government.

# Introduction

Earlier this month Ted Cantle, founder of the the Institute of Community Cohesion (iCoCo), and Eric Kaufmann, Professor of Politics at Birkbeck College, Published a paper called “Is segregation on the increase in the UK?”. In it they compare changes in levels of ethnic segregation between White British citizens and other ethnic groups over the last 15 years. They found that overall levels of segregation are increasing and that when there is a large increase in ethnic minority residents in an area the number of White British residents tends to decrease sharply. There are unlikely to be many people who find the conclusions of Cantle and Kaufmann particularly cheering however it is definitely interesting from a government policy point of view and when considering the current political situation in the UK.

This year Brexit has been the only topic in British politics. Like it or not the UK voted by a slim margin to leave the European Union and go it alone. The run up the Brexit campaign was, to put it mildly emotionally charged and arguably the campaign to leave was underpinned by concerns about immigration. These concerns covered illegal immigrants from Sub-Saharan Africa and the Middle East, as well as legal immigrants from the EU.

The combination of the Brexit results and the report by Cantle and Kaufmann encouraged us to revisit some work from earlier in the year that was inspired by a Nate Silver article on the website 538 called “The Most Diverse Cities Are Often The Most Segregated”. Silver looks at how some American cities although seemingly very diverse are in fact highly segregated when examining which neighbourhoods people actually live in, this is shown rather beautifully in Dustin Cable’s Dot Map. Obviously the USA has a very different history of ethnic interaction and movement to the UK, however the techniques that Silver uses can definitely be applied.

So basically, Segregation in the UK is increasing; the UK (or more accurately England), has voted for Brexit; In the US diverse cities can also be highly segregated. All those bits are very interesting but how do they tie together? What we want to know is "What are the most and least segregated parts of the UK? and how did that, if at all, affect voting for Brexit?".

# Method

## The areas to be analysed

Although we have discussed our question in terms of the UK, the focus of this work will actually only take Scotland England and Wales, with detailed analysis only using England, due to data limitations.

In Northern Ireland White British as an ethnic group is less important that Catholic and Protestant.  Personal identity and concepts of nationality are much more strongly related to type of Christianity than in the rest of the UK, Northern Ireland also has the added complication of the border with the Republic of Ireland and the effect a hard border would have further complicating voting. Because of these factors Northern Ireland will not be considered in the analysis.

A common point of discussion after the Brexit vote was that poorer people voted for Brexit, in order to control for this effect we want to use a measure of wealth or poverty and see if Integration and diversity are still significant. A method of doing this is using the "multiple indices of deprivation" this index scores LSOA on multiple indices, such as financial, educational, access to services etc. Unfortunately it is only calculated for each individual country within the UK and not for the UK as a whole, making comparing areas in England with Areas in Wales or Scotland impossible. Because of this for the most detailed analysis we will only use England.

In summary, Northern Ireland is excluded from all analysis as it has a fundamentally different view of segregation than the rest of the UK. Scotlan, England and Wales (SEW) will be used for the general analysis, but due to data limitations only England will be used for detailed analysis.

## Key Terms

LSOA: Lower Super Output area, This is a geographical area that has as a minimum a population of 1000, it is usually the smallest area that the government provides publicly available statistics for, as it prevents identification of individuals. This area is used to define the local or neighbourhood diversity.

LAD: The actual description of the areas being used is a Local Authority District, click here for a Wikipedia explanation, it is similar to Borough which is used instead at some points.

## The process

The Process to perform the analysis will be as follows

1. Get the data from the Office of National Statistics (Open Government FTW)

2. Split the ethnic make up of each LSOA into White British and Other which is everyone who can't be described as White British.

3. Calculate the Diversity index for each LAD and LSOA

4. Use a multi-linear model to create an Integration index

5. Use the voting results of the Brexit referendum, to create a logistic model that predicts voting patterns using the diversity and segregation scores

Using a generalised linear model (aka GLM) allows us to rate LADs and counties relative to what we would expect an area with that much of an ethnic mix to have in terms of local diversity given the diversity of the LAD as a whole. This is because we can't expect a city with very few ethnic minorities to have the same kind of distribution as a city with a large percentage of ethnic minorities.

## What is the Diversity score?

The diversity score is the probability that a person of a given ethnic group will be next to a person of any other ethnic group for a given area. In this analysis we will calculate the diversity at both LSOA and LAD level using the equation

$$Diversity = 1-W^2-O^2$$

Where W is the percent White British of the population and O is the Percent of Other.

### Why Only White British and Other?

Short answer: It doesn't really make a difference.

Long answer: The Brexit campaign's argument against immigration, wasn't focused on immigration solely from Europe but immigration and asylum in general, White-British is clearly the ethnically native population and such framing then creates a native non-native split aka White British and Other. As such the it is less important how Arab and Caribbean people are mixing with each other and more how they are mixing with the White British population.
From a results perspective, Britain is about 82% White British, much higher than than the US which is about 63% White (non-Hispanic), the effect of including how mixed non-White British communities are with each other is negligible. More detailed ethnic mixing only really has an effect in areas that have low proportions of White British population and high proportions of two other ethnic groups, this only happens in a few places and the overall effect is small.

## What is the Integration Index/Score?

Once all the Lads have a diversity score we will predict the average diversity score for each LAD using the equation

$$Average \;LSOA \;diversity = x * LAD \; diversity +z* LAD \; diversity^2$$

Where x and z are coefficients chosen to find the best fit. We then find the ratio between the Actual local LSOA diversity score and the average or expected score calculated by our linear model. This ratio shows whether a LAD is more or less integrated than would be expected for it's overall level of diversity. A score of 1 means a LAD has exactly the score we would have expected, a score larger than 1 means the LAD is more integrated than would be expected, whilst a score of less than 1 is more segregated than would be expected.

# Results

Mapping the Integration Index showed some very clear patterns, specifically that the North West of England is, relatively speaking, a hub of segregation.
By plotting the results showing local vs regional diversity and drawing a line of expected integration, we can clearly identify the towns and areas that are least integrated. Although Manchester, the regions largest city, is more integrated than average, the post industrial satellite towns that lie to it's north are the most segregated areas of the country, Burnley, Oldham, Blackburn and Bradford sit a long way below the expected line (shown in red) for cities of their level of diversity. a Smaller area of segregation is the Midlands with Birmingham the UK's second largest city being significantly more segregated than would be expected for a city that diverse. Some of the most integrated places in the UK are Oxford and Cambridge, arguably because of the high international student population which is relatively evenly dispersed  and not separated from the White British student population. This may have also helped Bristol and Cardiff to other large cities keep above the line in terms of integration, as they also have substantial student populations relative to their size, however the analysis doesn't explicitly pinpoint such reasons.

Due to it's size to population density London is broken out into a separate graphic, almost all of the city is above the line for integration with the exception of three boroughs Redbridge, Hounslow and Croydon. The most integrated boroughs in London are Kensington and Chelsea, Islington and Camden.

The more blue an area the more integrated it is the more red the more segregated. As can be seen the North West around Manchester is one of the most segregated areas of the country. The Midlands near Birmingham is also more segregated than would be expected.

London Boroughs are generally more integrated than expected, however Redbridge, Hounslow and Croydon, are more segregated.

The red line shows the expected amount of neighbourhood diversity for a given Borough diversity, the dotted line shows perfect integration.

## Are White British People a minority?

A side question is where are White Britons in a minority, and is it a common thing? In fact white Britons are a minority in about 11% of LSOA's which accounts for approximtely 12% of the whole population, however only about 4% of White Britons live in an area where they are a minority. However in General most LSOA's are majority White British with the average LSOA being 82% White British.

The average person in England and Wales lives in an area that is 83% White British.

### What does a White British minority mean?

We should also clarify again that as the analysis was done using effectively only two ethnic groups, White British and everyone else, if White British is not the majority group it doesn't necessarily mean that a different ethnicity is the majority, as the "Everyone else" group includes, Irish, Caribbean, Pakistani, etc. An excellent example of this is Kensington and Chelsea which has a White British Minority at just under 42% of the population but White British is still the largest single group by almost 13% percentage points. The second largest ethnic group is White Other which includes Irish, European and American citizens living in the UK, this group makes up just under 30% of the boroughs population meaning that the ethnic group White is about 72% of the borough and the combination of all other minorities taking up the remaining 28%.

## The relationship to Brexit

So far we have seen that there are reasonably large differences in how integrated or otherwise different parts of the country are even after accounting for different levels of diversity. In some areas, such as Burnley and Oldham, the levels of segregation are quite high which is what Cantle and Kaufmann discuss. We have also seen that although certain areas have high levels of diversity, Such as Birmingham and Bradford, they may be actually quite highly segregated, this is what Silver discusses. However we have not yet explored how this has any meaning in peoples everyday lives.

The tables below show the top ten most segregated and most integrated parts of England and Wales. There is a clear distinction between them in terms of voting In or Out on Brexit.  Of the top 10 most Segregated areas 100% voted out, whilst for the top 10 most integrated areas 80% voted in.
So as to check whether this pattern holds the boxplot "Brexit vote by integration score" was created it shows that, although the results are close the LADs that voted in have a higher average integration score than the LADs that voted out, the "In" group also has a tighter spread than "Out". In fact there is a statistically significant difference between the average scores of the two groups suggesting that levels of integration and segregation are predictive of voting preference.

 Region LAD Int Score Brexit 1 North West Blackburn -0.44 Out 2 North West Oldham -0.44 Out 3 Yorkshire and The Humber Bradford -0.41 Out 4 North West Burnley -0.35 Out 5 North West Pendle -0.34 Out 6 Yorkshire and The Humber Calderdale -0.32 Out 7 North West Hyndburn -0.31 Out 8 Yorkshire and The Humber Kirklees -0.3 Out 9 North West Rochdale -0.28 Out 10 North West Bolton -0.25 Out
 Region LAD Int Score Brexit 1 South West Isles of Scilly 0.14 In 2 London Sutton 0.14 Out 3 East of England Cambridge 0.13 In 4 South East Oxford 0.13 In 5 London Camden 0.13 In 6 London Islington 0.13 In 7 London Kensington and Chelsea 0.13 In 8 East of England Hertsmere 0.12 Out 9 London Newham 0.12 In 10 London Richmond upon Thames 0.12 In

The box-plot shows how the integration scores were distributed between the In and Out votes in the Brexit vote. There is a small but statistically significant difference between the averages of the voting results, showing that more integrated areas were more likely to vote In whereas more segregated areas were more likely to vote Out.

## Building statistical models

In order to explore further how diversity and segregation relate to voting, we made two different statistical models. A linear regression model, which predicts how much the final vote percent (what percentage Hartlepool end up voting), and a logistic regression classification model (Will Hartlepool vote in or out?). We will just touch on the linear regression and look more closely at the classification results. As was mentioned earlier in the article, this detailed analysis is done using just England as then it is possible to include the indices of deprivation.

Before making the models the variables were normalised to make the coefficients more easily comparable. This meant we subtracted the average and divided by the standard deviation for each variable, (click here for more on normalisation)

### Linear regression:

The Integration Index and the average local diversity were both significantly predictive at the 1% interval (5% is used normally) when estimating the final percentage of the population that would vote to remain or leave the EU, In addition the model was significant at the 0.1% level. Higher levels of local diversity and higher levels of integration both increased the expected final percentage that would vote to remain. To be fair the model wasn't ever going to predict the result of the vote it was far too inaccurate, however it did clearly indicate what would be the direction of voting given an area's Integration score and overall diversity.

### Classification:

We built 3 models (4 really but the last was just for fun), shown below in the table. We were interested in two things, were the variables statistically significant? Were the coefficients of the variables stable?

The variables of LAD diversity, Integration Score, Average Deprivation and Deprivation Score squared, were significant to 95% confidence in all models.

The only exception was the Average Deprivation score which was only significant to 90% confident in the Deprivation model. Significance is important. if variables have low significance it can mean that the results are there due to chance. variables with low significance shouldn't be included in a model.  When deprivation was included it had a negative sign indicating that as areas became more deprived they tended to vote to leave the EU, which was in line with much commentary on poorer people voting leave and wealthier types voting to stay. However low level of significance was surprising an so a squared term was added this increased the significance of of both the deprivation terms even if it's results were no less surprising. The positive coefficient on the ScoreSqd variable indicates that most and least deprived areas where more likely to vote in and those in the middle were most likely to vote out. Although 7 of the 10 most deprived areas voted Out and 8 of the 10 least deprived voted In, closer analysis reveals a more nuanced relationship between views on Brexit and deprivation.

Also of interest was the stability of the coefficients. The three models had very similar coefficients with increasing variables. with only small change in the LAD diversity and Integration Score coefficient across all three models. There was a relatively large change in the Deprivation Index score when the squared term was included however that is reasonable as they both represent the same underlying concept. The stability of the coefficients give confidence in the belief in the link between diversity integration and Brexit.

London is worth commenting on again as it had 5 of 33 boroughs that voted out. Haverly, Bexley and Barking   are all next to each other on the edge of east London, Hillingdon is on it's own in the west, however all three have positive integration scores. Sutton the final Local authority to vote Out is London's most integrated borough. This then contrasts with Croydon, Redbridge and Hounslow, all with negative scores and all voted In. This suggests that there are other explanatory variables that would help predict voting result, this is to be expected as the model made is very simple.

 Model Intercept Diversity IntScore Depr DeprSqd Accuracy 1 Basic -1.522 0.933 1.106 0.806 2 Deprivation -1.589 1.060 0.939 -0.440 0.815 3 Deprivation Squared -2.223 1.045 0.959 -0.592 0.618 0.833 4 SEW Basic -0.890 0.744 0.855 0.740

## SEW model

Just for fun we rebuilt the basic model including Scotland and Wales (The fourth model called SEW Basic) in this model the diversity and integration scores were both significant. The coefficients had changed more especially for the diversity score but they still pointed the same was as previously and as we had changed the composition of the data this level of change is acceptable. It is interesting to see the large drop in model accuracy with the other two countries included, this may be due to the relatively low levels of diversity in Scotland and Wales compared to England, or it may be driven by different voting drivers in these two countries as opposed to England.

# Conclusions

In this article we have considered how ethnic integration segregation is distributed across Scotland,  England and Wales and explored possible ways in which that segregation may affect the society we live in. We have seen that the most segregated areas of the England and Wales are in the North-West of the country. We have also seen that there is a link between voting preferences in the Brexit referendum and levels of diversity and segregation, even when controlling for deprivation. To be specific, higher levels of integration and higher levels of diversity seem to provide a more positive view of the EU than the other way round. Despite these results we should be cautious as correlation does not imply causation, segregation and opinions of the EU could both be symptoms of some other factor.

Bearing in mind the caution on causality, the link between segregation and voting is significant and seems intuitive. If two ethnic groups exist in the same area but have little interaction it is unsurprising that there is mistrust, especially if one of those groups isn't "native". It is harder to mistrust those who you have regular social interactions with such as neighbours, work colleagues or the parents of your children's friends. This mistrust between communities is known in social psychology as Ingroups and outgroups. It seems only natural that in the febrile atmosphere of the Brexit campaign, areas that already had high levels of segregation would be more inclined to vote out.

Given this link between ethnic integration and a very real societal reaction to it, it seems sensible that the causes of segregation, how it is reduced and how it is maintained are explored further both by researchers and policy makers, with a view to improving integration and  trust between all communities in our country.

## Next Steps

Combining Concepts from the work from Cantle and Kaufmann as well as this article, creating a predictive model which included the rate of change of segregation. This is interesting because intuitively an abrupt change in the ethnic make-up of an area is likely to be more uncomfortable for the long term residents than a gradual change, thus increasing the likelihood that those residents would vote to reject Europe and prefer tighter immigration controls.

Coming up: The Technical version of this article giving all the details on algorithms used, formula etc.

Want to try for yourself? fork the code!