Trust, Fakes and Overload

Dr Sophie Carr

image

https://www.lms.ac.uk/content/sophie-carr

https://twitter.com/SophieBays?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor

https://baysconsulting.co.uk/our-team/

https://www.linkedin.com/in/sophiecarrbays/?originalSubdomain=uk

The talk was about statistics, mathematics and numbers.

The following are notes from the on-line lecture. Even though I could stop the video and go back over things there are likely to be mistakes because I haven’t heard things correctly or not understood them. I hope Dr Carr and my readers will forgive any mistakes and let me know what I got wrong.

I would also like to point out to all my ex-students that the maths that they did in school is useful even if they went into non-STEM careers. A frequent moan was “when am I ever going to use this in my adult life.”

Dr Carr began her talk by explaining that her initial background was engineering and that this interest started very early. As a child she was very interested in aeroplanes and liked to build things including space rockets out of Lego. At school her favourite subject was physics (yay) and maths was something she had to do in order to be good at physics.

Bernoulli’s equation fascinated her during her A level studies. An equation where physics, maths and aeroplanes came together.

http://hyperphysics.phy-astr.gsu.edu/hbase/pber.html

The Bernoulli Equation can be considered to be a statement of the conservation of energy principle appropriate for flowing fluids. The qualitative behaviour that is usually labelled with the term “Bernoulli effect” is the lowering of fluid pressure in regions where the flow velocity is increased. This lowering of pressure in a constriction of a flow path may seem counterintuitive, but seems less so when you consider pressure to be energy density. In the high velocity flow through the constriction, kinetic energy must increase at the expense of pressure energy.

image

It should be noted that the application of the Bernoulli Equation in the above form is limited to cases of steady flow.

https://en.wikipedia.org/wiki/Laminar_flow

In fluid dynamics, laminar flow is characterised by fluid particles following smooth paths in layers, with each layer moving smoothly past the adjacent layers with little or no mixing.

image

The velocity profile associated with laminar flow resembles a deck of cards. This flow profile of a fluid in a pipe shows that the fluid acts in layers that slide over one another.

Dr Carr studied aeronautical engineering at university. It was during her studies she became interested in information overload. Why do people stop being able to take in information?

This ultimately led her to set up her own company. Since then, she has been working on statistics (also known as data science)

How do people see maths? How do they perceive numbers? How does the media portray numbers and how do people react to what they show? How does this affect people’s trust in numbers? Do people suffer from information overload when faced with lots of numbers?

Maths is important in everyday life.

Physicists, mathematicians and engineers work with maths all the time but people outside of these sorts of professions don’t think it is relevant to them. This is not the case. For example, two coffees made in exactly the same way can have very different tastes. The University of Portsmouth did some work on this. They were looking at how to make the perfect espresso.

https://www.port.ac.uk/news-events-and-blogs/news/brewing-a-better-espresso-with-a-shot-of-maths

The University of Portsmouth is a public university in the city of Portsmouth, Hampshire, England. The history of the university dates back to 1908, when the Park building opened as a Municipal college and public library. It was previously known as Portsmouth Polytechnic until 1992, when it was granted university status through the Further and Higher Education Act 1992.

The group at Portsmouth wanted to understand why sometimes two shots of espresso, made in seemingly the same way, can sometimes taste rather different.

They began by creating a new mathematical theory to describe extraction from a single grain, many millions of which comprise a coffee ‘bed’ which you would find in the basket of an espresso machine.

In order to solve the equations on a realistic coffee bed an army of super computers would be required, so a way of simplifying the equations was found.

The hard mathematical work was in making these simplifications systematically, in such a way that none of the important detail was lost.

The conventional wisdom is that if you want a stronger cup of coffee, you should grind your coffee finer. This makes sense because the finer the grounds mean that more surface area of coffee bean is exposed to water, which should mean a stronger coffee.

When beans were ground finely, the particles were so small that in some regions of the bed they clogged up the space where the water should be flowing

These clogged sections of the bed were wasted because the water couldn’t flow through them and access the coffee. If the coffee was ground coarser the whole bed was accessed and there was a more efficient extraction. The process was also cheaper because changing the grind setting used fewer beans. Using fewer beans would also be kinder to the environment.

The research showed that to make the coffee taste the same comes down to the size of the grind. It was found, that fewer coffee beans, ground more coarsely, are the key to a drink that is cheaper to make, more consistent from shot to shot, and just as strong.

Coffee was more reliable from cup to cup when using fewer beans ground coarsely.

The new recipes have been trialled in a small US coffee shop over a period of one year and they have reported saving thousands of dollars. Estimates indicates that scaling this up to encompass the whole US coffee market could save over $US1.1bn dollar per year.

Previous studies have looked at drip filter coffee. This is the first time mathematicians have used theoretical modelling to study the science of the perfect espresso – a more complicated process due to the additional pressure.

Tea has also got some maths.

https://en.wikipedia.org/wiki/Tea_leaf_paradox

The tea leaf paradox is a phenomenon where tea leaves in a cup of tea migrate to the centre and bottom of the cup after being stirred rather than being forced to the edges of the cup, as would be expected in a spiral centrifuge.

image

The blue line is the secondary flow that pushes the tea leaves to the middle of the bottom.

https://www.imperial.ac.uk/news/203853/imperial-mathematician-scoops-3m-breakthrough-prize/

https://www.theguardian.com/science/2020/sep/10/uk-mathematician-martin-hairer-wins-richest-prize-in-academia-breakthrough

https://www.facebook.com/watch/?v=942543536252848

Hairer landed the prize for his work on stochastic analysis, a field that describes how random effects turn the maths of things like stirring a cup of tea, the growth of a forest fire, or the spread of a water droplet that has fallen on a tissue into a fiendishly complex problem.

https://en.wikipedia.org/wiki/Stochastic_process

In probability theory and related fields, a stochastic or random process is a mathematical object usually defined as a family of random variables. Many stochastic processes can be represented by time series. However, a stochastic process is by nature continuous while a time series is a set of observations indexed by integers. A stochastic process may involve several related random variables.

https://en.wikipedia.org/wiki/Martin_Hairer

image

Sir Martin Hairer KBE FRS (born 14 November 1975) is an Austrian-British mathematician working in the field of stochastic analysis, in particular stochastic partial differential equations. He is Professor of Mathematics at Imperial College London, having previously held appointments at the University of Warwick and the Courant Institute of New York University. In 2014 he was awarded the Fields Medal, one of the highest honours a mathematician can achieve. In 2020 he won the 2021 Breakthrough Prize in Mathematics.

Now the general public would probably think such research was frivolous and switch off when confronted with the term “stochastic analysis”, but Dr Carr regards this, especially statistics, as a superpower.

https://en.wikipedia.org/wiki/Statistics

Statistics is the discipline that concerns the collection, organisation, analysis, interpretation, and presentation of data.

Statistics can be presented as a villain or a hero. Media can portray numbers in very different ways.

To get her point across Dr Carr used some of the superheroes from the Marvel and DC Universe and used different event criteria to group them in a Venn diagram. The criteria she used was what caused these people to gain their superpower.

image

https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe

https://en.wikipedia.org/wiki/DC_Universe

https://en.wikipedia.org/wiki/Venn_diagram

A Venn diagram is a widely-used diagram style that shows the logical relation between sets.

Dr Carr said it was ok to disagree with her as maths does include lots of debates.

There are two superheroes that she couldn’t place in her Venn diagram. One was the Black Widow. She was both a villain and a hero and she learnt her skills without any external help.

Thinking about the Black Widow got Dr Carr thinking about how we look at trust and fakes within numbers themselves and how we can show that maths and statistics are not just villains but are heroes we can trust.

Some aspects of mathematics can be turned into stories. For example:

Arithmancy. At Hogwarts students can learn to predict the future with numbers. Mathematicians would call this a predictive algorithm;

https://en.wikipedia.org/wiki/Wizarding_World

https://en.wikipedia.org/wiki/Predictive_analytics

It is true that predictive algorithms do have a rather mixed reputation in the media (2020 A level and GCSE results for example). They are often vilified for being wrong or not being accurate enough or not producing useful information. But mathematicians know that there is rarely certainty about what the future holds

Premonition. Knowing what the pattern is helps to develop strategies for answering the question. Destiny is a character in the X-Men, who can predict the future as is Alice Cullen in the Twilight series. In reality predicting the future is about spotting patterns. It’s about the ability to look ahead or back in the past in order to understand what questions need to be asked in order to understand the pattern, change the pattern, to stop the pattern if it is not particularly interesting, useful or we want to top the pattern from happening again. The maths is available for all of this to happen and can be turned into stories that people think is quite unusual, perhaps a superpower;

https://en.wikipedia.org/wiki/X-Men_(film_series)

https://en.wikipedia.org/wiki/The_Twilight_Saga_(film_series)

Solving puzzles. What questions should be asked for Batman to defeat the Riddler. But how do we ask the questions? The one thing that unites all mathematicians, whatever their field, is their love of working with and solving puzzles. The problem is how they can they convince the general public how wonderful it is to work with numbers and solve puzzles.

Numbers can actually move us, whether they be in books, newspapers or on social media. One example that is current at the moment (March 2021) is the fact that 140 thousand people have died from Covid 19 in the UK.

https://coronavirus.data.gov.uk/details/deaths

People have a real visceral reaction to numbers. For example, only 27% of FTSE-100 companies have females on the board (as a female this makes me very cross) and 60% of start-up companies fail.

https://en.wikipedia.org/wiki/FTSE_100_Index

It’s these reactions that spur communities to change things.

There is a beauty and elegance in a proof. A beautiful simplicity when things come together such as when two curves on a graph meet.

https://www.youtube.com/watch?v=OOPyNFdfCyE

Mathematics can move us in the same way as art and music can (although we aren’t always aware of this).

When we think about trust, fakes and overload of information its always important to remember that we are often talking about an emotive reaction.

How can we actually build up trust in numbers? This involves the people who are involved in science and maths and those people who produce the numbers doing the hard work.

This might explain why maths can have a bit of a reputation. It can be a bit tedious and repetitive but mathematicians don’t mind this. It is what they find beautiful.

Working in maths and science can involve automation. It’s never been easier to input data and get a whole lot of numbers out. But mathematicians and scientists need to look at these numbers in different ways. Do they prove or disprove a hypothesis?

Slowing down the analysis and doing the work thoroughly provides a greater depth of what is going on. It comes down to whether they want to get the numbers right or get the right number.

There is such a difference in how these things are calculated and how the results are communicated.

If the numbers are to stand the test of time then they need to be reliable and repeatable. They need to be understood and communicated.

There needs to be a reason for collecting them and the method of collecting them needs to be understood.

Effort needs to be put in to understand any bias and limitations that are in the data set. The population and sample size needs to be defined and described.

What and who is the study engaging with. This is important when communicating the results. Can the people receiving the results understand them?

There is often such a focus on obtaining the numbers that checking to see that they are correct is forgotten. Is it simply the case that the numbers are being used to stop the public pestering the people who asked for them to be done or are they being used to get a boss to leave their staff alone?

Letting people have the time to do the deep work is really hard to achieve, but is absolutely crucial to the work that mathematicians and scientists do. It underpins trust in numbers.

If the numbers are trusted then their ability to explain why they should be trusted becomes a lot easier.

When starting the process of trusting the numbers the first “superpower” is really to slow down when developing the method. This gives time to look at the little things. “Are we getting the numbers right”. How complicated does the method have to be? Can it be expressed in a way that people can understand or can the people be taught to understand it? Sometimes by understanding what is not included in the data set and the method being used is just as important as what has been included.

The second thing to look at is the uncertainty. Is a number absolute? This is unlikely as uncertainties are everywhere and can’t be ignored (although they can be easily overlooked). They can be glossed over because people want certainty.

Dr Carr thinks uncertainty is awesome and we all need to get comfortable with it. Talking about uncertainty doesn’t make us weak. It makes us honest and truthful. It allows mathematicians and scientist see where their work is applicable and how they can take things forward.

There are lots of different ways of showing uncertainty and communicating them properly will enable people to see which numbers can be trusted and help them find what they are looking for in the numbers.

The final thing is the language used to communicate any findings. Can anybody look at the work and understand why the research was done and why the data was collected. How does the work impact on the public? Why should they worry about it? Why should they invest time in reading about it?

The Royal Statistical society gives a prize for statistical excellence in journalism. This was established in 2007 to encourage excellence in journalists’ use of statistics and data.

https://rss.org.uk/training-events/events/excellence-awards/journalism-awards/

The winning articles were able to give accurate data and communicate the findings in a way that people could understand.

Statistics is often treated with suspicion.

Winston Churchill once said “I only believe in statistics that I doctored myself”

https://en.wikipedia.org/wiki/Winston_Churchill

image

Sir Winston Leonard Spencer Churchill, KG, OM, CH, TD, DL, FRS, RA (30 November 1874 – 24 January 1965) was a British politician, statesman, army officer, and writer. He was Prime Minister of the United Kingdom from 1940 to 1945, during the Second World War, and again from 1951 to 1955.

A very common phrase that is used about statistics is “lies, dammed lies and statistics”.

https://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics

“Lies, damned lies, and statistics” is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments. It is also sometimes colloquially used to doubt statistics used to prove an opponent’s point.

The phrase derives from the full sentence, “There are three kinds of lies: lies, damned lies, and statistics.”; it was popularized in the United States by Mark Twain and others, who mistakenly attributed it to the British prime minister Benjamin Disraeli.

https://en.wikipedia.org/wiki/Mark_Twain (below left)

image

Samuel Langhorne Clemens (November 30, 1835 – April 21, 1910), known by his pen name Mark Twain, was an American writer, humourist, entrepreneur, publisher, and lecturer.

https://en.wikipedia.org/wiki/Benjamin_Disraeli (above right)

Benjamin Disraeli, 1st Earl of Beaconsfield, KG, PC, FRS (21 December 1804 – 19 April 1881) was a British politician of the Conservative Party who twice served as Prime Minister of the United Kingdom.

Dr Carr feels that the statement should be re-written as “lies, dammed lies and misunderstandings”.

There is a lot of “fake news” but most people don’t set out to deceive and researchers in science and maths work really hard to explain what is going on. But unfortunately, some people misunderstand the work and that is often because numbers get misrepresented, misused or get taken out of context. This may serve somebody else’s purpose and produce a negative outcome or produce an outcome which wasn’t what the statistics were meant to be used for.

How are fake statistics spotted so rust can be maintained?

Is there a clear link between the headline and the text that is actually written?

There are three stories that have to be managed in an article.

Is the data in the story and is it easy to find? Is there a story in the results? What about the story that gets picked up by the media?

Handling all of this is quite hard. Is there a clear connection between the headline and the written content? Is there anything in the content that makes the headline believable?

Visualisations often cause problems. It is very easy to make a graph show something that is, in fact, wrong.

https://www.youtube.com/watch?v=1F7gm_BG0iQ

This particularly important at the moment

https://towardsdatascience.com/stopping-covid-19-with-misleading-graphs-6812a61a57c9

https://www.datasciencecentral.com/profiles/blogs/the-worst-covid-19-misleading-graphs

https://venngage.com/blog/misleading-graphs/

image

Pie chart should add up to 100% not 193%. Also, the 60% portion looks bigger than the 70% portion.

https://flowingdata.com/2009/11/26/fox-news-makes-the-best-pie-chart-ever/

https://www.theguardian.com/science/2018/apr/12/one-extra-glass-of-wine-will-shorten-your-life-by-30-minutes

A graph may have correct data but if it is set out badly then what it is showing may not be accepted or may be misused.

Another misuse of data is incorrectly linking things together. For example, it has been stated that drinking an extra glass of wine can shorten your life by 30 minutes. This is as bad as smoking. However, if you look more deeply at the report what it is, in fact, saying is if you drink three glasses a night, which is one over the recommended limit, every single day of your life then your life expectancy is lowered. Just one glass of wine extra will not have an effect.

https://www.bbc.co.uk/news/business-42802526

The BBC reported in January 2018 that official UK figures showed that the UK unemployment level had fallen by 3000 to 1.4 million. This seems like good news, but the uncertainty was +/- 77000

https://blogs.scientificamerican.com/observations/the-problem-with-failing-to-admit-we-dont-know/

In January 2018 the BBC News Web site announced that in the three months before November 2017, “UK unemployment fell by 3,000 to 1.44 million.” The reason for this fall was debated, but nobody questioned whether this figure really was accurate. But forensic scrutiny of the U.K. Office of National Statistics Web site revealed that the margin of error on this total was plus or minus 77,000—in other words, the true change could have been between a fall of 80,000 and a rise of 74,000, and a more honest headline would have been “UK unemployment may have gone up or gone down.”

Then there is the wonderful world of the spurious correlation

https://www.buzzfeednews.com/article/kjh2110/the-10-most-bizarre-correlations

Do women get PhDs because people go and watch sport?

http://tylervigen.com/view_correlation?id=79806

image

The problem with headlines is that we do have short attention spans and the media want us to look at the numbers in a certain way. Numbers can give us an emotional reaction.

When numbers are presented in a way that is inadvertently misleading or, even worse, actually from spuriously created sources it is really hard to separate them out.

There are people who deliberately set out to deceive So the question is how are people helped when there is an absolute overload of information.

People have to balance a perception that statistics and numbers are manipulated to deceive with the knowledge that you just can’t discount every single number that is seen.

Numbers inform conversations. They help people understand the world. Examples include how society is changing, the effects of global warming, changes to the economy and the spreading of a virus.

You can’t assume all numbers are false. Statistics and maths are there to help the important conversations.

How can mathematicians help people swamped by information when people have done their very best to get the numbers right?

Never stop reading, but stop and walk away when you are just skim reading and looking at the key words.

How can you get over the problem that headlines are designed to catch your eye when scrolling through items on your phone which may lead you to accept something which is rubbish if you were to read the whole text?

Dr Carr feels that there isn’t a simple way to stop this or help people, but there are three things that we could do to improve the trust in numbers.

1) Get comfortable with uncertainty. In other words, get comfortable with being uncomfortable. Point answers are not the solution to everything. People should be happy to talk about error values, ranges, confidence intervals and uncertainties. They need to become used to wider range of numbers. A number might not be exactly 10 but somewhere between 8 and 12. It will stop people expecting a definite answer when there isn’t one.

2) Love conditional probabilities. What do they mean? There is a real difference between the probability that you have a disease given that you have a positive test result and a probability of a positive test result given that you have the disease.

https://www.bmj.com/content/369/bmj.m1808

image

A lot of people would not see the difference. How can conversations enable them to interpret correctly what they are seeing? What questions should they be asking so they can see the differences in the sentence?

3) Question everything. Question the numbers. Query when things don’t look right. Query information when you are told that doing something will help you. Can you trust your sources of data? If you read something then go away and think about it, perhaps look in other places to see if they agree with the data. Don’t rush the search for data. Think about whether you actually need it and whether you can trust it.

Working in science and maths enables researchers to listen to different view points because there will be some data attached. These viewpoints can be refuted or endorsed if new data is collected.

In real life it is far to easy to dismiss opinions just because you don’t like them and far to easy to give them because there is no actual data to refute them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s