Brett Romero

Data Inspired Insights

Month: April 2015

Labor Statistics Part III – The Unemployed

Following on from Part II where I looked at the population of people who had left the labor force completely, this week I turn my attention to the unemployed. The unemployed are defined as those who are currently not employed but have made “specific efforts to find employment some time during the previous 4 week-period ending with the reference week”. Chart 1 maps the unemployment rate since 1948.

Chart 1 – US Unemployment Rate 1948 to 2015chart_3_1

Courtesy of the Bureau of Labor Statistics, there are several ways we can divide up the population of unemployed people to better understand what is driving the changes over time.

Cause of Unemployment

The first breakdown (shown in Chart 2) is the unemployed population (as a percentage of the total civilian labor force) broken down according to the cause of unemployment:

  1. Lost a job
  2. Left a job
  3. Rejoining the labor force after some hiatus
  4. Joining the labor force for the first time

Chart 2 – Unemployed Persons by Cause 1967 to 2015

chart_3_2

From this breakdown, the first conclusion we can draw is that people losing their jobs drives almost all the variation in the total unemployment rate over time. This stands in stark contrast to the population of job leavers and new entrants to the labor force, both of which have remained remarkably consistent over a long period of time.

The second thing to note is that the changes for those reentering the labor force appear to track the changes for job losers, but with smaller peaks and troughs. This suggests that when there is a spike in people losing their jobs (due to a recession for example), a population of people who had left the labor force is returning to look for jobs. Although counterintuitive (why would you rejoin the labor force in the middle of a downturn?), this likely reflects cases such as a family where the primary breadwinner loses their job, and both parents begin the hunt for jobs to make ends meet.

This is interesting primarily because it shows a feedback loop that potentially increases the spike in unemployment in a downturn. That is, just as large numbers of people are getting laid off from their jobs, an additional population of people who weren’t in the labor force also begins looking for jobs, further boosting the population of unemployed. Conversely, this also means that unemployment can fall much quicker than anticipated (for example when one parent becomes employed and the other drops out of the labor force again).

Education Level of the Unemployed

Chart 3 shows the unemployed population broken down by education level and the obvious conclusion to draw is that your teachers were right; finishing school will help you get (and keep) a job. The rates of unemployment for those people that didn’t finish high school are significantly higher than for everyone else, keeping in mind this is for people actively looking for work (as opposed to cruising on their parents couch or living off a wealthy spouse). Conversely, the unemployment rate for those that completed a bachelor’s degree or higher is by far the lowest of the four groups.

Chart 3 – Unemployed Persons by Education Level 1992 to 2015

chart_3_3

The other observation to be made is that there is not a huge difference in the unemployment rates for those that finished high school but didn’t go on to further studies, and those that went on to get an associates degree or attend, but not finish, college (university for those not in the US). Contrast this with the large gap between the ‘Some College/Associates Degree’ group and the ‘Bachelor’s or Higher’ group, and the advantage of graduating from college (at least in regards to getting employed) becomes plain to see.

Length of Unemployment

One of the more interesting and discussed breakdowns of the BLS unemployment data is the breakdown by length of time unemployed. Chart 3 shows how these percentages have changed over time for three groups:

Chart 4 – Unemployed Persons by Length of Unemployment 1948 to 2015

chart_3_4

The main feature that immediately stands out on this chart is the huge spike in the percentage of people unemployed for more than 15 weeks in 2009. This peak is well well above anything since the end of World War II and remains high today. This indicates that in addition to unemployment spiking in the global financial crisis (as we saw in Chart 1), people tended to stay out of work for significantly longer than in any other downturn since the end of World War II.

What this chart also shows us is how far the US economy is from what would be considered ‘historically normal’. For most of the past 60 years, the majority of unemployed people were unemployed for less than 5 weeks, followed by those unemployed for 5-14 weeks, and then finally the smallest group was those unemployed for 15 weeks or more. However, with the financial crisis we saw this split reverse and, unlike previous downturns, over 6 years after the financial crisis the population of people unemployed for 15+ weeks is still significantly higher than the population of people unemployed for less than 5 weeks.

Further confirming this shift, an additional series that the BLS produces is the average weeks unemployed (see Chart 5). From this chart we see that the latest downturn caused a huge spike in the average weeks unemployed, but also that the average period of unemployment remains at a level higher than at any other point pre-crisis.

Chart 5 – Average Period of Unemployment 1948 to 2015

chart_3_5

The other interesting point from Chart 5 is that even before the spike in 2009, if we look past the ups and downs of the recessions and recoveries, there appears to a trend of slowly increasing average time unemployed in the preceding 60 years. What would cause this average to creep up over time? It is likely to be a combination of a number of factors. Below are some factors that have occurred over time that could help explain this trend:

  1. Professionalization of recruiting – recruiting is increasingly a function that is handled by a professional team within an organization, or is outsourced to a professional firm, even for smaller companies. This practice generally ensures a certain minimum standard of hire, but also means it is increasingly rare that a firm will take a chance on someone with a long period of unemployment or a spotty employment history.
  2. Increasingly technical nature of jobs – with many professional jobs, even outside of the tech world, there is increasing pressure to continually develop new skills and adapt to new software and best practices just to keep up with the requirements of the job. As difficult as this can be for someone in the job, it is essentially impossible for someone who is unemployed leaving that person heavily disadvantaged in the job market.
  3. Improved ability to validate work history – previously, if a person had been unemployed for an extended period, they could fudge the dates (or flat out lie) with little chance of being found out. In 2015, with online networks such as LinkedIn and generally more thorough background check processes in place, it is much more difficult to get away with this type of deception (although it definitely still happens).

Many of these changes would appear to be positive changes, such as increasing professionalism in the recruitment process and less room to mislead potential employers, so surely we are just reducing the number of dishonest people and under qualified children of bosses/friends getting jobs? That is probably true to some extent. But what is also true is that those underdog stories that we love to hear about and watch, like a mother becoming hugely successful after years of staying home to raise the kids, or a super smart kid scamming his way into a prestigious law firm, are becoming close to impossible in reality. For better or worse, the job market is becoming a place for the Louis Litts of the world, not the Mike Rosses.

People in Part Time Work

Finally, although officially classified as employed, the BLS also tracks the number of people who want full time work but that are currently only working part time (also referred to as ‘under employed’). The change in this population is shown in Chart 6.

Chart 6 – Persons at Work Part Time for Economic Reasons 1956 to 2015

chart_3_6

One of the criticisms of the recovery post-2009 has been that it is a “part-time recovery” (see here and here for example). In other words, the belief is that the jobs being created are mostly part-time jobs and so the unemployment rate is not accurately reflecting the poor state of the economy. However, we can see that although the peak in 2009 was high (but not the highest, the peak in this series was actually 6.2% in October 1982), it has since fallen back to around average for the period and continues to fall in both absolute and percentage terms.

Watch this space for the final part of this series, Part IV, where we will explore the employed population.

Excel Tips – Array Functions

Excel’s array functions probably rival pivot tables for the title of most misunderstood and underutilized features available to Excel users. That is a shame as they are powerful tools that can be used to simply and elegantly address some of the key problems that users will regularly encounter. Although there is an almost endless list of ways in which array functions can be used, below I discuss three scenarios that I regularly encountered in which array functions can be useful.

Ensuring Formula Consistency

One of the simplest and most useful ways to utilize array functions is to ensure the consistency of formulae in large spreadsheets. There are numerous infamous examples where simple formula errors have undermined the credibility of analyses done in Excel (including a recent popular economics paper) and although this method (or any method for that matter) can’t prevent all errors, it can help to minimize the places where things can go wrong.

This method can be applied to any situation where you have a column or row in which you need to apply the same formula repeatedly (anywhere you would normally drag the formula across or down). To use an array function in this situation, select the full range of cells you need the formula to apply to and then create the formula in the same way you would normally, except replacing the single cell references with references to the range of cells instead. When the formula is ready, press Control + Shift + Enter to confirm it [1]. A simple example summing two columns is shown below:

sample_2_1

Using this method ensures that the formulae are consistent and also that individual formula in the column or row cannot be modified. You can test how this works yourself – once you have created an array formula (or opened the example file at the bottom of this page) try to delete or modify one of the individual cell formulas. You should be presented with an error like the following:

sample_2_2

This error prevents any modifications that users may inadvertently make, including the deletion of a row or column in the dataset.

Transposing Datasets

Occasionally the need will arise to transpose data (convert data that runs across a row into data that runs down a column, or vice versa). In my experience, these cases typically involve time series data running across a spreadsheet being transposed to run down the spreadsheet to make it easier to view.

The method here is to highlight the cells where you want the transposed results to display and enter the function “=TRANSPOSE( “, then highlight the values to be transposed. Note, that if the size of the ranges selected do not match (e.g. the number of columns to be transposed is greater than the number of rows in the range selected to display the results) the extra values will be dropped, or in the reverse case, the extra cells in the result set will be display as errors. An example is provided below:

sample_2_3

The big advantage of doing this (as opposed to simply copying and pasting the set with the transpose option selected) is that it maintains a live connection. Any updates in the original dataset will be reflected in the transposed dataset.

SUMIF with two conditions[2]

One of the primary reasons I started learning about array functions was this exact problem – how do I do a SUMIF with two conditions? This may seem like an esoteric requirement but, as the datasets you are working with become more complex, the need to summarize the data by multiple criteria becomes increasingly useful. Looking at an example, let’s imagine we have a small dataset of campaign contributions that contains the name, gender, city and the amount contributed for 10 individuals. This data is shown below:

sample_2_4

Now let’s imagine you want to identify the total contributions from all females in Boston. Broadly speaking, there are three ways you can do this:

  1. Manually select all the values and add them up
  2. Create a pivot table [3]
  3. Use an array formula

Often, the actual dataset is too big for method 1 (any dataset with more than 10 entries probably falls into this category), and a pivot table can be overkill when you only need to derive one value. In these cases, method 3 can be the simplest option.

To create our SUMIF with two criteria, we create a formula that actually looks like a SUMIF in some ways. Essentially we create two nested IF statements inside a SUM function. Our final formula is going to be as follows (if you are recreating this, don’t forget to use Control + Shift + Enter):

sample_2_5

The formula works by creating an array of TRUEs and FALSEs based on the criteria specified in the IF statements (i.e. that the value in the gender column = “F” and the value in the City column = “Boston”). If both conditions are met, the formula captures the corresponding value in the specified range (the contributions column in this case); otherwise, the FALSE value (zero) is captured. Once there is an array of values and zeroes, the SUM function will sum up all the values to provide the answer.

Finally, this structure can also be used for a range of functions including:

  • MAX – replace SUM with MAX
  • MIN – replace SUM with MIN and the false value (zero) with a value larger than the maximum value in the dataset
  • COUNT – set the TRUE value to 1 instead of a range
  • AVERAGE – use the COUNT and SUM functions described above to create an average

Still confused? Please feel free to download this example file: 

 

[1] This combination confirms to Excel that the formula is an array formula. If you simply press enter, the formula will not work as expected. Completed array formulas will display in the formula bar surrounded by {}, but you do not manually type these characters in.


[2] Since writing this article I have become aware of two new(ish) Excel functions that have effectively replaced array functions for this purpose. SUMIFS and COUNTIFS are built in functions available from the 2007 version of Excel onwards which allow the user to sum or count records and specify multiple criteria.


[3] Pivot tables are very useful for a range of purposes, but will be covered in a separate blog entry

Labor Statistics Part II – The Non-Participants

Previously in Labor Statistics Part I, we looked at data from the Bureau of Labor Statistics that showed, among other things, a falling participation rate since the turn of the century (see Chart 1). We also saw that even though unemployment has fallen over the last 5 years or so, that fall was at least partially a result of people leaving the labor force entirely rather than finding employment. Here we will take a more detailed look at those people classified as not in the labor force (those deemed to be not participating) to see if we can explain why the participation rate is falling.

Chart 1 – Participation Rate vs. Employed Population as a Percentage of Total Population 1947 to 2014

chart_4

Courtesy of Bureau of Labor Statistics (BLS) data we can actually break the population of people not in the labor force into several different subgroups, starting with two main groups: people who want a job (but don’t fall under the unemployed category); and people who don’t want a job. These populations can then be further broken down into subgroups based on: gender; race; and age. For our purposes we are going to focus mainly on the two main groups, people who want work and people who do not, with the latter further split into three age brackets (16-24, 25-54 and 55+). Chart 2 shows the break down of these groups (12 month moving averages have been used to smooth the series).

Chart 2 – Breakdown of Population Not in the Labor Force 1995 to 2015

chart_2_2

Looking at Chart 2, aside from an increase of over 20 million people over the last 20 years, no obvious trends emerge – none of the subgroups really appears to be shrinking or expanding significantly and all seem to be more or less moving in line with population growth. And, in fact, if we look at the percentages, this is more or less what we see:

Chart 3 – Breakdown of Population Not in the Labor Force – Percent of Total – 1995 to 2015

chart_2_3

Keeping in mind that the left hand axis of these charts has been zoomed in to exaggerate any changes, we can see that over 20 years, the proportion of the total population of people not in the labor force in each age bracket has not changed significantly. Summarizing:

  • Don’t Want a Job – 55+ years: fell a couple of percent from 2001 through to 2011, but in the past four years has basically recovered to where it was pre-2001.
  • Don’t Want a Job – 25-54 years: slowly increased around 1.5% from 1995 to 2005, but has now fallen back below 1995 levels.
  • Don’t Want a Job – 16-24 years: Increased around 5% from 1995 to 2011 but has fallen back a couple of percent since.
  • Do want a job – All Ages: Fell 4% from 1995 to 2001, was flat from 2001 to 2009 but has increased around a percentage point since.

Looking at the data this way, there aren’t many conclusions to draw. There is not enough real movement in the numbers to suggest anything is fundamentally changing in the pool of people considered not in the labor force.

However, one of the issues with analyzing data like this can be that the large pool of existing people can tend to obscure more rapid changes happening when you look at new entrants to, and exits from, that pool. So what does the picture look like if we analyze only the changes in the population from year to year?

Chart 4 – Annual Change to the Population Not in the Labor Force 1995 to 2015

chart_2_4

* 2015 figures are only for January, February and March

Looking at Chart 4, we can see several interesting trends and changes.

The first trend that becomes clear is how much of the growth in the population of people not in the labor force since the global recession in 2008-2009 is due to over 55 years olds. In fact, from 2010 to March 2015, the 55+ years age group contributed just over 75% of the growth in the population of people not in the labor force. And this trend has been accelerating. In 2014, the last year for which we have complete data, the 55+ years age group contributed close to 95% of total growth.

The second trend that stands out concerns 16-24 year olds. In this case there are large increases in the number of 16-24 year olds entering the population not in the labor force immediately after the two major downturns, the bursting of the tech bubble in 2000 and the financial crisis in 2008-2009. This suggests that following a downturn, young people are electing to either delay entering the workforce to stay in school longer, or are leaving the workforce and going back to school.

The third and final observation relates to those who indicated that they want a job, but do not meet the formal criteria to be classified as unemployed. In 2008 and 2009 we see large increases in the population of people who wanted a job but are not in the labor force. This is expected and corresponds with the financial crisis when millions of Americans lost their jobs. However, from 2010 onwards, the rate of this increase slowed, and, in some years, even reversed.

Drawing all the threads together, what are the main conclusions we can make from the above observations? My main takeaways are as follows:

  1. The growth in the population of people not in the labor force (and the decline in the participation rate), particularly over the past 5 years, has been mostly driven by over 55 year olds. This actually agrees with one of the more prevalent theories for why the participation rate is falling – the baby boomer generation is reaching retirement age and retiring en masse. Additionally, this trend of decreasing participation seems set to continue, meaning a decreasing percentage of the population will be required to fund the government for an ageing population.
  2. 16-24 year olds tend to drop out of the labor force in the immediate aftermath of downturns. Given the age range, a logical explanation for this is that, when the job market is poor, young people elect to either stay in school longer or return to school to pursue further studies. The positive news is that the spikes in young people leaving the labor force appear to be short term. Additionally, taking a long-term perspective, more young people pursuing higher education will arguably benefit the wider economy in later years.
  3. Since 2013, the number of people not in the labor force who do want a job appears to be on the decline (or has at least leveled off). This in turn suggests the current recovery is a genuine one, even in the face of a falling participation rate.

Finally, I want to leave you with one last chart. Chart 5 shows the changes in the proportion of men and women in the population of people not in the labor force. The positive news is that, although there is still a gender divide, it appears that the gap is continuing to close. In fact, 2014 was the year in which women represented the smallest proportion of the population not in the labor force, and the first time (probably since the US existed) that the percentage has dropped under 60%.

Chart 5 – Gender Split of Not in the Labor Force Population 1947 to 2014

chart_2_5

Excel Tips – Template to Table

There was an interesting Excel problem I encountered a while back which I thought I would share as the solution ended up being quite well received and even got me a nice Swiss dinner.

Problem

The problem arose out of a reasonably common scenario – a friend of mine had begun working in a new workplace where they had been using a MS Word template to collect the results of survey they had been undertaking. Being more of a data focused person, she managed to convince them to migrate the template to MS Excel as this would allow easier aggregation and analysis of the data. However, in convincing them to migrate, she had to design the Excel template in a manner more or less mirroring the Word template, which meant merged cells, data in various columns and rows on the spreadsheet, and generally no easy way to tabulate the data across multiple surveys.

After the surveys starting coming in from various areas and she had spent an inordinate amount of time manually copying and pasting the results into a table, she asked me if there was a better way to do this.

Solution

After a couple of false starts, I came to a point where I thought the only way to do this was going to be a macro. The issues with using a macro are:

  • if it breaks, it is likely that no one else will be able to fix it
  • it isn’t easy for others to determine what exactly the macro is doing, and
  • moving a macro across multiple spreadsheets and workbooks can get messy.

Asides from those issues, non-technical people tend to get a little squeamish around macros.

I ended up stumbling across this solution almost by pure chance as I happened to be doing something at work that resulted in me using an Excel function that I hadn’t used very often: INDIRECT. What INDIRECT does is takes any string (either typed directly into the formula, referenced from another cell, or some combination of both) and tries to read it as a cell reference.

For example – if you type “A3” into a cell, then reference that cell (“=A1” for example), the result will be that both the original cell and the cell with the reference will now read “A3”. But what if you wanted the cell referencing the original cell to actually use the value in that cell as a reference to look up a third value in cell A3? That is where INDIRECT comes in:

sample_1_1

In this simple example we have used it to reference a cell in the same sheet, but we can also use INDIRECT to build a reference to a different sheet or workbook. And that’s where this is going. All the templates were the same – the same cell reference for each data point every time – but the name of the sheet (or workbook if they haven’t been added together yet) was always different. So what if we use an INDIRECT formula to dynamically look up the name of the sheet, which the user inputs? We can then use that value to complete the cell references for all the cells looking up the information on that sheet. Here is an example of what that would look like:

sample_1_2

Once you have the basic formula working there are 3 steps left:

  1. Wrap the formula in an IFERROR clause ( =IFERROR(formula, “”) ) so that if the ‘Sheet Name’ column is blank, you get blank data fields instead of a ton of #REF errors.
  2. Manually ‘program’ the first row. You will have to ensure all the right cells are being looked up in the right columns.
  3. Drag the formula down for as many rows as you like.

Once this is done, users of the workbook will be able to move a completed template into this workbook, enter the sheet name in the table, and all the results should immediately populate in the table. No Macros, no mess, and a convenient and simple way to convert information from an irregularly formatted template into tabular data.

Still confused? Please feel free to download this example file: 

Labor Statistics Part I – Setting the Scene

The unemployment rate: in western countries this tends to be one of the most discussed and politicized of the official statistics produced by Governments. In the US, the unemployment rate has been under a higher than usual level of scrutiny since the 2008 financial meltdown led to historically high levels of unemployment.

However, as of February 2015 the unemployment rate has fallen to 5.5%, meaning it is back within the normal historical range and is expected to keep falling. So everything is all good right? Maybe.

Chart 1 – US Unemployment Rate Jun 1976 to Feb 2015

chart_1_1

Before we get into that, first a bit more detail on how the unemployment rate is calculated. For such a well-known statistic, it appears to be relatively poorly understood outside of the world of policy wonks. Ask the average person to guess how the unemployment rate is calculated and they are likely to guess something along the lines of the following:

equation_1

Someone with a bit more time to think on it may consider the fact that not all of the population are of a working age and factor that into their guess:

equation_2

But even this more refined calculation would result in an unemployment rate well into the 30%-40% range. The reason for that is, as the Bureau of Labor Statistics (BLS) outlines, to qualify as unemployed, a person has to be part of the Civilian non-institutional population [1] and meet one of the following two criteria (emphasis mine):

  • had no employment during the reference week, were available for work, except for temporary illness, and had made specific efforts to find employment some time during the 4 week-period ending with the reference week, or
  • were waiting to be recalled to a job from which they had been laid off.

As I am sure you can imagine, this definition leaves a lot of people that many would consider “unemployed” in some third pool, neither employed nor unemployed as the BLS defines it. These people are actually in a pool labeled “not in the labor force”.

From Bureau of Labor Statistics data, in 2014, people in the labor force made up 62.9% of the Civilian non-institutional population, leaving just over 92 million people outside the labor force. But this percentage of people in the labor force (also called the “Participation Rate”) has changed significantly over time. Chart 2 shows the Participation Rate since 1947:

Chart 2 – US Participation Rate 1947 to 2014

chart_1_2

There are several interesting things to notice here.

Firstly, from the early 60s until the late 90’s, the percentage of people considered in the labor force surged – from 58.7% in 1963 to 67.1% in the year 2000. To give you an idea of how that translates into numbers of people, that increase meant there were an additional 17.8 million people in the labor force in the year 2000 than there would have been if the participation rate from 1963 had remained unchanged.

Who were all these extra people? Most of this increase represents the movement of women into the labor force over time and the rise of the two income household, both of which can be seen in the increasing participation rate for woman (see Chart 3). Although this indirectly led to a lower participation rate for men, the overall result was an increase in the participation rate in general.

Chart 3 – Participation Rate by Gender 1948 to 2014

chart_1_3

The other major trend we can see in the Participation Rate over time is the downwards trend since 2000. By the end of 2014 the US had returned to a Participation Rate not seen since 1977. The result is that even though the headline unemployment rate has dropped back down to 5.5% as of February 2015, once the lower participation rate is factored, the picture isn’t nearly as rosy, as shown in Chart 4:

Chart 4 – Participation Rate vs. Employed Population as a Percentage of Total Population 1947 to 2014

chart_4

This has not gone unnoticed (see ForbesCNN and Bloomberg for example), but what is the cause and what impact does this have for public policy going forward? This will be something we will explore over the next few weeks in a series of articles – watch this space.

In the meantime, keep an eye on the Datasets section of this website for downloads of the various datasets being used in these articles.

 


[1] Persons 16 years of age and older residing in the 50 states and the District of Columbia, who are not inmates of institutions (e.g., penal and mental facilities, homes for the aged), and who are not on active duty in the Armed Forces.

© 2018 Brett Romero

Theme by Anders NorenUp ↑