Data collection

Add to favourites
  • Planning your data collection
  • Sampling
  • Collecting and storing your data
Data for improvement

Once you have decided on what you want to measure in your family of outcome, process and balancing measures, you need to work out how to get data to measure them.  You will find it useful to use the measurement plan tool.

In designing your data collection, make sure that:  

  • You have clear operational definitions for the measures
  • You collect data as often as possible, and plot over time
  • The people doing the improvement own the measurement
Planning your data collection

Having decided what to measure, you need to work out how to get the data on a regular basis.  Keep this as simple as possible.  Labour intensive data collection is usually doomed to failure.  If data can be obtained from existing sources, or built into work routines this will give you the best chance of success.  Some things to consider:

  • Who will collect the data? - Someone needs to do it, and they need to know it’s them (or when it’s them).  Ideally, it’s the people doing the work – but not at the expense of the work getting done.
  • Where will they collect the data? - Need to know at which part of the process data will be gathered, and in which location(s).
  • When will they collect the data? – does it need to be collected at a particular time, or on a particular day of the week? 
  • How often will they collect the data? - Need to agree frequency of data collection. This depends on process throughput and cycle time.  There may be a trade-off between getting enough data at each time point, and more time points as illustrated in the charts below showing weekly and daily data.
Data points
Sampling

One of the best ways to avoid building an industry around measurement is to use sampling; it can reduce data burden, saving time and resource.

Depending on what you’re measuring you might want to think about different sampling approaches – see box below.  The most important thing is to make sure you’re getting data that tells you what you need to know. 

Data

 

We are often more used to data for research or judgement.  In those cases, we are trying to say something about the population and need to make sure we have an acceptable level of accuracy.  In improvement the most important thing is to learn, and to sample in such a way that we can understand any changes over time.

Sample Size

The number in your samples will come down to balancing what’s practical with getting useful information. Generally, the bigger your sample, the less variability there will be in your results, so it will be easier to detect change.  

Other factors to consider are: 

  • The type of data (attribute or variables).  In general, you need bigger samples for attribute data.  If you are looking at averaging a variables measure then 5 could be a reasonable sample size whereas this is not really enough if your measure is a percentage, in which case 20 would be best, but 10 should be considered as a minimum.
  • The amount of common cause variation in your measure of interest.  If your data is quite consistent then your sample size can be smaller than if your variation is high.  An initial data collection with a larger sample size might be needed to understand this.
  • The availability of data, or how time consuming it is to obtain data.
  • The expected visibility of the data – if you will need to use the results to influence others, you might want to consider bigger samples than if it is just for your team.
Collecting and storing your data

When starting to collect data, one of the things that people often try and do first is adapt their IT systems. This can take a long time, during which you have no data, and no tests of change.  In general, if you can start with a simpler method – even just writing it down – you can get going sooner.  This also gives an opportunity to find out whether the measures are going to be the right ones to tell you what’s happening.

Excel is a useful and widely available way to store and analyse your data. If you are new to it, then find someone who can help.  An excel file tends to be called a workbook, and this contains separate worksheets.  Plan how you are going to arrange your data and keep your data files tidy.  

Create a worksheet for each measure.  Put your dates or other identifiers in the first column, with a title in the top row.  Put titles on each column that says what goes in them.  If your measure is a percentage or a rate, create a column for the numerator and column for the denominator (don’t just calculate the % and throw away the underlying data).   

Stratification / Influencing factors

Stratification is the separation and classification of data according to selected factors, reflecting known or suspected differences in the process or outcome.  For example, by shift, day, team or characteristics of service users such as sex or age group or reason for attending.  Make sure you record any necessary information that you might need to stratify by.  

A chart that is exhibiting fairly regular patterns can be a sign that the data should be stratified.  In that case you would do a separate chart for each stratum – e.g. weekdays vs weekends.  If this could be a possibility make sure you’ve kept a note of date / day of the week.

There may also be other factors that might influence your measures that you should consider before collecting your data. Think about it up front and record the information you need, or try to keep it constant so that it’s not introducing variation to your data.  

For example, if you think time of day will influence your data, try and collect at same time each day, or at least record the time so you can check.  Maybe things are different if it’s raining…  if so record the weather or you will forget!