Medical statistics books download




















Please check carefully , title and other book information before purchased to make sure it is the right book. No deadline once you receive your Ebook downloable PDF through your email, you can keep it forever and there is no deadline for using it Interactive Features Easily highlight text, take and share notes, search keywords, print pages, and more. Some titles have flashcards, definition look-up, and text-to-speech too.

Your Device, Your Way Access online textbooks from your laptop, tablet, or mobile phone. Mac vs PC? Once they are sent to your email, there is no refund will be offered Cant find the book you want? Let me know and I will get the book for you For more info, please contact us. Reviews 0 Reviews There are no reviews yet. Currently, we only ship within the US.

Monday — Friday delivery time is days. Hence it is critical that abstracts are well-written, accurate and unbiased. Sometimes, sub-group analyses are reported in abstracts as if they were the primary analysis. This is misleading, especially if the primary analysis is not reported.

If the results give rise to a new hypothesis, state this clearly. Objectives: To test the hypothesis that nurse led follow-up pro- grammes are effective and cost effective in improving quality of life after discharge from intensive care.

Design: A pragmatic, non-blinded, multicentre, randomised controlled trial. Setting: Three UK hospitals two teaching hospitals and one district general hospital. Participants: patients aged 18 years or more were recruited after discharge from intensive care between September and October Intervention: Nurse led intensive care follow-up programmes versus standard care.

Main outcome measure s : Health related quality of life measured with the SF questionnaire at 12 months after randomisation. A cost effectiveness analysis was also performed. Results: patients were recruited and completed one year follow-up. At 12 months, there was no evidence of a difference in the SF physical component score mean Further work should focus on the roles of early physical rehabilitation, delirium, cognitive dysfunction, and relatives in recovery from critical illness. Intensive care units should review their follow-up programmes in light of these results.

These results agreed with those presented in the main body of the paper although the number followed up to 12 months was not explicitly stated in the paper. The PRaCTICaL study of nurse led, intensive care follow-up programmes for improving long term outcomes from critical illness: a pragmatic randomised controlled trial. BMJ ; b Research articles: introduction and methods sections Introduction section The introduction section gives the background to the current study and often includes details of previous research work in the subject area.

Extract from the Introduction of a research paper1. However, little has been published about factors linked to high service demand or about variations in demand across the country. Carlisle et al. Wass and Zoltie reported that increased use of accident and emergency departments is disproportionately high among elderly patients.

Methods section The methods section should describe how the study was conducted. Assuming power of 0. This week was chosen as having a low probability of extreme weather conditions and to avoid school and public holidays, both of which may affect the nature and volume of calls.

Data for and were already held in electronic form, having been taken from routine data forms LA4s by the LAS Management Information depart- ment. The following data were retrieved for each call: time and date, patient age, and patient sex.

Virtually all calls were made for a single patient, allowing us to calculate call rates using the resident population for Greater London. The earliest year, , was used as the baseline so that changes in and were each compared with This allowed us to maximise the use of the data available. All analyses were performed using Stata version 7. References 1 Hall GM. Emergency call work-load, deprivation and population density: an investigation into ambulance services across England.

Flowcharts are a useful way of presenting these data see Fig. See also b Presenting statistics: managing computer output, p. More recently, online publishing has enabled additional information to be made available on journal websites, to supplement the data in the printed journal version. Figure 4. Reproduced from BMJ, Engebretsen et al. Immobilisation versus immediate mobilisation after intrauterine insemination: randomised controlled trial. Radial extracorporeal shockwave treatment compared with supervised exercises in patients with subacromial pain syndrome: single blind randomised study.

Although this section tends to include less statistics than the results section, a sound understanding of statistics is important in forming conclu- sions and critically evaluating the study methodology. Adjusted mean vocabulary scores of children with hearing impairment, assessed at the age of 5 years, were higher in children enrolled before 11 months of age in an early intervention program in Nebraska than in those enrolled at 11 to 23 months of age by 0.

Together with the lack of power this may explain why few studies have been able to show this dose-response relation. Language ability after early detection of permanent childhood hearing impairment. N Engl J Med ; 20 — Hormonal contraception and risk of venous thromboembolism: national follow-up study. Presenting statistics: managing computer output Computer output It is common practice to use a computer program to perform statistical analyses.

These often produce more results than are needed and so the relevant results need to be extracted and put into a new document in a new format for presentation.

Even if the computer only gives the rel- evant results, these may not be suitable for presentation because they are usually given to too many decimal places. The data in the example on b p. The researchers used a Mann Whitney U test equivalent to the b Wilcoxon two-sample signed rank test, p.

The computer output from a statistical test is shown with an arrow indicating the P value that can be reported. The text below the computer output illustrates how the results could be reported in a paper.

Many more examples for both SPSS and Stata are given in Presenting medical statistics, and each example also gives the commands needed to perform the particular analysis in the statistical program. More details about statistical programs in general are given in b Chapter 5, p. Wilcoxon W The data are presented as medians and interquartile range IQR. Results section The median IQR number of portions of fruit and vegetable eaten per day at baseline among smokers was 3 2, 4 and 3.

Presenting statistics: numerical results Rounding Computers usually give results to many decimal places and these should be rounded for presentation to make them easier to read and to avoid implying a falsely high level of precision. The following suggestions make numbers easy to read and absorb but also include all relevant information.

State the actual number as well unless it is obvious. Very small proportions may be easier to read if given as rates per or per 10 , etc. Simply give the numbers alone. Statistical programs give P values to many decimal places and these are not needed for reporting. For example if two P values are close to the 0. P values are probabilities, presented as proportions, and so it is unnecessary to report many decimal places as this obscures the meaning. The following are P values as given by a statistics program.

They can be rounded and reported as shown: 0. Presenting statistics: tables and graphs Introduction Tables and graphs are a useful way of presenting the results of statistical analyses. When used in written reports, a table or graph should stand alone so that a reader does not need to read the text of the report or article to be able to understand it.

Table 4. Common errors 0 Avoid graphs with missing zeros or stretched scales, which can exag- gerate relationships see Fig.

Example of stretching the scale Figure 4. By stretching the scale second graph , the effect looks more dramatic. Stillbirth rate — 6 Rate per births 5 4 3. Stillbirth rate — 5. Randomised trial of high frequency oscillatory ventilation or conventional ventilation in babies of gestational age 28 weeks or less: respiratory and neurological outcomes at 2 years.

Statistics and the publication process Introduction Many journals in medicine and health research now include statistical review as part of the peer-review process in response to the increased use of statistics in these disciplines.

As a result of this move towards statistical review, journals have developed guidelines for authors, which include a section on the statistical aspects. These are discussed in more detail in the section b Research articles: guidelines, p. In recent years more guidelines have been developed for other study designs. Some of these are listed in the next section, b Research articles: guidelines continued , p.

State the results in absolute numbers when feasible e. See also M www. Up-to-date lists and links can be found on the website, and some of those available at the time of writing are as follows: TREND: non-randomized controlled trials M www.

Statistical problems in medical papers Common causes for rejection Statistical review is wider in scope than might perhaps be expected. Alternatively, it may be that the methods used are not appropriate and that the analysis needs to be repeated using a more suitable method. If uncertain about statistical comments on a paper, it is worth talking to a statistician, who might be able to provide advice on how to respond.

Choosing and using statistical software for analysing data. Statistical software packages Choosing a package Using a package Examples of using statistical packages Using spreadsheets for analysis Transferring data between packages Common packages Introduction In this chapter we will describe the main features of statistics packages, what they do, and what they do not do.

We will describe how we as users interact with packages, how we transfer data between packages, and how to decide which package to use. There are many statistical analysis computer packages and programs on the market and this chapter will not provide a review of what is available.

Instead we will discuss the main issues that drive the choice of package to use. Statistical software packages What is a statistical package? A statistical analysis package is a suite of computer programs that can be used to carry out manipulations of data and perform statistical analyses. Most of them have a user-friendly interface and do not require the user to be an expert in statistical programming. Many statistical packages are produced by commercial companies and can be purchased from suppliers or bought online.

There are increasing numbers of programs and pack- ages available free on the Internet, although the onus is on the user to check that they come from a reputable source, as they may not have been checked to the same extent as commercial programs. What do statistical packages do? How packages work Most statistical packages in common use among medical researchers are either menu-driven or command-driven. Menu-driven packages provide options for the user to select, in menus that are usually hierarchical in design.

This has the advantage that the user does not have to remember commands or computer syntax. Command-driven packages work by the user entering a particular command, which will execute the required process or statistical method. This method is usually quicker than using menus, but it does require the user to remember and enter the actual commands. If the syntax is entered incorrectly, for example with a typo, the command will not run. That is, the results come back immediately after each command is entered or menu selected.

Packages used over networks may run online or off-line. Costs Some statistical packages run under a licensing arrangement whereas others are sold with perpetual licences. For most commercial pack- ages, updated versions are regularly supplied by the vendor to allow new statistical procedures to be incorporated or existing procedures to be extended. These are usually cheaper for existing customers.

Some software is available on an institutional licence. Prices for individual and licence copies may be a little less for academic institutions than commer- cial institutions, and greater discounts may be available for students. An increasing number of statistical packages can be bought on the Internet and some allow you to download the full version and try it out for free for a few days.

Scope of packages The scope varies hugely, with some packages providing a very wide range of utilities. Some, such as SPSS,4 are sold as a basic package with a number of specialized add-ons. Other packages, such as Stata,5 have user-written procedures that are available free online to licence holders. Stata also has several different versions which are priced according to the size of the dataset that the package will analyse. Choosing a package Introduction There are a number of things to consider when choosing a statistical package.

What is your budget? Are you looking for a free package? Do you have colleagues who already use a particular package and can provide support? Do the authors or marketers of the package provide support if you encounter problems when using it? Does it have any site licences or purchasing agreements? Do you want a menu-driven package or a command-driven one? These may require a separate package or a separate add-on to an existing package.

Many packages produce good graphics but only a few claim to produce state-of-the-art graphs. Packages tend to have an upper limit for the amount of data that they can process. This may depend on the package or on the computer used, or both. Some packages sell different versions, which can process different amounts of data with the larger versions costing more. Is this easy to do? Which operating system will you be using?

Using a package Introduction Statistical packages are wonderful tools that enable us to perform complex calculations easily. They facilitate statistical analyses that would previously have been impossible to do by hand or with a calculator, and for which the details may be technically challenging. There is, however, a real danger of inadvertently conducting inappropriate analyses, since these packages make it possible to use statistical methods we may not fully understand.

This can result in a vast set of results that have no logical thread and are impenetrable. Many of us have succumbed to this danger at times since the computer is so intoxicating. For these reasons some general advice on using statistical packages follows, which will help avoid these pitfalls and improve the quality of your statistical analyses. Plan the analysis It is always good statistical practice to plan the analysis beforehand.

This applies globally to a whole project and also to individual analyses within a project. Planning helps to keep us on track and avoid unneces- sary analyses or data dredging that can lead us to make wrong inferences. It is also important to check that the statistical analyses planned are appropriate, that any distributional assumptions are met and the analyses answer the questions that are intended.

The statistical package may still perform analyses which are invalid because the sample size, or distribu- tional model, or design assumptions do not hold. Hence we need to be careful. Extracting the relevant results Many packages produce lots of output, some of which is relevant for a given situation and some of which is not. It is necessary to know what is appropriate so that we can present the results later in a concise format.

It is particularly important to report the numbers of observations included in each analysis. Missing data All research has some degree of missing data and it is important to be aware of how the package handles it. For example, missing data are some- times denoted by a blank cell or a dot. For example, multiple regression usually requires data to be present on all variables included in an analysis and so the number of subjects included in a multiple regression analysis may be much less than the total sample size if many subjects have one or more missing values for some varia- bles.

Also different multiple regression analyses may have different pat- terns of missing data and so may be based on different sets of individuals b Missing data, p. It may be necessary to export data into a separate graphics package to improve the quality of the graphs. Format We need to make sure that the data are in the format that the package accepts and to ensure that, for particular analyses, variables are coded appropriately.

Examples of using statistical packages Chi-squared test The examples that follow show the computer output results from a chi- squared test done in three commercial statistical packages: SPSS,1 Stata,2 and SAS. The test examines whether there is any evidence for a relationship between smoking during pregnancy and low birthweight. For explanation of the chi-squared test, see b Chi-squared test, p.

Value df Asymp Sig. Exact Sig. The minimum expected count is All give similar P values. This analysis has also given several versions of the chi-squared test. The values are the same as given by SPSS. SAS output This is similar to the Stata output. SAS additionally states the number of observations used in the analysis and the number of subjects with missing data.

Comparisons between outputs All three give the same test statistics and P values for the main chi-squared test. Varying additional tests statistics are given. The layout is slightly different in the three packages. For all three sets of results, the output is not suitable for reporting as it is. All three packages, and most packages in practice, give more information than is needed for reporting. The appropriate results should be extracted and reported either in text, or if part of a set of analyses, in a table see example below, and also Peacock and Kerry1.

These three packages have been shown as they are familiar to us and to illustrate what you might see when you use a package. There are many other packages which can also be used see b Common packages, p. Example: presenting results from a chi-squared test Table 5.

In this example, the results could be combined with those for other risk factors for low birthweight, such as alcohol and illicit drugs data not shown here. Underneath the table is an example of the accompanying text that could appear in a document reporting the results. Table 5. Description There was a higher prevalence of smoking during pregnancy among mothers with low birthweight babies, compared with those with normal weight babies.

Using spreadsheets for analysis Spreadsheets can be used for data entry and data analysis by means of their in-built routines. The statistical methods available are limited but there are also add-ons available for purchase online that will extend the scope of the spreadsheet. These can be found by searching on the Internet. We show below the results of doing the chi-squared analysis shown in the section b Examples of using statistical packages, p. Test of independence between the rows and the columns Chi-square : Chi-square Observed value 3.

Ha: There is a link between the rows and the columns of the table. The risk to reject the null hypothesis H0 while it is true is 5. Comment on XLSTAT output The test statistic and P value were, as expected, the same as for the other three packages shown in the section b Examples of using statistical pack- ages, p. In medical statistics we do not usually use this interpretation since it implies that if P is greater than 0.

In fact, a P value greater than 0. In this case P is only just greater than 0. Analyse-it The same analysis was repeated using Analyse-it. The package has a report facility but this was not available in the free evaluation version. Comment on Analyse-it output The test statistic and P value were the same as found in the other pack- ages.

General comments Both packages were easy to download and use for a chi-squared test but a full review has not been undertaken for either add-in. Since these and other similar add-in packages can be tried for free, it is easy to do your own evaluation before buying. Absence of evidence is not evidence of absence. It is worth thinking about this at the outset to reduce the possibility of things going wrong.

Not all packages can do this. Using a transfer program will usually mean that variable names and labels are also transferred. Other methods listed above may not do this. Has it worked properly? Have numerical data transferred correctly and are they in the right format? Check all the data if possible or a representative sample. More packages, particularly free ones, and specialized packages can be found in the same way.

It is worth trying out different ones. London: BMJ Books, Why summarize data? Introduction In this chapter we describe types of quantitative and categorical data and show how these different types of data can be summarized numeri- cally and in graphs. We give worked examples of how to calculate mean, median, standard deviation, and interquartile range, and give examples of displaying data in graphs. Often all that is needed is a count of the data items for each variable or question to check for any missing items.

For example a particular question in a self-completed questionnaire may fre- quently be missed because it is on another page. This can be picked up early by simply counting the number of replies to each question.

Data checking and data cleaning The aim of this is to make sure that the data are correct on the com- puter record. Errors can arise if a research subject mis-reports informa- tion or the researcher mis-records that information. Further errors may be introduced when the data are transferred onto a computer. This will highlight values outside the expected range but errors that are still within range will not be found in this way.

For example, the results of a study conducted in one country may apply in another country if both countries have similar baseline characteristics. For example, before doing any sort of regression analysis with several variables, simple descriptive anal- yses are needed for the variables involved to determine the individual inter-relationships.

Types of data Quantitative and categorical data In order to know what sort of statistical analysis is appropriate, it is impor- tant to know what type of data we are handling.

There are several ways of classifying data, which are discussed in this chapter, but the simplest is to consider data as either quantitative or categorical see b Quantitative data, p. This term is rather ambiguous as it can be confused with those data collected from a qualitative study, such as text obtained from in-depth interviews.

Data from purely qualitative studies are analysed using non- statistical methods and are not considered in this book. A variable A variable is a quantity that is measured or observed in an individual and which varies from person to person.

For example, blood pressure is a variable because blood pressure varies from person to person. Another example is blood group, which also varies from person to person. Note that variables can be derived when the research subject is an organizational unit rather than a person, such as when studying the use of operating theatres in a set of hospitals and calculating the proportion of time that they are in use in each hospital.

The concept of variables is discussed further in Chapter 7 b Independence: data and variables, p. Statistic A statistic is any quantity that is calculated from a set of data. For example mean blood pressure calculated in a group of subjects is a statistic. Another example is the proportion of people who are over- weight in a sample.

A statistic summarizes the data in some sense. There are many different statistics that can be calculated from data and the choice of which to use is driven partly by the type of data and partly by the purpose of the study. In many cases several statistics will be calculated from the same set of data. A simple example of this is if we calculate both the minimum and maximum age of subjects in a study — these are two different statistics, both of which are useful summary measures. Interval scales On an interval scale, differences between values at different points of the scale have the same meaning.

Ratio scales Data can be regarded as on a ratio scale if the ratio of two measure- ments has a meaning. For example we can say that twice as many people in one group had a particular characteristic compared with another group and this has a sensible meaning.

In contrast, temperature is not ratio data because we cannot say that one temperature is twice as hot as another. So, in degrees Fahrenheit the temperature is not doubled.

This is of course because of the arbitrary zero on the scale for temperature. Ordinal data Quantitative data are always ordinal — the data values can be arranged in a numerical order from the smallest to the largest. Categorical data may also have an inherent ordering and so be ordinal, such as stage of disease. For example gestational age of babies is often reported in whole weeks, such as 38 weeks, and so appears to be discrete.

It is however continuous because it could be reported to a greater degree of accuracy, for example as a decimal, such as Apparently similar gaps between categories may not have the same clinical meaning.

Similarly, calculating a mean stage of cancer for a group of individuals would be nonsensical. Where categorical data are coded with numerical codes, it might appear that there is an ordering but this may not necessarily be so. It is important to distinguish between ordered and non-ordered data because it affects the analysis.

Dichotomous data This is where there are only two classes and all individuals fall into one or other of the classes. These data are also known as binary data. Categorizing continuous data It is possible to reclassify continuous data into groups, perhaps for ease of reporting. For example it is common to report birthweight in bands, giving the numbers of babies who fall into each birthweight band. In addition, the nature of any relationships may be masked.

For example, if the relationship was curved, this may be weaker if the data were categorized and if the relationship was U-shaped, categorization may totally obscure it. The analysis may be more straightforward and more meaningful if the data are grouped.

Summarizing quantitative data Continuous data Continuous data can be summarized in several different ways and many of these are either a measure of the centre of the data distribution or a measure of the variability of the data. This mean is known as the arithmetic mean. Two other types of mean, the geometric mean and the harmonic mean, are described in the section b Geometric mean, harmonic mean, mode, p.

Median This is the middle value when the data are arranged in ascending order of size. If there are an odd number of values in the sample then the median will be the value with the same number of values both bigger than it and smaller than it. If there is an even number of values, there will be two middle values and the median will be the mean of the two.

Standard deviation This indicates how dispersed the data are and is a measure of the average difference between the mean and each data value. It is calculated by taking the square root of the variance.

The variance is calculated by summing the squared differences between the overall mean and each value and then dividing by the number of values minus one.

Since we virtually always have a sample, the SD is obtained by dividing by n—1 because it can be shown to give a more accurate estimate of the population standard deviation Range This is the difference between the smallest and largest value and is usually expressed as the minimum and maximum.

Sometimes the actual differ- ence between the two extremes is presented, but this is not a good idea as it does not show the extremes. Percentiles centiles in general The median and quartiles are examples of percentiles — points which divide the distribution of the data into set percentages above or below a certain value.

The median is the 50th centile, the lower quartile is the 25th and the upper quartile is the 75th. Although these are the most common centiles that we calculate, any percentile can be calculated from con- tinuous data. For some data, a different percentile may provide a useful summary. For example, child growth charts show several different centiles calculated from the general population to allow detection of children with poor growth.

The formula is given below b Calculation of median, interquartile range, p. Standard deviation Calculation of median, interquartile range The data These are as given in b Calculation of mean, SD, p. Geometric mean, harmonic mean, mode Introduction The mean that we calculated previously b Summarizing quantitative data, p. This gives a measure of the middle of the distribution when the data follow a reasonably symmetrical distribution, but when the data are skewed it will not represent the middle.

Most non-symmetrical data distributions have a positive skew, that is, the tail of the distribution is longer on the right-hand side. Geometric mean This is calculated using log-transformed data — each data value is replaced by its logarithm to base e. The arithmetic mean is then calculated on the new log-transformed scale and this is back-transformed using the exponential transformation to give a mean that is in the same units as the original data.

It can be used when the data are highly positively skewed, but it is not commonly seen in practice. Mode The mode is the value which has the greatest frequency. It has limited usefulness for continuous data but is useful for categorical data where it indicates the most common category. Example Figure 6. The distribu- tion is positively skewed. These data are used to illustrate the calculation of geometric and harmonic means. Calculation of geometric and harmonic means As this is a large dataset, we only show a few values before and after transformation of the data to illustrate the calculations:.

To calculate the geometric mean: 1. Note that the geometric mean is smaller than the arithmetic mean and is close to the median value, 20 g. The harmonic mean is smaller still. Choosing a summary measure for quantitative data Introduction It is usually useful to present more than one summary measure for a set of data and we give some suggestions as to what summary measures will be useful in different situations. If the data are going to be analysed later using methods based on means then it makes sense to present means rather than medians.

If the data are skewed they may need to be transformed before analysis and so it is best to present summaries based on the transformed data, such as geometric means. See notes on transformations b Transforming data, p.

This may be particularly useful when comparing two groups where the medians are the same but the outer tails of the distributions are different. If analyses are planned which are based on means then it makes sense to be consistent and give standard deviations.

In this case the untransformed standard deviation can be given or another measure of spread. This is discussed further in Chapter 8 b Transforming data, p. Summarizing categorical data Unordered categories nominal data These can be summarized using the frequencies in each category together with either the overall proportions or percentages. The choice of whether to use proportions or percentages is a personal one although percentages are more commonly seen.

The complete set of fre- quencies is the frequency distribution. An example is given in Table 6. Table 6. Ordered categories ordinal data These can also be summarized by frequencies and percentages as above but in addition we can calculate cumulative frequencies and percent- ages. This can be useful to show the percentage below a certain cut-off.

Cross tabulations It is often useful to tabulate one categorical variable against another to show the proportions or percentages of the categories of one variable by the other for example, see Table 6. Section 2 - Snapshot of Health and Well-being in England, 8.

London, Crown Publications, The rectangles have heights or areas that are propor- tional to the frequencies in these categories. The vertical y scale is the frequency per interval see Fig. Note that if the widths of the bins are the same then the height of each rectangle is proportional to its frequency, but if they are not the area indicates the frequency.

It is best where possible to keep the width the same for all bins. Stem and leaf plot A stem and leaf plot is a graph that shows the main features of a set of data. In the stem and leaf plot the numbers themselves are used to dem- onstrate the shape of the distribution. It may be used instead of a histogram for small datasets or alongside to show patterns of occur- rence for certain numbers see Fig. The plot provides a useful summary of data structure while at the same time showing other characteristics such as a tendency for certain trailing digits to be more common than others so called digit preference.

We can see here that cm and cm both occur twice, cm occurs six times, and so on. In some datasets where observers are reporting measurements to the nearest 5 or 10, there will be an excess of these trailing digits. That does not appear to be the case in these data but is a common feature of blood pressure data.

It illustrates how useful a box and whisker plot can be to display data in groups. Note that an outlier is indicated by a separate circle outside the plot.

This is a height of cm which is quite small, but was found to be a correct value and not an error. Medical Statistics Made Easy has been a perennial bestseller since the first edition was published it is consistently a 1 bestseller in medical statistics on Amazon. It is widely recommend on a variety of courses and programmes, from undergraduate medicine, through to professional medical qualifications.

It is a book of key statistics principles for anyone studying or working in medicine and healthcare who needs a basic overview of the subject. It is ideal for non-statisticians who need to understand how statistics are used and applied in medicine and medical research. Using a consistent format, the authors describe the most common statistical methods in turn and then rate them on how difficult they are to understand and how common they are.

The worked examples that demonstrate the statistical method in action have been updated to include current articles from the medical literature and now feature a wider range of medical journals.



0コメント

  • 1000 / 1000