“Is it dishonest to remove outliers and/or transform data?”
Outliers are defined as a data point that is extremely different from the others that have been produced in the sample. By extremely, it is meant that the data is several standard deviations away from the sample mean and that is follows a completely different pattern from the other data points. They can be easily spotted through things such as a high degree of inconsistency across the board of participants. Outliers flaw the design of the experiment and effect the results produced, it also effects the write up of the test, for example, if a ten people read 20 pages an hour of some text book and one reads 200 pages per hour, the average pages/hour will be thrown off by the one person that seems to have way above average skills. It could even be the difference in the hypothesis being supported or not, but is it dishonest to completely remove them from the data?
Many scientists do not believe that they should be removed from the data… it effects the reliability of the experiment should other scientists wish to carry out their own versions, the results would simply not match the results of the initial test, it also makes it easier to tamper with data, scientists who see that the data will not support the hypothesis could easily remove certain data points in order for it to then be supported.
However, other scientists do believe that outliers should be allowed to be removed, and that this could even be done before the data is analysed, the participant could be removed from the experimental condition, should it be seen that they are not following the instructions carefully or that they fail to engage in the experiment at all. Not removing these participants affects the validity of the results. Cleaning the data before the final project write up ensures that data represents the construct properly.
In conclusion I believe that it is okay to remove outliers from the experiment should it be done before the data is analysed, through the observation of an unengaged and despondent participant, although I do not believe that it should be allowed to be removed once the experiment and write up has begun, this is because I believe it affects the reliability of the experiment too much and when other scientists wish to carry out the test again, the same results will not be sought.