Monday, February 23, 2009
Surveys are done to generalize about a population whereas case studies look at more discrete factors of particular individuals and so cannot be generalized.
n = sample population
N = target population
You have to be specific whenever you define/describe population
What about confidence limits? require random sample technique
How do we do a true random sample? number target population; choose people at random using some method. You can use a system to randomize--(e.g. every fifth student) systematic random sampling
Cluster sampling--individual units within a large population.
n : k rations...are variables... for every k, the rule of thumb is to have 10 n's to getreliable data. Formulas include Spearman Brown,
Wednesday, February 18, 2009
Notes taken by Alicia
Qualitative descriptive research (case studies) - Ultimate goal is to improve practice. This presupposes a cause/effect relationship between behavior and outcome; however, this method will ONLY let you hypothesize about variables and describe them. When you move to show correlation among them, you’re doing quantitative work. But remember, correlation does not mean causation.
With these studies, you can examine factors that *might* be influencing behaviors, environments, circumstances, etc. You cannot prove cause/effect for certain.
Purpose - Case studies identify and provide evidence to support the fact that certain parts/variables exist, that they have construct validity (i.e. people agree these are the parts). Qualitative-descriptive method is a necessary precursor to quantitative research: you always need to operationalize variables–define them.
Subject selection critieria -
- Begin with a theory, which already has construct validity
[in UXD, a text for this would be Universal Principles of Design, since it is ripe for application to projects]
- Subjects need to be representative of the thing under study so that it becomes possible to generalize findings to a wider community.
Data collection techniques -
- Content analysis: coding for patterns (i.e. pattern recognition) across subjects
- Think/talk aloud protocol
**The success of this methodology hinges on inter-rater reliability, that measure of agreement between coders. [The best example of how to do inter-rater reliability in composition is "The Pregnant Pause: An Inquiry Into the Nature of Planning" by Linda Flower and John R. Hayes.]
When there’s low inter-rater reliability, it could be because…
- All else being equal, the raters weren’t well trained
- Your categories were not operationally defined to a sufficient degree. Categories should be as concrete as possible.
- The raters themselves are flawed: they are not experts; they are ideologically opposed to the study or to potential findings; they are fatigued. When critiquing a coding study, it’s apt to question who the raters are.
Saturday, February 14, 2009
Foundational terms in empirical research:
Measurement - the process of quantifying variables.
- Its 2 parts are qualitative and quantitative.
- 2 options are to select:
- measurements which already exist and have proven reliability and validity.
- Measures which exist can be direct or indirect.
- measurements which don’t exist. In this case you have to show reliability and validity.
Variables - That which is manipulated in quantitative reseach. These are described in qualitative research.
Qualitative Research - identify (fact), describe (definition) potential variables, and attempt to prove they exist (& that they have construct validity, reliability, etc.). You actually have to persuade people that this is the case, and so you’re always already engaged in a rhetorical practice, even when you’re doing empirical research.
Quantitative Research - object is to show relationship (quality) between variables–i.e. to persuade people that some [usually causal] relationship exists. This presupposes that the variables exist, which means that you have to 1st qualitatively show this to be the case.
Methods of evaluating composition:
(Critical question is always are these direct or indirect measures? IOW, do these methods measure the student…or the rater…? Answer to this will tell whether they’re direct or indirect measures)
- Holistic Evaluation - Give one score based on overall impression. No factorial breakdown. [low inter-rater reliability]
- Analytic Evaluation - Break down based on categories or variables
- Primary Traits - Put the object into categories based on how well it fits into the description of that category. Will end up with several scores. (e.g. the Eng 103 descriptive grading rubric)
Types of Data:
(Type must match statistical measure to ensure validity.)
- Nominal - Classifications which can be named.
- Ordinal - Rank ordering. Not equidistant between points. [e.g. A, B, C, D, F...Likert scales]
- Interval - Has equal distance between variables. [more powerful statistics are associated with these]
- Ratio - Interval data with an absolute zero. Almost never crops up with human subjects.
- Z-Score - lets you normalize across a large population. But you have to exclude to normalize. (measure of reliability)
Reliability - measure of agreement or disagreement between raters or instruments (r = -1 - +1). Inter-rater reliability should approach +0.7 to have predictive power that what you’re describing is actually there.
(Out for a nightmarish situation: if your research is predicated on r=0.7, and you don’t achieve this, you can argue that r is a social construct, which is inherently predicated on truth by consensus, and is therefore not really a “scientific” or precise measure.)
3 types of reliability:
- Equivalency - you’re triangulating with multiple instruments and all of the instruments are giving the same results.
- Stability - do the instruments/people change over time? If yes, that’s a reliability issue.
- Internal - Consistency of instruments/people. Granularity and scale.
Validity - measuring what you say you’re measuring. Reliability is a necessary condition for validity, but it’s not sufficient. You can have construct validity, without being able to measure it reliably. Likewise, you can measure something reliably without it having construct validity.
What is the difference between reliability and validity? You can reliably measure something you did not intend to measure in your study, or around which your study does not hinge.