• = a set of techniques to answer research questions with quantiative data
  • typical procedure:
    1. come up with a model, theory and hypothesis
    2. develop or select concrete measurement instruments
    3. collect data
    4. analyse data (e.g. statistical modelling)
    5. evaluate results

Variables and confounders

  • in the models and theories, we use constructs (those are general terms, like “business” or “IT alignment”)
    • variables are operationalized versions of constructs (so they have values and can be measured)
    • we have:
        1. dependent variables (outcomes)
        1. independent variables
        1. confounders - a variable that is related to both the dependent and independent variable (at least partially)
        • why to notice them?
          • because if we don’t take them into account, we can think that there is a relationship between A and B (but there is a confounder C driving/influencing them both)
          • e.g. “IT investments” (independent) “company performance”
            • the confounder is the company’s size (because different sizes of companies have different IT investments, different scenarios, and different performance drivers)
        • we can control them by using control variables
        1. control variables
        • a) keeping the confounder constant in the study (e.g. study only companies of the same/similar size)
        • b) measure them and include them in the statistical model (e.g. add the company size as variable to the regression - this way I can isolate the effect of IT investment on company’s performance after accounting for the company size)
          • = testing the hypothesis over all possible values of the confounder
        1. mediator variables
        • mediator variables “sit” in between independent variable and dependent variable (A influences the mediator and the mediator influence the B)
        • there are two types of mediation:
          • full mediation
            • e.g. IT governance business/IT alignment business value from IT
              • the relationship between IT governance and business value from IT drops to zero (“everything goes through the mediator”)
          • partial mediation
            • there are two types of relationship: direct and undirect (through the mediator)
            • so the mediator does not affect everything, but is partially involved
        1. moderator variables
        • they act as catalyzators, the change the strength and increase/decrease the strength of the relationship between independent and dependent variables
        • e.g. the IT savvyiness of the board members increases the relationship strength between “board involvement in IT governance” and “business value from IT”
    • conditional process modelling = a combination of influences of mediators and moderators on the relationship between variables
    • variable types:
      • categorical variables (nominal, ordinal)
        • binary vs. polytomous (more than two categories)
        • useful is frequency analysis (e.g. how many subjects in each category)
      • quantitative variables
        • on the level of interval and ratio measurement levels
        • a lot of statistical analyses can be performed

Quality criteria

    1. reliability
    • if we find consistent results if we replicate the study
    1. validity
    • if the conclusions based on the study are true
    • a) construct validity = the methods are really measuring and manipulating the properties that we intend them to, this is a prerequisite for internal and external validity
    • b) internal validity = to observed effect is actually due to the hypothesized cause
      • what we meant that will cause X to happen, was actually caused by X
      • the risk here are confounders (they may affect the variables, so X happens as well, but not because of what we meant will cause it)
    • c) external validity = the hypothesized relationship also holds in other settings (e.g. other groups, industries, times etc.)

Measurement

  • what is operationalization?
    • choosing/creating a concrete method to measure a construct
    • variables are operationalized versions of construct
    • there are different operationalizations (some of them are capturing the whole construct, some of them are capturing only some aspects)
  • levels of measurements:
    • nominal: differentiate between values (male/female, right-handed/left-handed)
    • ordinal: differentiate and order (levels of education, levels of agreement from strongly disagree to strongly agree)
    • interval: differentiate between values, ordering and interpreting differences (e.g. temperature)
    • ratio - the highest level of measurement
      • it has a meaningful zero point (zero means “none of this thing exist”)
        • e.g. 0 °C does not mean that temperature does not exist, it cannot be relatively compared to different temperatures, and the relational ratios change when converting to Fahrenheit
        • but 0 cm means no length at all, in every unit system
          • meaninful zero also means we can do relative ratios (10 cm is twice as much as 20 cm, which holds in every unit system)
    • according to the reached level of measurement, we can choose a correct statistical tool to get interesting insights
  • measurement reliability and validity:
    • validity = the choosen instrument is measuring the construct we intended to measure
      • influenced by systematic error, if the systematic error is high, we are measuring the wrong construct
      • we have to have reliable measurement in order to assess the validity (necessary, but not sufficient condition)
    • reliability = how reliable is the measurement?
      • if we replicate the measurement in the same conditions we should get consistent, stable and same/similar results
      • influenced by random error, if the random error is high, we are measuring imprecisely

Surveys and questionnaires

  • difference:
    • survey: a list of questions covering a variety of topics
    • questionnaire: focuses on one construct or a related set of constructs
  • rules:
    • provide clear instructions
    • administered by the interviewer
    • one construct should be measured by more questions (to enable reliability/internal consistency)
    • avoid complicated questions and double negations, extreme wording
    • response options should be exhaustive (each respondent should be able to position himself) and mutually exclusive (they won’t overlap)

Sampling

  • related to: Sampling
  • we sample population (taking a subset of units/elements from the whole set)
  • stratum = a subset of the population, where elements share some characteristic
  • sampling frame = an actual subset of the population to choose from
    • population = all IT experts
    • sampling frame = the results of the LinkedIn search and my network
  • types of sampling:
    • probability sampling:
      • for each element in the population, the probability of the element ending up in the sample in known and non-zero
      • a) simple random sampling (each element has an equal and independent chance to be in the sample)
      • b) systematic sampling (only first element is taken by random and then the rest is predetermined, like skipping X elements etc.)
        • the sample could be biased if there is some pattern in the sampling frame
      • c) stratified random sampling (divide population into mutually exclusive strata and then do simple random sampling from each strata)
    • non-probability sampling:
      • some elements have zero probability of being selected or their probability is unknown (we cannot claim a representative sample)
      • a) convenience sampling (selected elements are easy to get, e.g. in the case study)
      • b) snowball sampling (small group is selected, then they are asked to get other participants)
        • e.g. Delphi study and focus groups
      • c) purposive sampling - elements are chosen based on the judgement of the researcher
        • for very specific case researches

Statistical tests

  • correlation tests (Pearson, Spearman) - tests a relationship between variables (correlation does not mean causation)
  • regression analysis - test of the relationship between independent variables and one dependent variable
    • e.g. if the IT investment predicts the company’s performance
  • tests of mean differences:
    • Independent samples t-test (difference between two groups on single, continuous dependent variable)
      • = means of the group are equal
      • e.g. measuring differences between companies adopting COBIT or not on their performance
    • Independent samples z-test (difference between two groups on a single proportion)
      • = proportions (percentages) are equal
    • ANOVA (difference between more than two groups on a single, continuous variable)
      • test if all means are equal ()
      • the idea is not to run multiple t-tests, but run the on all groups simultaneously
      • if the test is significant, it also finds where the differences are (some post-hoc tests are used to find out)
    • ANCOVA (extension of ANOVA which controls the effect of covariates on the dependent variable)
      • it just handles the confounders statistically
  • tests of repeated measures (“longitudinal research”):
    • paired samples t-test (difference for one group between two time-points)
      • e.g. a company before adopting COBIT and after
    • repeated measures ANOVA (difference for one group between more than two time-points)
    • mixed ANOVA
      • when sample is divided into two or more independent groups and we are measuring differences in more than two time-points
      • e.g. measure business/IT alignment before, during, and after COBIT 5 adoption in small and large organizations

Latent constructs/models

  • variables like age are directly measured, variables like happiness and indirectly inferred through observed variables
  • factor analysis = look how observed variables cluster under latent dimensions
    • exploratory factor analysis (exploring the underlying factor structure and try to explain it theoretically)
      • e.g. “I like people”, “I don’t mind presenting”, “I am often in the center of attention” will cluster in a cluster, which we could label “extraversion”
      • I can run survey without strong theoretical basis and to see, which clusters show
    • confirmatory factor analysis (formally testing the hypothesis of a specific factor structure)
      • we can test test variable “extraversion” with the new dataset
      • it takes the clusters from the exploratory factor analysis and tests them as hypothesis against a new dataset to see, if the data patterns actually fit
    • structural equation modelling (regression with latent variables)
      • we can assess the effect of extraversion on hapiness (measuring causal relationships between latent constructs)
  • two types:
    • reflective latent constructs
      • the hypothesis is that the reflective latent construct influences/causes changes in all the indicators
    • formative latent constructs
      • the hypothesis is that the all the indicators represent some part of the formative latent construct (their changes cause changes in to latent construct)