Modely pro diskrétní longitudinální data a jejich aplikace při vyšetřování dotazníků

Summary

In this diploma thesis was introduced the regression models and the longitudinal analysis. Acquired theory was used for solving practical problems in R software.

The chosen regression models were: the linear model (LM), the mixed linear model (LMM), the generalized linear model (GLM) and the generalized mixed linear model (GLMM). The main difference among these models is in their data distribution and in their data dependency.

Longitudinal data are data with time parameter. These data contain the measurement of subjects, which should have been accomplished on the same or very similar subjects in continuous time period. The longitudinal analysis is then analysis above these longitudinal data.

Key words:

Regression, regression analysis, regression models, linear model, LM, mixed linear model, LMM, generalized linear model, GLM, generalized mixed linear model, GLMM, longitudinal data, longitudinal analysis, questionnaires, tree damage.

The regression analysis is method for modeling and analyzing dependent and independent variables. This method is able to estimate an independent variable(s) on the basis of knowledge of a dependent variable(s). This is very beneficial for all kind of data characterization. The data have different type of dependency and distribution, therefore the regression models are divided into more groups. The linear model (LM) has normal distribution and independent data. The mixed linear model (LMM) has normal distribution and dependent data, the generalized mixed model (GLM) has non-normal distribution and independent data. And at last the generalized mixed linear model (GLMM) has non-normal distribution and dependent data. All these models can be used for longitudinal analysis.

In this thesis was the term questionnaires understood as all types data sets and it was not matter if the data were subjective or objective. The regression and longitudinal analysis work on both types. And moreover the aim of this thesis was to introduce the widest range of useful areas, where these analysis can be performed.

In the second chapter are given examples, where these statistics methods can be used and why are these methods so wholesome for geoinformatics field.

Next chapter is the mathematical summary of these models with practical illustration on the real data sets in R language.

The main theoretical block describes the mixed linear models (LMM) but as we know the findings are equally useful also for generalized mixed linear models (GLMM). Here was explained the random and fixed effect, the restricted maximum likelihood and the maximum likelihood estimation and described the sections of functions outputs. The essential terms were Residual, Intercept, Standard deviation, Variance and Standard Error. The further assessment and parameters estimation were included as opportunity of following progress in the statistics methods

It was necessary to explain the crossed and nested random effects, what was done with examples in the next section. The main point of this section was the hypothesis testing, which can evaluate the chosen models.

Last theory block was reserved for longitudinal analysis, where the regression models can be applied. Also the models outcomes were explained in detail.

The case studies were designed for un-typical data, where could be the theory comprehend in other way. In the first case study was shown an example with the Poisson distribution and the hypothesis testing. Next one explained the stepwise regression and the elimination of extra parameters. And in the last one chapter the models with correlated and uncorrelated random effects were performed.