-
Spatial-temporal Modelling of Oesophageal and Lung Cancers in Kenya’s Counties
Joseph Kuria Waitara,
Gregory Kerich,
John Kihoro,
Anne Korir
Issue:
Volume 10, Issue 4, July 2021
Pages:
175-183
Received:
1 June 2021
Accepted:
18 June 2021
Published:
30 June 2021
Abstract: Oesophageal cancer is the cancer that forms in tissues lining the oesophagus (the muscular tube through which food passes from the throat to the stomach) while Lung cancer is the cancer that forms in tissues of the lung, usually in the cells lining air passages. In this study, Data collected by the Nairobi Cancer Registry (NCR) was used to produce spatial-temporal distribution of oesophageal cancer cases for counties in Kenya. The study revealed, counties where data was available Bomet had highest relative risk of oesophageal cancer, followed by Meru, Nyeri, Embu, Nakuru, Kakamega Nairobi, Mombasa, Kiambu and Machakos counties respectively. The study revealed that smoking and alcohol use were significant risk factors of oesophageal cancer in Kenya. Generation of spatio-temporal maps and identification of the risk factors from various counties with notified oesophageal cancer cases is a major milestone since previous studies focused on specific regions. The multiplicative effect of smoking was observed to be 1.012, indicating that oesophageal cancer is 1.2% higher to those who smoke compared to non-smokers. The multiplicative effect of alcohol use was observed to be 1.0346, indicating that oesophageal cancer was 3.5% higher to alcohol users as compared to non-alcohol users. The study findings revealed that, the multiplicative effect of smoking was 1.4021, indicating that lung cancer was 40.21% higher to smokers as compared to non-smokers from the available data. The multiplicative effect of alcohol use was 1.3689 indicating that the risk of lung cancer was 36.89% higher to alcohol users compared to non-alcohol users. Clearly, counties where the data was not available the relative risks were relatively low, therefore even though the data was not available in these counties application of spatial-temporal accounting for covariates revealed that there is risk of oesophageal and lung cancer in the counties. To enhance research on oesophageal, lung and other types of cancer in Kenya the National Cancer Registry in collaboration with Counties health departments should work very closely to enhance cancer data collection to facilitate research and to inform the appropriate measures to be implemented to mitigate the increase of cancer cases.
Abstract: Oesophageal cancer is the cancer that forms in tissues lining the oesophagus (the muscular tube through which food passes from the throat to the stomach) while Lung cancer is the cancer that forms in tissues of the lung, usually in the cells lining air passages. In this study, Data collected by the Nairobi Cancer Registry (NCR) was used to produce ...
Show More
-
An Entropy Based Objective Bayesian Prior Distribution
Issue:
Volume 10, Issue 4, July 2021
Pages:
184-193
Received:
20 July 2021
Accepted:
6 August 2021
Published:
23 August 2021
Abstract: Bayesian Statistical Analysis requires that a prior probability distribution be assumed. This prior is used to describe the likelihood that a given probability distribution generated the sample data. When no information is provided about how data samples are drawn, a statistician must use what is called an, “objective prior distribution” for analysis. Some common objective prior distributions are the Jeffery’s prior, Haldane prior, and reference prior. The choice of an objective prior has a strong effect on statistical inference, so it must be chosen with care. In this paper, a novel entropy based objective prior distribution is proposed. It is proven to be uniquely defined given a few postulates, which are based on well accepted properties of probability distributions. This novel objective prior distribution is shown to be the exponential of the entropy information in a probability distribution (eS), which suggests a strong connection to information theory. This result confirms the maximal entropy principle, which paves the way for a more robust mathematical foundation for thermodynamics. It also suggests possible connection between quantum mechanics and information theory. The novel objective prior distribution is used to derive a new regularization technique that is shown to improve the accuracy of modern day artificial intelligence on a few real world data sets on most test runs. On just a couple of trials, the new regularization technique overly regularized a neural network and lead to poorer results. This showed that, while often quite effective, this new regularization technique must be used with care. It is anticipated that this novel objective prior will be an integral part of many new algorithms that focus on finding an appropriate model to describe a data set.
Abstract: Bayesian Statistical Analysis requires that a prior probability distribution be assumed. This prior is used to describe the likelihood that a given probability distribution generated the sample data. When no information is provided about how data samples are drawn, a statistician must use what is called an, “objective prior distribution” for analys...
Show More
-
Exploring the Effects of Assumption Violations on Simple Linear Regression and Correlation Using Excel
William Henry Laverty,
Ivan William Kelly
Issue:
Volume 10, Issue 4, July 2021
Pages:
194-201
Received:
25 June 2021
Accepted:
21 August 2021
Published:
30 August 2021
Abstract: Regression analysis plays a central role in statistics and our understanding of the world. Linear regression models are the simplest type of regression and an understanding of them is an essential basis for more advanced models. In this article we will show how to use Excel to generate data from a simple linear regression model and illustrate how the statistical methods behave both when the fundamental assumptions of the model hold and when the fundamental assumptions are violated. The advantage of the using the program Excel is that when you press the recalculate button, under the Formulas menu, the data that is generated at random will be regenerated, statistical calculations will be recalculated and relevant graphs will be redrawn. Least squares is the statistical technique typically used when assumptions are satisfied. A statistical technique used when the normality assumption is violated is the non-parametric technique introduced by Kendall and Theil. The latter is useful when data are skewed or heteroskedastic, and is as powerful as least squares regression for Normally distributed data. Exercises are provided to illustrate both these procedures. In these exercises we generate samples of a Simple Linear Regression where the error term could follow a Normal distribution or the heavy tailed t-distribution.
Abstract: Regression analysis plays a central role in statistics and our understanding of the world. Linear regression models are the simplest type of regression and an understanding of them is an essential basis for more advanced models. In this article we will show how to use Excel to generate data from a simple linear regression model and illustrate how t...
Show More
-
Comparison of the New Estimators: The Semi-Parametric Likelihood Estimator, SPW, and the Conditional Weighted Pseudo Likelihood Estimator, WPCE
Samuel Joel Kamun,
Richard Simwa,
Stanley Sewe
Issue:
Volume 10, Issue 4, July 2021
Pages:
202-207
Received:
6 August 2021
Accepted:
21 August 2021
Published:
31 August 2021
Abstract: The analysis of sample-based studies involving sampling designs for small sample size, is challenging because the sample selection probabilities (as well as the sample weights) is dependent on the response variable and covariates. The study has focused on using systems of weighted regression estimating equations, using different modified weights, to estimate the coefficients of Weighted Likelihood Estimators. Usually, the design-consistent (Weighted) estimators are obtained by solving (sample) weighted estimating equations. They are then used to construct estimates which have better relative efficiencies and smaller finite small sample bias than the estimates from the Horvitz-Thompson Weighted Estimator with unmodified weight, option A. The purpose of our study is to compare derived Estimators of the weighted regression estimating equations for estimating the coefficients of Weighted Likelihood Estimators, the Semi-Parametric Weighted Likelihood Estimator, SPW and the Weighted Conditional Pseudo Likelihood Estimator, WCPE with the conventional Horvitz-Thompson Weighted Likelihood Estimator, using relative efficiency, sample bias and Standard Error for small sample size. The constructed estimates from the system of weighted regression estimating equations, using different modified weights, are actually the Weighted Likelihood Estimators. The study compared the two new estimators, the Semi-parametric weighted estimator, SPW and the Weighted Conditional Pseudo Likelihood estimator, WCPE, for both the unmodified and modified Weights, which were found to have better relative efficiency and smaller finite small sample bias than the estimates from conventional Horvitz-Thompson Weighted Estimator, for both generated and for real data. The outcome of the tests show strong similarity in performance to those obtained using the simulated data. Estimates were constructed which have better relative efficiencies and smaller finite small sample bias than the estimates from the Horvitz-Thompson Weighted Estimator with unmodified weight, option A.
Abstract: The analysis of sample-based studies involving sampling designs for small sample size, is challenging because the sample selection probabilities (as well as the sample weights) is dependent on the response variable and covariates. The study has focused on using systems of weighted regression estimating equations, using different modified weights, t...
Show More