Publications



2020

(C)
Christos Diou, Ioannis Sarafis, Vasileios Papapanagiotou, Leonidas Alagialoglou, Irini Lekka, Dimitrios Filos, Leandros Stefanopoulos, Vasileios Kilintzis, Christos Maramis, Youla Karavidopoulou, Nikos Maglaveras, Ioannis Ioakimidis, Evangelia Charmandari, Penio Kassari, Athanasia Tragomalou, Monica Mars, Thien-An Ngoc Nguyen, Tahar Kechadi, Shane O' Donnell, Gerardine Doyle, Sarah Browne, Grace O' Malley, Rachel Heimeier, Katerina Riviou, Evangelia Koukoula, Konstantinos Filis, Maria Hassapidou, Ioannis Pagkalos, Daniel Ferri, Isabel Pérez and Anastasios Delopoulos
"BigO: A public health decision support system for measuring obesogenic behaviors of children in relation to their local environment"
42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2020 May
[Abstract][BibTex][pdf]

Obesity is a complex disease and its prevalence depends on multiple factors related to the local socioeconomic, cultural and urban context of individuals. Many obesity prevention strategies and policies, however, are horizontal measures that do not depend on context-specific evidence. In this paper we present an overview of BigO, a system designed to collect objective behavioral data from children and adolescent populations as well as their environment in order to support public health authorities in formulating effective, context-specific policies and interventions addressing childhood obesity. We present an overview of the data acquisition, indicator extraction, data exploration and analysis components of the BigO system, as well as an account of its preliminary pilot application in 33 schools and 2 clinics in four European countries, involving over 4,200 participants.

@inproceedings{diou2020bigo,
author={Christos Diou and Ioannis Sarafis and Vasileios Papapanagiotou and Leonidas Alagialoglou and Irini Lekka and Dimitrios Filos and Leandros Stefanopoulos and Vasileios Kilintzis and Christos Maramis and Youla Karavidopoulou and Nikos Maglaveras and Ioannis Ioakimidis and Evangelia Charmandari and Penio Kassari and Athanasia Tragomalou and Monica Mars and Thien-An Ngoc Nguyen and Tahar Kechadi and Shane O' Donnell and Gerardine Doyle and Sarah Browne and Grace O' Malley and Rachel Heimeier and Katerina Riviou and Evangelia Koukoula and Konstantinos Filis and Maria Hassapidou and Ioannis Pagkalos and Daniel Ferri and Isabel Pérez and Anastasios Delopoulos},
title={BigO: A public health decision support system for measuring obesogenic behaviors of children in relation to their local environment},
booktitle={42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
publisher={IEEE},
year={2020},
month={05},
date={2020-05-06},
url={https://arxiv.org/pdf/2005.02928.pdf},
abstract={Obesity is a complex disease and its prevalence depends on multiple factors related to the local socioeconomic, cultural and urban context of individuals. Many obesity prevention strategies and policies, however, are horizontal measures that do not depend on context-specific evidence. In this paper we present an overview of BigO, a system designed to collect objective behavioral data from children and adolescent populations as well as their environment in order to support public health authorities in formulating effective, context-specific policies and interventions addressing childhood obesity. We present an overview of the data acquisition, indicator extraction, data exploration and analysis components of the BigO system, as well as an account of its preliminary pilot application in 33 schools and 2 clinics in four European countries, involving over 4,200 participants.}
}

(C)
Vasileios Papapanagiotou, Ioannis Sarafis, Christos Diou, Ioannis Ioakimidis, Evangelia Charmandari and Anastasios Delopoulos
"Collecting big behavioral data for measuring behavior against obesity"
42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2020 May
[Abstract][BibTex][pdf]

Obesity is currently affecting very large portions of the global population. Effective prevention and treatment starts at the early age and requires objective knowledge of population-level behavior on the region/neighborhood scale. To this end, we present a system for extracting and collecting behavioral information on the individual-level objectively and automatically. The behavioral information is related to physical activity, types of visited places, and transportation mode used between them. The system employs indicator-extraction algorithms from the literature which we evaluate on publicly available datasets. The system has been developed and integrated in the context of the EU-funded BigO project that aims at preventing obesity in young populations.

@inproceedings{papapanagiotou2020collecting,
author={Vasileios Papapanagiotou and Ioannis Sarafis and Christos Diou and Ioannis Ioakimidis and Evangelia Charmandari and Anastasios Delopoulos},
title={Collecting big behavioral data for measuring behavior against obesity},
booktitle={42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
year={2020},
month={05},
date={2020-05-11},
url={https://arxiv.org/pdf/2005.04928.pdf},
abstract={Obesity is currently affecting very large portions of the global population. Effective prevention and treatment starts at the early age and requires objective knowledge of population-level behavior on the region/neighborhood scale. To this end, we present a system for extracting and collecting behavioral information on the individual-level objectively and automatically. The behavioral information is related to physical activity, types of visited places, and transportation mode used between them. The system employs indicator-extraction algorithms from the literature which we evaluate on publicly available datasets. The system has been developed and integrated in the context of the EU-funded BigO project that aims at preventing obesity in young populations.}
}

(C)
Ioannis Sarafis, Christos Diou, Vasileios Papapanagiotou, Leonidas Alagialoglou and Anastasios Delopoulos
"Inferring the Spatial Distribution of Physical Activity in Children Population from Characteristics of the Environment"
42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2020 May
[Abstract][BibTex][pdf]

Obesity affects a rising percentage of the children and adolescent population, contributing to decreased quality of life and increased risk for comorbidities. Although the major causes of obesity are known, the obesogenic behaviors manifest as a result of complex interactions of the individual with the living environment. For this reason, addressing childhood obesity remains a challenging problem for public health authorities. The BigO project relies on large-scale behavioral and environmental data collection to create tools that support policy making and intervention design. In this work, we propose a novel analysis approach for modeling the expected population behavior as a function of the local environment. We experimentally evaluate this approach in predicting the expected physical activity level in small geographic regions using urban environment characteristics. Experiments on data collected from 156 children and adolescents verify the potential of the proposed approach. Specifically, we train models that predict the physical activity level in a region, achieving 81% leave-one-out accuracy. In addition, we exploit the model predictions to automatically visualize heatmaps of the expected population behavior in areas of interest, from which we draw useful insights. Overall, the predictive models and the automatic heatmaps are promising tools in gaining direct perception for the spatial distribution of the population's behavior, with potential uses by public health authorities.

@conference{sarafis2020inferring,
author={Ioannis Sarafis and Christos Diou and Vasileios Papapanagiotou and Leonidas Alagialoglou and Anastasios Delopoulos},
title={Inferring the Spatial Distribution of Physical Activity in Children Population from Characteristics of the Environment},
booktitle={42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
publisher={IEEE},
year={2020},
month={05},
date={2020-05-08},
url={https://arxiv.org/pdf/2005.03957.pdf},
abstract={Obesity affects a rising percentage of the children and adolescent population, contributing to decreased quality of life and increased risk for comorbidities. Although the major causes of obesity are known, the obesogenic behaviors manifest as a result of complex interactions of the individual with the living environment. For this reason, addressing childhood obesity remains a challenging problem for public health authorities. The BigO project relies on large-scale behavioral and environmental data collection to create tools that support policy making and intervention design. In this work, we propose a novel analysis approach for modeling the expected population behavior as a function of the local environment. We experimentally evaluate this approach in predicting the expected physical activity level in small geographic regions using urban environment characteristics. Experiments on data collected from 156 children and adolescents verify the potential of the proposed approach. Specifically, we train models that predict the physical activity level in a region, achieving 81% leave-one-out accuracy. In addition, we exploit the model predictions to automatically visualize heatmaps of the expected population behavior in areas of interest, from which we draw useful insights. Overall, the predictive models and the automatic heatmaps are promising tools in gaining direct perception for the spatial distribution of the population\'s behavior, with potential uses by public health authorities.}
}

2019

(J)
Christos Diou, Ioannis Sarafis, Vasileios Papapanagiotou, Ioannis Ioakimidis and Anastasios Delopoulos
Statistical Journal of the IAOS, 35, (4), pp. 677-690, 2019 Dec
[Abstract][BibTex][pdf]

The way we eat and what we eat, the way we move and the way we sleep significantly impact the risk of becoming obese. These aspects of behavior decompose into several personal behavioral elements including our food choices, eating place preferences, transportation choices, sleeping periods and duration etc. Most of these elements are highly correlated in a causal way with the conditions of our local urban, social, regulatory and economic environment. To this end, the H2020 project “BigO: Big Data Against Childhood Obesity” (http://bigoprogram.eu) aims to create new sources of evidence together with exploration tools, assisting the Public Health Authorities in their effort to tackle childhood obesity. In this paper, we present the technology-based methodology that has been developed in the context of The way we eat and what we eat, the way we move and the way we sleep significantly impact the risk of becoming obese. These aspects of behavior decompose into several personal behavioral elements including our food choices, eating place preferences, transportation choices, sleeping periods and duration etc. Most of these elements are highly correlated in a causal way with the conditions of our local urban, social, regulatory and economic environment. To this end, the H2020 project “BigO: Big Data Against Childhood Obesity” (http://bigoprogram.eu) aims to create new sources of evidence together with exploration tools, assisting the Public Health Authorities in their effort to tackle childhood obesity. In this paper, we present the technology-based methodology that has been developed in the context of BigO in order to: (a) objectively monitor a matrix of a population’s obesogenic behavioral elements using commonly available wearable sensors (accelerometers, gyroscopes, GPS), embedded in smart phones and smart watches; (b) acquire information for the environment from open and online data sources; (c) provide aggregation mechanisms to correlate the population behaviors with the environmental characteristics; (d) ensure the privacy protection of the participating individuals; and (e) quantify the quality of the collected big data. BigO in order to: (a) objectively monitor a matrix of a population’s obesogenic behavioral elements using commonly available wearable sensors (accelerometers, gyroscopes, GPS), embedded in smart phones and smart watches; (b) acquire information for the environment from open and online data sources; (c) provide aggregation mechanisms to correlate the population behaviors with the environmental characteristics; (d) ensure the privacy protection of the participating individuals; and (e) quantify the quality of the collected big data.

@article{DiouIAOS2019,
author={Christos Diou and Ioannis Sarafis and Vasileios Papapanagiotou and Ioannis Ioakimidis and Anastasios Delopoulos},
title={A methodology for obtaining objective measurements of population obesogenic behaviors in relation to the environment},
journal={Statistical Journal of the IAOS},
volume={35},
number={4},
pages={677-690},
year={2019},
month={12},
date={2019-12-10},
url={https://arxiv.org/pdf/1911.08315.pdf},
doi={http://10.3233/SJI-190537},
abstract={The way we eat and what we eat, the way we move and the way we sleep significantly impact the risk of becoming obese. These aspects of behavior decompose into several personal behavioral elements including our food choices, eating place preferences, transportation choices, sleeping periods and duration etc. Most of these elements are highly correlated in a causal way with the conditions of our local urban, social, regulatory and economic environment. To this end, the H2020 project “BigO: Big Data Against Childhood Obesity” (http://bigoprogram.eu) aims to create new sources of evidence together with exploration tools, assisting the Public Health Authorities in their effort to tackle childhood obesity. In this paper, we present the technology-based methodology that has been developed in the context of The way we eat and what we eat, the way we move and the way we sleep significantly impact the risk of becoming obese. These aspects of behavior decompose into several personal behavioral elements including our food choices, eating place preferences, transportation choices, sleeping periods and duration etc. Most of these elements are highly correlated in a causal way with the conditions of our local urban, social, regulatory and economic environment. To this end, the H2020 project “BigO: Big Data Against Childhood Obesity” (http://bigoprogram.eu) aims to create new sources of evidence together with exploration tools, assisting the Public Health Authorities in their effort to tackle childhood obesity. In this paper, we present the technology-based methodology that has been developed in the context of BigO in order to: (a) objectively monitor a matrix of a population’s obesogenic behavioral elements using commonly available wearable sensors (accelerometers, gyroscopes, GPS), embedded in smart phones and smart watches; (b) acquire information for the environment from open and online data sources; (c) provide aggregation mechanisms to correlate the population behaviors with the environmental characteristics; (d) ensure the privacy protection of the participating individuals; and (e) quantify the quality of the collected big data. BigO in order to: (a) objectively monitor a matrix of a population’s obesogenic behavioral elements using commonly available wearable sensors (accelerometers, gyroscopes, GPS), embedded in smart phones and smart watches; (b) acquire information for the environment from open and online data sources; (c) provide aggregation mechanisms to correlate the population behaviors with the environmental characteristics; (d) ensure the privacy protection of the participating individuals; and (e) quantify the quality of the collected big data.}
}

2019

(C)
Ioannis Sarafis, Christos Diou, Ioannis Ioakimidis and Anastasios Delopoulos
41th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019 Jul
[Abstract][BibTex][pdf]

Certain patterns of eating behaviour during meal have been identified as risk factors for long-term abnormal eating development in healthy individuals and, eventually, can affect the body weight. To detect early signs of problematic eating behaviour, this paper proposes a novel method for building behaviour assessment models. The goal of the models is to predict whether the in-meal eating behaviour resembles patterns associated with obesity, eating disorders, or low-risk behaviours. The models are trained using meals recorded with a plate scale from a reference population and labels annotated by a domain expert. In addition, the domain expert assigned scores that characterise the degree of any exhibited abnormal patterns. To improve model effectiveness, we use the domain expert’s scores to create training error regularisation weights that alter the importance of each training instance for its class during model training. The behaviour assessment models are based on the SVM algorithm and the fuzzy SVM algorithm for their instance-weighted variation. Experiments conducted on meals recorded from 120 individuals show that: (a) the proposed approach can produce effective models for eating behaviour classification (for individuals), or for ranking (for populations); and (b) the instance-weighted fuzzy SVM models achieve significant performance improvements, compared to the non-weighted, standard SVM models.

@conference{sarafis2019assessment,
author={Ioannis Sarafis and Christos Diou and Ioannis Ioakimidis and Anastasios Delopoulos},
title={Assessment of In-Meal Eating Behaviour using Fuzzy SVM},
booktitle={41th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
year={2019},
month={07},
date={2019-07-27},
url={https://mug.ee.auth.gr/wp-content/uploads/sarafis2019assessment.pdf},
doi={https://doi.org/10.1109/EMBC.2019.8857606},
abstract={Certain patterns of eating behaviour during meal have been identified as risk factors for long-term abnormal eating development in healthy individuals and, eventually, can affect the body weight. To detect early signs of problematic eating behaviour, this paper proposes a novel method for building behaviour assessment models. The goal of the models is to predict whether the in-meal eating behaviour resembles patterns associated with obesity, eating disorders, or low-risk behaviours. The models are trained using meals recorded with a plate scale from a reference population and labels annotated by a domain expert. In addition, the domain expert assigned scores that characterise the degree of any exhibited abnormal patterns. To improve model effectiveness, we use the domain expert’s scores to create training error regularisation weights that alter the importance of each training instance for its class during model training. The behaviour assessment models are based on the SVM algorithm and the fuzzy SVM algorithm for their instance-weighted variation. Experiments conducted on meals recorded from 120 individuals show that: (a) the proposed approach can produce effective models for eating behaviour classification (for individuals), or for ranking (for populations); and (b) the instance-weighted fuzzy SVM models achieve significant performance improvements, compared to the non-weighted, standard SVM models.}
}

(C)
Ioannis Sarafis, Christos Diou and Anastasios Delopoulos
41th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019 Jul
[Abstract][BibTex][pdf]

Obesity is a preventable disease that affects the health of a significant population percentage, reduces the life expectancy and encumbers the health care systems. The obesity epidemic is not caused by isolated factors, but it is the result of multiple behavioural patterns and complex interactions with the living environment. Therefore, in-depth understanding of the population behaviour is essential in order to create successful policies against obesity prevalence. To this end, the BigO system facilitates the collection, processing and modelling of behavioural data at population level to provide evidence for effective policy and interventions design. In this paper, we introduce the behaviour profiles mechanism of BigO that produces comprehensive models for the behavioural patterns of individuals, while maintaining high levels of privacy protection. We give examples for the proposed mechanism from real world data and we discuss usages for supporting various types of evidence-based policy design.

@conference{sarafis2019behaviour,
author={Ioannis Sarafis and Christos Diou and Anastasios Delopoulos},
title={Behaviour Profiles for Evidence-based Policies Against Obesity},
booktitle={41th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
year={2019},
month={07},
date={2019-07-26},
url={https://mug.ee.auth.gr/wp-content/uploads/sarafis2019behaviour.pdf},
doi={https://doi.org/10.1109/EMBC.2019.8857161},
abstract={Obesity is a preventable disease that affects the health of a significant population percentage, reduces the life expectancy and encumbers the health care systems. The obesity epidemic is not caused by isolated factors, but it is the result of multiple behavioural patterns and complex interactions with the living environment. Therefore, in-depth understanding of the population behaviour is essential in order to create successful policies against obesity prevalence. To this end, the BigO system facilitates the collection, processing and modelling of behavioural data at population level to provide evidence for effective policy and interventions design. In this paper, we introduce the behaviour profiles mechanism of BigO that produces comprehensive models for the behavioural patterns of individuals, while maintaining high levels of privacy protection. We give examples for the proposed mechanism from real world data and we discuss usages for supporting various types of evidence-based policy design.}
}

2018

(J)
Ioannis Sarafis, Christos Diou and Anastasios Delopoulos
CoRR, abs/1809.06124, 2018 Sep
[Abstract][BibTex][pdf]

Weighted SVM (or fuzzy SVM) is the most widely used SVM variant owning its effectiveness to the use of instance weights. Proper selection of the instance weights can lead to increased generalization performance. In this work, we extend the span error bound theory to weighted SVM and we introduce effective hyperparameter selection methods for the weighted SVM algorithm. The significance of the presented work is that enables the application of span bound and span-rule with weighted SVM. The span bound is an upper bound of the leave-one-out error that can be calculated using a single trained SVM model. This is important since leave-one-out error is an almost unbiased estimator of the test error. Similarly, the span-rule gives the actual value of the leave-one-out error. Thus, one can apply span bound and span-rule as computationally lightweight alternatives of leave-one-out procedure for hyperparameter selection. The main theoretical contributions are: (a) we prove the necessary and sufficient condition for the existence of the span of a support vector in weighted SVM; and (b) we prove the extension of span bound and span-rule to weighted SVM. We experimentally evaluate the span bound and the span-rule for hyperparameter selection and we compare them with other methods that are applicable to weighted SVM: the K-fold cross-validation and the $\xi - \alpha$ bound. Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than K-fold cross-validation.

@article{Sarafis2018CoRR,
author={Ioannis Sarafis and Christos Diou and Anastasios Delopoulos},
title={Span error bound for weighted SVM with applications in hyperparameter selection (preprint)},
journal={CoRR},
volume={abs/1809.06124},
year={2018},
month={09},
date={2018-09-17},
url={https://arxiv.org/pdf/1809.06124.pdf},
abstract={Weighted SVM (or fuzzy SVM) is the most widely used SVM variant owning its effectiveness to the use of instance weights. Proper selection of the instance weights can lead to increased generalization performance. In this work, we extend the span error bound theory to weighted SVM and we introduce effective hyperparameter selection methods for the weighted SVM algorithm. The significance of the presented work is that enables the application of span bound and span-rule with weighted SVM. The span bound is an upper bound of the leave-one-out error that can be calculated using a single trained SVM model. This is important since leave-one-out error is an almost unbiased estimator of the test error. Similarly, the span-rule gives the actual value of the leave-one-out error. Thus, one can apply span bound and span-rule as computationally lightweight alternatives of leave-one-out procedure for hyperparameter selection. The main theoretical contributions are: (a) we prove the necessary and sufficient condition for the existence of the span of a support vector in weighted SVM; and (b) we prove the extension of span bound and span-rule to weighted SVM. We experimentally evaluate the span bound and the span-rule for hyperparameter selection and we compare them with other methods that are applicable to weighted SVM: the K-fold cross-validation and the $\\xi - \\alpha$ bound. Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than K-fold cross-validation.}
}

2018

(C)
Alexandros Papadopoulos, Konstantinos Kyritsis, Ioannis Sarafis and Anastasios Delopoulos
40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, Honolulu, HI, USA, 2018 Oct
[Abstract][BibTex][pdf]

Automated monitoring and analysis of eating behaviour patterns, i.e., “how one eats”, has recently received much attention by the research community, owing to the association of eating patterns with health-related problems and especially obesity and its comorbidities. In this work, we introduce an improved method for meal micro-structure analysis. Stepping on a previous methodology of ours that combines feature extraction, SVM micro-movement classification and LSTM sequence modelling, we propose a method to adapt a pretrained IMU-based food intake cycle detection model to a new subject, with the purpose of improving model performance for that subject. We split model training into two stages. First, the model is trained using standard supervised learning techniques. Then, an adaptation step is performed, where the model is fine-tuned on unlabeled samples of the target subject via semisupervised learning. Evaluation is performed on a publicly available dataset that was originally created and used in [1] and has been extended here to demonstrate the effect of the semisupervised approach, where the proposed method improves over the baseline method.

@conference{papadopoulos2018personalised,
author={Alexandros Papadopoulos and Konstantinos Kyritsis and Ioannis Sarafis and Anastasios Delopoulos},
title={Personalised meal eating behaviour analysis via semi-supervised learning},
booktitle={40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)},
publisher={IEEE},
address={Honolulu, HI, USA},
year={2018},
month={10},
date={2018-10-29},
url={http://mug.ee.auth.gr/wp-content/uploads/papadopoulos2018personalised.pdf},
doi={http://10.1109/EMBC.2018.8513174},
abstract={Automated monitoring and analysis of eating behaviour patterns, i.e., “how one eats”, has recently received much attention by the research community, owing to the association of eating patterns with health-related problems and especially obesity and its comorbidities. In this work, we introduce an improved method for meal micro-structure analysis. Stepping on a previous methodology of ours that combines feature extraction, SVM micro-movement classification and LSTM sequence modelling, we propose a method to adapt a pretrained IMU-based food intake cycle detection model to a new subject, with the purpose of improving model performance for that subject. We split model training into two stages. First, the model is trained using standard supervised learning techniques. Then, an adaptation step is performed, where the model is fine-tuned on unlabeled samples of the target subject via semisupervised learning. Evaluation is performed on a publicly available dataset that was originally created and used in [1] and has been extended here to demonstrate the effect of the semisupervised approach, where the proposed method improves over the baseline method.}
}

2017

(C)
Christos Diou, Ioannis Sarafis, Ioannis Ioakimidis and Anastasios Delopoulos
"Data-driven assessments for sensor measurements of eating behavior"
Biomedical & Health Informatics (BHI), 2017 IEEE EMBS International Conference on, pp. 129-132, 2017 Jan
[Abstract][BibTex][pdf]

Two major challenges in sensor-based measurement and assessment of healthy eating behavior are (a) choosing the behavioral indicators to be measured, and (b) interpreting the measured values. While much of the work towards solving these problems belongs in the domain of behavioral science, there are several areas where technology can help. This paper outlines an approach for representing and interpreting eating and activity behavior based on sensor measurements and data available from a reference population. The main idea is to assess the “similarity” of an individual\'s behavior to previous data recordings of a relevant reference population. Thus, by appropriate selection of the indicators and reference data it is possible to perform comparative behavioral evaluation and support decisions, even in cases where no clear medical guidelines for the indicator values exist. We examine the simple, univariate case (one indicator) and then extend these ideas to the multivariate problem (several indicators) using one-class SVM to measure the distance from the reference population.

@inproceedings{diou2017data,
author={Christos Diou and Ioannis Sarafis and Ioannis Ioakimidis and Anastasios Delopoulos},
title={Data-driven assessments for sensor measurements of eating behavior},
booktitle={Biomedical & Health Informatics (BHI), 2017 IEEE EMBS International Conference on},
pages={129-132},
year={2017},
month={01},
date={2017-01-01},
url={http://ieeexplore.ieee.org/document/7897222/},
abstract={Two major challenges in sensor-based measurement and assessment of healthy eating behavior are (a) choosing the behavioral indicators to be measured, and (b) interpreting the measured values. While much of the work towards solving these problems belongs in the domain of behavioral science, there are several areas where technology can help. This paper outlines an approach for representing and interpreting eating and activity behavior based on sensor measurements and data available from a reference population. The main idea is to assess the “similarity” of an individual\\'s behavior to previous data recordings of a relevant reference population. Thus, by appropriate selection of the indicators and reference data it is possible to perform comparative behavioral evaluation and support decisions, even in cases where no clear medical guidelines for the indicator values exist. We examine the simple, univariate case (one indicator) and then extend these ideas to the multivariate problem (several indicators) using one-class SVM to measure the distance from the reference population.}
}

2016

(J)
Ioannis Sarafis, Christos Diou and Anastasios Delopoulos
"Online training of concept detectors for image retrieval using streaming clickthrough data"
Engineering Applications of Artificial Intelligence, 51, pp. 150-162, 2016 Jan
[Abstract][BibTex][pdf]

Clickthrough data from image search engines provide a massive and continuously generated source of user feedback that can be used to model how the search engine users perceive the visual content. Image clickthrough data have been successfully used to build concept detectors without any manual annotation effort, although the generated annotations suffer from labeling errors. Previous research efforts therefore focused on modeling the sample uncertainty in order to improve concept detector effectiveness. In this paper, we study the problem in an online learning setting using streaming clickthrough data where each click is treated seperately when it becomes available; the concept detector model is therefore continuously updated without batch retraining. We argue that sample uncertainty can be incorporated in the online learning setting by exploiting the repetitions of incoming clicks at the classifier level, where these act as an implicit importance weighting mechanism. For online concept detector training we use the LASVM algorithm. The inferred weighting approximates the solution of batch trained concept detectors using weighted SVM variants that are known to achieve improved performance and high robustness to noise compared to the standard SVM. Furthermore, we evaluate methods for selecting negative samples using a small number of candidates sampled locally from the incoming stream of clicks. The selection criteria aim at drastically improving the performance and the convergence speed of the online concept detectors. To validate our arguments we conduct experiments for 30 concepts on the Clickture-Lite dataset. The experimental results demonstrate that: (a) the proposed online approach produces effective and noise resilient concept detectors that can take advantage of streaming clickthrough data and achieve performance that is equivalent to Fuzzy SVM concept detectors with sample weights and 78.6% improved compared to standard SVM concept detectors; and (b) the selection criteria speed up convergence and improve effectiveness compared to random negative sampling even for a small number of available clicks (up to 134% after 100 clicks).

@article{Sarafis2016Online,
author={Ioannis Sarafis and Christos Diou and Anastasios Delopoulos},
title={Online training of concept detectors for image retrieval using streaming clickthrough data},
journal={Engineering Applications of Artificial Intelligence},
volume={51},
pages={150-162},
year={2016},
month={01},
date={2016-01-29},
url={http://www.sciencedirect.com/science/article/pii/S095219761600021X},
doi={http://dx.doi.org/10.1016/j.engappai.2016.01.017},
keywords={Clickthrough data;Online learning;Image retrieval;Label noise;Fuzzy SVM;LASVM},
abstract={Clickthrough data from image search engines provide a massive and continuously generated source of user feedback that can be used to model how the search engine users perceive the visual content. Image clickthrough data have been successfully used to build concept detectors without any manual annotation effort, although the generated annotations suffer from labeling errors. Previous research efforts therefore focused on modeling the sample uncertainty in order to improve concept detector effectiveness. In this paper, we study the problem in an online learning setting using streaming clickthrough data where each click is treated seperately when it becomes available; the concept detector model is therefore continuously updated without batch retraining. We argue that sample uncertainty can be incorporated in the online learning setting by exploiting the repetitions of incoming clicks at the classifier level, where these act as an implicit importance weighting mechanism. For online concept detector training we use the LASVM algorithm. The inferred weighting approximates the solution of batch trained concept detectors using weighted SVM variants that are known to achieve improved performance and high robustness to noise compared to the standard SVM. Furthermore, we evaluate methods for selecting negative samples using a small number of candidates sampled locally from the incoming stream of clicks. The selection criteria aim at drastically improving the performance and the convergence speed of the online concept detectors. To validate our arguments we conduct experiments for 30 concepts on the Clickture-Lite dataset. The experimental results demonstrate that: (a) the proposed online approach produces effective and noise resilient concept detectors that can take advantage of streaming clickthrough data and achieve performance that is equivalent to Fuzzy SVM concept detectors with sample weights and 78.6% improved compared to standard SVM concept detectors; and (b) the selection criteria speed up convergence and improve effectiveness compared to random negative sampling even for a small number of available clicks (up to 134% after 100 clicks).}
}

2015

(J)
Ioannis Sarafis, Christos Diou and Anastasios Delopoulos
"Building effective SVM concept detectors from clickthrough data for large-scale image retrieval"
International Journal of Multimedia Information Retrieval, 4, (2), pp. 129-142, 2015 Jun
[Abstract][BibTex][pdf]

Clickthrough data is a source of information that can be used for automatically building concept detectors for image retrieval. Previous studies, however, have shown that in many cases the resulting training sets suffer from severe label noise that has a significant impact in the SVM concept detector performance. This paper evaluates and proposes a set of strategies for automatically building effective concept detectors from clickthrough data. These strategies focus on: (1) automatic training set generation; (2) assignment of label confidence weights to the training samples and (3) using these weights at the classifier level to improve concept detector effectiveness. For training set selection and in order to assign weights to individual training samples three Information Retrieval (IR) models are examined: vector space models, BM25 and language models. Three SVM variants that take into account importance at the classifier level are evaluated and compared to the standard SVM: the Fuzzy SVM, the Power SVM, and the Bilateral-weighted Fuzzy SVM. Experiments conducted on the MM Grand Challenge dataset (consisting of 1M images and 82.3M unique clicks) for 40 concepts demonstrate that (1) on average, all weighted SVM variants are more effective than the standard SVM; (2) the vector space model produces the best training sets and best weights; (3) the Bilateral-weighted Fuzzy SVM produces the best results but is very sensitive to weight assignment and (4) the Fuzzy SVM is the most robust training approach for varying levels of label noise.

@article{Sarafis2015Building,
author={Ioannis Sarafis and Christos Diou and Anastasios Delopoulos},
title={Building effective SVM concept detectors from clickthrough data for large-scale image retrieval},
journal={International Journal of Multimedia Information Retrieval},
volume={4},
number={2},
pages={129-142},
year={2015},
month={06},
date={2015-06-01},
url={http://link.springer.com/article/10.1007/s13735-015-0080-5},
doi={http://10.1007/s13735-015-0080-5},
abstract={Clickthrough data is a source of information that can be used for automatically building concept detectors for image retrieval. Previous studies, however, have shown that in many cases the resulting training sets suffer from severe label noise that has a significant impact in the SVM concept detector performance. This paper evaluates and proposes a set of strategies for automatically building effective concept detectors from clickthrough data. These strategies focus on: (1) automatic training set generation; (2) assignment of label confidence weights to the training samples and (3) using these weights at the classifier level to improve concept detector effectiveness. For training set selection and in order to assign weights to individual training samples three Information Retrieval (IR) models are examined: vector space models, BM25 and language models. Three SVM variants that take into account importance at the classifier level are evaluated and compared to the standard SVM: the Fuzzy SVM, the Power SVM, and the Bilateral-weighted Fuzzy SVM. Experiments conducted on the MM Grand Challenge dataset (consisting of 1M images and 82.3M unique clicks) for 40 concepts demonstrate that (1) on average, all weighted SVM variants are more effective than the standard SVM; (2) the vector space model produces the best training sets and best weights; (3) the Bilateral-weighted Fuzzy SVM produces the best results but is very sensitive to weight assignment and (4) the Fuzzy SVM is the most robust training approach for varying levels of label noise.}
}

2014

(C)
Ioannis Sarafis, Christos Diou and Anastasios Delopoulos
"Building Robust Concept Detectors from Clickthrough Data: A Study in the MSR-Bing Dataset"
2014 9th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), pp. 66-71, 2014 Nov
[Abstract][BibTex][pdf]

In this paper we extend our previous work on strategies for automatically constructing noise resilient SVM detectors from click through data for large scale concept-based image retrieval. First, search log data is used in conjunction with Information Retrieval (IR) models to score images with respect to each concept. The IR models evaluated in this work include Vector Space Models (VSM), BM25 and Language Models (LM). The scored images are then used to create training sets for SVM and appropriate sample weights for two SVM variants: the Fuzzy SVM (FSVM) and the Power SVM (PSVM). These SVM variants incorporate weights for each individual training sample and can therefore be used to model label uncertainty at the classifier level. Experiments on the MSR-Bing Image Retrieval Grand Challenge dataset (consisting of 1M images and 82.3M unique clicks) show that FSVM is the most robust SVM algorithm for handling label noise and that the highest performance is achieved with weights derived from VSM. These results extend our previous findings on the value of FSVM from professional image archives to large-scale general purpose search engines, and furthermore identify VSM as the most appropriate sample weighting model.

@inproceedings{Sarafis2014Building,
author={Ioannis Sarafis and Christos Diou and Anastasios Delopoulos},
title={Building Robust Concept Detectors from Clickthrough Data: A Study in the MSR-Bing Dataset},
booktitle={2014 9th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)},
pages={66-71},
year={2014},
month={11},
date={2014-11-01},
url={http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6978955},
doi={http://10.1109/SMAP.2014},
abstract={In this paper we extend our previous work on strategies for automatically constructing noise resilient SVM detectors from click through data for large scale concept-based image retrieval. First, search log data is used in conjunction with Information Retrieval (IR) models to score images with respect to each concept. The IR models evaluated in this work include Vector Space Models (VSM), BM25 and Language Models (LM). The scored images are then used to create training sets for SVM and appropriate sample weights for two SVM variants: the Fuzzy SVM (FSVM) and the Power SVM (PSVM). These SVM variants incorporate weights for each individual training sample and can therefore be used to model label uncertainty at the classifier level. Experiments on the MSR-Bing Image Retrieval Grand Challenge dataset (consisting of 1M images and 82.3M unique clicks) show that FSVM is the most robust SVM algorithm for handling label noise and that the highest performance is achieved with weights derived from VSM. These results extend our previous findings on the value of FSVM from professional image archives to large-scale general purpose search engines, and furthermore identify VSM as the most appropriate sample weighting model.}
}

(C)
Ioannis Sarafis, Christos Diou, Theodora Tsikrika and Anastasios Delopoulos
"Weighted SVM from clickthrough data for image retrieval"
2014 IEEE International Conference on Image Processing (ICIP), pp. 3013-3017, 2014 Aug
[Abstract][BibTex][pdf]

In this paper we propose a novel approach to training noise-resilient concept detectors from clickthrough data collected by image search engines. We take advantage of the query logs to automatically produce concept detector training sets; these suffer though from label noise, i.e., erroneously assigned labels. We explore two alternative approaches for handling noisy training data at the classifier level by training concept detectors with two SVM variants: the Fuzzy SVM and the Power SVM. Experimental results on images collected from a professional image search engine indicate that 1) Fuzzy SVM outperforms both SVM and Power SVM and is the most effective approach towards handling label noise and 2) the performance gain of Fuzzy SVM compared to SVM increases progressively with the noise level in the training sets.

@inproceedings{Sarafis2014Weighted,
author={Ioannis Sarafis and Christos Diou and Theodora Tsikrika and Anastasios Delopoulos},
title={Weighted SVM from clickthrough data for image retrieval},
booktitle={2014 IEEE International Conference on Image Processing (ICIP)},
pages={3013-3017},
year={2014},
month={08},
date={2014-08-01},
url={http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=7025609},
doi={http://10.1109/ICIP.2014.7025609},
abstract={In this paper we propose a novel approach to training noise-resilient concept detectors from clickthrough data collected by image search engines. We take advantage of the query logs to automatically produce concept detector training sets; these suffer though from label noise, i.e., erroneously assigned labels. We explore two alternative approaches for handling noisy training data at the classifier level by training concept detectors with two SVM variants: the Fuzzy SVM and the Power SVM. Experimental results on images collected from a professional image search engine indicate that 1) Fuzzy SVM outperforms both SVM and Power SVM and is the most effective approach towards handling label noise and 2) the performance gain of Fuzzy SVM compared to SVM increases progressively with the noise level in the training sets.}
}