DSpace Angular :: Browsing by Author "Soto, Alvaro"

Browsing by Author "Soto, Alvaro"

Now showing 1 - 20 of 20

A STATISTICAL APPROACH TO SIMULTANEOUS MAPPING AND LOCALIZATION FOR MOBILE ROBOTS
(INST MATHEMATICAL STATISTICS, 2007) Araneda, Anita; Fienberg, Stephen E.; Soto, Alvaro
Mobile robots require basic information to navigate through an environment: they need to know where they are (localization) and they need to know where they are going. For the latter, robots need a map of the environment. Using sensors of a variety of forms, robots gather information as they move through in environment in order to build a map. In this paper we present a novel sampling algorithm to solving the simultaneous mapping and localization (SLAM) problem in indoor environments. We approach the problem from a Bayesian statistics perspective. The data correspond to a set of range tinder and odometer measurements, obtained at discrete time instants. We focus on the estimation of the posterior distribution over the space of possible maps given the data. By exploiting different factorizations of this distribution, we derive three sampling algorithms based oil importance sampling. We illustrate the results of our approach by testing the algorithms with two real data sets obtained through robot navigation inside office buildings at Carnegie Mellon University and the Pontificia Universidad Catolica de Chile.
Active Visual Perception for Mobile Robot Localization
(2010) Correa, Javier; Soto, Alvaro
Localization is a key issue for a mobile robot, in particular in environments where a globally accurate positioning system, such as GPS, is not available. In these environments, accurate and efficient robot localization is not a trivial task, as an increase in accuracy usually leads to an impoverishment in efficiency and viceversa. Active perception appears as an appealing way to improve the localization process by increasing the richness of the information acquired from the environment. In this paper, we present an active perception strategy for a mobile robot provided with a visual sensor mounted on a pan-tilt mechanism. The visual sensor has a limited field of view, so the goal of the active perception strategy is to use the pan-tilt unit to direct the sensor to informative parts of the environment. To achieve this goal, we use a topological map of the environment and a Bayesian non-parametric estimation of robot position based on a particle filter. We slightly modify the regular implementation of this filter by including an additional step that selects the best perceptual action using Monte Carlo estimations. We understand the best perceptual action as the one that produces the greatest reduction in uncertainty about the robot position. We also consider in our optimization function a cost term that favors efficient perceptual actions. Previous works have proposed active perception strategies for robot localization, but mainly in the context of range sensors, grid representations of the environment, and parametric techniques, such as the extended Kalman filter. Accordingly, the main contributions of this work are: i) Development of a sound strategy for active selection of perceptual actions in the context of a visual sensor and a topological map; ii) Real time operation using a modified version of the particle filter and Monte Carlo based estimations; iii) Implementation and testing of these ideas using simulations and a real case scenario. Our results indicate that, in terms of accuracy of robot localization, the proposed approach decreases mean average error and standard deviation with respect to a passive perception scheme. Furthermore, in terms of efficiency, the active scheme is able to operate in real time without adding a relevant overhead to the regular robot operation.
An accelerated algorithm for density estimation in large databases using Gaussian mixtures
(2007) Soto, Alvaro; Zavala, Felipe; Araneda, Anita
Today, with the advances of computer storage and technology, there are huge datasets available, offering an opportunity to extract valuable information. Probabilistic approaches are specially suited to learn from data by representing knowledge as density functions. In this paper, we choose Gaussian mixture models (GMMs) to represent densities, as they possess great flexibility to adequate to a wide class of problems. The classical estimation approach for GMMs corresponds to the iterative algorithm of expectation maximization (EM). This approach, however, does not scale properly to meet the high demanding processing requirements of large databases. In this paper we introduce an EM-based algorithm, that solves the scalability problem. Our approach is based on the concept of data condensation which, in addition to substantially diminishing the computational load, provides sound starting values that allow the algorithm to reach convergence faster. We also focus on the model selection problem. We test our algorithm using synthetic and real databases, and find several advantages, when compared to other standard existing procedures.
An autonomous educational mobile robot mediator
(2008) Mitnik, Ruben; Nussbaum, Miguel; Soto, Alvaro
So far, most of the applications of robotic technology to education have mainly focused on supporting the teaching of subjects that are closely related to the Robotics field, such as robot programming, robot construction, or mechatronics. Moreover, most of the applications have used the robot as an end or a passive tool of the learning activity, where the robot has been constructed or programmed. In this paper, we present a novel application of robotic technologies to education, where we use the real world situatedness of a robot to teach non-robotic related subjects, such as math and physics. Furthermore, we also provide the robot with a suitable degree of autonomy to actively guide and mediate in the development of the educational activity. We present our approach as an educational framework based on a collaborative and constructivist learning environment, where the robot is able to act as an interaction mediator capable of managing the interactions occurring among the working students. We illustrate the use of this framework by a 4-step methodology that is used to implement two educational activities. These activities were tested at local schools with encouraging results. Accordingly, the main contributions of this work are: i) A novel use of a mobile robot to illustrate and teach relevant concepts and properties of the real world; ii) A novel use of robots as mediators that autonomously guide an educational activity using a collaborative and constructivist learning approach; iii) The implementation and testing of these ideas in a real scenario, working with students at local schools.
APEX: affordance-based plan executor for indoor robotic navigation
(2024) Sepulveda, Gabriel; Vazquez, Marynel; Soto, Alvaro
The concept of affordance is commonly described as the set of possibilities an environment offers an animal to select and execute actions. In the context of autonomous robot navigation, determining which affordances are available to a mobile agent provides crucial information to perform navigation tasks such as executing navigation plans or exploring new environments. In this work, we provide a robot with the capability of detecting changes in the set of navigation affordances that it has available as it moves through indoor spaces. Specifically, given a navigation plan represented as a sequence of high-level behaviors such as 'turn left' or 'follow a corridor', we propose the Affordance-Based Plan Executor (APEX) as a new learning-based method to execute this plan based on the identification of navigation affordances. As a relevant fact, the proposed method does not require a map of the environment at execution time. To the best of our knowledge, our work is one of the first to explicitly consider the identification of navigation affordances as a central element to implement the execution of a behavior-based navigation plan in an indoor environment. Our experiments using the Gibson simulator and environments from the Stanford 2D-3D-S dataset suggest that our approach performs well in previously seen environments in comparison to baselines, exceeding alternative approaches' success rate by up to 43%. Additionally, our approach shows promise at generalizing to previously unseen environments, exceeding the success rate of alternative approaches by up to 26.3%. Finally, our experiments confirm that explicit reasoning about affordances is key to APEX's performance.
Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations
(ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2021) Araujo Vasquez, Vladimir Giovanny; Villa, Andres; Mendoza Rocha, Marcelo Gabriel; Moens, Marie-Francine; Soto, Alvaro
Current language models are usually trained using a self-supervised scheme, where the main focus is learning representations at the word or sentence level. However, there has been limited progress in generating useful discourse-level representations. In this work, we propose to use ideas from predictive coding theory to augment BERT-style language models with a mechanism that allows them to learn suitable discourse-level representations. As a result, our proposed approach is able to predict future sentences using explicit top-down connections that operate at the intermediate layers of the network. By experimenting with benchmarks designed to evaluate discourse-related knowledge using pre-trained sentence representations, we demonstrate that our approach improves performance in 6 out of 11 tasks by excelling in discourse relationship detection.
Automated Design of a Computer Vision System for Visual Food Quality Evaluation
(2013) Mery Quiroz, Domingo Arturo; Pedreschi, Franco; Soto, Alvaro
Automated fish bone detection using X-ray imaging
(ELSEVIER SCI LTD, 2011) Mery, Domingo; Lillo, Ivan; Loebel, Hans; Riffo, Vladimir; Soto, Alvaro; Cipriano, Aldo; Miguel Aguilera, Jose
In countries where fish is often consumed, fish bones are some of the most frequently ingested foreign bodies encountered in foods. In the production of fish fillets, fish bone detection is performed by human inspection using their sense of touch and vision which can lead to misclassification. Effective detection of fish bones in the quality control process would help avoid this problem. For this reason, an X-ray machine vision approach to automatically detect fish bones in fish fillets was developed. This paper describes our approach and the corresponding experiments with salmon and trout fillets. In the experiments, salmon X-ray images using 10 x 10 pixels detection windows and 24 intensity features (selected from 279 features) were analyzed. The methodology was validated using representative fish bones and trouts provided by a salmon industry and yielded a detection performance of 99%. We believe that the proposed approach opens new possibilities in the field of automated visual inspection of salmon, trout and other similar fish. (C) 2011 Elsevier Ltd. All rights reserved.
Automatic document screening of medical literature using word and text embeddings in an active learning setting
(SPRINGER, 2020) Carvallo, Andres; Parra, Denis; Lobel, Hans; Soto, Alvaro
Document screening is a fundamental task within Evidence-based Medicine (EBM), a practice that provides scientific evidence to support medical decisions. Several approaches have tried to reduce physicians' workload of screening and labeling vast amounts of documents to answer clinical questions. Previous works tried to semi-automate document screening, reporting promising results, but their evaluation was conducted on small datasets, which hinders generalization. Moreover, recent works in natural language processing have introduced neural language models, but none have compared their performance in EBM. In this paper, we evaluate the impact of several document representations such as TF-IDF along with neural language models (BioBERT, BERT, Word2Vec, and GloVe) on an active learning-based setting for document screening in EBM. Our goal is to reduce the number of documents that physicians need to label to answer clinical questions. We evaluate these methods using both a small challenging dataset (CLEF eHealth 2017) as well as a larger one but easier to rank (Epistemonikos). Our results indicate that word as well as textual neural embeddings always outperform the traditional TF-IDF representation. When comparing among neural and textual embeddings, in the CLEF eHealth dataset the models BERT and BioBERT yielded the best results. On the larger dataset, Epistemonikos, Word2Vec and BERT were the most competitive, showing that BERT was the most consistent model across different corpuses. In terms of active learning, an uncertainty sampling strategy combined with a logistic regression achieved the best performance overall, above other methods under evaluation, and in fewer iterations. Finally, we compared the results of evaluating our best models, trained using active learning, with other authors methods from CLEF eHealth, showing better results in terms of work saved for physicians in the document-screening task.
Collaborative robotic instruction: A graph teaching experience
(PERGAMON-ELSEVIER SCIENCE LTD, 2009) Mitnik, Ruben; Recabarren, Matias; Nussbaum, Miguel; Soto, Alvaro
Graphing is a key skill in the study of Physics. Drawing and interpreting graphs play a key role in the understanding of science, while the lack of these has proved to be a handicap and a limiting factor in the learning of scientific concepts. It has been observed that despite the amount of previous graph-working experience, students of all ages experience a series of difficulties when trying to comprehend graphs or when trying to relate them with physical concepts such as position, velocity and acceleration. Several computational tools have risen to improve the students' understanding of kinematical graphs; however, these approaches fail to develop graph construction skills. On the other hand, Robots have opened new opportunities in learning. Nevertheless, most of their educational applications focus on Robotics related subjects, such as robot programming, robot construction, and artificial intelligence. This paper describes a robotic activity based on face-to-face computer supported collaborative learning. By means of a set of handhelds and a robot wirelessly interconnected, the aim of the activity is to develop graph construction and graph interpretation skills while also reinforcing kinematics concepts. Results show that students using the robotic activity achieve a significant increase in their graph interpreting skills. Moreover, when compared with a similar computer-simulated activity, it proved to be almost twice as effective. Finally, the robotic application proved to be a highly motivating activity for the students, fostering collaboration among them. (C) 2009 Elsevier Ltd. All rights reserved.
Features: The More The Better
(2008) Mery Quiroz, Domingo Arturo; Soto, Alvaro
Human detection using a mobile platform and novel features derived from a visual saliency mechanism
(ELSEVIER, 2010) Montabone, Sebastian; Soto, Alvaro
Human detection is a key ability to an increasing number of applications that operates in human inhabited environments or needs to interact with a human user. Currently, most successful approaches to human detection are based on background substraction techniques that apply only to the case of static cameras or cameras with highly constrained motions. Furthermore, many applications rely on features derived from specific human poses, such as systems based on features derived from the human face which is only visible when a person is facing the detecting camera. In this work, we present a new computer vision algorithm designed to operate with moving cameras and to detect humans in different poses under partial or complete view of the human body. We follow a standard pattern recognition approach based on four main steps: (i) preprocessing to achieve color constancy and stereo pair calibration, (ii) segmentation using depth continuity information, (iii) feature extraction based on visual saliency, and (iv) classification using a neural network. The main novelty of our approach lies in the feature extraction step, where we propose novel features derived from a visual saliency mechanism. In contrast to previous works, we do not use a pyramidal decomposition to run the saliency algorithm, but we implement this at the original image resolution using the so-called integral image. Our results indicate that our method: (j) outperforms state-of-the-art techniques for human detection based on face detectors, (ii) outperforms state-of-the-art techniques for complete human body detection based on different set of visual features, and (iii) operates in real time onboard a mobile platform, such as a mobile robot (15 fps). (C) 2009 Elsevier B.V. All rights reserved.
Indoor Mobile Robotics at Grima, PUC
(2012) Caro, Luis; Correa, Javier; Espinace, Pablo; Langdon, Daniel; Maturana, Daniel; Mitnik, Ruben; Montabone, Sebastian; Pszczolkowski, Stefan; Araneda, Anita; Mery Quiroz, Domingo Arturo; Torres, Miguel; Soto, Alvaro
Inspecting the concept knowledge graph encoded by modern language models
(Association for Computational Linguistics (ACL), 2021) Aspillaga, Carlos; Soto, Alvaro; Mendoza Rocha, Marcelo Gabriel
The field of natural language understanding has experienced exponential progress in the last few years, with impressive results in several tasks. This success has motivated researchers to study the underlying knowledge encoded by these models. Despite this, attempts to understand their semantic capabilities have not been successful, often leading to non-conclusive, or contradictory conclusions among different works. Via a probing classifier, we extract the underlying knowledge graph of nine of the most influential language models of the last years, including word embeddings, text generators, and context encoders. This probe is based on concept relatedness, grounded on WordNet. Our results reveal that all the models encode this knowledge, but suffer from several inaccuracies. Furthermore, we show that the different architectures and training strategies lead to different model biases. We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging. We hope our insights will motivate the development of models that capture concepts more precisely.
Learning Sentence-Level Representations with Predictive Coding
(2023) Araujo, Vladimir; Moens, Marie-Francine; Soto, Alvaro
Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.
Overcoming Catastrophic Forgetting Using Sparse Coding and Meta Learning
(2021) Hurtado, Julio; Lobel, Hans; Soto, Alvaro
Continuous learning occurs naturally in human beings. However, Deep Learning methods suffer from a problem known as Catastrophic Forgetting (CF) that consists of a model drastically decreasing its performance on previously learned tasks when it is sequentially trained on new tasks. This situation, known as task interference, occurs when a network modifies relevant weight values as it learns a new task. In this work, we propose two main strategies to face the problem of task interference in convolutional neural networks. First, we use a sparse coding technique to adaptively allocate model capacity to different tasks avoiding interference between them. Specifically, we use a strategy based on group sparse regularization to specialize groups of parameters to learn each task. Afterward, by adding binary masks, we can freeze these groups of parameters, using the rest of the network to learn new tasks. Second, we use a meta learning technique to foster knowledge transfer among tasks, encouraging weight reusability instead of overwriting. Specifically, we use an optimization strategy based on episodic training to foster learning weights that are expected to be useful to solve future tasks. Together, these two strategies help us to avoid interference by preserving compatibility with previous and future weight values. Using this approach, we achieve state-of-the-art results on popular benchmarks used to test techniques to avoid CF. In particular, we conduct an ablation study to identify the contribution of each component of the proposed method, demonstrating its ability to avoid retroactive interference with previous tasks and to promote knowledge transfer to future tasks.
PIVOT: Prompting for Video Continual Learning
(IEEE Computer Soc., 2023) Villa Ojeda, Andres Felipe; Alcazar, Juan Leon; Alfarra, Motasem; Alhamoud, Kumail; Hurtado, Julio; Heilbron, Fabian Caba; Soto, Alvaro; Ghanem, Bernard
Modern machine learning pipelines are limited due to data availability, storage quotas, privacy regulations, and expensive annotation processes. These constraints make it difficult or impossible to train and update large-scale models on such dynamic annotated sets. Continual learning directly approaches this problem, with the ultimate goal of devising methods where a deep neural network effectively learns relevant patterns for new (unseen) classes, without significantly altering its performance on previously learned ones. In this paper, we address the problem of continual learning for video data. We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain, thereby reducing the number of trainable parameters and the associated forgetting. Unlike previous methods, ours is the first approach that effectively uses prompting mechanisms for continual learning without any in-domain pre-training. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
Quality classification of corn tortillas using computer vision
(ELSEVIER SCI LTD, 2010) Mery, Domingo; Chanona Perez, Jorge J.; Soto, Alvaro; Miguel Aguilera, Jose; Cipriano, Aldo; Velez Rivera, Nayeli; Arzate Vazquez, Israel; Gutierrez Lopez, Gustavo F.
Computer vision is playing an increasingly important role in automated visual food inspection. However, quality control in tortilla production is still performed by human operators which may lead to misclassification due to their subjectivity and fatigue. In order to reduce the need for human operators and therefore misclassification, we developed a computer vision framework to automatically classify the quality of corn tortillas according to five hedonic sub-classes given by a sensorial panel. The proposed framework analyzed 750 corn tortillas obtained from 15 different Mexican commercial stores which were either small, medium or large in size. More than 2300 geometric and color features were extracted from 1500 images capturing both sides of the 750 tortillas. After implementing a feature selection algorithm, in which the most relevant features were selected for the classification of the five sub-classes, only 64 features were required to design a classifier based on support vector machines. Cross-validation yielded a performance of 95% in the classification of the five hedonic sub-classes. Additionally, using only 10 of the selected features and a simple statistical classifier, it was possible to determine the origin of the tortillas with a performance of 96%. We believe that the proposed framework opens up new possibilities in the field of automated visual inspection of tortillas. (c) 2010 Elsevier Ltd. All rights reserved.
Unsupervised anomaly detection in large databases using Bayesian networks
(TAYLOR & FRANCIS INC, 2008) Cansado, Antonio; Soto, Alvaro
Today, there has been a massive proliferation of huge databases storing valuable information. The opportunities of an effective use of these new data sources are enormous; however the huge size and dimensionality of current large databases calls for new ideas to scale up current statistical and computational approaches. This article presents an application Of artificial intelligence technology to the problem of automatic detection of candidate anomalous records in a large database. IMP build our approach with three main goats in mind: 1) an effective detection of the records that are potentially anomalous; 2) a suitable selection of the subset of attributes that. explains what makes a record anomalous; and. 3) an efficient implementation that allows us to scale the approach to large databases. Our algorithm called Boyesian network anomaly detector (BNAD), uses the joint probability density junction (pdf) provided by a Bayesian network (BN) to achieve these goals. By using appropriate data structures, advanced caching techniques, the flexibility of Gaussian mixture mod els, Find the efficiency of BNs to model joint pdfs, BNAD manages to efficiently learn a suitable BV from a large dataset. We test BNAD using synthetic and real databases, the latter from the fields of manufacturing and astronomy, obtaining encouraging results.
Using data mining techniques to predict industrial wine problem fermentations
(ELSEVIER SCI LTD, 2007) Urtubia, Alejandra; Perez Correa, J. Ricardo; Soto, Alvaro; Pszczolkowski, Philippo
Winemakers currently lack the tools to identify early signs of undesirable fermentation behavior and so are unable to take possible mitigating actions. Data collected from tracking 24 industrial fermentations of Cabernet sauvignon were used in this study to explore how useful is data mining to detect anomalous behaviors in advance. A database held periodic measurements of 29 components that included sugar, alcohols, organic acids and amino acids. Owing to the scale of the problem, we used a two-stage classification procedure. First PCA was used to reduce system dimensionality while preserving metabolite interaction information. Cluster analysis (K-Means) was then performed on the lower-dimensioned system to group fermentations into clusters of similar behavior. Numerous classifications were explored depending on the data used. Initially data from just the first three days were assessed, and then the entire data set was used. Information from the first three days' fermentation behavior provides important clues about the final classification. We also found a strong association between problematic fermentations and specific patterns found by the data mining tools. In short, data from the first three days contain sufficient information to establish the likelihood of a fermentation finishing normally. Results from this study are most encouraging. Data from many more fermentations and of different varieties needs to be collected, however, to develop a reliable and more broadly applicable diagnostic tool. (c) 2006 Elsevier Ltd. All rights reserved.

Browsing by Author "Soto, Alvaro"

Results Per Page

Sort Options