Browsing by Author "Perez, Jorge"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
- ItemAn extension of SPARQL for RDFS(SPRINGER-VERLAG BERLIN, 2008) Arenas, Marcelo; Gutierrez, Claudio; Perez, Jorge; Christophides, V; Collard, M; Gutierrez, CRDF Schema (RDFS) extends RDF with a schema vocabulary with a predefined semantics. Evaluating queries which involve this vocabulary is challenging, and there is not yet consensus in the Semantic Web community on how to define a query language for RDFS. In this paper, we introduce a language for querying RDFS data. This language is obtained by extending SPARQL with nested regular expressions that allow to navigate through an RDF graph with RDFS vocabulary. This language is expressive enough to answer SPARQL queries involving RDFS vocabulary, by directly traversing the input graph.
- ItemBidirectional Constraints for Exchanging Data: Beyond Monotone Queries(IJCAI-INT JOINT CONF ARTIF INTELL, 2015) Arenas, Marcelo; Dieguez, Gabriel; Perez, Jorge; Yang, Q; Wooldridge, MIn this paper, we propose to use the language of bidirectional constraints to specify schema mappings in the context of data exchange. These constraints impose restrictions over both the source and the target data, and have the potential to minimize the ambiguity in the description of the target data to be materialized. We start by making a case for the usefulness of bidirectional constraints to give a meaningful closed-world semantics for st-tgds, which is motivated by Clark's predicate completion and Reiter's formalization of the closed-world assumption of a logical theory. We then formally study the use of bidirectional constraints in data exchange. In particular, we pinpoint the complexity of the existence-of-solutions and the query evaluation problems in several different scenarios, including in the latter case both monotone and nonmonotone queries.
- ItemComposition and Inversion of Schema Mappings(ASSOC COMPUTING MACHINERY, 2009) Arenas, Marcelo; Perez, Jorge; Reutter, Juan; Riveros, Cristian
- ItemnSPARQL: A Navigational Language for RDF(SPRINGER-VERLAG BERLIN, 2008) Perez, Jorge; Arenas, Marcelo; Gutierrez, Claudio; Sheth, A; Staab, S; Paolucci, M; Maynard, D; Finin, T; Krishnaprasad, TNavigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In particular, we have argued in a previous paper that nested regular expressions are appropriate to navigate RDF data, and we have proposed the nSPARQL query language for RDF, that uses nested regular expressions as building blocks. In this paper, we study some of the fundamental properties of nSPARQL concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that the evaluation of a nested regular expression E over an RDF graph G can be computed in time O(vertical bar G vertical bar . vertical bar E vertical bar).
- ItemOn the expressiveness of LARA: A proposal for unifying linear and relational algebra(2022) Barcelo, Pablo; Higuera, Nelson; Perez, Jorge; Subercaseaux, BernardoWe study the expressive power of the LARA language - a recently proposed unified model for expressing relational and linear algebra operations - both in terms of traditional database query languages and some analytic tasks often performed in machine learning pipelines. Since LARA is parameterized by a set of user-defined functions which allow to transform values in tables, known as extension functions, the exact expressive power of the language depends on how these functions are defined. We start by showing LARA to be expressive complete with respect to a syntactic fragment of relational algebra with aggregation (under the mild assumption that extension functions in LARA can cope with traditional relational algebra operations such as selection and renaming). We then look further into the expressiveness of LARA based on different classes of extension functions, and distinguish two main cases depending on the level of genericity that queries are enforced to satisfy. Under strong genericity assumptions the language cannot express matrix convolution, a very important operation in current machine learning pipelines. This language is also local, and thus cannot express operations such as matrix inverse that exhibit a recursive behavior. For expressing convolution, one can relax the genericity requirement by adding an underlying linear order on the domain. This, however, destroys locality and turns the expressive power of the language much more difficult to understand. In particular, although under complexity assumptions some versions of the resulting language can still not express matrix inverse, a proof of this fact without such assumptions seems challenging to obtain.
- ItemQuery language-based inverses of schema mappings: semantics, computation, and closure properties(2012) Arenas, Marcelo; Perez, Jorge; Reutter, Juan; Riveros, CristianThe inversion of schema mappings has been identified as one of the fundamental operators for the development of a general framework for metadata management. During the last few years, three alternative notions of inversion for schema mappings have been proposed (Fagin-inverse (Fagin, TODS 32(4), 25:1-25:53, 2007), quasi-inverse (Fagin et al., TODS 33(2), 11:1-11:52, 2008), and maximum recovery (Arenas et al., TODS 34(4), 22:1-22:48, 2009)). However, these notions lack some fundamental properties that limit their practical applicability: most of them are expressed in languages including features that are difficult to use in practice, some of these inverses are not guaranteed to exist for mappings specified with source-to-target tuple-generating dependencies (st-tgds), and it has been futile to search for a meaningful mapping language that is closed under any of these notions of inverse. In this paper, we develop a framework for the inversion of schema mappings that fulfills all of the above requirements. It is based on the notion of -maximum recovery, for a query language , a notion designed to generate inverse mappings that recover back only the information that can be retrieved with queries in . By focusing on the language of conjunctive queries (CQ), we are able to find a mapping language that contains the class of st-tgds, is closed under CQ-maximum recovery, and for which the chase procedure can be used to exchange data efficiently. Furthermore, we show that our choices of inverse notion and mapping language are optimal, in the sense that choosing a more expressive inverse operator or mapping language causes the loss of these properties.
- ItemQuerying Semantic Data on the Web(ASSOC COMPUTING MACHINERY, 2012) Arenas, Marcelo; Gutierrez, Claudio; Miranker, Daniel P.; Perez, Jorge; Sequeda, Juan F.
- ItemSemantics and Complexity of SPARQL(2009) Perez, Jorge; Arenas, Marcelo; Gutierrez, ClaudioSPARQL is the standard language for querying RDF data. In this article, we address systematically the formal study of the database aspects of SPARQL, concentrating in its graph pattern matching facility. We provide a compositional semantics for the core part of SPARQL, and study the complexity of the evaluation of several fragments of the language. Among other complexity results, we show that the evaluation of general SPARQL patterns is PSPACE-complete. We identify a large class of SPARQL patterns, defined by imposing a simple and natural syntactic restriction, where the query evaluation problem can be solved more efficiently. This restriction gives rise to the class of well-designed patterns. We show that the evaluation problem is coNP-complete for well-designed patterns. Moreover, we provide several rewriting rules for well-designed patterns whose application may have a considerable impact in the cost of evaluating SPARQL queries.
- ItemSimple and Efficient Minimal RDFS(ELSEVIER, 2009) Munoz, Sergio; Perez, Jorge; Gutierrez, ClaudioThe original RDFS language design includes several features that hinder the task of developers and theoreticians. This paper has two main contributions in the direction of simplifying the language. First, it introduces a small fragment which, preserving the normative semantics and the core functionalities, avoids the complexities of the original specification, and captures the main semantic functionalities of RDFS. Second, it introduces a minimalist deduction system over this fragment, which by avoiding certain rare cases, obtains a simple deductive system and a computationally efficient entailment checking. (C) 2009 Elsevier B.V. All rights reserved.
- ItemThe Expressive Power of Graph Neural Networks as a Query Language(2020) Barcelo, Pablo; Kostylev, Egor, V; Monet, Mikael; Perez, Jorge; Reutter, Juan L.; Silva, Juan-PabloIn this paper we survey our recent results characterizing various graph neural network (GNN) architectures in terms of their ability to classify nodes over graphs, for classifiers based on unary logical formulas- or queries. We focus on the language FOC2, a well-studied fragment of FO. This choice is motivated by the fact that FOC2 is related to the Weisfeiler-Lehman (WL) test for checking graph isomorphism, which has the same ability as GNNs for distinguishing nodes on graphs. We unveil the exact relationship between FOC2 and GNNs in terms of node classification. To tackle this problem, we start by studying a popular basic class of GNNs, which we call AC-GNNs, in which the features of each node in a graph are updated, in successive layers, according only to the features of its neighbors. We prove that the unary FOC2 formulas that can be captured by an AC-GNN are exactly those that can be expressed in its guarded fragment, which in turn corresponds to graded modal logic. This result implies in particular that AC-GNNs are too weak to capture all FOC2 formulas. We then seek for what needs to be added to AC-GNNs for capturing all FOC2. We show that it suffices to add readouts layers, which allow updating the node features not only in terms of its neighbors, but also in terms of a global attribute vector. We call GNNs with readouts ACR-GNNs. We also describe experiments that validate our findings by showing that, on synthetic data conforming to FOC2 but not to graded modal logic, AC-GNNs struggle to fit in while ACR-GNNs can generalise even to graphs of sizes not seen during training.
- ItemThe language of plain SO-tgds: Composition, inversion and structural properties(2013) Arenas, Marcelo; Perez, Jorge; Reutter, Juan; Riveros, CristianThe problems of composing and inverting schema mappings specified by source-to-target tuple-generating dependencies (st-tgds) have attracted a lot of attention, as they are of fundamental importance for the development of Bernstein's metadata management framework. In the case of the composition operator, a natural semantics has been proposed and the language of second-order tuple generating dependencies (SO-tgds) has been identified as the right language to express it. In the case of the inverse operator, several semantics have been proposed, most notably the maximum recovery, the only inverse notion that guarantees that every mapping specified by st-tgds is invertible. Unfortunately, less attention has been paid to combining both operators, which is the motivation of this paper. More precisely, we start our investigation by showing that SO-tgds are not good for inversion, as there exist mappings specified by SO-tgds that are not invertible under any of the notions of inversion proposed in the literature. To overcome this limitation, we borrow the notion of CQ-composition, which is a relaxation obtained by parameterizing the composition of mappings by the class of conjunctive queries (CQ), and we propose a restriction over the class of SO-tgds that gives rise to the language of plain SO-tgds. Then we show that plain SO-tgds are the right language to express the CQ-composition of mappings given by st-tgds, in the same sense that SO-tgds are the right language to express the composition of st-tgds, and we prove that every mapping specified by a plain SO-tgd admits a maximum recovery, thus showing that plain SO-tgds have a good behavior w.r.t. inversion. Moreover, we show that the language of plain SO-tgds shares some fundamental structural properties with the language of st-tgds, but being much more expressive, and we provide a polynomial-time algorithm to compute maximum recoveries for mappings specified by plain SO-tgds (which can also be used to compute maximum recoveries for mappings given by st-tgds). All these results suggest that the language of plain SO-tgds is a good alternative to be implemented in data exchange and data integration applications. (C) 2013 Elsevier Inc. All rights reserved.