top of page

Marcia Haskell Group

Public·14 members

D NetGuide Journal Vol (4), Issue (22)


[["Monti2018-ov","title":"Dual-Primal Graph Convolutional Networks","author":"Monti, Federico and Shchur, Oleksandr and Bojchevski, Aleksandar and Litany, Or and Gunnemann, Stephan and Bronstein, Michael M","abstract":"In recent years, there has been a surge of interest in developing deep learning methods for non-Euclidean structured data such as graphs. In this paper, we propose Dual-Primal Graph CNN, a graph convolutional architecture that alternates convolution-like operations on the graph and its dual. Our approach allows to learn both vertex- and edge features and generalizes the previous graph attention (GAT) model. We provide extensive experimental validation showing state-of-the-art results on a variety of tasks tested on established graph benchmarks, including CORA and Citeseer citation networks as well as MovieLens, Flixter, Douban and Yahoo Music graph-guided recommender systems.","month":"jun","year":"2018","eprint":"1806.00770","type":"ARTICLE"],["Battaglia2018-pi","title":"Relational inductive biases, deep learning, and graph networks","author":"Battaglia, Peter W and Hamrick, Jessica B and Bapst, Victor and Sanchez-Gonzalez, Alvaro and Zambaldi, Vinicius and Malinowski, Mateusz and Tacchetti, Andrea and Raposo, David and Santoro, Adam and Faulkner, Ryan and Gulcehre, Caglar and Song, Francis and Ballard, Andrew and Gilmer, Justin and Dahl, George and Vaswani, Ashish and Allen, Kelsey and Nash, Charles and Langston, Victoria and Dyer, Chris and Heess, Nicolas and Wierstra, Daan and Kohli, Pushmeet and Botvinick, Matt and Vinyals, Oriol and Li, Yujia and Pascanu, Razvan","abstract":"Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between ``hand-engineering'' and ``end-to-end'' learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.","month":"jun","year":"2018","archivePrefix":"arXiv","primaryClass":"cs.LG","eprint":"1806.01261","archiveprefix":"arXiv","primaryclass":"cs.LG","type":"ARTICLE"],["Corso2020-py","title":"Principal Neighbourhood Aggregation for Graph Nets","author":"Corso, Gabriele and Cavalleri, Luca and Beaini, Dominique and Lio, Pietro and Velickovic, Petar","abstract":"Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features - which occur regularly in real-world input domains and within the hidden layers of GNNs - and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. With this work, we hope to steer some of the GNN research towards new aggregation methods which we believe are essential in the search for powerful and robust models.","month":"apr","year":"2020","archivePrefix":"arXiv","primaryClass":"cs.LG","eprint":"2004.05718","archiveprefix":"arXiv","primaryclass":"cs.LG","type":"ARTICLE"],["Poulovassilis1994-bt","title":"A nested-graph model for the representation and manipulation of complex objects","author":"Poulovassilis, Alexandra and Levene, Mark","journal":"ACM Transactions on Information Systems","volume":"12","number":"1","pages":"35--68","year":"1994","type":"MISC"],["Gao2019-lf","title":"Graph U-Nets","author":"Gao, Hongyang and Ji, Shuiwang","abstract":"We consider the problem of representation learning for graph data. Convolutional neural networks can naturally operate on images, but have significant challenges in dealing with graph data. Given images are special cases of graphs with nodes lie on 2D lattices, graph embedding tasks have a natural correspondence with image pixel-wise prediction tasks such as segmentation. While encoder-decoder architectures like U-Nets have been successfully applied on many image pixel-wise prediction tasks, similar methods are lacking for graph data. This is due to the fact that pooling and up-sampling operations are not natural on graph data. To address these challenges, we propose novel graph pooling (gPool) and unpooling (gUnpool) operations in this work. The gPool layer adaptively selects some nodes to form a smaller graph based on their scalar projection values on a trainable projection vector. We further propose the gUnpool layer as the inverse operation of the gPool layer. The gUnpool layer restores the graph into its original structure using the position information of nodes selected in the corresponding gPool layer. Based on our proposed gPool and gUnpool layers, we develop an encoder-decoder model on graph, known as the graph U-Nets. Our experimental results on node classification and graph classification tasks demonstrate that our methods achieve consistently better performance than previous models.","month":"may","year":"2019","archivePrefix":"arXiv","primaryClass":"cs.LG","eprint":"1905.05178","archiveprefix":"arXiv","primaryclass":"cs.LG","type":"ARTICLE"],["Pope2019-py","title":"Explainability Methods for Graph Convolutional Neural Networks","author":"Pope, Phillip E and Kolouri, Soheil and Rostami, Mohammad and Martin, Charles E and Hoffmann, Heiko","journal":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","year":"2019","type":"MISC"],["Zachary1977-jg","title":"An Information Flow Model for Conflict and Fission in Small Groups","author":"Zachary, Wayne W","abstract":"Data from a voluntary association are used to construct a new formal model for a traditional anthropological problem, fission in small groups. The process leading to fission is viewed as an unequal flow of sentiments and information across the ties in a social network. This flow is unequal because it is uniquely constrained by the contextual range and sensitivity of each relationship in the network. The subsequent differential sharing of sentiments leads to the formation of subgroups with more internal stability than the group as a whole, and results in fission. The Ford-Fulkerson labeling algorithm allows an accurate prediction of membership in the subgroups and of the locus of the fission to be made from measurements of the potential for information flow across each edge in the network. Methods for measurement of potential information flow are discussed, and it is shown that all appropriate techniques will generate the same predictions.","journal":"J. Anthropol. Res.","publisher":"The University of Chicago Press","volume":"33","number":"4","pages":"452--473","month":"dec","year":"1977","type":"ARTICLE"],["Duvenaud2015-yc","title":"Convolutional Networks on Graphs for Learning Molecular Fingerprints","author":"Duvenaud, David and Maclaurin, Dougal and Aguilera-Iparraguirre, Jorge and Gomez-Bombarelli, Rafael and Hirzel, Timothy and Aspuru-Guzik, Alan and Adams, Ryan P","abstract":"We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predictive performance on a variety of tasks.","month":"sep","year":"2015","archivePrefix":"arXiv","primaryClass":"cs.LG","eprint":"1509.09292","archiveprefix":"arXiv","primaryclass":"cs.LG","type":"ARTICLE"],["Pennington2014-kg","title":"Glove: Global Vectors for Word Representation","author":"Pennington, Jeffrey and Socher, Richard and Manning, Christopher","journal":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","year":"2014","type":"MISC"],["Velickovic2017-hf","title":"Graph Attention Networks","author":"Velickovic, Petar and Cucurull, Guillem and Casanova, Arantxa and Romero, Adriana and Lio, Pietro and Bengio, Yoshua","abstract":"We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).","month":"oct","year":"2017","eprint":"1710.10903","type":"ARTICLE"],["Vaswani2017-as","title":"Attention Is All You Need","author":"Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, Lukasz and Polosukhin, Illia","abstract":"The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.","month":"jun","year":"2017","eprint":"1706.03762","type":"ARTICLE"],["Lample2019-jg","title":"Deep Learning for Symbolic Mathematics","author":"Lample, Guillaume and Charton, Francois","abstract":"Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica.","month":"dec","year":"2019","eprint":"1912.01412","type":"ARTICLE"],["McCloskey2018-ml","title":"Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry","author":"McCloskey, Kevin and Taly, Ankur and Monti, Federico and Brenner, Michael P and Colwell, Lucy","abstract":"Deep neural networks have achieved state of the art accuracy at classifying molecules with respect to whether they bind to specific protein targets. A key breakthrough would occur if these models could reveal the fragment pharmacophores that are causally involved in binding. Extracting chemical details of binding from the networks could potentially lead to scientific discoveries about the mechanisms of drug actions. But doing so requires shining light into the black box that is the trained neural network model, a task that has proved difficult across many domains. Here we show how the binding mechanism learned by deep neural network models can be interrogated, using a recently described attribution method. We first work with carefully constructed synthetic datasets, in which the 'fragment logic' of binding is fully known. We find that networks that achieve perfect accuracy on held out test datasets still learn spurious correlations due to biases in the datasets, and we are able to exploit this non-robustness to construct adversarial examples that fool the model. The dataset bias makes these models unreliable for accurately revealing information about the mechanisms of protein-ligand binding. In light of our findings, we prescribe a test that checks for dataset bias given a hypothesis. If the test fails, it indicates that either the model must be simplified or regularized and/or that the training dataset requires augmentation.","month":"nov","year":"2018","archivePrefix":"arXiv","primaryClass":"cs.LG","eprint":"1811.11310","archiveprefix":"arXiv","primaryclass":"cs.LG","type":"ARTICLE"],["Rozemberczki2020-lq","title":"Little Ball of Fur","author":"Rozemberczki, Benedek and Kiss, Oliver and Sarkar, Rik","journal":"Proceedings of the 29th ACM International Conference on Information & Knowledge Management","year":"2020","type":"MISC"],["Berge1976-ss","title":"Graphs and Hypergraphs","author":"Berge, Claude","publisher":"Elsevier","year":"1976","language":"en","type":"BOOK"],["Harary1969-qo","title":"Graph Theory","author":"Harary, Frank","year":"1969","type":"MISC"],["Zaheer2017-uc","title":"Deep Sets","author":"Zaheer, Manzil and Kottur, Satwik and Ravanbakhsh, Siamak and Poczos, Barnabas and Salakhutdinov, Ruslan and Smola, Alexander","abstract":"We study the problem of designing models for machine learning tasks defined on \\textbackslashemph\\sets\\. In contrast to traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets that are invariant to permutations. Such problems are widespread, ranging from estimation of population statistics \\textbackslashcite\\poczos13aistats\\, to anomaly detection in piezometer data of embankment dams \\textbackslashcite\\Jung15Exploration\\, to cosmology \\textbackslashcite\\Ntampaka16Dynamical,Ravanbakhsh16ICML1\\. Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong. This family of functions has a special structure which enables us to design a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks. We also derive the necessary and sufficient conditions for permutation equivariance in deep models. We demonstrate the applicability of our method on population statistic estimation, point cloud classification, set expansion, and outlier detection.","month":"mar","year":"2017","eprint":"1703.06114","type":"ARTICLE"],["Kunegis2013-er","title":"KONECT","author":"Kunegis, Jerome","journal":"Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion","year":"2013","type":"MISC"],["Zitnik2018-uk","title":"Modeling polypharmacy side effects with graph convolutional networks","author":"Zitnik, Marinka and Agrawal, Monica and Leskovec, Jure","abstract":"Motivation: The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases or co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change, favorably or unfavorably, if taken with another drug. The knowledge of drug interactions is often limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing. Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality and morbidity. Results: Here, we present Decagon, an approach for modeling polypharmacy side effects. The approach constructs a multimodal graph of protein-protein interactions, drug-protein target interactions and the polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type. Decagon is developed specifically to handle such multimodal graphs with a large number of edge types. Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks. Unlike approaches limited to predicting simple drug-drug interaction values, Decagon can predict the exact side effect, if any, through which a given drug combination manifests clinically. Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69\\%. We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients. Furthermore, Decagon models particularly well polypharmacy side effects that have a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types. Decagon opens up opportunities to use large pharmacogenomic and patient population data to flag and prioritize polypharmacy side effects for follow-up analysis via formal pharmacological studies. Availability and implementation: Source code and preprocessed datasets are at: ","journal":"Bioinformatics","volume":"34","number":"13","pages":"i457--i466","month":"jul","year":"2018","language":"en","type":"ARTICLE"],["Kearnes2016-rl","title":"Molecular graph convolutions: movin


  • About

    Welcome to the group! You can connect with other members, ge...

    bottom of page