Home
Search results “Sparsity in text mining”
Weka Tutorial 25: Sparse Data (Data Preprocessing)
 
07:14
In this tutorial I demonstrate the way to represent sparse data in ARFF file format that Weka can read. Link in: http://www.linkedin.com/pub/rushdi-shams/3b/83b/9b3
Views: 8543 Rushdi Shams
On the Anonymization of Sparse High-Dimensional Data
 
01:13
Title: On the Anonymization of Sparse High-Dimensional Data Domain: Data Mining Description: 1, Privacy preservation is the most focussed issue in information publication, on the grounds that the sensitive data shouldn't be disclosed. For this regard, several privacy preservation data mining algorithms are proposed. 2, Generalisation, Bucketisation and Anatomisation techniques are used as a part of this regard. They ensure the privacy of the user,either by modifying quasi identifier values or by including noise. 3, These techniques are well suited for low dimensional data and they expel the most valuable information from the dataset.In this work,we concentrate on protection against identity disclosure in the publication of sparse high dimensional data. 4, The sparse dataset which is scanty has less information distributed in the entire dataset.So,in the first phase we transform the dataset into a band matrix framework by coordinating Genetic algorithm with Cuckoo search algorithm.This makes the nearest rows associated and makes the non zero components near to the diagonal and lessens the search space and also memory. 5, In the other phase a novel anatomisation technique based on disassociation is introduced to safeguard privacy.This technique isolates the quasi identifier values with sensitive attributes and publishes quasi identifiers straightforwardly.Then density based clustering is employed to anonymise the underlying data,ands protects against identity disclosure and increases data utility The adversary cannot relate the sensitive value with high probability.Experimental results demonstrate that this technique decreases information loss, reconstruction error and increases data utility. For more details contact: E-Mail: [email protected] Buy Whole Project Kit for Rs 5000%. Project Kit: • 1 Review PPT • 2nd Review PPT • Full Coding with described algorithm • Video File • Full Document Note: *For bull purchase of projects and for outsourcing in various domains such as Java, .Net, .PHP, NS2, Matlab, Android, Embedded, Bio-Medical, Electrical, Robotic etc. contact us. *Contact for Real Time Projects, Web Development and Web Hosting services. *Comment and share on this video and win exciting developed projects for free of cost. Search Terms: 1. 2017 ieee projects 2. latest ieee projects in java 3. latest ieee projects in data mining 4. 2017 – 2018 data mining projects 5. 2017 – 2018 best project center in Chennai 6. best guided ieee project center in Chennai 7. 2017 – 2018 ieee titles 8. 2017 – 2018 base paper 9. 2017 – 2018 java projects in Chennai, Coimbatore, Bangalore, and Mysore 10. time table generation projects 11. instruction detection projects in data mining, network security 12. 2017 – 2018 data mining weka projects 13. 2017 – 2018 b.e projects 14. 2017 – 2018 m.e projects 15. 2017 – 2018 final year projects 16. affordable final year projects 17. latest final year projects 18. best project center in Chennai, Coimbatore, Bangalore, and Mysore 19. 2017 Best ieee project titles 20. best projects in java domain 21. free ieee project in Chennai, Coimbatore, Bangalore, and Mysore 22. 2017 – 2018 ieee base paper free download 23. 2017 – 2018 ieee titles free download 24. best ieee projects in affordable cost 25. ieee projects free download 26. 2017 data mining projects 27. 2017 ieee projects on data mining 28. 2017 final year data mining projects 29. 2017 data mining projects for b.e 30. 2017 data mining projects for m.e 31. 2017 latest data mining projects 32. latest data mining projects 33. latest data mining projects in java 34. data mining projects in weka tool 35. data mining in intrusion detection system 36. intrusion detection system using data mining 37. intrusion detection system using data mining ppt 38. intrusion detection system using data mining technique 39. data mining approaches for intrusion detection 40. data mining in ranking system using weka tool 41. data mining projects using weka 42. data mining in bioinformatics using weka 43. data mining using weka tool 44. data mining tool weka tutorial 45. data mining abstract 46. data mining base paper 47. data mining research papers 2017 - 2018 48. 2017 - 2018 data mining research papers 49. 2017 data mining research papers 50. data mining IEEE Projects 52. data mining and text mining ieee projects 53. 2017 text mining ieee projects 54. text mining ieee projects 55. ieee projects in web mining 56. 2017 web mining projects 57. 2017 web mining ieee projects 58. 2017 data mining projects with source code 59. 2017 data mining projects for final year students 60. 2017 data mining projects in java 61. 2017 data mining projects for students
Joseph Salmon (ENST ParisTech): Convex Optimization, Sparsity and Regression in High Dimension
 
01:33:03
Talk given by Joseph Salmon at CIMAT on November, 5th, during the Workshop on Image Processing/Statistical Pattern Recognition Following seminal works from R. Tibshirani and D. Donoho in the mid 90's, a tremendous amount of new tools have been developed to handle regression when the number of explanatory variables (or features) is potentially larger than the sample size. The main ingredient, though, has been the design of methods leveraging sparsity. In this lecture, I will present a point on view relying mainly on modern convex optimization techniques providing sparse solutions. A particular emphasize on non-smooth regularized regression, including l1 regularization (Lasso) or sparse-group regularization (Group-Lasso), will be given. Algorithmic challenges depending on the nature of the data will be addressed, with potential applications in image processing, bio-statistics and text mining. Last but not least, statistical results assessing the successes and failures of such methods will be presented.
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in Spark with Lorand Dali
 
26:12
"This talk tells the story of implementation and optimization of a sparse logistic regression algorithm in spark. I would like to share the lessons I learned and the steps I had to take to improve the speed of execution and convergence of my initial naive implementation. The message isn't to convince the audience that logistic regression is great and my implementation is awesome, rather it will give details about how it works under the hood, and general tips for implementing an iterative parallel machine learning algorithm in spark. The talk is structured as a sequence of ""lessons learned"" that are shown in form of code examples building on the initial naive implementation. The performance impact of each ""lesson"" on execution time and speed of convergence is measured on benchmark datasets. You will see how to formulate logistic regression in a parallel setting, how to avoid data shuffles, when to use a custom partitioner, how to use the 'aggregate' and 'treeAggregate' functions, how momentum can accelerate the convergence of gradient descent, and much more. I will assume basic understanding of machine learning and some prior knowledge of spark. The code examples are written in scala, and the code will be made available for each step in the walkthrough. Session hashtag: #EUds9"
Views: 735 Databricks
Sparse and large-scale learning with heterogeneous data
 
54:58
Google Tech Talks September 5, 2006 Gert Lanckriet is assistant professor in the Electrical and Computer Engineering Department at the University of California, San Diego. He conducts research on machine learning, applied statistics and convex optimization with applications in computational biology, finance, music and vision. ABSTRACT An important challenge for the field of machine learning is to deal with the increasing amount of data that is available for learning and to leverage the (also increasing) diversity of information sources, describing these data. Beyond classical vectorial data formats, data in the format of graphs, trees, strings and beyond have become widely available for data mining, e.g., the linked structure of the world wide web, text, images and sounds on web pages, protein interaction networks, phylogenetic trees, etc. Moreover, for interpretability and economical reasons, decision rules that rely on a small subset of the information sources and/or a small subset of the features describing the data are highly desired: sparse learning algorithms are a must. This talk will outline two recent approaches that address sparse, large-scale learning with heterogeneous data, and show some applications. Google engEDU Speaker: Gert Lanckriet
Views: 814 GoogleTalksArchive
GMove: Group-Level Mobility Modeling using Geo-Tagged Social Media
 
18:35
Author: Chao Zhang, Department of Computer Science, University of Illinois at Urbana-Champaign Abstract: Understanding human mobility is of great importance to various applications, such as urban planning, traffic scheduling, and location prediction. While there has been fruitful research on modeling human mobility using tracking data (e.g., GPS traces), the recent growth of geo-tagged social media (GeoSM) brings new opportunities to this task because of its sheer size and multi-dimensional nature. Nevertheless, how to obtain quality mobility models from the highly sparse and complex GeoSM data remains a challenge that cannot be readily addressed by existing techniques. We propose GMOVE, a group-level mobility modeling method using GeoSM data. Our insight is that the GeoSM data usually contains multiple user groups, where the users within the same group share significant movement regularity. Meanwhile, user grouping and mobility modeling are two intertwined tasks: (1) better user grouping offers better within-group data consistency and thus leads to more reliable mobility models; and (2) better mobility models serve as useful guidance that helps infer the group a user belongs to. GMOVE thus alternates between user grouping and mobility modeling, and generates an ensemble of Hidden Markov Models (HMMs) to characterize group-level movement regularity. Furthermore, to reduce text sparsity of GeoSM data, GMOVE also features a text augmenter. The augmenter computes keyword correlations by examining their spatiotemporal distributions. With such correlations as auxiliary knowledge, it performs sampling-based augmentation to alleviate text sparsity and produce high-quality HMMs. Our extensive experiments on two real-life data sets demonstrate that GMOVE can effectively generate meaningful group-level mobility models. Moreover, with context-aware location prediction as an example application, we find that GMOVE significantly outperforms baseline mobility models in terms of prediction accuracy. More on http://www.kdd.org/kdd2016/ KDD2016 Conference is published on http://videolectures.net/
Views: 132 KDD2016 video
Lecture 47 — Singular Value Decomposition | Stanford University
 
13:40
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
data.bythebay.io: Matt Seal, Unsupervised NLP Classification with Clustering
 
39:16
In the world of local government finances training data is sparse. Language based training data is almost non-existent. Furthermore, fiscal language in governments has a high domain knowledge requirement to build training data and garner strong intuitions. This makes traditional supervised methods difficult to use successfully, as the training data you generate is always lagging raw data growth. To help tackle these challenges in performing NLP analysis we'll be showing techniques around relationship extraction and clustering to perform data understanding on domain heavy topics. We'll be exploring these techniques on published local government budget pdfs to extract topics and gain insights into the purpose of domain specific text. The format of the talk will follow each key point with code examples. First we’ll talk about data challenges in local government, and the lack of established knowledge bases around that data. Specifically we’ll explore the unknown number of classes problem and how unsupervised algorithms can garner insights. Then we’ll focus on the families of clustering algorithms available and how they allow you to focus on edge associations rather than holistic state spaces. Following that we’ll explore some useful techniques for optimizing computation and how missing or skipped data points can be linked by association. Finally we’ll combine the pieces we’ve shown to perform topic extraction and understanding from public financial budgets.
Views: 352 FunctionalTV
Low-Sparsity Unobservable Attacks Against Smart Grid
 
10:19
Head office: 2nd floor, Yashomandir Avenue, Behind Maleria Bus Stop, Patil lane No. 1, College Road, nashik- 422005. Phone no- 0253 6644344. Mobile no- 9028924212. Branch office: Office No-307, 3rd Floor, Om Chembars, Above Hotel Panchali, JM Road, Shivaji nagar, pune- 411005 Phone no- 020 30107071. Email: [email protected] Website: www.saiinfosolution.co.in
KDD2016 paper 1190
 
03:23
Title: Pseudo-Document-based Topic Modeling of Short Texts without Auxiliary Information Authors: Yuan Zuo*, Beihang University Junjie Wu, Beihang University Has Lin, Beihang University Hui Xiong, Rutgers Abstract: Recent years have witnessed the unprecedented growth of online social media, which empower short texts as the prevalent format for information of Internet. Given the nature of sparsity, however, short text topic modeling remains a critical yet much-watched challenge in both academy and industry. Rich research efforts have been put on building different types of probabilistic topic models for short texts, among which the self aggregation methods without using auxiliary information become an emerging solution for providing informative cross-text word co-occurrences. However, models along this line are still rarely seen, and the representative one SATM is prone to overfitting and computationally expensive. In light of this, in this paper, we propose a novel probabilistic model called PTM for short text topic modeling. PTM introduces the concept of {\it pseudo document} to implicitly aggregate short texts against data sparsity. By modeling the topic distributions of latent pseudo documents rather than short texts, PTM is expected to gain excellent performance in both accuracy and efficiency. A Sparsity-enhanced PTM (SPTM for short) is also proposed by applying Spike and Slab prior, with the purpose of eliminating undesired correlations between pseudo documents and latent topics. Extensive experiments on various real-world data sets with state-of-the-art baselines demonstrate the high quality of topics learned by PTM and its robustness with reduced training samples. It is also interesting to show that i) SPTM gains a clear edge over PTM when the number of pseudo documents is relatively small, and ii) the constraint that a short text belongs to only one pseudo document is critically important for the success of PTM. We finally take an in-depth semantic analysis to unveil directly the fabulous function of pseudo documents in finding cross-text word co-occurrences for topic modeling. More on http://www.kdd.org/kdd2016/ KDD2016 Conference will be recorded and published on http://videolectures.net/
Views: 386 KDD2016 video
A Parallel and Primal-Dual Sparse Method for Extreme Classification
 
01:53
A Parallel and Primal-Dual Sparse Method for Extreme Classification Ian Yen (Carnegie Mellon University) Xiangru Huang (University of Texas at Austin) Wei Dai (Carnegie Mellon University) Pradeep Ravikumar (Carnegie Mellon University) Inderjit Dhillon (University of Texas at Austin) Eric Xing (Carnegie Mellon University) Extreme Classification considers the problem of multiclass or multilabel prediction when there is a huge number of classes: a scenario that occurs in many real-world applications such as text and image tagging. In this setting, standard classification methods with complexity linear to the number of classes become intractable, while enforcing structural constraints among classes (such as low-rank or tree-structured) to reduce complexity often sacrifices accuracy for efficiency. The recent \emph{PD-Sparse} method addresses this issue to gives an algorithm that is sublinear in the number of variables by exploiting \emph{primal-dual} sparsity inherent in the max-margin loss. However, the objective requires training models of all classes together, which incurs large memory consumption and prohibits it from the simple parallelization scheme that a one-versus-all method can easily take advantage of. In this work, we propose a primal-dual sparse method that enjoys the same parallelizability and space efficiency of one-versus-all approach, while having complexity sublinear to the number of classes. On several large-scale benchmark data sets, the proposed method achieves accuracy competitive to state-of-the-art methods while reducing training time from days to tens of minutes compared to existing parallel or sparse methods on a cluster of $100$ cores. More on http://www.kdd.org/kdd2017/
Views: 194 KDD2017 video
KATE: K­-Competitive Autoencoder for Text
 
20:11
Author: Yu Chen, Computer Science Department, Rensselaer Polytechnic Institute Abstract: Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confounding properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval. More on http://www.kdd.org/kdd2017/ KDD2017 Conference is published on http://videolectures.net/
Views: 275 KDD2017 video
Matrix Computations and Optimization in Apache Spark
 
22:52
Authors: Reza Bosagh Zadeh, Institute for Computational and Mathematical Engineering, Stanford University Abstract: We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark provides abstractions and implementations for distributed matrices and optimization routines using these matrices. When translating single-node algorithms to run on a distributed cluster, we observe that often a simple idea is enough: separating matrix operations from vector operations and shipping the matrix operations to be ran on the cluster, while keeping vector operations local to the driver. In the case of the Singular Value Decomposition, by taking this idea to an extreme, we are able to exploit the computational power of a cluster, while running code written decades ago for a single core. Another example is our Spark port of the popular TFOCS optimization package, originally built for MATLAB, which allows for solving Linear programs as well as a variety of other convex programs. We conclude with a comprehensive set of benchmarks for hardware accelerated matrix computations from the JVM, which is interesting in its own right, as many cluster programming frameworks use the JVM. The contributions described in this paper are already merged into Apache Spark and available on Spark installations by default, and commercially supported by a slew of companies which provide further services. More on http://www.kdd.org/kdd2016/ KDD2016 Conference is published on http://videolectures.net/
Views: 917 KDD2016 video
NPC2016: DeepTech Summit: Discovering Topics from Unstructured Text
 
17:29
Speaker: Prof Chiranjib Bhattacharyya, Professor, Dept. of Comp. Science & Automation, Indian Institute of Science - Bangalore
Dimensionality Reduction - The Math of Intelligence #5
 
10:49
Most of the datasets you'll find will have more than 3 dimensions. How are you supposed to understand visualize n-dimensional data? Enter dimensionality reduction techniques. We'll go over the the math behind the most popular such technique called Principal Component Analysis. Code for this video: https://github.com/llSourcell/Dimensionality_Reduction Ong's Winning Code: https://github.com/jrios6/Math-of-Intelligence/tree/master/4-Self-Organizing-Maps Hammad's Runner up Code: https://github.com/hammadshaikhha/Math-of-Machine-Learning-Course-by-Siraj/tree/master/Self%20Organizing%20Maps%20for%20Data%20Visualization Please Subscribe! And like. And comment. That's what keeps me going. I used a screengrab from 3blue1brown's awesome videos: https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw More learning resources: https://plot.ly/ipython-notebooks/principal-component-analysis/ https://www.youtube.com/watch?v=lrHboFMio7g https://www.dezyre.com/data-science-in-python-tutorial/principal-component-analysis-tutorial https://georgemdallas.wordpress.com/2013/10/30/principal-component-analysis-4-dummies-eigenvectors-eigenvalues-and-dimension-reduction/ http://setosa.io/ev/principal-component-analysis/ http://sebastianraschka.com/Articles/2015_pca_in_3_steps.html https://algobeans.com/2016/06/15/principal-component-analysis-tutorial/ Join us in the Wizards Slack channel: http://wizards.herokuapp.com/ And please support me on Patreon: https://www.patreon.com/user?u=3191693 Follow me: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/ Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content!
Views: 78138 Siraj Raval
Sparse and large-scale learning with heterogeneous data
 
54:58
Google Tech Talks September 5, 2006 Gert Lanckriet is assistant professor in the Electrical and Computer Engineering Department at the University of California, San Diego. He conducts research on machine learning, applied statistics and convex optimization with applications in computational biology, finance, music and vision. ABSTRACT An important challenge for the field of machine learning is to deal with the increasing amount of data that is available for learning and to leverage the (also increasing) diversity of information sources, describing these data. Beyond classical vectorial data formats, data in the format of graphs, trees, strings and beyond have become widely available for data...
Views: 2833 GoogleTechTalks
MIC visualization of datasets
 
04:46
A short visualization of the usage of MIC for exploring the structure within a data set from the World Health Organization and its partner organizations. Video: David Reshef Read more at http://web.mit.edu/newsoffice/2011/large-data-sets-algorithm-1216.html Learn about the work at http://www.exploredata.net/
Neural Question Answering over Knowledge Graphs
 
57:43
Questions in real-world scenarios are mostly factoid, such as "any universities in Seattle?''. In order to answer factoid questions, a system needs to extract world knowledge and reason over facts. Knowledge graphs (KGs), e.g., Freebase, NELL, YAGO etc, provide large-scale structured knowledge for factoid question answering. What we do is usually parsing the raw questions into path queries of KGs. This talk introduces three pieces of work in different abstraction levels to handle this challenge: i) In case a path query, containing the topical entity and relation chain referred by a question, is available precisely in a KG, how to perform effective path query answering over KGs directly -- KGs usually suffer from severe sparsity. The first part of this talk presents three sequence-to-sequence models for path query answering and vector space learning of KG elements (entities & relations); ii) As questions in reality are raw text and mostly contain single-relation, the second part of this talk presents an effective entity linker and an attentive max-pooling based convolutional neural network to conduct (question, single KG fact) match, which enables the system to pick the best KG fact -- a one-hop path query -- to retrieve the answer; iii) Subsequently, the final part shows how to make improvements over single-relation KGQA to handle the multi-relation KGQA problem -- projecting the multi-relation question into a multi-hop path query for answer retrieval.  See more on this video at https://www.microsoft.com/en-us/research/video/neural-question-answering-knowledge-graphs/
Views: 3138 Microsoft Research
Deep Learning Approach for Extreme Multi-label Text Classification
 
28:54
Extreme classification is a rapidly growing research area focusing on multi-class and multi-label problems involving an extremely large number of labels. Many applications have been found in diverse areas ranging from language modeling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, gene function prediction in bioinformatics, etc. Extreme classification has also opened up a new paradigm for ranking and recommendation by reformulating them as multi-label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content-based recommendation techniques. Consequently, extreme classifiers have been deployed in many real-world applications in industry. This workshop aims to bring together researchers interested in these areas to encourage discussion and improve upon the state-of-the-art in extreme classification. In particular, we aim to bring together researchers from the natural language processing, computer vision and core machine learning communities to foster interaction and collaboration. Find more talks at https://www.youtube.com/playlist?list=PLD7HFcN7LXReN-0-YQeIeZf0jMG176HTa
Views: 9537 Microsoft Research
A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data
 
01:00
To get this project in ONLINE or through TRAINING Sessions, Contact: JP INFOTECH, 45, KAMARAJ SALAI, THATTANCHAVADY, PUDUCHERRY-9 Landmark: Opposite to Thattanchavady Industrial Estate, Next to VVP Nagar Arch. Mobile: (0) 9952649690 , Email: [email protected], web: www.jpinfotech.org Blog: www.jpinfotech.blogspot.com A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data This paper proposes a novel temporal knowledge representation and learning framework to perform large-scale temporal signature mining of longitudinal heterogeneous event data. The framework enables the representation, extraction, and mining of high order latent event structure and relationships within single and multiple event sequences. The proposed knowledge representation maps the heterogeneous event sequences to a geometric image by encoding events as a structured spatial-temporal shape process. We present a doubly constrained convolutional sparse coding framework that learns interpretable and shift-invariant latent temporal event signatures. We show how to cope with the sparsity in the data as well as in the latent factor model by inducing a double sparsity constraint on the β-divergence to learn an over complete sparse latent factor model. A novel stochastic optimization scheme performs large-scale incremental learning of group-specific temporal event signatures. We validate the framework on synthetic data and on an electronic health record dataset.
Views: 192 jpinfotechprojects
Proactive Learning and Structural Transfer Learning: Building Blocks of Cognitive Systems
 
28:45
Dr. Jaime Carbonell is an expert in machine learning, scalable data mining (“big data”), text mining, machine translation, and computational proteomics. He invented Proactive Machine Learning, including its underlying decision-theoretic framework, and new Transfer Learning methods. He is also known for the Maximal Marginal Relevance principle in information retrieval. Dr. Carbonell has published some 350 papers and books and supervised 65 Ph.D. dissertations. He has served on multiple governmental advisory committees, including the Human Genome Committee of the National Institutes of Health, and is Director of the Language Technologies Institute. At CMU, Dr. Carbonell has designed degree programs and courses in language technologies, machine learning, data sciences, and electronic commerce. He received his Ph.D. from Yale University. For more, read the white paper, "Computing, cognition, and the future of knowing" https://ibm.biz/BdHErb
Views: 1763 IBM Research
Structural Event Detection from Log Messages
 
02:54
Structural Event Detection from Log Messages Fei Wu (penn state) Pranay Anchuri (NEC Labs America) Zhenhui Li (Penn State University) A wide range of modern web applications are only possible because of the composable nature of the web services they are built upon. It is, therefore, often critical to ensure proper functioning of these web services. As often, the server-side of web services is not directly accessible, several log message based analysis have been developed to monitor the status of web services. Existing techniques focus on using clusters of messages (log patterns) to detect important system events. We argue that meaningful system events are often representable by groups of cohesive log messages and the relationships among these groups. We propose a novel method to mine structural events as directed workflow graphs (where nodes represent log patterns, and edges represent relations among patterns). The structural events are inclusive and correspond to interpretable episodes in the system. The problem is non-trivial due to the nature of log data: (i) Individual log messages contain limited information, and (ii) Log messages in a large scale web system are often interleaved even though the log messages from individual components are ordered. As a result, the patterns and relationships mined directly from the messages and their ordering can be errorenous and unreliable in practice. Our solution is based on the observation that meaningful log patterns and relations often form workflow structures that are connected. Our method directly models the overall quality of structural events. Through both qualitative and quantitative experiments on real world datasets, we demonstrate the effectiveness and the expressiveness of our event detection method. More on http://www.kdd.org/kdd2017/
Views: 429 KDD2017 video
Sparse Optimization Algorithm, Shearlet-Based Method for Image Deblurring
 
08:41
Head office: 2nd floor, Yashomandir Avenue, Behind Maleria Bus Stop, Patil lane No. 1, College Road, nashik- 422005. Phone no- 0253 6644344. Mobile no- 9028924212. Branch office: Office No-307, 3rd Floor, Om Chembars, Above Hotel Panchali, JM Road, Shivaji nagar, pune- 411005 Phone no- 020 30107071.
GMove: Group-Level Mobility Modeling Using Geo-Tagged Social Media (KDD 2016)
 
18:28
GMove: Group-Level Mobility Modeling Using Geo-Tagged Social Media KDD 2016 Chao Zhang Keyang Zhang Quan Yuan Luming Zhang Tim Hanratty Jiawei Han Understanding human mobility is of great importance to various applications, such as urban planning, traffic scheduling, and location prediction. While there has been fruitful research on modeling human mobility using tracking data (e.g., GPS traces), the recent growth of geo-tagged social media (GeoSM) brings new opportunities to this task because of its sheer size and multi-dimensional nature. Nevertheless, how to obtain quality mobility models from the highly sparse and complex GeoSM data remains a challenge that cannot be readily addressed by existing techniques. We propose GMove, a group-level mobility modeling method using GeoSM data. Our insight is that the GeoSM data usually contains multiple user groups, where the users within the same group share significant movement regularity. Meanwhile, user grouping and mobility modeling are two intertwined tasks: (1) better user grouping offers better within-group data consistency and thus leads to more reliable mobility models; and (2) better mobility models serve as useful guidance that helps infer the group a user belongs to. GMove thus alternates between user grouping and mobility modeling, and generates an ensemble of Hidden Markov Models (HMMs) to characterize group-level movement regularity. Furthermore, to reduce text sparsity of GeoSM data, GMove also features a text augmenter. The augmenter computes keyword correlations by examining their spatiotemporal distributions. With such correlations as auxiliary knowledge, it performs sampling-based augmentation to alleviate text sparsity and produce high-quality HMMs. Our extensive experiments on two real-life data sets demonstrate that GMove can effectively generate meaningful group-level mobility models. Moreover, with context-aware location prediction as an example application, we find that GMove significantly outperforms baseline mobility models in terms of prediction accuracy.
VC-Dimension and Rademacher Averages - Part 1
 
01:31:07
Author: Matteo Riondato, Eli Upfal Abstract: Rademacher Averages and the Vapnik-Chervonenkis dimension are fundamental concepts from statistical learning theory. They allow to study simultaneous deviation bounds of empirical averages from their expectations for classes of functions, by considering properties of the functions, of their domain (the dataset), and of the sampling process. In this tutorial, we survey the use of Rademacher Averages and the VC-dimension in sampling-based algorithms for graph analysis and pattern mining. We start from their theoretical foundations at the core of machine learning, then show a generic recipe for formulating data mining problems in a way that allows to use these concepts in efficient randomized algorithms for those problems. Finally, we show examples of the application of the recipe to graph problems (connectivity, shortest paths, betweenness centrality) and pattern mining. Our goal is to expose the usefulness of these techniques for the data mining researcher, and to encourage research in the area. ACM DL: http://dl.acm.org/citation.cfm?id=2789984 DOI: http://dx.doi.org/10.1145/2783258.2789984
Credit Scoring & R: Reject inference, nested conditional models, & joint scores
 
44:37
Credit scoring tends to involve the balancing of mutually contradictory objectives spiced with a liberal dash of methodological conservatism. This talk emphasises the craft of credit scoring, focusing on combining technical components with some less common analytical techniques. The talk describes an analytical project which R helped to make relatively straight forward. Ross Gayler describes himself as a recovered psychologist who studied rats and stats (minus the rats) a very long time ago. Since then he has mostly worked in credit scoring (predictive modelling of risk-related customer behaviour in retail finance) and has forgotten most of the statistics he ever knew. Credit scoring involves counterfactual reasoning. Lenders want to set policies based on historical experience, but what they really want to know is what would have happened if their historical policies had been different. The statistical consequence of this is that we are required to build statistical models of structure that is not explicitly present in the available data and that the available data is systematically censored. The simplest example of this is that the applicants who are estimated to have the highest risk are declined credit and consequently, we do not have explicit knowledge of how they would have performed if they had been accepted. Overcoming this problem is known as 'reject inference' in credit scoring. Reject inference is typically discussed as a single-level phenomenon, but in reality there can be multiple levels of censoring. For example, an applicant who has been accepted by the lender may withdraw their application with the consequence that we don't know whether they would have successfully repaid the loan had they taken up the offer. Independently of reject inference, it is standard to summarise all the available predictive information as a single score that predicts a behaviour of interest. In reality, there may be multiple behaviours that need to be simultaneously considered in decision making. These may be predicted by multiple scores and in general there will be interactions between the scores -- so they need to be considered jointly in decision making. The standard technique for implementing this is to divide each score into a small number of discrete levels and consider the cross-tabulation of both scores. This is simple but limited because it does not make optimal use of the data, raises problems of data sparsity, and makes it difficult to achieve a fine level of control. This talk covers a project that dealt with multiple, nested reject inference problems in the context of two scores to be considered jointly. It involved multivariate smoothing spline regression and some general R carpentry to plug all the pieces together.
Views: 5561 Jeromy Anglim
Word Representation Learning without unk Assumptions
 
47:18
Chris Dyer, Carnegie Mellon University Representation Learning https://simons.berkeley.edu/talks/chris-dyer-2017-3-31
Views: 925 Simons Institute
Learning Representations of Large-scale Networks part 2
 
01:07:52
Authors: Qiaozhu Mei, Department of Electrical Engineering and Computer Science, University of Michigan Jian Tang, Montreal Institute for Learning Algorithms (MILA), University of Montreal Abstract: Large-scale networks such as social networks, citation networks, the World Wide Web, and traffic networks are ubiquitous in the real world. Networks can also be constructed from text, time series, behavior logs, and many other types of data. Mining network data attracts increasing attention in academia and industry, covers a variety of applications, and influences the methodology of mining many types of data. A prerequisite to network mining is to find an effective representation of networks, which largely determines the performance of downstream data mining tasks. Traditionally, networks are usually represented as adjacency matrices, which suffer from data sparsity and high-dimensionality. Recently, there is a fast-growing interest in learning continuous and low-dimensional representations of networks. This is a challenging problem for multiple reasons: (1) networks data (nodes and edges) are sparse, discrete, and globally interactive; (2) real-world networks are very large, usually containing millions of nodes and billions of edges; and (3) real-world networks are heterogeneous. Edges can be directed, undirected or weighted, and both nodes and edges may carry different semantics. In this tutorial, we will introduce the recent progress on learning continuous and low-dimensional representations of large-scale networks. This includes methods that learn the embeddings of nodes, methods that learn representations of larger graph structures (e.g., an entire network), and methods that layout very large networks on extremely low (2D or 3D) dimensional spaces. We will introduce methods for learning different types of node representations: representations that can be used as features for node classification, community detection, link prediction, and network visualization. We will introduce end-to-end methods that learn the representation of the entire graph structure through directly optimizing tasks such as information cascade prediction, chemical compound classification, and protein structure classification, using deep neural networks. We will highlight open source implementations of these techniques. Link to tutorial: https://sites.google.com/site/pkujiantang/home/kdd17-tutorial More on http://www.kdd.org/kdd2017/ KDD2017 Conference is published on http://videolectures.net/
Views: 64 KDD2017 video
MIA: Barbara Engelhardt, Structured factor models to identify interpretable signal in genomic data
 
01:04:42
March 3, 2017 Models, Inference and Algorithms: Broad Institute of MIT and Harvard Structured factor models to identify interpretable signal in genomic data Barbara Engelhardt Associate Professor Princeton University Copyright Broad Institute, 2018. All rights reserved.
Views: 965 Broad Institute
Lecture 3 | GloVe: Global Vectors for Word Representation
 
01:18:40
Lecture 3 introduces the GloVe model for training word vectors. Then it extends our discussion of word vectors (interchangeably called word embeddings) by seeing how they can be evaluated intrinsically and extrinsically. As we proceed, we discuss the example of word analogies as an intrinsic evaluation technique and how it can be used to tune word embedding techniques. We then discuss training model weights/parameters and word vectors for extrinsic tasks. Lastly we motivate artificial neural networks as a class of models for natural language processing tasks. Key phrases: Global Vectors for Word Representation (GloVe). Intrinsic and extrinsic evaluations. Effect of hyperparameters on analogy evaluation tasks. Correlation of human judgment with word vector distances. Dealing with ambiguity in word using contexts. Window classification. ------------------------------------------------------------------------------- Natural Language Processing with Deep Learning Instructors: - Chris Manning - Richard Socher Natural language processing (NLP) deals with the key artificial intelligence technology of understanding complex human language communication. This lecture series provides a thorough introduction to the cutting-edge research in deep learning applied to NLP, an approach that has recently obtained very high performance across many different NLP tasks including question answering and machine translation. It emphasizes how to implement, train, debug, visualize, and design neural network models, covering the main technologies of word vectors, feed-forward models, recurrent neural networks, recursive neural networks, convolutional neural networks, and recent models involving a memory component. For additional learning opportunities please visit: http://online.stanford.edu/
[PURDUE MLSS] Introduction to Machine Learning by Dale Schuurmans Part 3/6
 
01:00:11
Lecture slides: http://learning.stat.purdue.edu/mlss/_media/mlss/schuurmans.pdf Abstract of the talk: This course will provide a simple unified introduction to batch training algorithms for supervised, unsupervised and partially-supervised learning. The concepts introduced will provide a basis for the more advanced topics in other lectures. The first part of the course will cover supervised training algorithms, establishing a general foundation through a series of extensions to linear prediction, including: nonlinear input transformations (features), L2 regularization (kernels), prediction uncertainty (Gaussian processes), L1 regularization (sparsity), nonlinear output transformations (matching losses), surrogate losses (classification), multivariate prediction, and structured prediction. Relevant optimization concepts will be acquired along the way. The second part of the course will then demonstrate how unsupervised and semi-supervised formulations follow from a relationship between forward and reverse prediction problems. This connection allows dimensionality reduction and sparse coding to be unified with regression, and clustering and vector quantization to be unified with classification—even in the context of other extensions. Current convex relaxations of such training problems will be discussed. The last part of the course covers partially-supervised learning—the problem of learning an input representation concurrently with a predictor. A brief overview of current research will be presented, including recent work on boosting and convex relaxations. See other lectures at Purdue MLSS Playlist: http://www.youtube.com/playlist?list=PL2A65507F7D725EFB&feature=view_all
Views: 898 Purdue University
5.2 Linear Soft-Margin SVM | 5 Support Vector Machines | Pattern Recognition Class 2012
 
01:14:03
The Pattern Recognition Class 2012 by Prof. Fred Hamprecht. It took place at the HCI / University of Heidelberg during the summer term of 2012. Website: http://hci.iwr.uni-heidelberg.de/MIP/Teaching/pr/ Playlist with all videos: http://goo.gl/gmOI6 Contents of this recording: soft margin, hard margin linear soft-margin SVM slack variables version space sparsity quadratic program (QP) kernel trick KKT condition Karush-Kuhn-Tucker conditions Wolfe dual complementary slackness support vectors Gram matrix similarity measure Syllabus: 1. Introduction 1.1 Applications of Pattern Recognition 1.2 k-Nearest Neighbors Classification 1.3 Probability Theory 1.4 Statistical Decision Theory 2. Correlation Measures, Gaussian Models 2.1 Pearson Correlation 2.2 Alternative Correlation Measures 2.3 Gaussian Graphical Models 2.4 Discriminant Analysis 3. Dimensionality Reduction 3.1 Regularized LDA/QDA 3.2 Principal Component Analysis (PCA) 3.3 Bilinear Decompositions 4. Neural Networks 4.1 History of Neural Networks 4.2 Perceptrons 4.3 Multilayer Perceptrons 4.4 The Projection Trick 4.5 Radial Basis Function Networks 5. Support Vector Machines 5.1 Loss Functions 5.2 Linear Soft-Margin SVM 5.3 Nonlinear SVM 6. Kernels, Random Forest 6.1 Kernels 6.2 One-Class SVM 6.3 Random Forest 6.4 Random Forest Feature Importance 7. Regression 7.1 Least-Squares Regression 7.2 Optimum Experimental Design 7.3 Case Study: Functional MRI 7.4 Case Study: Computer Tomography 7.5 Regularized Regression 8. Gaussian Processes 8.1 Gaussian Process Regression 8.2 GP Regression: Interpretation 8.3 Gaussian Stochastic Processes 8.4 Covariance Function 9. Unsupervised Learning 9.1 Kernel Density Estimation 9.2 Cluster Analysis 9.3 Expectation Maximization 9.4 Gaussian Mixture Models 10. Directed Graphical Models 10.1 Bayesian Networks 10.2 Variable Elimination 10.3 Message Passing 10.4 State Space Models 11. Optimization 11.1 The Lagrangian Method 11.2 Constraint Qualifications 11.3 Linear Programming 11.4 The Simplex Algorithm 12. Structured Learning 12.1 structSVM 12.2 Cutting Planes
Views: 6108 UniHeidelberg
ClusType
 
19:26
Authors: Xiang Ren, Ahmed El-Kishky, Chi Wang, Fangbo Tao, Clare R. Voss, Jiawei Han Abstract: Entity recognition is an important but challenging research problem. In reality, many text collections are from specific, dynamic, or emerging domains, which poses significant new challenges for entity recognition with increase in name ambiguity and context sparsity, requiring entity detection without domain restriction. In this paper, we investigate entity recognition (ER) with distant-supervision and propose a novel relation phrase-based ER framework, called ClusType, that runs data-driven phrase mining to generate entity mention candidates and relation phrases, and enforces the principle that relation phrases should be softly clustered when propagating type information between their argument entities. Then we predict the type of each entity mention based on the type signatures of its co-occurring relation phrases and the type indicators of its surface name, as computed over the corpus. Specifically, we formulate a joint optimization problem for two tasks, type propagation with relation phrases and multi-view relation phrase clustering. Our experiments on multiple genres---news, Yelp reviews and tweets---demonstrate the effectiveness and robustness of ClusType, with an average of 37% improvement in F1 score over the best compared method. ACM DL: http://dl.acm.org/citation.cfm?id=2783362 DOI: http://dx.doi.org/10.1145/2783258.2783362
Scalable Multi-label Annotation
 
00:31
Full Title: Scalable Multi-label Annotation Authors: Jia Deng, Olga Russakovsky, Jonathan Krause, Michael S Bernstein, Alex Berg, Li Fei-Fei Abstract: We study strategies for scalable multi-label annotation, or for efficiently acquiring multiple labels from humans for a collection of items. We propose an algorithm that exploits correlation, hierarchy, and sparsity of the label distribution. A case study of labeling 200 objects using 20,000 images demonstrates the effectiveness of our approach. The algorithm results in up to 6x reduction in human computation time compared to the naive method of querying a human annotator for the presence of every object in every image. DOI:http://doi.acm.org/10.1145/2556288.2557011
[PURDUE MLSS] Introduction to Machine Learning by Dale Schuurmans Part 6/6
 
01:02:37
Lecture slides: http://learning.stat.purdue.edu/mlss/_media/mlss/schuurmans.pdf Abstract of the talk: This course will provide a simple unified introduction to batch training algorithms for supervised, unsupervised and partially-supervised learning. The concepts introduced will provide a basis for the more advanced topics in other lectures. The first part of the course will cover supervised training algorithms, establishing a general foundation through a series of extensions to linear prediction, including: nonlinear input transformations (features), L2 regularization (kernels), prediction uncertainty (Gaussian processes), L1 regularization (sparsity), nonlinear output transformations (matching losses), surrogate losses (classification), multivariate prediction, and structured prediction. Relevant optimization concepts will be acquired along the way. The second part of the course will then demonstrate how unsupervised and semi-supervised formulations follow from a relationship between forward and reverse prediction problems. This connection allows dimensionality reduction and sparse coding to be unified with regression, and clustering and vector quantization to be unified with classification—even in the context of other extensions. Current convex relaxations of such training problems will be discussed. The last part of the course covers partially-supervised learning—the problem of learning an input representation concurrently with a predictor. A brief overview of current research will be presented, including recent work on boosting and convex relaxations. See other lectures at Purdue MLSS Playlist: http://www.youtube.com/playlist?list=PL2A65507F7D725EFB&feature=view_all
Views: 546 Purdue University
Topic Mining over Asynchronous Text Sequences
 
04:51
Sai infocorp Solution Pvt. Ltd. Mobile no- 9028924212. Address: Head office: 2nd floor, Yashomandir Avenue, Behind Maleria Bus Stop, Patil lane No. 1, College Road, nashik- 422005. Phone no- 0253 6644344. Mobile no- 9028924212. Branch office: Office No-307, 3rd Floor, Om Chembars, Above Hotel Panchali, JM Road, Shivaji nagar, pune- 411005 Phone no- 020 30107071. Email: [email protected] Website: www.saiinfosolution.co.in
Min-Wise Hashing for Large-Scale Regression and Classification
 
37:17
Nicolai Meinshausen, University of Oxford and ETH Zürich Succinct Data Representations and Applications http://simons.berkeley.edu/talks/nicolai-meinshausen-2013-09-18
Views: 692 Simons Institute
Efficient Correlated Topic Modeling with Topic Embedding
 
03:07
Efficient Correlated Topic Modeling with Topic Embedding Junxian He (Carnegie Mellon University) Zhiting Hu (Carnegie Mellon University) Taylor Berg-Kirkpatrick (Carnegie Mellon University) Ying Huang (Shanghai Jiaotong University) Eric Xing (Carnegie Mellon University) Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval. More on http://www.kdd.org/kdd2017/
Views: 567 KDD2017 video
Collaborative Knowledge Base Embedding for Recommender Systems
 
20:55
Author: Fuzheng Zhang, Microsoft Research Asia, Microsoft Research Abstract: Among different recommendation techniques, collaborative filtering usually suffer from limited performance due to the sparsity of user-item interactions. To address the issues, auxiliary information is usually used to boost the performance. Due to the rapid collection of information on the web, the knowledge base provides heterogeneous information including both structured and unstructured data with different semantics, which can be consumed by various applications. In this paper, we investigate how to leverage the heterogeneous information in a knowledge base to improve the quality of recommender systems. First, by exploiting the knowledge base, we design three components to extract items’ semantic representations from structural content, textual content and visual content, respectively. To be specific, we adopt a heterogeneous network embedding method, termed as TransR, to extract items’ structural representations by considering the heterogeneity of both nodes and relationships. We apply stacked denoising auto-encoders and stacked convolutional auto-encoders, which are two types of deep learning based embedding techniques, to extract items’ textual representations and visual representations, respectively. Finally, we propose our final integrated framework, which is termed as Collaborative Knowledge Base Embedding (CKE), to jointly learn the latent representations in collaborative filtering as well as items’ semantic representations from the knowledge base. To evaluate the performance of each embedding component as well as the whole system, we conduct extensive experiments with two realworld datasets from different scenarios. The results reveal that our approaches outperform several widely adopted state-of-the-art recommendation methods. More on http://www.kdd.org/kdd2016/ KDD2016 Conference is published on http://videolectures.net/
Views: 475 KDD2016 video
Distributed Coordinate Descent for Regularized Logistic Regression - Ilya Trofimov
 
22:25
Yandex School of Data Analysis Conference Machine Learning: Prospects and Applications https://yandexdataschool.com/conference Logistic regression with regularization is the method of choice for solving classification and class probability estimation problems in text classification, clickstream data analysis and web data mining. Despite the fact that logistic regression can build only linear separating surfaces, the testing accuracy of it, with proper regularization, is often good for high dimensional input spaces. For several problems the testing accuracy has shown to be close to that of nonlinear classifiers such as kernel methods. At the same time, training and testing of linear classifiers is much faster. It makes the logistic regression a good choice for large-scale problems. Choosing the right regularizer is problem dependent. L2-regularization is known to shirk coefficients towards zero leaving correlated ones in a model. L1-regularization leads to a sparse solution and typically selects only one coefficient from a group of correlated ones. Elastic net regularizer is a linear combination of L1 and L2. It allows selecting a trade-off between them. Other regularizers are less often used: group lasso and non-convex SCAD regularizer. Nowadays we see a growing number of problems where both the number of examples and the number of features are very large. Many problems grow beyond the capabilities of a single computer and need to be handled by distributed systems. Distributed machine learning is now an area of active research. We propose a new architecture for fitting logistic regression with regularizers in the distributed settings. Inside this architecture we implement a new parallel coordinate descent algorithm for L1 and L2 regularized logistic regression and guarantee its convergence. We show how our algorithm can be modified to solve the slow node problem which is common in distributed machine learning. We empirically show the effectiveness of our algorithm and its implementation in comparison with several state-ofthe art methods.
Using Hashtag Graph-based Topic Model to Connect Semantically-related Words
 
17:38
Using Hashtag Graph-based Topic Model to Connect Semantically-related Words without Co-occurrence in Microblogs To get this project in ONLINE or through TRAINING Sessions, Contact: JP INFOTECH, Old No.31, New No.86, 1st Floor, 1st Avenue, Ashok Pillar, Chennai -83.Landmark: Next to Kotak Mahendra Bank. Pondicherry Office: JP INFOTECH, #39, Kamaraj Salai,Thattanchavady, Puducherry -9.Landmark: Next to VVP Nagar Arch. Mobile: (0) 9952649690, Email: [email protected], web: http://www.jpinfotech.org, Blog: http://www.jpinfotech.blogspot.com In this paper, we introduce a new topic model to understand the chaotic microblogging environment by using hashtag graphs. Inferring topics on Twitter becomes a vital but challenging task in many important applications. The shortness and informality of tweets leads to extreme sparse vector representations with a large vocabulary. This makes the conventional topic models (e.g., Latent Dirichlet Allocation [1] and Latent Semantic Analysis [2]) fail to learn high quality topic structures. Tweets are always showing up with rich user-generated hashtags. The hashtags make tweets semi-structured inside and semantically related to each other. Since hashtags are utilized as keywords in tweets to mark messages or to form conversations, they provide an additional path to connect semantically related words. In this paper, treating tweets as semi-structured texts, we propose a novel topic model, denoted as Hashtag Graph-based Topic Model (HGTM) to discover topics of tweets. By utilizing hashtag relation information in hashtag graphs, HGTM is able to discover word semantic relations even if words are not co-occurred within a specific tweet. With this method, HGTM successfully alleviates the sparsity problem. Our investigation illustrates that the user-contributed hashtags could serve as weakly-supervised information for topic modeling, and the relation between hashtags could reveal latent semantic relation between words. We evaluate the effectiveness of HGTM on tweet (hashtag) clustering and hashtag classification problems. Experiments on two real-world tweet data sets show that HGTM has strong capability to handle sparseness and noise problem in tweets. Furthermore, HGTM can discover more distinct and coherent topics than the state-of-the-art baselines.
Views: 139 jpinfotechprojects
KDD 2016 paper 392
 
02:48
Title : Large-scale Item Categorization in e-Commerce Using Multiple Recurrent Neural Networks Authors : Jung-woo Ha , NAVER LABS Hyuna Pyo, NAVER LABS Jeonghee Kim NAVER LABS Abstract Precise item categorization is a key issue in e-commerce domains. However, it still remains a challenging problem due to data size, category skewness, and noisy metadata. Here, we demonstrate a successful report on a deep learning-based item categorization method, i.e., deep categorization network (DeepCN), in an e-commerce website. DeepCN is an end-to-end model using multiple recurrent neural networks (RNNs) dedicated to metadata attributes for generating features from text metadata and fully connected layers for classifying item categories from the generated features. The categorization errors are propagated back through the fully connected layers to the RNNs for weight update in the learning process. This deep learning-based approach allows diverse attributes to be integrated into a common representation, thus overcoming sparsity and scalability problems. We evaluate DeepCN on large-scale real-world data including more than 94 million items with approximately 4,100 leaf categories from a Korean e-commerce website. Experiment results show our method improves the categorization accuracy compared to the model using single RNN as well as a standard classification model using unigram-based bag-of-words. Furthermore, we investigate how much the model parameters and the used attributes influence categorization performances. http://www.kdd.org/kdd2016/subtopic/view/large-scale-item-categorization-in-e-commerce-using-multiple-recurrent-neur
Views: 422 NAVER LABS
Joint Hypergraph Learning for Tag based Image Retrieval
 
15:54
2018 IEEE Transaction on Image Processing For More Details::Contact::K.Manjunath - 09535866270 http://www.tmksinfotech.com and http://www.bemtechprojects.com 2018 and 2019 IEEE [email protected] TMKS Infotech,Bangalore
Views: 115 manju nath
Learning Representations of Large-scale Networks part 1
 
01:43:45
Authors: Qiaozhu Mei, Department of Electrical Engineering and Computer Science, University of Michigan Jian Tang, Montreal Institute for Learning Algorithms (MILA), University of Montreal Abstract: Large-scale networks such as social networks, citation networks, the World Wide Web, and traffic networks are ubiquitous in the real world. Networks can also be constructed from text, time series, behavior logs, and many other types of data. Mining network data attracts increasing attention in academia and industry, covers a variety of applications, and influences the methodology of mining many types of data. A prerequisite to network mining is to find an effective representation of networks, which largely determines the performance of downstream data mining tasks. Traditionally, networks are usually represented as adjacency matrices, which suffer from data sparsity and high-dimensionality. Recently, there is a fast-growing interest in learning continuous and low-dimensional representations of networks. This is a challenging problem for multiple reasons: (1) networks data (nodes and edges) are sparse, discrete, and globally interactive; (2) real-world networks are very large, usually containing millions of nodes and billions of edges; and (3) real-world networks are heterogeneous. Edges can be directed, undirected or weighted, and both nodes and edges may carry different semantics. In this tutorial, we will introduce the recent progress on learning continuous and low-dimensional representations of large-scale networks. This includes methods that learn the embeddings of nodes, methods that learn representations of larger graph structures (e.g., an entire network), and methods that layout very large networks on extremely low (2D or 3D) dimensional spaces. We will introduce methods for learning different types of node representations: representations that can be used as features for node classification, community detection, link prediction, and network visualization. We will introduce end-to-end methods that learn the representation of the entire graph structure through directly optimizing tasks such as information cascade prediction, chemical compound classification, and protein structure classification, using deep neural networks. We will highlight open source implementations of these techniques. Link to tutorial: https://sites.google.com/site/pkujiantang/home/kdd17-tutorial More on http://www.kdd.org/kdd2017/ KDD2017 Conference is published on http://videolectures.net/
Views: 383 KDD2017 video
BWCA Lecture 6 (Stable Clustering I)
 
01:19:05
Stable clustering, part 1. The k-median problem and the BBG algorithm.