23 research outputs found
A lexical approach for taxonomy mapping
Obtaining a useful complete overview of Web-based product information has become difficult nowadays due to the ever-growing amount of information available on online shops. Findings from previous studies suggest that better search capabilities, such as the exploitation of annotated data, are needed to keep online shopping transparent for the user. Annotations can, for example, help present information from multiple sources in a uniform manner. In order to support the product data integration process, we propose an algorithm that can autonomously map heterogeneous product taxonomies from different online shops. The proposed approach uses word sense disambiguation techniques, approximate lexical matching, and a mechanism that deals with composite categories. Our algorithm’s performance compared favorably against two other state-of-the-art taxonomy mapping algorithms on three real-life datasets. The results show that the F1-measure for our algorithm is on average 60% higher than a state-of-the-art product taxonomy mapping algorithm
Ontology population from web product information
With the vast amount of information available on the Web, there is an increasing need to structure Web data in order to make it accessible to both users and machines. E-commerce is one of the areas in which growing data congestion on the Web has serious consequences. This paper proposes a frame- work that is capable of populating a product ontology us- ing tabular product information from Web shops. By for- malizing product information in this way, better product comparison or recommendation applications could be built. Our approach employs both lexical and syntactic matching for mapping properties and instantiating values. The per- formed evaluation shows that instantiating consumer elec- Tronics from Best Buy and Newegg.com results in an F1 score of approximately 77%
Intelligent Information Systems for Web Product Search
Over the last few years, we have experienced an increase in online shopping. Consequently, there is a need for efficient and effective product search engines. The rapid growth of e-commerce, however, has also introduced some challenges. Studies show that users can get overwhelmed by the information and offerings presented online while searching for products. In an attempt to lighten this information overload burden on consumers, there are several product search engines that aggregate product descriptions and price information from the Web and allow the user to easily query this information. Most of these search engines expect to receive the data from the participating Web shops in a specific format, which means Web shops need to transform their data more than once, as each product search engine requires a different format. Because currently most product information aggregation services rely on Web shops to send them their data, there is a big opportunity for solutions that aim to tackle this problem using a more automated approach. This dissertation addresses key aspects of implementing such a system, including hierarchical product classification, entity resolution, ontology population and schema mapping, and lastly, the optimization of faceted user interfaces. The findings of this work show us how one can design Web product search engines that automatically aggregate product information while allowing users to perform effective and efficient queries
A framework for product description classification in e-commerce
We propose the Hierarchical Product Classification (HPC) framework for the purpose of classifying products using a hierarchical product taxonomy. The framework uses a classification system with multiple classification nodes, each residing on a different level of the taxonomy. The innovative part of the framework stems from the definition of classification recipes that can be used to construct high-quality classifier nodes, using the product descriptions in the most optimal way. These classifier recipes are specifically tailored for the e-commerce domain. The use of these classifier recipes enables flexible classifiers that adjust to the taxonomy depth-specific characteristics of product taxonomies. Furthermore, in order to gain insight into which components are required to perform high quality product classification, we evaluate several feature selection methods and classification techniques in the context of our framework. Based on 3000 product descriptions obtained from Amazon.com, HPC achieves an overall accuracy of 76.80% for product classification. Using 110 categories from CircuitCity.com and Amazon.com, we obtain a precision of 93.61% for mapping the categories to the taxonomy of shopping.com
Analysing the Effect of Offline Media on Online Conversion Actions
Recognising the web as one of the most popular mediums for information distribution, online advertising is nowadays a booming business. Arriving to a company website is often done by interacting with search engine advertisements. In this paper, we investigate how offline advertising by means of TV and radio influences the search engine advertisement that leads to users visiting a company marketing website (a conversion action). Our research is based on the search engine-driven conversion actions of the 2012 marketing campaign 'Do Us A Flavour' of the chips manufacturer Lays, for which we experimented with several prediction models: linear regression (linear model), support vector regression (nonlinear model), and six distributed lag models (linear autoregressive models). Our results show that offline commercials positively influence the online marketing campaign. We have also determined that this influence is higher for TV than for radio, and that general-purpose TV channels have a higher impact on the number of conversion actions than specialised TV channels. In addition, we observed that TV advertisements have the highest influence on conversion actions in the first 50 minutes after the advertisement broadcasting
Dynamic facet ordering for faceted product search engines
Faceted browsing is widely used in Web shops and product comparison sites. In these cases, a fixed ordered list of facets is often employed. This approach suffers from two main issues. First, one needs to invest a significant amount of time to devise an effective list. Second, with a fixed list of facets, it can happen that a facet becomes useless if all products that match the query are associated to that particular facet. In this work, we present a framework for dynamic facet ordering in e-commerce. Based on measures for specificity and dispersion of facet values, the fully automated algorithm ranks those properties and facets on top that lead to a quick drill-down for any possible target product. In contrast to existing solutions, the framework addresses e-commerce specific aspects, such as the possibility of multiple clicks, the grouping of facets by their corresponding properties, and the abundance of numeric facets. In a large-scale simulation and user study, our approach was, in general, favorably compared to a facet list created by domain experts, a greedy approach as baseline, and a state-of-the-art entropy-based solution
A Data Type-Driven Property Alignment Framework for Product Duplicate Detection on the Web
During the last decade daily life has morphed into a world of broadband ubiquity, where devices facilitate constant engagement. As a consequence of this, the area of e-commerce has seen an immense growth. Despite the market opportunities for retailers and the ease for customers to acquire products through webshops, the shift to digital retail has its drawbacks. For example, it leads to cluttered and incomparable information among different webshops, which calls for an automated method to regain homogeneity in product representations. This paper presents a product duplicate detection solution, which exploits a data type-driven property alignment framework. Based on the performed experiment, we show a statistically significant improvement of the F-score from 47.91 % to 78.13 % compared to an existing state-of-the-art approach
Scalable entity resolution for Web product descriptions
\u3cp\u3eConsumers are increasingly using the Web to find product information and make online purchases. This is reflected by the ongoing growth of worldwide e-commerce sales figures. Entity resolution is an important task that supports many services that have arisen from this growth, such as Web shop aggregators. In this paper, we propose a scalable framework for multi-source entity resolution. Our blocking approach employs model words to produce blocks that make our solution highly effective and efficient for the considered domains. An in-depth evaluation, performed using millions of experiments and three large datasets (on consumer electronics and software products), shows that our model words-based approach outperforms other approaches in most cases. Furthermore, we also evaluate our approach with an imperfect similarity function and find that model words-based blocking schemes provide the best blocks with respect to the F\u3csub\u3e1\u3c/sub\u3e-measure.\u3c/p\u3
